Connecting an S3 Bucket
This example illustrates a few capabilities of Beam:
Customize the runtime
First, you’ll define a Runtime
with an Image
.
We’re going to be defining two things:
- Which packages to install in the runtime
- A storage volume to temporarily store images downloaded from S3
from beam import App, Runtime, Image, Volume
app = App(
name="s3-background-remover",
runtime=Runtime(
cpu=1,
memory="16Gi",
image=Image(
python_version="python3.8",
python_packages=["pillow", "rembg", "boto3"],
),
),
volumes=[Volume(path="./unprocessed_images", name="unprocessed_images")],
)
Storing AWS secrets
Since we’re pulling image files from Amazon S3, you’ll need your own AWS credentials to run this example. You can save your AWS credentials in the Beam Secrets Manager, and access the secrets as os.environ
variables.
os.environ["AWS_ACCESS_KEY"]
os.environ["AWS_SECRET_ACCESS_KEY"]
Reading and writing files from S3
Let’s write a basic client to read and write files to an S3 bucket. You’ll setup a dedicated bucket for images that are unprocessed, and another bucket for finished images.
Processing images with rembg
We’ll use rembg
to remove the backgrounds from our images.
Let’s write a function to:
- Download all the files in your bucket to a Storage Volume
- Apply the background removal process to each image with
rembg
- Upload the each processed image to an S3 bucket
Running the function on a schedule
Since we want this to run on a schedule, we’ll add a Scheduled Job to the Beam app.
@app.schedule(when="every 5m")
Deploying the app
To deploy the app, enter your shell from the working directory and run:
beam deploy app.py
After you run this command, your app will run every hour, indefinitely.
You can modify the frequency by updating the cron interval and redeploying the app. And if you decide that you’d rather invoke this manually as a REST API, you can do that too.
Was this page helpful?