Mounting S3 Buckets
Attach S3 buckets to your apps
Beam allows you to mount your own S3 buckets to your apps. Buckets are mounted using AWS’s mountpoint-s3. In general, any provider with an S3-compatible API should work. For instance, AWS S3, Cloudflare R2, and Tigris all work out of the box.
Mountpoint is optimized for reading large files with high throughput and writing new files from a single client at a time. It does not provide full POSIX compliance. For instance, it does not support appending to files.
Cloud buckets allow you to expose your own S3-compatible storage as a file system.
Mounting an S3 Bucket
External S3 buckets are special cases of the volume abstraction with some extra configuration. To connect your bucket to your app, you need to provide the following:
- The bucket name
- An AWS access key
- An AWS secret key
- An S3 endpoint (if you’re using a non-AWS S3-compatible provider)
These will need to have permissions to read, write, and list objects in the bucket. Within AWS, you can use IAM policies to control these permissions. Other S3-compatible providers (like Cloudflare R2 and Tigris) often provide these keys when you sign up for their service.
You will need to store the access key and secret key in Beam’s secret manager. You can do this using the Beam CLI:
beam secret create S3_KEY "your-access-key"
beam secret create S3_SECRET "your-secret-key"
When a request is received to start the container, Beam looks up these secrets and uses them to mount the bucket. This means that you can use any names you like for the secrets.
The secrets’ names must match the values you enter in the CloudBucketConfig
.
The endpoint is optional. If you’re using AWS S3, you can omit it, but if you’re using a non-AWS S3-compatible provider, you will need to provide it.
from beam import CloudBucket, CloudBucketConfig, function
mount_path = "./weights"
weights = CloudBucket(
name="weights",
mount_path=mount_path,
config=CloudBucketConfig(
access_key="S3_KEY",
secret_key="S3_SECRET",
),
)
@function(volumes=[weights])
def sandbox():
import os
import uuid
# Write to the bucket.
file_name = f"{uuid.uuid4()}.txt"
file_path = os.path.join(weights.mount_path, file_name)
try:
with open(file_path, "w") as f:
f.write("hello world")
except Exception as e:
print(e)
# Read from the bucket.
with open(file_path, "r") as f:
print(f.read())
if __name__ == "__main__":
sandbox.remote()
Read Only Buckets
You can mount your bucket as read only by setting the read_only
flag to True
. This will prevent any writes to the bucket.
weights = CloudBucket(
name="weights",
mount_path="./weights",
config=CloudBucketConfig(
access_key="S3_KEY",
secret_key="S3_SECRET",
read_only=True,
),
)
Specifying a region
You can specify a region for your bucket by setting the region
field in the CloudBucketConfig
. This option can be important when mounting AWS buckets.
From mountpoint’s documentation:
Amazon S3 buckets are associated with a single AWS Region. Mountpoint attempts to automatically detect the region for your S3 bucket at startup time and directs all S3 requests to that region. However, in some scenarios like cross-region mount with a directory bucket, this region detection may fail, preventing your bucket from being mounted and displaying Access Denied or No Such Bucket errors.
weights = CloudBucket(
name="weights",
mount_path="./weights",
config=CloudBucketConfig(
access_key="S3_KEY",
secret_key="S3_SECRET",
region="us-east-1",
),
)
Egress fees
Beam is a multi-tenant service. This means that you might get charged egress fees if your container happens to start in a region that is different from your bucket’s. You can read more about AWS S3 egress fees here.
Was this page helpful?