Distributed Storage Volumes
Attach distributed storage volumes to your apps
Beam allows you to create highly-available storage volumes that can be used across tasks. You might use volumes for things like storing model weights or large datasets.
Beam Volumes are mounted directly to the containers that run your code, so they are more performant than using cloud object storage.
We strongly recommend storing your data in Beam Volumes for any data you plan to access from your Beam functions.
How to Write Files in Beam
Your apps run in containers that are read only.
There are two use-cases for saving files: persistent files, that you want to access between tasks, and temporary files that will be deleted when your container spins down.
- Persisting Files: write to a volume.
- Temporary Files: temporary files can be written to the
/tmp
directory in your Beam container, for example you could save an image to/tmp/myimage.png
.
Reading and Writing to Volumes
You can read and write to your Volume like any ordinary Python file:
from beam import function, Volume
VOLUME_PATH = "./model_weights"
@function(
volumes=[Volume(name="model-weights", mount_path=VOLUME_PATH)],
)
def access_files():
# Write files to a volume
with open(f"{VOLUME_PATH}/somefile.txt", "w") as f:
f.write("This is being written to a file in the volume")
# Read files from a volume
with open(f"{VOLUME_PATH}/somefile.txt", "r") as f:
print(f.readlines())
if __name__ == "__main__":
access_files()
To run this code, run python [filename].py
. You’ll see it print the text we just wrote to the file.
(.venv) $ python reading_and_writing_data.py
=> Building image
=> Using cached image
=> Syncing files
Reading .beamignore file
Collecting files from /Users/beta9/beam/examples/06_volume
=> Files synced
=> Running function: <reading_and_writing_data:access_files>
['This is being written to a file in the volume']
=> Function complete <e1526222-f665-47a5-9377-6f9036de3951>
Mounting a Volume
Volumes can be attached anything you run on Beam.
By default, Volumes are shared across all apps in your Beam account.
from beam import function, Volume
VOLUME_PATH = "./model_weights"
@function(
volumes=[Volume(name="model-weights", mount_path=VOLUME_PATH)],
)
def load_model():
from transformers import AutoModel
# Load model from cloud storage cache
AutoModel.from_pretrained(VOLUME_PATH)
Uploading Data
You can upload files with the CLI using the beam cp
command.
Usage: beam cp [OPTIONS] LOCAL_PATH REMOTE_PATH
Copy contents to a volume.
Options:
-c, --context TEXT The config context to use.
-h, --help Show this message and exit.
Match Syntax:
This command provides support for Unix shell-style wildcards, which are not the same as regular expressions.
* matches everything
? matches any single character
[seq] matches any character in `seq`
[!seq] matches any character not in `seq`
For a literal match, wrap the meta-characters in brackets. For example, '[?]' matches the character '?'.
Examples:
# Copy contents to a remote volume
beam cp mydir myvol/subdir
beam cp myfile.txt myvol/subdir
# Use a question mark to match a single character in a path
beam cp 'mydir/?/data?.json' myvol/sub/path
# Use an asterisk to match all characters in a path
beam cp 'mydir/*/*.json' myvol/data
# Use a sequence to match a specific set of characters in a path
beam cp 'mydir/[a-c]/data[0-1].json' myvol/data
# Escape special characters if you don't want to single quote your local path
beam cp mydir/\[a-c\]/data[0-1].json' myvol/data
beam cp mydir/\?/data\?.json myvol/sub/path
CLI Management Commands
Create a Volume
beam volume create [VOLUME-NAME]
$ beam volume create weights
Name Created At Updated At Workspace Name
───────────────────────────────────────────────────────
weights May 07 2024 May 07 2024 cf2db0
Delete a Volume
beam volume delete [VOLUME-NAME]
$ beam volume delete model-weights
Any apps (functions, endpoints, taskqueue, etc) that
refer to this volume should be updated before it is deleted.
Are you sure? (y/n) [n]: y
Deleted volume model-weights
List Volumes
beam volume list
$ beam volume list
Name Size Created At Updated At Workspace Name
─────────────────────────────────────────────────────────────────────────────────────
weights 240.23 MiB 2 days ago 2 days ago cf2db0
1 volumes | 240.23 MiB used
List Volume Contents
beam ls [VOLUME-NAME]
$ beam ls weights
Name Size Modified Time IsDir
──────────────────────────────────────────────────────────────────
.locks 0.00 B 29 minutes ago Yes
models--facebook--opt-125m 240.23 MiB 28 minutes ago Yes
2 items | 240.23 MiB used
Copy Files to Volumes
beam cp [LOCAL-PATH] [VOLUME-NAME]
=> weights (copying 1 object)
[LennonBeatlemania.pth] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 kB 0:00:00
Move Files in Volumes
beam mv [SOURCE] [DEST]
$ beam mv file.txt files/text-files
Moved file.txt to files/text-files/file.txt
Remove Files from Volumes
beam rm [FILE]
=> weights/app.py (1 object deleted)
app.py
Was this page helpful?