Beam allows you to create highly-available storage volumes that can be used across tasks. You might use volumes for things like storing model weights or large datasets.

Beam Volumes are mounted directly to the containers that run your code, so they are more performant than using cloud object storage.

We strongly recommend storing your data in Beam Volumes for any data you plan to access from your Beam functions.

How to Write Files in Beam

Your apps run in containers that are read only.

There are two use-cases for saving files: persistent files, that you want to access between tasks, and temporary files that will be deleted when your container spins down.

  1. Persisting Files: write to a volume.
  2. Temporary Files: temporary files can be written to the /tmp directory in your Beam container, for example you could save an image to /tmp/myimage.png.

Reading and Writing to Volumes

You can read and write to your Volume like any ordinary Python file:

from beam import function, Volume


VOLUME_PATH = "./model_weights"


@function(
    volumes=[Volume(name="model-weights", mount_path=VOLUME_PATH)],
)
def access_files():
    # Write files to a volume
    with open(f"{VOLUME_PATH}/somefile.txt", "w") as f:
        f.write("This is being written to a file in the volume")

    # Read files from a volume
    with open(f"{VOLUME_PATH}/somefile.txt", "r") as f:
        print(f.readlines())


if __name__ == "__main__":
    access_files()

To run this code, run python [filename].py. You’ll see it print the text we just wrote to the file.

(.venv) $ python reading_and_writing_data.py

=> Building image
=> Using cached image
=> Syncing files
Reading .beamignore file
Collecting files from /Users/beta9/beam/examples/06_volume
=> Files synced
=> Running function: <reading_and_writing_data:access_files>

['This is being written to a file in the volume']

=> Function complete <e1526222-f665-47a5-9377-6f9036de3951>

Creating a Volume

Volumes can be attached anything you run on Beam.

By default, Volumes are shared across all apps in your Beam account.

from beam import function, Volume


VOLUME_PATH = "./model_weights"


@function(
    volumes=[Volume(name="model-weights", mount_path=VOLUME_PATH)],
)
def load_model():
    from transformers import AutoModel

    # Load model from cloud storage cache
    AutoModel.from_pretrained(VOLUME_PATH)

If you add a volume to your app, it will be created automatically. You can also create volumes manually in the CLI, by using:

$ beam volume create my-volume

  Name         Created At   Updated At   Workspace Name
 ───────────────────────────────────────────────────────
  my-volume    just now     just now     f6fa28

Uploading Data

You can upload files with the CLI using the beam cp command.

beam cp [local-file] beam://[volume-name]

Files

beam cp file.txt beam://myvol/              # ./file.txt => beam://myvol/file.txt
beam cp file.txt beam://myvol/file.txt      # ./file.txt => beam://myvol/file.txt
beam cp file.txt beam://myvol/file.new      # ./file.txt => beam://myvol/file.new
beam cp file.txt beam://myvol/hello         # ./file.txt => beam://myvol/hello.txt (keeps the extension)

Directories

beam cp mydir beam://myvol                  # ./mydir/file.txt => beam://myvol/file.txt
beam cp mydir beam://myvol/mydir            # ./mydir/file.txt => beam://myvol/mydir/file.txt
beam cp mydir beam://myvol/newdir           # ./mydir/file.txt => beam://myvol/newdir/file.txt

Downloading Data

Files

beam cp beam://myvol/file.txt .             # beam://myvol/file.txt => ./file.txt
beam cp beam://myvol/file.txt file.new      # beam://myvol/file.txt => ./file.new

Directories

beam cp beam://myvol/mydir .                # beam://myvol/mydir/file.txt => ./file.txt

CLI Management Commands

Create a Volume

beam volume create [VOLUME-NAME]
$ beam volume create weights

  Name       Created At    Updated At    Workspace Name
 ───────────────────────────────────────────────────────
  weights   May 07 2024   May 07 2024   cf2db0

Delete a Volume

beam volume delete [VOLUME-NAME]
$ beam volume delete model-weights

Any apps (functions, endpoints, taskqueue, etc) that
refer to this volume should be updated before it is deleted.
Are you sure? (y/n) [n]: y

Deleted volume model-weights

List Volumes

beam volume list
$ beam volume list

  Name                                Size   Created At   Updated At   Workspace Name
 ─────────────────────────────────────────────────────────────────────────────────────
  weights                       240.23 MiB   2 days ago   2 days ago   cf2db0

  1 volumes | 240.23 MiB used

List Volume Contents

beam ls [VOLUME-NAME]
$ beam ls weights

  Name                               Size   Modified Time    IsDir
 ──────────────────────────────────────────────────────────────────
  .locks                           0.00 B   29 minutes ago   Yes
  models--facebook--opt-125m   240.23 MiB   28 minutes ago   Yes

  2 items | 240.23 MiB used

Copy Files to Volumes

beam cp [LOCAL-PATH] beam://[VOLUME-NAME]
$ beam cp my-file beam://my-volume

[beam://my-volume/my-file] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.0/10.0 MiB 1.29 MiB/s 0:00:07

Move Files in Volumes

beam mv [SOURCE] [DEST]
$ beam mv file.txt files/text-files

Moved file.txt to files/text-files/file.txt

Remove Files from Volumes

beam rm [FILE]
=> weights/app.py (1 object deleted)
app.py