Applications on Beam are run inside containers. A container is like a lightweight VM that packages a set of software packages required by your application. The benefit of using containers is portability. The required runtime environment is packaged alongside the application.

Containers are based on container images which are instructions for how a container should be built.

Because you are building a custom application, it is likely that your application depends on some custom software to run. This could include custom python packages, libraries, binaries, and drivers.

You can customize the container image used to run your Beam application with the Image class. The options specified in the Image class will influence how the image is built.

Exploring the Beam Image Class

Every application that runs on Beam instantiates the Image class. This class provides a variety of methods for customizing the container image used to run your application.

It exposes options for:

  • Installing a specific version of Python
  • Adding custom shell commands that run during the build process
  • Adding custom Python packages to install in the container
  • Choosing a custom base image to build on top of
  • Using a custom Dockerfile to build your own base image
  • Setting up a custom conda environment using micromamba

The default Beam image uses ubuntu:22.04 as its base and installs Python 3.10.

from beam import function


# This function will use ubuntu:22.04 with Python 3.10
@function()
def hello_world():
    return "Hello, world!"

hello_world.remote()

Adding Python Packages

The most common way to customize your image is to add the Python packages required by your application. This is done by calling the add_python_packages method on the Image object with a list of package names.

Pinning the version of the package is recommended. This ensures that when you re-deploy your application, you won’t accidentally pick up a new version that breaks your application.

from beam import Image, endpoint

image = Image(python_version="python3.11").add_python_packages(["numpy==2.2.0"])

@endpoint(image=image)
def handler():
  return {}

Importing requirements.txt

If you already have a requirements.txt file, you can also use that directly using the Image constructor’s python_packages parameter:

from beam import Image, endpoint

image = Image(python_version="python3.11", python_packages="requirements.txt")

@endpoint(image=image)
def handler():
  return {}

Adding Shell Commands

Sometimes, it is necessary to run additional shell commands while building your image. This can be achieved by calling the add_commands method on the Image object with a list of commands.

For instance, you might need to install libjpeg-dev when using the Pillow library. In the example below, we’ll install libjpeg-dev and then install Pillow.

from beam import Image, endpoint

image = (
    Image(python_version="python3.11")
    .add_commands(["apt-get update", "apt-get install libjpeg-dev -y"])
    .add_python_packages(["Pillow"])
)

@endpoint(image=image)
def handler():
  return {}

Customizing the Base Image

Some applications and libraries require specific dependencies that are not available in the default Beam image. In these cases, you can use a custom base image.

Some of the most common custom base images are the CUDA development images from NVIDIA (e.g. nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04). These images come with additional libraries, debugging tools, and nvcc installed.

The image below will use a custom CUDA image as the base.

from beam import Image, function

image = Image(
    base_image="nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04"
)

@function(image=image)
def hello_world():
    return "Hello, world!"

hello_world.remote()

CUDA Drivers & NVIDIA Kernel Drivers

When choosing a custom base image, it is important to understand the difference between the NVIDIA Kernel Driver and the CUDA Runtime & Libraries.

ComponentLocationRole
NVIDIA Kernel DriverHost MachineLow-level GPU management, talks directly to hardware.
CUDA Runtime & LibrariesContainerProvides high-level APIs and libraries for applications.

The NVIDIA Kernel Driver on the host must support the CUDA version used by the container.

In general, if the CUDA version on the host is greater than or equal to the CUDA version in the container, then the NVIDIA Kernel Driver on the host will support the CUDA version used by the container.

For example, using a CUDA 12.2 image on a host with a CUDA 12.4 driver will work. However, using a CUDA 12.8 image on a host with a CUDA 12.4 driver will not work.

You can consult the table below to help you choose a compatible base image.

GPUDriver VersionCUDA Version
T4550.127.0512.4
A10G550.90.1212.4
RTX4090550.127.0512.4
L40s550.127.0512.4
A100-40550.127.0512.4
H100550.127.0512.4

Using a Specific Python Version

To install a specific version of Python, you can use the python_version parameter:

from beam import function, Image


# This function will use ubuntu:22.04 with Python 3.11
@function(image=Image(python_version="python3.11"))
def hello_world():
    return "Hello, world!"

hello_world.remote()

This function will use the CUDA image as the base and install Python 3.10 because no python_version is specified and the CUDA image has no Python version installed.

from beam import Image, function


@function(
    image=Image(
        base_image="nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04",
    )
)
def custom_image_no_python():
    return "Hello, world!"

This function will use the CUDA image as the base and install Python 3.11 because a python_version is specified.

from beam import Image, function


@function(
    image=Image(
        base_image="nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04",
        python_version="python3.11",
    )
)
def custom_image_python_requested():
    return "Hello, world!"

If your image comes with a pre-installed version of Python3, it will be used by default as long as you don’t specify a python_version in your Image constructor. This function will use the PyTorch image as the base and will use the Python version that already exists in the PyTorch image.

from beam import Image, function


@function(
    image=Image(
        base_image="docker.io/pytorch/pytorch:2.2.1-cuda12.1-cudnn8-devel"
    )
)
def custom_image_pytorch():
    return "Hello, world!"

Building on GPU

By default, Beam builds your images on CPU-only machines. However, sometimes you might need the build to occur on a machine with a GPU.

For instance, some libraries might compile CUDA kernels during installation. In these cases, you can use the build_with_gpu() command to run your build on the GPU of your choice.

from beam import Image

image = (
    Image()
    .add_commands(
        [
            "apt-get update -y",
            "apt-get install ffmpeg -y",
            "apt-get install nvidia-cuda-toolkit -y", # Requires GPU to install
        ]
    )
    .build_with_gpu(gpu="T4") # Install on a T4
)

Building with Environment Variables

Often, shell commands require certain environment variables to be set. You can set these using the with_envs command:

from beam import Image

image = (
    Image()
    .add_python_packages(["huggingface_hub[cli]", "accelerate"])
    .with_envs(["HF_HUB_ENABLE_HF_TRANSFER=1", "HF_HOME"=/models])
    .add_commands(["huggingface-cli download meta-llama/Llama-3.2-3B"])
)

Injecting Secrets

Sometimes, you might not want the environment variables to be set in plain text. In these cases, you can leverage Beam secrets and the with_secrets command:

You can create secrets like this, using the CLI: beam secret create HF_TOKEN <your_token>.

from beam import Image

image = (
    Image()
    .add_python_packages(["huggingface_hub[cli]", "accelerate"])
    .with_envs(["HF_HUB_ENABLE_HF_TRANSFER=1", "HF_HOME"=/models])
    .with_secrets(["HF_TOKEN"]) # Models with a user agreement often require a token
    .add_commands(["huggingface-cli download meta-llama/Llama-3.2-3B"])
)

Note Adding secrets and environment variables to the build environment does not make them available in the runtime environment.

Runtime environment variables and secrets must be specified in the function decorator directy:

from beam import function

@function(env_vars={"HF_HOME": "/models"}, secrets=["HF_TOKEN"])
def download_model():
    return "Hello, world!"

Using a Dockerfile

You also have the option to build your own custom base image using a Dockerfile.

The from_dockerfile() command accepts a path to a valid Dockerfile as well as an optional path to a context directory:

from beam import Image, endpoint

image = Image().from_dockerfile("./Dockerfile").add_python_packages(["numpy"])


@endpoint(image=image, name="test_dockerfile")
def handler():
  return {}

The context directory serves as the root for any paths used in commands like COPY and ADD, meaning all relative paths are relative to this directory.

The image built from your Dockerfile will be used as the base image for a Beam application.

Ports will not be exposed in the runtime environment, and the entrypoint will be overridden.

Conda Environments

Beam supports using Anaconda environments via micromamba. To get started, you can chain the micromamba method to your Image definition and then specify packages and channels via the add_micromamba_packages method.

from beam import Image


image = (
    Image(python_version="python3.11")
    .micromamba()
    .add_micromamba_packages(packages=["pandas", "numpy"], channels=["conda-forge"])
    .add_python_packages(packages=["huggingface-hub[cli]"])
    .add_commands(commands=["micromamba run -n beta9 huggingface-cli download gpt2 config.json"])
)

You can still use pip to install additional packages in the conda environment and you can run shell commands too.

If you need to run a shell command inside the conda environment, you should prepend the command with micromamba run -n beta9 as shown above.