SDK Reference
Environment
Image
Defines a custom container image that your code will run in.
An Image object encapsulates the configuration of a custom container image that will be used as the runtime environment for executing tasks.
The Python version to be used in the image. Defaults to Python 3.8.
A list of Python packages to install in the container image. Alternatively, a string containing a path to a requirements.txt can be provided. Default is [].
A list of shell commands to run when building your container image. These commands can be used for setting up the environment, installing dependencies, etc. Default is [].
A custom base image to replace the default ubuntu20.04 image used in your container. This can be a public or private image from Docker Hub, Amazon ECR, Google Cloud Artifact Registry, or
NVIDIA GPU Cloud Registry. The formats for these registries are respectively docker.io/my-org/my-image:0.1.0
,
111111111111.dkr.ecr.us-east-1.amazonaws.com/my-image:latest
,
us-east4-docker.pkg.dev/my-project/my-repo/my-image:0.1.0
, and nvcr.io/my-org/my-repo:0.1.0
. Default is None.
A key/value pair or key list of environment variables that contain credentials to a private registry. When provided as a dict, you must supply the correct keys and values. When provided as a list, the keys are used to lookup the environment variable value for you. Default is None.
List of Base Image Creds
Dict of Base Image Creds
Adds environment variables to an image. These will be available when building the image
and when the container is running. This can be a string, a list of strings, or a
dictionary of strings. The string must be in the format of KEY=VALUE
. If a list of
strings is provided, each element should be in the same format. Default is None.
Builds the image on a GPU.
Context
Context is a dataclass used to store various useful fields you might want to access in your entry point logic.
Field Name | Type | Default Value | Purpose |
---|---|---|---|
container_id | Optional[str] | None | Unique identifier for a container |
stub_id | Optional[str] | None | Identifier for a stub |
stub_type | Optional[str] | None | Type of the stub (function, endpoint, taskqueue, etc) |
callback_url | Optional[str] | None | URL called when the task status changes |
task_id | Optional[str] | None | Identifier for the specific task |
timeout | Optional[int] | None | Maximum time allowed for the task to run (seconds) |
on_start_value | Optional[Any] | None | Any values returned from the on_start function |
bind_port | int | 0 | Port number to bind a service to |
python_version | str | "" | Version of Python to be used |
Callables
Function
Decorator for defining a remote function.
This method allows you to run the decorated function in a remote container.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in MiB, or as a string with units (e.g., “1Gi”).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU required, leave it empty. Multiple GPUs can be specified as a list.
The container image used for task execution.
The maximum number of seconds a task can run before timing out. Set to -1 to disable the timeout.
The maximum number of times a task will be retried if the container crashes.
An optional URL to send a callback to when a task is completed, timed out, or cancelled.
A list of storage volumes to be associated with the function.
A list of secrets that are injected into the container as environment variables.
An optional name for this function, used during deployment. If not specified, you must specify the name at deploy time with the --name
argument.
The task policy for the function. This helps manage the lifecycle of an individual task. Setting values here will override timeout and retries.
A list of exceptions that will trigger a retry.
Remote
You can run any function remotely on Beam by using the .remote()
method:
The code above is invoked by running python example.py
:
Map
You can scale out workloads to many containers using the .map()
method. You might use this for parallelizing computational-heavy tasks, such as batch inference or data processing jobs.
Schedule
This method allows you to schedule the decorated function to run at specific intervals defined by a cron expression.
The cron expression or predefined schedule that determines when the task will run. This parameter defines the interval or specific time when the task should execute.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in megabytes (e.g., 128 for 128 megabytes).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU required, leave it empty.
The container image used for the task execution..
The maximum number of seconds a task can run before it times out. Default is 180. Set it to -1 to disable the timeout.
The number of concurrent tasks to handle per container. Modifying this parameter can improve throughput for certain workloads. Workers will share the CPU, Memory, and GPU defined. You may need to increase these values to increase concurrency.
The duration in seconds to keep the task queue warm even if there are no pending tasks. Keeping the queue warm helps to reduce the latency when new tasks arrive. Default is 10s.
The maximum number of tasks that can be pending in the queue. If the number of pending tasks exceeds this value, the task queue will stop accepting new tasks.
An optional URL to send a callback to when a task is completed, timed out, or cancelled.
The maximum number of times a task will be retried if the container crashes.
A list of volumes to be mounted to the container.
A list of secrets that are injected into the container as environment variables.
An optional name for this endpoint, used during deployment. If not specified,
you must specify the name at deploy time with the --name
argument
Scheduling Options
Predefined Schedule | Description | Cron Expression |
---|---|---|
@yearly (or @annually ) | Run once a year at midnight on January 1st | 0 0 1 1 * |
@monthly | Run once a month at midnight on the first day of the month | 0 0 1 * * |
@weekly | Run once a week at midnight on Sunday | 0 0 * * 0 |
@daily (or @midnight ) | Run once a day at midnight | 0 0 * * * |
@hourly | Run once an hour at the beginning of the hour | 0 * * * * |
Endpoint
Decorator used for deploying a web endpoint.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in megabytes (e.g., 128 for 128 megabytes).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU required, leave it empty.
The container image used for the task execution..
The maximum number of seconds a task can run before it times out. Default is 180. Set it to -1 to disable the timeout.
The number of concurrent tasks to handle per container. Modifying this parameter can improve throughput for certain workloads. Workers will share the CPU, Memory, and GPU defined. You may need to increase these values to increase concurrency.
The duration in seconds to keep the task queue warm even if there are no pending tasks. Keeping the queue warm helps to reduce the latency when new tasks arrive. Default is 10s.
The maximum number of tasks that can be pending in the queue. If the number of pending tasks exceeds this value, the task queue will stop accepting new tasks.
A function that runs when the container first starts. The return values of the
on_start
function can be retrieved by passing a context
argument to your
handler function.
A list of volumes to be mounted to the container.
A list of secrets that are injected into the container as environment variables.
An optional name for this endpoint, used during deployment. If not specified,
you must specify the name at deploy time with the --name
argument
If false, allows the endpoint to be invoked without an auth token.
The maximum number of times a task will be retried if the container crashes.
Serve
beam serve
monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.
Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.
It’s also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.
To start an ephemeral serve
session, you’ll use the serve
command:
By default, Beam will sync all the files in your working directory to the
remote container. This allows you to use the files you have locally while
developing. If you want to prevent some files from getting uploaded, you can
create a .beamignore
.
Task Queue
Decorator for defining a task queue.
This method allows you to create a task queue out of the decorated function.
The tasks are executed asynchronously. You can interact with the task queue either through an API (when deployed), or directly in Python through the .put()
method.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in megabytes (e.g., 128 for 128 megabytes).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU required, leave it empty.
The container image used for the task execution..
The maximum number of seconds a task can run before it times out. Default is 180. Set it to -1 to disable the timeout.
The number of concurrent tasks to handle per container. Modifying this parameter can improve throughput for certain workloads. Workers will share the CPU, Memory, and GPU defined. You may need to increase these values to increase concurrency.
The duration in seconds to keep the task queue warm even if there are no pending tasks. Keeping the queue warm helps to reduce the latency when new tasks arrive. Default is 10s.
The maximum number of tasks that can be pending in the queue. If the number of pending tasks exceeds this value, the task queue will stop accepting new tasks.
An optional URL to send a callback to when a task is completed, timed out, or cancelled.
The maximum number of times a task will be retried if the container crashes.
A list of volumes to be mounted to the container.
A list of secrets that are injected into the container as environment variables.
An optional name for this endpoint, used during deployment. If not specified,
you must specify the name at deploy time with the --name
argument
A list of exceptions that will trigger a retry.
Serve
beam serve
monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.
Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.
It’s also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.
To start an ephemeral serve
session, you’ll use the serve
command:
By default, Beam will sync all the files in your working directory to the
remote container. This allows you to use the files you have locally while
developing. If you want to prevent some files from getting uploaded, you can
create a .beamignore
.
ASGI
Decorator used for creating and deploying an ASGI application.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in MiB, or as a string with units (e.g., “1Gi”).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU required, leave it empty.
The container image used for task execution.
A list of volumes to be mounted to the container.
The maximum number of seconds a task can run before timing out. Set to -1 to disable the timeout.
The maximum number of times a task will be retried if the container crashes.
The number of processes handling tasks per container. Workers share CPU, memory, and GPU resources.
The maximum number of concurrent requests the ASGI application can handle.
The duration in seconds to keep the task queue warm when there are no pending tasks.
The maximum number of tasks that can be pending in the queue.
A list of secrets injected into the container as environment variables.
An optional name for this ASGI application, used during deployment.
If false, allows the ASGI application to be invoked without an auth token.
Configure deployment autoscaling using various strategies.
An optional URL to send a callback when a task is completed, timed out, or canceled.
The task policy for the function, overriding timeout and retries.
Serve
beam serve
monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.
Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.
It’s also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.
To start an ephemeral serve
session, you’ll use the serve
command:
By default, Beam will sync all the files in your working directory to the
remote container. This allows you to use the files you have locally while
developing. If you want to prevent some files from getting uploaded, you can
create a .beamignore
.
Realtime
Decorator for creating a real-time application built on top of ASGI/websockets.
The handler function runs every time a message is received over the websocket.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in MiB, or as a string with units (e.g., “1Gi”).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU is required, leave it empty.
The container image used for task execution.
A list of volumes to be mounted to the ASGI application.
The maximum number of seconds a task can run before timing out. Set to -1 to disable the timeout.
The number of processes handling tasks per container. Workers share CPU, memory, and GPU resources.
The maximum number of concurrent requests the ASGI application can handle. This allows processing multiple requests concurrently.
The duration in seconds to keep the task queue warm even if there are no pending tasks.
The maximum number of tasks that can be pending in the queue.
A list of secrets injected into the container as environment variables.
An optional name for this ASGI application, used during deployment. If not specified, you must provide the name during deployment.
If false, allows the ASGI application to be invoked without an auth token.
Configure a deployment autoscaler to scale the function horizontally using various autoscaling strategies.
An optional URL to send a callback to when a task is completed, timed out, or canceled.
Serve
beam serve
monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.
Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.
It’s also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.
To start an ephemeral serve
session, you’ll use the serve
command:
By default, Beam will sync all the files in your working directory to the
remote container. This allows you to use the files you have locally while
developing. If you want to prevent some files from getting uploaded, you can
create a .beamignore
.
Function
Decorator for defining a remote function.
This method allows you to run the decorated function in a remote container.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in MiB, or as a string with units (e.g., “1Gi”).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU required, leave it empty. Multiple GPUs can be specified as a list.
The container image used for task execution.
The maximum number of seconds a task can run before timing out. Set to -1 to disable the timeout.
The maximum number of times a task will be retried if the container crashes.
An optional URL to send a callback to when a task is completed, timed out, or cancelled.
A list of storage volumes to be associated with the function.
A list of secrets that are injected into the container as environment variables.
An optional name for this function, used during deployment. If not specified, you must specify the name at deploy time with the --name
argument.
The task policy for the function. This helps manage the lifecycle of an individual task. Setting values here will override timeout and retries.
A list of exceptions that will trigger a retry.
Bot
Decorator for defining a bot with multiple states and transitions.
The bot
decorator allows you to define a bot with specific states (locations) and transitions. These bots run as distributed, stateful workflows, where each transition is executed in a remote container.
The underlying language model (e.g., "gpt-4o"
) used by the bot.
The Open API key used to authenticate requests to Open AI
A list of BotLocation
objects defining the bot’s states. Each location corresponds to a type (e.g., BaseModel
) that the bot operates on.
A human-readable description of the bot’s purpose.
Specifies whether the bot requires an auth token passed to invoke it.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in megabytes (e.g., 128 for 128 megabytes).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU required, leave it empty.
The container image used for the task execution..
The maximum number of seconds a task can run before it times out. Default is 180. Set it to -1 to disable the timeout.
The number of concurrent tasks to handle per container. Modifying this parameter can improve throughput for certain workloads. Workers will share the CPU, Memory, and GPU defined. You may need to increase these values to increase concurrency.
The duration in seconds to keep the task queue warm even if there are no pending tasks. Keeping the queue warm helps to reduce the latency when new tasks arrive. Default is 10s.
The maximum number of tasks that can be pending in the queue. If the number of pending tasks exceeds this value, the task queue will stop accepting new tasks.
A function that runs when the container first starts. The return values of the
on_start
function can be retrieved by passing a context
argument to your
handler function.
A list of volumes to be mounted to the container.
A list of secrets that are injected into the container as environment variables.
An optional name for this endpoint, used during deployment. If not specified,
you must specify the name at deploy time with the --name
argument
If false, allows the endpoint to be invoked without an auth token.
The maximum number of times a task will be retried if the container crashes.
Autoscaling
QueueDepthAutoscaler
Adds an autoscaler to an app.
The number of containers to keep running at baseline. The containers will continue running until the deployment is stopped.
The max number of tasks that can be queued up to a single container. This can
help manage throughput and cost of compute. When max_tasks_per_container
is
0, a container can process any number of tasks.
The maximum number of containers that the autoscaler can create. It defines an upper limit to avoid excessive resource consumption.
Data Structures
Simple Queue
Creates a Queue instance.
Use this a concurrency safe distributed queue, accessible both locally and within remote containers.
Serialization is done using cloudpickle, so any object that supported by that should work here. The interface is that of a standard python queue.
Because this is backed by a distributed queue, it will persist between runs.
The name of the queue (any arbitrary string).
Map
Creates a Map Instance.
Use this a concurrency safe key/value store, accessible both locally and within remote containers.
Serialization is done using cloudpickle, so any object that supported by that should work here. The interface is that of a standard python dictionary.
Because this is backed by a distributed dictionary, it will persist between runs.
The name of the map (any arbitrary string).
Storage
Beam allows you to create highly-available storage volumes that can be used across tasks. You might use volumes for things like storing model weights or large datasets.
Volume
Creates a Volume instance.
When your container runs, your volume will be available at ./{name}
and /volumes/{name}
.
The name of the volume, a descriptive identifier for the data volume.
The path where the volume is mounted within the container environment.
CloudBucket
Creates a CloudBucket instance.
When your container runs, your cloud bucket will be available at ./{name}
and /volumes/{name}
.
The name of the cloud bucket, must be the same as the bucket name in the cloud provider.
The path where the cloud bucket is mounted within the container environment.
Configuration for the cloud bucket.
CloudBucketConfig
Configuration for a cloud bucket.
Whether the volume is read-only.
The beam secret name for the S3 access key for the external provider.
The beam secret name for the S3 secret key for the external provider.
The S3 endpoint for the external provider.
The region for the external provider.
Output
A file that a task has created.
Use this to save a file you may want to save and share later.
When you run this function, it will return a pre-signed URL to the image:
The length of time the pre-signed URL will be available for. The file will be automatically deleted after this period.
Files
Saving a file and generating a public URL.
PIL Images
Saving a PIL.Image
object.
Directories
Saving a directory.
Experimental
Signal
Creates a Signal instance. Signals can be used to notify a container to perform specific actions using a flag.
For example, signals can reload global state, send a webhook, or terminate the container.
This is a great tool for automated retraining and deployment.
The name of the signal.
A function to be called when the signal is set. If not provided, no handler will be executed.
The number of seconds after which the signal will be automatically cleared if both handler
and clear_after_interval
are set.
Integrations
vllm
A wrapper around the vLLM library that allows you to deploy it as an ASGI app.
The number of CPU cores allocated to the container.
The amount of memory allocated to the container. It should be specified in MiB, or as a string with units (e.g., “1Gi”).
The type or name of the GPU device to be used for GPU-accelerated tasks. If not applicable or no GPU is required, leave it empty.
The container image used for task execution. This will include an add_python_packages
call with ["fastapi", "vllm", "huggingface_hub"]
added to ensure vLLM can run.
The number of workers to run in the container.
The maximum number of concurrent requests the container can handle.
The number of seconds to keep the container warm after the last request.
The maximum number of pending tasks allowed in the container.
The maximum number of seconds to wait for the container to start.
Whether the endpoints require authorization.
The name of the container. If not specified, you must provide it during deployment.
The volumes to mount into the container. Default is a single volume named “vllm_cache” mounted to ”./vllm_cache”, used as the download directory for vLLM models.
A list of secrets to pass to the container. To enable Hugging Face authentication for downloading models, set the HF_TOKEN
in the secrets.
The autoscaler to use for scaling container deployments.
The arguments to configure the vLLM model.
Utils
env
You can use env.is_remote()
to only import Python packages when your app is running remotely. This is used to avoid import errors, since your Beam app might be using Python packages that aren’t installed on your local computer.
The alternative to env.is_remote()
is to import packages inline in your functions. For more information on this topic, visit this page.
Was this page helpful?