Beam is serverless, which means your apps will scale-to-zero by default. Billing is based on the lifecycle of your containers. You are only charged when your containers are running.

Price List

ResourcePrice (Per Second)Price (Per Hour)
CPU$0.0000528 per core$0.190 per core
RAM$0.0000056 per GB$0.020 per GB
T4 GPU$0.000150$0.54
RTX 4090 GPU$0.000192$0.69
A10G GPU$0.000292$1.05
A100-40 GPU$0.000764$2.75
A100-80 GPU$0.000955$3.44
H100 GPU$0.001222$4.40
File StorageIncludedIncluded

What am I charged for?

You are charged whenever a container is running. This includes:

What am I not charged for?

  • Waiting for a machine to start
  • Pulling your container image

Default Container Spin-down Times

After handling a request, Beam keeps containers running (“warm”) for a certain amount of time in order to quickly handle future requests. By default, these are the container “keep warm” times for each deployment type:

Deployment TypeContainer Keep Warm Duration
Endpoints/ASGI/Realtime180s
Task Queues10s
Pods600s

Real-World Example

You’ve deployed a REST API. You’ve added two Python Packages in your Image(), which are loaded when your app first starts.

You’ve also added a keep_warm_seconds=300, which will keep the container alive for 300 seconds (5 minutes) after each request.

app.py
from beam import endpoint


# This runs once when the container first starts
def load_models():
    return {}

@endpoint(keep_warm_seconds=300, on_start=load_models)
def predict():
    return {}

Let’s pretend you deploy this and call the API. Suppose it takes:

  • 1s to boot your application and run your on_start function.
  • 100ms to run your task.
  • 300s to keep the container alive, based on the keep_warm_seconds argument.

You would be billed for a total of 301.1 seconds.