Beam is serverless, which means apps will scale-to-zero by default. We bill by the second of compute used, and we don’t charge for idle time. This being said, it’s important to explain what counts as compute usage on Beam.

Usage-Based Pricing


  • CPU: $0.00002778/s per core
  • Memory: $0.00000069/s per GB


  • A10G: $0.00046716/s
  • T4: $0.00017222/s
  • A100: $0.00091388/s

What am I charged for?

You’re charged for the time to boot your application, running your code, and any keep warm time you’ve set. For REST APIs, your container will stay alive for 90s after the last request by default.

You can override the default keep warm period by setting a keep_warm_seconds of 0 in your function decorator:


Am I charged for cold start?

We only charge for the time loading your application code. Here’s a breakdown of a cold start:

  • Node start time. Not charged. This is typically minimal, but can take longer if the system is busy.
  • Image load time. Not charged. Pulling your container image from our image service.
  • Application start time. Charged. Running your code. This includes running any loaders you’ve added to your app.

Real-World Example

You’ve deployed a REST API. You’ve added two Python Packages in your Image(), which are loaded when your app first starts.

You’ve also added a keep_warm_seconds=300, which will keep the container alive for 300 seconds (5 minutes) after each request.
from beam import App, Runtime, Image

app = App(
            python_packages=["transformers", "torch"],

# This runs once when the container first starts
def load_models():
    return {}

app.rest_api(keep_warm_seconds=300, loader=load_models)
def predict():
    return {}

Let’s pretend you deploy this and call the API. Suppose it takes:

  • 1s to boot your application and run your loader function.
  • 100ms to run your task.
  • 300s to keep the container alive, based on the keep_warm_seconds argument.

You would be billed for a total of 301.1 seconds.