# Building AI Agents
Source: https://docs.beam.cloud/v2/agents/introduction

Beam is launching a new type of agent framework that is stateful and has concurrency built-in. 

<CardGroup>
  <Card title="Other agent frameworks can't..." icon="thumbs-down">
    • Multi-task <br />
    • Automatically synchronize state <br />
    • Run each task in an isolated environment <br />• Scale to 100s of GPUs
  </Card>

  <Card title="Beam gives you everything you need" icon="thumbs-up">
    • Sandboxed compute environments <br />
    • Concurrency <br />
    • Task management and queuing <br />
    • Edge deployment and autoscaling <br />
    • Authentication <br />• Lots of GPUs
  </Card>
</CardGroup>

## Introduction

Today, most agent frameworks are based on graph DAGs. While useful for simple tasks, this limits you to performing one action at a time (i.e. using one tool at a time).

Beam uses a new agentic concurrency model, based on [*petri nets*](https://en.wikipedia.org/wiki/Petri_net), which are capable of multi-tasking complex, multi-threaded workflows.

By combining this agent framework with Beam's cloud compute, you can build powerful, parallelized applications.

<Frame>
  <img />
</Frame>

## Core Concepts

Our agent framework has three important components: locations, transitions, and markers.

For this example, suppose we're modeling an eCommerce store.

* **Locations** -- these are specific *states* or *conditions* that hold tokens. For example, `in_shopping_cart` or `in_queue`.
* **Transitions** -- events or actions that cause state changes. For example `place_order` or `accept_payment`. Each transition has:
  * **Inputs** -- the locations the data is consumed from.
  * **Outputs** -- the locations the data is sent.
* **Markers** -- markers are *data types*. For example, `order_12345_red_shoes`.

<Tip>
  Think of a Petri net as a factory assembly line, where parts (markers) move
  between workstations (locations), and tasks (transitions) are performed when
  all required parts are in place.
</Tip>

## Initial Setup

Let's setup a simple `hello world` chatbot. This chatbot will respond to messages from a user. It will ask the user for their name, and attempt to update the status of their order.

### Pre-requisites

* A free Beam account, and [Beam installed on your computer](/v2/getting-started/installation)
* An [OpenAI API key](https://platform.openai.com/api-keys)

### Hello World

We'll start by creating a `bot`, a `transition`, and initial state `markers`:

```python theme={null}
from beam import Bot


# Create the bot -- make sure to add your own OpenAI API key
bot = Bot(
    model="gpt-4o",
    api_key="YOUR_OPENAI_API_KEY",
    locations=[],
    description="A simple bot to cancel orders.",
)
```

## Managing State

Now we'll add `locations` and `markers`, which represent *state*.

```python app.py {2,6-11,19-20} theme={null}
from beam import Bot, BotContext, BotLocation
from pydantic import BaseModel


# Marker states for the bot
class UserName(BaseModel):
    name: str


class OrderStatus(BaseModel):
    message: str


# Create the bot -- make sure to add your own OpenAI API key
bot = Bot(
    model="gpt-4o",
    api_key="YOUR_OPENAI_API_KEY",
    locations=[
        BotLocation(marker=UserName),
        BotLocation(marker=OrderStatus),
    ],
    description="A simple bot to cancel orders.",
)
```

## Adding Transitions

Let's add our first **transition**. A transition is a state change. It takes our `UserName` location and returns an `OrderStatus` location.

```python app.py {27-33} theme={null}
from beam import Bot, BotContext, BotLocation
from pydantic import BaseModel


# Marker states for the bot
class UserName(BaseModel):
    name: str


class OrderStatus(BaseModel):
    message: str


# Create the bot -- make sure to add your own OpenAI API key
bot = Bot(
    model="gpt-4o",
    api_key="YOUR_OPENAI_API_KEY",
    locations=[
        BotLocation(marker=UserName),
        BotLocation(marker=OrderStatus),
    ],
    description="A simple bot to cancel orders.",
)


# This transition prompts the user for their name and cancels their orders
@bot.transition(
    cpu=1,
    memory=128,
    inputs={UserName: 1},
    outputs=[OrderStatus],
    description="Cancels a users order.",
)
```

## Interacting with User Input

Let's add basic logic in the transition. We'll accept a username, and update a dict with the user's order status.

### Adding Prompts

We'll introduce a new concept, called `context`, which is a class that provides various convenience methods for your bot.

```python app.py {14-15, 37-46} theme={null}
from beam import Bot, BotContext, BotLocation
from pydantic import BaseModel


# Marker states for the bot
class UserName(BaseModel):
    name: str


class OrderStatus(BaseModel):
    message: str


# Hardcoded user data (mock database)
USER_DATA = {"Alice": "processing", "Bob": "shipped"}

# Create the bot -- make sure to add your own OpenAI API key
bot = Bot(
    model="gpt-4o",
    api_key="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",
    locations=[
        BotLocation(marker=UserName),
        BotLocation(marker=OrderStatus),
    ],
    description="A simple bot to cancel orders.",
)


# This transition prompts the user for their name and cancels their orders
@bot.transition(
    cpu=1,
    memory=128,
    inputs={UserName: 1},
    outputs=[OrderStatus],
    description="Cancels a users order.",
)
def cancel_order(context: BotContext, inputs):
    # Get the name provided by the user
    user_name = inputs[UserName][0].name

    # Update the user's order status
    USER_DATA[user_name] = "cancelled"

    # Send a message in the chat
    context.say(f"Order cancelled for {user_name}")

    # Return a marker state
    return {OrderStatus: [OrderStatus(message="order_cancelled")]}
```

### Human-in-the-loop

We can also add a confirmation prompt, so that user input is required before the bot can proceed to the next step.

Let's add the `confirm` flag to the transition:

```python app.py {48-51} theme={null}
@bot.transition(
    cpu=1,
    memory=128,
    inputs={UserName: 1},
    outputs=[OrderStatus],
    description="Cancels a users order.",
    confirm=True
)
```

The bot will only proceed if the user confirms the request.

### Adding Transitions

Let's add a second transition, which will issue a refund to the user after they cancel their order.

This transition will fire when an `OrderStatus` marker is created.

```python theme={null}
class RefundStatus(BaseModel):
    message: str

@bot.transition(
    cpu=1,
    memory=128,
    inputs={OrderStatus: 1},
    outputs=[RefundStatus],
    description="Offers a refund to the user after cancelling their order.",
    expose=False,  # The bot won't take this into account when asking the user for input
)
def offer_refund(context: BotContext, inputs):
    # Process the refund (mock logic)
    refund_message = f"Your refund for the cancelled order has been processed. You should see it in your account within 3-5 business days."

    # Send a message in the chat
    context.say(refund_message)
```

You'll notice that we're not returning a marker, because this transition marks the end of our workflow. After this transition runs, there's no state left to update.

## Advanced Usage

### Controlling Bot Awareness

Based on the [system prompt](https://github.com/beam-cloud/beta9/blob/main/pkg/abstractions/experimental/bot/prompt.yaml), the bot automatically knows about all the locations and transitions defined in the network. This means that the bot will understand its role based on the data you add to your transitions.

However, you might not want the bot to know about certain transitions or locations!

<Tip>
  If you want certain things hidden from the bot's context, you can pass
  `expose=false` to locations and transitions.
</Tip>

Think of hidden transitions as 'backstage actions' -- users can still interact with them, but the bot won't take it into account in its reasoning.

```python {7} theme={null}
@bot.transition(
    cpu=1,
    memory=128,
    inputs={OrderStatus: 1},
    outputs=[RefundStatus],
    description="Offers a refund to the user after cancelling their order.",
    expose=False,  # This prevents the bot from using this transition in its reasoning
)
```

### Using Context Commands

We provide a number of helper commands using a class called `context`.

Context variables can be used for prompting the user for input, creating blocking requests to the bot, and sending message to the user.

***Available Commands***

| Method                | Description                                                                                                                                                                   |
| --------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `context.confirm()`   | Pause a transition until a user says yes or no.                                                                                                                               |
| `context.prompt()`    | Send a blocking or non-blocking request to the model (e.g., "summarize these reviews"). You can pass an optional `wait_for_response=False` boolean to make this non-blocking. |
| `context.remember()`  | Add an arbitrary JSON-serializable object to the conversation memory.                                                                                                         |
| `context.say()`       | Output text to the user's chat window.                                                                                                                                        |
| `context.send_file()` | Send a file to the user created during a transition.                                                                                                                          |
| `context.get_file()`  | Retrieve a file from the user during a transition.                                                                                                                            |

## Development Workflow

### Testing

We'll start by running the bot from our shell, as a temporary development server.

```sh theme={null}
$ beam serve app.py:bot
```

This command will spin up a container in the cloud for the bot `transition`, and create an interactive dialogue in your shell.

```sh theme={null}
=> Building image
=> Using cached image
=> Syncing files
=> Files synced
=> Invocation details
websocat 'wss://979563f1-b569-4f2c-8113-dc6ebca007d1.app.beam.cloud'

=> Session started: 062a4f
=> Chat with your bot below...
```

You can interact with your bot by typing into the shell. In your shell, you'll see responses from the bot, as well as event logs from each transition that fires.

```sh theme={null}
{
  "type": "agent_message",
  "value": "Hello! How can I assist you today? If you would like to cancel an order, please provide the user's name to get started.",
  "metadata": {
    "request_id": "b2f58de9-7c93-4eda-9a8d-fc42d4e40561",
    "session_id": "062a4f"
  }
}
# hi - please cancel alice's order
#
{
  "type": "agent_message",
  "value": "I've noted the request to cancel Alice's order. If there's anything else you need, just let me know!",
  "metadata": {
    "request_id": "93c0f077-b663-46a0-b110-d63376c8821f",
    "session_id": "062a4f"
  }
}

{
  "type": "transition_fired",
  "value": "cancel_order",
  "metadata": {
    "session_id": "062a4f",
    "task_id": "bf45f661-daaa-4d70-83f1-0479587fafe9",
    "transition_name": "cancel_order"
  }
}

{
  "type": "transition_started",
  "value": "cancel_order",
  "metadata": {
    "session_id": "062a4f",
    "task_id": "bf45f661-daaa-4d70-83f1-0479587fafe9",
    "transition_name": "cancel_order"
  }
}

{
  "type": "agent_message",
  "value": "\u2705 Order cancelled for alice",
  "metadata": {
    "session_id": "062a4f",
    "task_id": "bf45f661-daaa-4d70-83f1-0479587fafe9",
    "transition_name": "cancel_order"
  }
}

{
  "type": "transition_completed",
  "value": "cancel_order",
  "metadata": {
    "session_id": "062a4f",
    "task_id": "bf45f661-daaa-4d70-83f1-0479587fafe9",
    "transition_name": "cancel_order"
  }
}
```

### Deployment

```
$ beam deploy app.py:bot --name order-bot
```

You can login to the Beam Dashboard and use the web UI to chat with your bot, view the network graph, and view the event logs for each task.

<Frame>
  <img />
</Frame>

## Creating Public Chatbots

You can also create sharable pages for your chatbot by adding `authorized=False` to your `bot`:

```python app.py {11} theme={null}
from beam import Bot, BotContext, BotLocation

bot = Bot(
    model="gpt-4o",
    api_key="YOUR_OPENAI_API_KEY",
    locations=[
        BotLocation(marker=UserName),
        BotLocation(marker=OrderStatus),
    ],
    description="A simple bot to cancel orders.",
    authorized=False,
)
```

When deployed, you can access a public URL for your bot, which looks like this:

<Frame>
  <img />
</Frame>


# Example: Research Assistant
Source: https://docs.beam.cloud/v2/agents/synchronization


Beam's agent framework is designed for concurrency and synchronization. In this example, we'll show how you can deploy an app that scrapes online product reviews.

You can follow along with the tutorial in the video below.

<iframe title="YouTube video player" />

## Why Beam?

Beam's Petri Net framework is ideal for workflows that require concurrency and scalability. This app uses Beam to:

* **Retrieve Google Shopping URLs** for a product name you provide to the bot.
* **Scrape review pages** for those products.
* **Summarize reviews** into a report.

## Pre-requisites

You'll need three API keys to run the example below:

* [Firecrawl API key](https://docs.firecrawl.dev/introduction) (free), used for scraping product pages
* [SerpApi API key](https://serpapi.com/) (free for 100 searches a month), used to retrieve Google Shopping URLs
* [OpenAI API Key](https://platform.openai.com/docs/quickstart)

Set up your environment variables by adding these keys to a `.env` file in your project directory.

```
OPEN_AI_API_KEY=your_openai_api_key
SERPAPI_API_KEY=your_serpapi_api_key
FIRECRAWL_API_KEY=your_firecrawl_api_key
```

## Setup

### Defining Locations

Locations represent the states of data flowing through the network. In this app, we'll use three states:

* **ProductName**: The product to search for (i.e. "headphones")
* **URL**: URLs of product pages retrieved from Google Shopping
* **ReviewPage**: Online product pages with customer reviews

Define these locations in your code:

```python theme={null}
from pydantic import BaseModel

class ProductName(BaseModel):
    product_name: str

class URL(BaseModel):
    url: str

class ReviewPage(BaseModel):
    review_page: str
```

### Create the Bot

Let's setup the bot, which is what manages the workflow. Add your API keys and define the locations (states) it will manage.

```python theme={null}
from beam import Bot, BotLocation

bot = Bot(
    model="gpt-4o",
    api_key=OPEN_AI_API_KEY,
    locations=[
        BotLocation(marker=ProductName),
        BotLocation(marker=URL, expose=False),
        BotLocation(marker=ReviewPage, expose=False),
    ],
    description="This bot will take a product category as input, search for reviews, and summarize them.",
)
```

## Adding Transitions

Transitions are events or actions in your bot, triggered by changes to the locations (state).

### Retrieve Product URLs

The first transition takes a product category (e.g., "headphones") and uses SerpAPI to retrieve Google Shopping URLs for the product.

<Frame>
  <img />
</Frame>

```python theme={null}
from beam import Image
from serpapi import GoogleSearch

@bot.transition(
    inputs={ProductName: 1},
    outputs=[URL],
    description="Retrieve Google Shopping results for a product.",
    cpu=1,
    memory=128,
    image=Image(python_packages=["serpapi", "python-dotenv"]),
)
def get_product_urls(context, inputs):
    product_name = inputs[ProductName][0].product_name

    params = {
        "engine": "google_shopping",
        "q": product_name,
        "api_key": SERPAPI_API_KEY,
    }

    search = GoogleSearch(params)
    results = search.get_dict()
    urls = results["shopping_results"][:3]

    return {URL: [URL(url=url["product_link"]) for url in urls]}
```

### Scrape Review Pages

The second transition scrapes review pages from each product URL using Firecrawl.

```python theme={null}
from firecrawl import FirecrawlApp
import json

@bot.transition(
    inputs={URL: 1},
    outputs=[ReviewPage],
    description="Scrape review pages for product URLs.",
    cpu=1,
    memory=128,
    image=Image(python_packages=["firecrawl-py", "python-dotenv"]),
    expose=False,
)
def scrape_reviews(context, inputs):
    url = inputs[URL][0].url
    app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)

    scrape_result = app.scrape_url(url, params={"formats": ["markdown"]})
    return {ReviewPage: [ReviewPage(review_page=json.dumps(scrape_result))]}
```

### Summarize Reviews

The final transition summarizes reviews from all the scraped pages into a markdown file.

Pay close attention to the `inputs` field below. **This transition will not begin running until 3 `ReviewPage` markers have been created from the previous transition.**

```python {2} theme={null}
@bot.transition(
    inputs={ReviewPage: 3},
    outputs=[],
    description="Summarize product reviews.",
    cpu=1,
    memory=128,
    image=Image(python_packages=["python-dotenv"]),
    expose=False,
)
def summarize_reviews(context, inputs):
    all_review_pages = "\n".join([input.review_page for input in inputs[ReviewPage]])

    prompt = f"""
        The following pages contain markdown reviews for products.
        Summarize the key takeaways, including 1-3 direct quotes from reviewers.
        Ensure the product name and URL are included:
        {all_review_pages}
    """

    event = context.prompt(msg=prompt, timeout_seconds=30)
    summary = event.value

    file_path = "/tmp/product-reviews.md"
    with open(file_path, "w") as f:
        f.write(summary)

    context.say("Product reviews summarized successfully!")
    if context.confirm(description="Do you want a sharable link to the summary?"):
        context.send_file(path=file_path, description="Product Review Summary")
```

Once deployed, you'll be able to see the tasks in the dashboard, with the transition waiting until all `ReviewPage` markers have been emitted.

<Frame>
  <img />
</Frame>

## Deploying the Bot

```sh theme={null}
$ beam deploy app.py:bot --name product-review-bot
```

Deploying the bot gives you access to a dashboard, where you can interact with the bot using a Chat UI.

<Frame>
  <img />
</Frame>

## What's next?

With the bot deployed, there are a few things you can try:

### Create a Public Chat Page

You can create a public, sharable Chat Page for your bot by adding an `authorized=False` argument to the `bot`:

```python {6} theme={null}
from beam import Bot, BotLocation

bot = Bot(
    model="gpt-4o",
    api_key=OPEN_AI_API_KEY,
    authorized=False,
    locations=[
        BotLocation(marker=ProductName),
        BotLocation(marker=URL, expose=False),
        BotLocation(marker=ReviewPage, expose=False),
    ],
    description="This bot will take a product category as input, search for reviews, and summarize them.",
)
```

When deployed, this gives you a sharable Chat UI. You can retrieve the URL to the Chat UI by clicking next to the "lock" icon.

<Frame>
  <img />
</Frame>

Here's what the Chat UI looks like:

<Frame>
  <img />
</Frame>

### Add Interactivity

We provide a number of helper commands using a class called `context`.

Context variables can be used for prompting the user for input, creating blocking requests to the bot, and sending message to the user.

***Available Commands***

| Method                | Description                                                                                                                                                                   |
| --------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `context.confirm()`   | Pause a transition until a user says yes or no.                                                                                                                               |
| `context.prompt()`    | Send a blocking or non-blocking request to the model (e.g., "summarize these reviews"). You can pass an optional `wait_for_response=False` boolean to make this non-blocking. |
| `context.remember()`  | Add an arbitrary JSON-serializable object to the conversation memory.                                                                                                         |
| `context.say()`       | Output text to the user's chat window.                                                                                                                                        |
| `context.send_file()` | Send a file to the user from a transition.                                                                                                                                    |
| `context.get_file()`  | Retrieve a file from the user during a transition.                                                                                                                            |

## View The Code

You can see the full code for this example below.

<Accordion title="View The Code">
  ```python theme={null}
  from beam import Bot, BotContext, BotLocation, Image
  from pydantic import BaseModel

  from dotenv import load_dotenv
  import os

  load_dotenv()

  OPEN_AI_API_KEY = os.getenv("OPEN_AI_API_KEY")
  SERPAPI_API_KEY = os.getenv("SERPAPI_API_KEY")
  FIRECRAWL_API_KEY = os.getenv("FIRECRAWL_API_KEY")

  NUMBER_OF_PRODUCT_REVIEWS_TO_SUMMARIZE = 3


  # Define Locations (States)
  class ProductName(BaseModel):
    product_name: str


  class URL(BaseModel):
    url: str


  class ReviewPage(BaseModel):
    review_page: str


  # Create the Bot
  bot = Bot(
    model="gpt-4o",
    api_key=OPEN_AI_API_KEY,
    locations=[
        BotLocation(marker=ProductName),
        BotLocation(marker=URL, expose=False),
        BotLocation(marker=ReviewPage, expose=False),
    ],
    description="This bot will take a product category as input (i.e. 'headphones') and search Google shopping for those products, lookup reviews for each of them, and then summarize the reviews of all products in a summary.",
  )


  # Transition 1: Retrieve 3 Google shopping URLs for each product
  @bot.transition(
    inputs={ProductName: 1},
    outputs=[URL],
    description="Takes a product name and retrieves 5 Google shopping results",
    cpu=1,
    memory=128,
    image=Image(python_packages=["serpapi", "google-search-results", "python-dotenv"]),
  )
  def get_product_urls(context: BotContext, inputs):
    product_name = inputs[ProductName][0].product_name

    from serpapi import GoogleSearch

    params = {
        "engine": "google_shopping",
        "q": product_name,
        "api_key": SERPAPI_API_KEY,
    }

    search = GoogleSearch(params)
    results = search.get_dict()
    urls = results["shopping_results"][:NUMBER_OF_PRODUCT_REVIEWS_TO_SUMMARIZE]

    # Return a product url
    return {URL: [URL(url=url["product_link"]) for url in urls]}


  # Transition 2: Scrape review page
  @bot.transition(
    inputs={URL: 1},
    outputs=[ReviewPage],
    description="Scrapes the review page for each URL provided.",
    cpu=1,
    memory=128,
    image=Image(python_packages=["firecrawl-py", "python-dotenv"]),
    expose=False,
  )
  def scrape_reviews(context: BotContext, inputs):
    url = inputs[URL][0].url

    import json

    from firecrawl import FirecrawlApp

    app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)

    # Scrape reviews from the product page
    scrape_result = app.scrape_url(url, params={"formats": ["markdown"]})
    print(scrape_result)

    return {ReviewPage: [ReviewPage(review_page=json.dumps(scrape_result))]}


  # Transition 3: Summarize the product reviews
  @bot.transition(
    inputs={ReviewPage: NUMBER_OF_PRODUCT_REVIEWS_TO_SUMMARIZE},
    outputs=[],
    description="Summarizes the reviews.",
    cpu=1,
    memory=128,
    image=Image().add_python_packages(["python-dotenv"]),
    expose=False,
  )
  def summarize_reviews(context: BotContext, inputs):
    try:
        all_review_pages = "\n".join(
            [input.review_page for input in inputs[ReviewPage]]
        )

        print(all_review_pages)

        prompt = f"""
            The following page contains markdown with a review for a product.
            Please highlight the key takeaways from all the reviews,
            and include 1-3 direct quotes from reviewers to support your points.
            In each quote, make sure to cite the name of the reviewer (if available).
            Make sure to include the name of the product, and a URL to buy it, in your response:
            {all_review_pages}
            """

        event = context.prompt(
            msg=prompt,
            timeout_seconds=30,
        )

        context.say("I've summarized product reviews like so: " + event.value)

        file_path = "/tmp/product-reviews.md"
        with open(file_path, "w") as f:
            f.write(event.value)

        if context.confirm(description="Do you want a sharable link to the summary?"):
            context.send_file(path=file_path, description="Summary of product reviews")

    except AttributeError:
        context.say("Review not found.")
  ```
</Accordion>


# Mounting S3 Buckets
Source: https://docs.beam.cloud/v2/data/external-storage

Attach S3 buckets to your apps

Beam allows you to mount your own S3 buckets to your apps. Buckets are mounted using AWS's [mountpoint-s3](https://github.com/awslabs/mountpoint-s3). In general, any provider with an S3-compatible API should work. For instance, [AWS S3](https://aws.amazon.com/s3/), [Cloudflare R2](https://www.cloudflare.com/developer-platform/r2/), and [Tigris](https://www.tigrisdata.com/) all work out of the box.

Mountpoint is optimized for reading large files with high throughput and writing new files from a single client at a time. It does not provide full POSIX compliance. For instance, it does not support appending to files.

<Tip>
  Cloud buckets allow you to expose your own S3-compatible storage as a file
  system.
</Tip>

### Mounting an S3 Bucket

External S3 buckets are special cases of the [volume abstraction](/v2/data/volume) with some extra configuration. To connect your bucket to your app, you need to provide the following:

1. The bucket name
2. An AWS access key
3. An AWS secret key
4. An S3 endpoint (if you're using a non-AWS S3-compatible provider)

These will need to have permissions to read, write, and list objects in the bucket. Within AWS, you can use [IAM policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html) to control these permissions. Other S3-compatible providers (like Cloudflare R2 and Tigris) often provide these keys when you sign up for their service.

You will need to store the access key and secret key in Beam's [secret manager](/v2/environment/secrets). You can do this using the Beam CLI:

```bash theme={null}
beam secret create S3_KEY "your-access-key"
beam secret create S3_SECRET "your-secret-key"
```

When a request is received to start the container, Beam looks up these secrets and uses them to mount the bucket. This means that you can use any names you like for the secrets.

<Warning>
  The secrets' names must match the values you enter in the `CloudBucketConfig`.
</Warning>

The endpoint is optional. If you're using AWS S3, you can omit it, but if you're using a non-AWS S3-compatible provider, you will need to provide it.

```python theme={null}
from beam import CloudBucket, CloudBucketConfig, function


mount_path = "./weights"

weights = CloudBucket(
    name="weights",
    mount_path=mount_path,
    config=CloudBucketConfig(
        access_key="S3_KEY",
        secret_key="S3_SECRET",
    ),
)


@function(volumes=[weights])
def sandbox():
    import os
    import uuid

    # Write to the bucket.
    file_name = f"{uuid.uuid4()}.txt"
    file_path = os.path.join(weights.mount_path, file_name)

    try:
        with open(file_path, "w") as f:
            f.write("hello world")
    except Exception as e:
        print(e)

    # Read from the bucket.
    with open(file_path, "r") as f:
        print(f.read())


if __name__ == "__main__":
    sandbox.remote()
```

<Warning>
  The `name` field in the `CloudBucket` constructor must be the name of the
  bucket you created in the cloud provider.
</Warning>

### Read Only Buckets

You can mount your bucket as read only by setting the `read_only` flag to `True`. This will prevent any writes to the bucket.

```python theme={null}
weights = CloudBucket(
    name="weights",
    mount_path="./weights",
    config=CloudBucketConfig(
        access_key="S3_KEY",
        secret_key="S3_SECRET",
        read_only=True,
    ),
)
```

### Specifying a region

You can specify a region for your bucket by setting the `region` field in the `CloudBucketConfig`. This option can be important when mounting AWS buckets.

From mountpoint's [documentation](https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONFIGURATION.md#region-detection):

> Amazon S3 buckets are associated with a single AWS Region. Mountpoint attempts to automatically detect the region for your S3 bucket at startup time and directs all S3 requests to that region. However, in some scenarios like cross-region mount with a directory bucket, this region detection may fail, preventing your bucket from being mounted and displaying Access Denied or No Such Bucket errors.

```python theme={null}
weights = CloudBucket(
    name="weights",
    mount_path="./weights",
    config=CloudBucketConfig(
        access_key="S3_KEY",
        secret_key="S3_SECRET",
        region="us-east-1",
    ),
)
```

### Egress Fees

If your bucket is in a different region than your Beam container, you might get charged egress fees by your cloud provider. You can read more about AWS S3 egress fees [here](https://aws.amazon.com/s3/pricing/).

<Tip>Tigris and Cloudflare R2 do not charge egress fees.</Tip>


# Ephemeral Files and Images
Source: https://docs.beam.cloud/v2/data/output

Storing ephemeral files for images, audio files, and more.

You may want to save data produced by your tasks. Beam provides an abstraction called `Output`, which allows you to save files or directories and generate public URLs to access them.

## Saving Files

To save an `Output`, you can write any filetype to Beam's `/tmp` directory.

Here's what your code might look like:

```python theme={null}
from beam import function, Output


@function()
def save_output():
    # File is saved to /tmp directory
    file_name = "/tmp/my_output.txt"

    # Write to new text file
    with open(file_name, "w") as f:
        f.write("This is an output, a glorious text file.")

    # Save output
    output_file = Output(path=file_name)
    output_file.save()

    # Generate and return a public URL
    public_url = output_file.public_url(expires=400)
    return {"output_url": public_url}
```

### Directories

You can also create public URLs for directories, by passing in a directory path:

```python theme={null}
# Generate a public URL for a directory
file_path = "./tmp/waveforms"
output = Output(path=file_path)
output.save()

# Returns https://app.beam.cloud/output/id/abe0c95a-2cd1-40b3-bace-9225f2c79c6d
output_url = output.public_url()
```

### PIL Images

If your app uses PIL, `Output` includes a wrapper around PIL to simplify the process of generating a public URL for the PIL image file:

```python theme={null}
# Save a PIL image
image = pipe(...)

# Persist the PIL image to an Output
output = Output.from_pil_image(image).save()
```

Here's a full example:

```python theme={null}
from beam import Image as BeamImage, Output, function


@function(
    image=BeamImage(
        python_packages=[
            "pillow",
        ],
    ),
)
def save_image():
    from PIL import Image as PILImage

    # Generate PIL image
    pil_image = PILImage.new(
        "RGB", (100, 100), color="white"
    )  # Creating a 100x100 white image

    # Save image file
    output = Output.from_pil_image(pil_image)
    output.save()

    # Retrieve pre-signed URL for output file
    url = output.public_url(expires=400)
    print(url)

    # Print other details about the output
    print(f"Output ID: {output.id}")
    print(f"Output Path: {output.path}")
    print(f"Output Stats: {output.stat()}")
    print(f"Output Exists: {output.exists()}")

    return {"image": url}


if __name__ == "__main__":
    save_image()
```

When you run this function, it will return a pre-signed URL to the image:

```bash theme={null}
https://app.beam.cloud/output/id/abe0c95a-2cd1-40b3-bace-9225f2c79c6d
```

## Generating Public URLs

Your app might return files from the API, such as images or MP3s. You can use `Output` to generate a public URL to access the content.

<Frame>
  <img />
</Frame>

### Expiring Public URLs

You can pass an optional `expires` parameter to `output.public_url` to control how long to persist the file before it is deleted.

<Info>By default, public URLs are automatically deleted after 1 hour.</Info>

```python theme={null}
# Delete public URL after 5 minutes
output.public_url(expires=300)
```


# Distributed Storage Volumes
Source: https://docs.beam.cloud/v2/data/volume

Attach distributed storage volumes to your apps

Beam allows you to create highly-available storage volumes that can be used across tasks. You might use volumes for things like storing model weights or large datasets.

<Tip>
  Beam Volumes are mounted directly to the containers that run your code, so they are more performant than using cloud object storage.

  We strongly recommend storing your data in Beam Volumes for any data you plan to access from your Beam functions.
</Tip>

## How to Write Files in Beam Containers

There are two use-cases for saving files: *persistent* files, that you want to access between tasks, and *temporary* files that will be deleted when your container spins down.

1. **Persisting Files**: write to a volume.
2. **Temporary Files**: temporary files can be written to the `/tmp` directory in your Beam container, for example you could save an image to `/tmp/myimage.png`.

## Reading and Writing to Volumes

You can read and write to your Volume like any ordinary Python file:

```python theme={null}
from beam import function, Volume


VOLUME_PATH = "./model_weights"


@function(
    volumes=[Volume(name="model-weights", mount_path=VOLUME_PATH)],
)
def access_files():
    # Write files to a volume
    with open(f"{VOLUME_PATH}/somefile.txt", "w") as f:
        f.write("This is being written to a file in the volume")

    # Read files from a volume
    with open(f"{VOLUME_PATH}/somefile.txt", "r") as f:
        print(f.readlines())


if __name__ == "__main__":
    access_files()
```

<Warning>
  It can take up to 60 seconds for any files written to a distributed volume to
  become available to other containers.
</Warning>

To run this code, run `python [filename].py`. You'll see it print the text we just wrote to the file.

```
(.venv) $ python reading_and_writing_data.py

=> Building image
=> Using cached image
=> Syncing files
Reading .beamignore file
Collecting files from /Users/beta9/beam/examples/06_volume
=> Files synced
=> Running function: <reading_and_writing_data:access_files>

['This is being written to a file in the volume']

=> Function complete <e1526222-f665-47a5-9377-6f9036de3951>
```

## Creating a Volume

Volumes can be attached anything you run on Beam.

<Info>
  By default, Volumes are shared across all apps in your Beam account.
</Info>

```python theme={null}
from beam import function, Volume


VOLUME_PATH = "./model_weights"


@function(
    volumes=[Volume(name="model-weights", mount_path=VOLUME_PATH)],
)
def load_model():
    from transformers import AutoModel

    # Load model from cloud storage cache
    AutoModel.from_pretrained(VOLUME_PATH)
```

If you add a volume to your app, it will be created automatically. You can also create volumes manually in the CLI, by using:

```sh theme={null}
$ beam volume create my-volume

  Name         Created At   Updated At   Workspace Name
 ───────────────────────────────────────────────────────
  my-volume    just now     just now     f6fa28
```

## Uploading Data

You can upload files with the CLI using the `beam cp` command.

```sh theme={null}
beam cp [local-file] beam://[volume-name]
```

### Files

```bash theme={null}
beam cp file.txt beam://myvol/              # ./file.txt => beam://myvol/file.txt
beam cp file.txt beam://myvol/file.txt      # ./file.txt => beam://myvol/file.txt
beam cp file.txt beam://myvol/file.new      # ./file.txt => beam://myvol/file.new
beam cp file.txt beam://myvol/hello         # ./file.txt => beam://myvol/hello.txt (keeps the extension)
```

### Directories

```bash theme={null}
beam cp mydir beam://myvol                  # ./mydir/file.txt => beam://myvol/file.txt
beam cp mydir beam://myvol/mydir            # ./mydir/file.txt => beam://myvol/mydir/file.txt
beam cp mydir beam://myvol/newdir           # ./mydir/file.txt => beam://myvol/newdir/file.txt
```

## Downloading Data

### Files

```bash theme={null}
beam cp beam://myvol/file.txt .             # beam://myvol/file.txt => ./file.txt
beam cp beam://myvol/file.txt file.new      # beam://myvol/file.txt => ./file.new
```

### Directories

```bash theme={null}
beam cp beam://myvol/mydir .                # beam://myvol/mydir/file.txt => ./file.txt
```

## CLI Management Commands

### Create a Volume

```bash theme={null}
beam volume create [VOLUME-NAME]
```

```bash theme={null}
$ beam volume create weights

  Name       Created At    Updated At    Workspace Name
 ───────────────────────────────────────────────────────
  weights   May 07 2024   May 07 2024   cf2db0
```

### Delete a Volume

```bash theme={null}
beam volume delete [VOLUME-NAME]
```

```bash theme={null}
$ beam volume delete model-weights

Any apps (functions, endpoints, task queue, etc) that
refer to this volume should be updated before it is deleted.
Are you sure? (y/n) [n]: y

Deleted volume model-weights
```

### List Volumes

```bash theme={null}
beam volume list
```

```bash theme={null}
$ beam volume list

  Name                                Size   Created At   Updated At   Workspace Name
 ─────────────────────────────────────────────────────────────────────────────────────
  weights                       240.23 MiB   2 days ago   2 days ago   cf2db0

  1 volumes | 240.23 MiB used
```

### List Volume Contents

```bash theme={null}
beam ls [VOLUME-NAME]
```

```bash theme={null}
$ beam ls weights

  Name                               Size   Modified Time    IsDir
 ──────────────────────────────────────────────────────────────────
  .locks                           0.00 B   29 minutes ago   Yes
  models--facebook--opt-125m   240.23 MiB   28 minutes ago   Yes

  2 items | 240.23 MiB used
```

### Copy Files to Volumes

```bash theme={null}
beam cp [LOCAL-PATH] beam://[VOLUME-NAME]
```

```bash theme={null}
$ beam cp my-file beam://my-volume

[beam://my-volume/my-file] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.0/10.0 MiB 1.29 MiB/s 0:00:07
```

### Move Files in Volumes

```bash theme={null}
beam mv [SOURCE] [DEST]
```

```bash theme={null}
$ beam mv file.txt files/text-files

Moved file.txt to files/text-files/file.txt
```

### Remove Files from Volumes

```bash theme={null}
beam rm [FILE]
```

```bash theme={null}
=> weights/app.py (1 object deleted)
app.py
```


# Keeping Containers Warm
Source: https://docs.beam.cloud/v2/endpoint/keep-warm

Control how long your apps stay running before shutting down.

By default, Beam is serverless, which means your applications will shut off automatically when they're not being used.

## Configuring Keep Warm

You can control how long your containers are kept alive by using the `keep_warm_seconds` flag in your deployment trigger.

For example, by adding a `keep_warm_seconds=300` argument to an endpoint, your app will stay running for 5 minutes before shutting off:

```python theme={null}
from beam import endpoint


# Container stays alive for 5 min before shutting down automatically
@endpoint(keep_warm_seconds=300)
def handler():
    return {}
```

<Warning>
  When `keep_warm_seconds` is set in your deployment, it will count as billable
  usage.
</Warning>

## Setting Always-On Containers

<Note>
  Any running containers count towards billable usage. Take care to avoid
  setting `min_containers` unless you're comfortable paying for usage 24/7.
</Note>

You can configure the number of containers running at baseline using the `min_containers` field.

By setting `min_containers=1`, 1 container will *always* remain running until the deployment is stopped.

<Warning>
  If you redeploy an app that has `min_containers` set, make sure to explicitly
  stop the previous deployment versions in order to avoid running containers
  that you are no longer using.
</Warning>

```python theme={null}
from beam import endpoint, QueueDepthAutoscaler


@endpoint(
    autoscaler=QueueDepthAutoscaler(
        min_containers=1, max_containers=3, tasks_per_container=1
    ),
)
def handler():
    return {"success": "true"}
```

## Pre-Warming Your Container

You can pre-warm your containers by adding `/warmup` to the end of your deployment URL:

```sh theme={null}
curl -X POST 'https://hello-world-a4bdc39-v1.app.beam.cloud/warmup' \
     -H 'Authorization: Bearer [YOUR_TOKEN]'
```

When invoked, this endpoint will send a request to the container to warm-up.

You can add `/warmup` to the end of any of your deployment APIs to warm-up your container:

```
id/:stubId/warmup
/:deploymentName/warmup
/:deploymentName/latest/warmup
/:deploymentName/v:version/warmup
```

## Default Container Spin-down Times

After handling a request, Beam keeps containers running ("warm") for a certain amount of time in order to quickly handle future requests. By default, these are the container "keep warm" times for each deployment type:

| Deployment Type         | Container Keep Warm Duration |
| ----------------------- | ---------------------------- |
| Endpoints/ASGI/Realtime | 180s                         |
| Task Queues             | 10s                          |
| Pods                    | 600s                         |


# Pre-Loading Models
Source: https://docs.beam.cloud/v2/endpoint/loaders

This guide shows how you can optimize performance by pre-loading models when your container first starts.

Beam includes an optional `on_start` lifecycle hook which you can add to your functions. The `on_start` function will be run exactly once when your container first starts.

```python app.py theme={null}
from beam import endpoint


def download_models():
    # Do something that only needs to happen once
    return {}


# The on_start function runs once when the container starts
@endpoint(on_start=download_models)
def handler():
    return {}
```

Anything returned from `on_start` can be retrieved in the `context` variable that is automatically passed to your handler:

```python theme={null}
from beam import endpoint


def download_models():
    # Do something that only needs to happen once
    x = 10
    return {"x": x}


# The on_start function runs once when the container starts
@endpoint(on_start=download_models)
def handler(context):
    # Retrieve cached values from on_start
    on_start_value = context.on_start_value
    return {}
```

# Example: Downloading Model Weights

```python theme={null}
from beam import Image, endpoint, Volume


CACHE_PATH = "./weights"


def download_models():
    from transformers import AutoTokenizer, OPTForCausalLM

    model = OPTForCausalLM.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)
    tokenizer = AutoTokenizer.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)

    return model, tokenizer


@endpoint(
    on_start=download_models,
    volumes=[Volume(name="weights", mount_path=CACHE_PATH)],
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(
        python_version="python3.8",
        python_packages=[
            "transformers",
            "torch",
        ],
    ),
)
def predict(context, prompt):
    # Retrieve cached model from on_start function
    model, tokenizer = context.on_start_value

    # Generate
    inputs = tokenizer(prompt, return_tensors="pt")
    generate_ids = model.generate(inputs.input_ids, max_length=30)
    result = tokenizer.batch_decode(
        generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False
    )[0]

    print(result)

    return {"prediction": result}
```

## Using Loaders with Multiple Workers

<Tip>
  If you are scaling out vertically with
  [workers](/v2/scaling/concurrency#increasing-throughput-in-a-single-container),
  the loader function will run once for each worker that starts up.
</Tip>


# Creating a Web Endpoint
Source: https://docs.beam.cloud/v2/endpoint/overview

Deploying and invoking web endpoints on Beam

Beam allows you to deploy web endpoints that can be invoked via HTTP requests. These endpoints can be used to run arbitrary code. For instance, you could perform inference using one of our GPUs, or just run a simple function that multiplies two numbers.

```python theme={null}
from beam import endpoint

@endpoint(
    cpu=1.0,
    memory=128,
)
def multiply(**inputs):
    result = inputs["x"] * 2
    return {"result": result}
```

<Tip>
  **Endpoints vs. Task Queues**

  Endpoints are RESTful APIs, designed for synchronous tasks that can complete in 180 seconds or less. For longer running tasks, you'll want to use an asynchronous [`task_queue`](/v2/task-queue/running-tasks) instead.
</Tip>

#### Launch a Preview Environment (Optional)

[`beam serve`](/v2/reference/cli#serve) monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.

Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.

It's also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.

To start an ephemeral `serve` session, you'll use the `serve` command:

```sh theme={null}
beam serve [FILE.PY]:[ENTRY-POINT]
```

For example, to start a session for the `multiply` function in `app.py`, run:

```sh theme={null}
beam serve app.py:multiply
```

To end the session, you can use `Ctrl + C` in the terminal where you started the session.

<Warning>
  Serve sessions end automatically after 10 minutes of inactivity. The entire
  duration of the session is counted towards billable usage, even if the session
  is not receiving requests.
</Warning>

<Tip>
  By default, Beam will sync all the files in your working directory to the
  remote container. This allows you to use the files you have locally while
  developing. If you want to prevent some files from getting uploaded, you can
  create a [`.beamignore`](/v2/reference/cli#ignore-local-files).
</Tip>

### Deploying the Endpoint

When you're finished with prototyping and want to make a persistent deployment of the endpoint, enter your shell and run this command from the working directory:

```bash theme={null}
beam deploy [FILE.PY]:[ENTRY-POINT]
```

After running this command, you'll see some logs in the console that show the progress of your deployment.

<Accordion title="Show Logs">
  ```bash theme={null}
  $ beam deploy app.py:multiply

  => Building image
  => Using cached image
  => Syncing files
  Reading .beamignore file
  => Files synced
  => Deploying endpoint
  => Deployed
  => Invocation details

  curl -X POST 'https://multiply-712408b-v1.app.beam.cloud' \
  -H 'Accept: _/_' \
  -H 'Accept-Encoding: gzip, deflate' \
  -H 'Connection: keep-alive' \
  -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
  -H 'Content-Type: application/json' \
  -d '{}'

  ```
</Accordion>

<Info>
  The container handling the endpoint will spin down after 180 seconds of inactivity by default, or customized with the `keep_warm_seconds` parameter. The container will be billed for the time it is active and handling requests.
</Info>

### Calling the Endpoint

After deploying the API, you'll be able to make a web request to hit the API with cURL or libraries of your choice.

<Tabs>
  <Tab title="cURL">
    Open another terminal window to invoke the API:

    ### Example Request

    ```sh theme={null}
    curl -X POST 'https://multiply-712408b-v1.app.beam.cloud' \
    -H 'Accept: */*' \
    -H 'Accept-Encoding: gzip, deflate' \
    -H 'Connection: keep-alive' \
    -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
    -H 'Content-Type: application/json' \
    -d '{"x": 10}'
    ```

    ### Example Response

    ```json theme={null}
    {
      "result": 20
    }
    ```
  </Tab>

  <Tab title="Python">
    In Python, you can use the `requests` library to make a POST request to the endpoint:

    ```python theme={null}
    import requests

    url = "https://multiply-712408b-v1.app.beam.cloud"
    headers = {
      "Connection": "keep-alive",
      "Content-Type": "application/json",
      "Authorization": "Bearer [YOUR_AUTH_TOKEN]",
    }
    data = {"x": 10}

    response = requests.post(url, headers=headers, json=data)
    print(response.json())
    ```

    ### Example Response

    ```json theme={null}
    { "result": 20 }
    ```
  </Tab>
</Tabs>

To send other payloads other than JSON, you can encode the data as a base64 string and include it in the JSON payload, or upload the file to a S3 bucket and mount the bucket to the endpoint.

For more detailed examples, checkout the [Sending File Payloads](/v2/endpoint/sending-file-payloads) documentation.

```
```


# Realtime and Streaming
Source: https://docs.beam.cloud/v2/endpoint/realtime


## Deploying a Realtime App

This is a simple example of a realtime streaming app. When deployed, this app will be exposed as a public websocket endpoint.

The `realtime` handler accepts a single parameter, called `event`, with the event payload.

<Tip>
  The `realtime` decorator is an abstraction above `asgi`.

  This means that additional parameters in `asgi`, such as [`concurrent_requests`](/v2/endpoint/web-server#concurrent-requests) can be used too.
</Tip>

```python app.py theme={null}
from beam import realtime


@realtime(
    cpu=1,
    memory="1Gi",
    concurrent_requests=10, # Process 10 requests at a time 
    authorized=False, # Don't require auth to invoke 
)
def stream(event):
    # Echo back the event payload sent to the websocket
    return {"response": event}
```

This app can be deployed in traditional Beam fashion:

```sh theme={null}
beam deploy app.py:stream
```

## Streaming Responses from the Client

Realtime Endpoints can be connected to from any websocket client.

<Tabs>
  <Tab title="Beam Javascript SDK">
    The code below uses the Beam Javascript SDK to send requests to the realtime app.

    Make sure to add an `.env` file to your project with your `BEAM_DEPLOYMENT_ID` and `BEAM_TOKEN`:

    ```javascript client.js theme={null}
    import beam from "@beamcloud/beam-js";

    const streamResponse = async () => {
      const client = await beam.init(process.env.BEAM_TOKEN);
      const deployment = await client.deployments.get({ id: process.env.BEAM_DEPLOYMENT_ID });

      const connection = await deployment.realtime();
      
      const payload = {
        "event": "Echo this back",
      }

      connection.onmessage = (message) => {
          console.log(`Response: ${message.data}`);
      };

      connection.send(JSON.stringify(payload));

      setTimeout(() => {
        connection.close();
      }, 1000);
    };

    streamResponse();
    ```
  </Tab>

  <Tab title="Javascript">
    The code below uses the native WebSocket API to send requests to the realtime app.

    ```javascript client.js theme={null}
    const socket = new WebSocket("wss://1c0f0cbe-e0d1-49ae-a556-5daffe23eb4c.app.beam.cloud");

    // Connection opened
    socket.addEventListener("open", (event) => {
      socket.send("Hello Server!");
    });

    // Listen for messages
    socket.addEventListener("message", (event) => {
      console.log(event.data); // {"response":"Hello Server!"}
    });
    ```
  </Tab>
</Tabs>


# Sending File Payloads
Source: https://docs.beam.cloud/v2/endpoint/sending-file-payloads

Sending file payloads to Endpoints and Web Servers

There are two easy ways to send files to your Beam endpoints and ASGI web servers.

## Sending Files to Endpoints Using Base64

The simplest way to send files to your Beam endpoint is to use Base64 encoding. In the example below, we will use this method to send an image to an endpoint. The first step is to define an endpoint that accepts an encoded string.

```python theme={null}
import base64
import io
from beam import endpoint
from beam import Image as BeamImage
from PIL import Image

@endpoint(name="image_endpoint", image=BeamImage().add_python_packages(["pillow"]))
def image_endpoint(image: str):
    image = base64.b64decode(image)
    image = Image.open(io.BytesIO(image))
    # do something with the image
    return {"message": "Image processed successfully"}
```

We can then deploy our endpoint with the command `beam deploy app.py:image_endpoint`. The simple script below can be used to send an image to the endpoint.

<CodeGroup>
  ```python Python theme={null}
  import base64
  import requests

  with open("./cool-picture.png", "rb") as image_file:
      encoded_string = base64.b64encode(image_file.read())
      b64_image = encoded_string.decode("utf-8")

  url = "https://image-endpoint-53b4230-v1.app.beam.cloud"
  headers = {
      "Connection": "keep-alive",
      "Content-Type": "application/json",
      "Authorization": "Bearer <your-token>",
  }
  data = {"image": b64_image}

  response = requests.post(url, headers=headers, json=data)
  ```

  ```bash Curl theme={null}
  export B64_FILE=$(base64 -i ./cool-picture.png)
  curl -X POST "https://image-endpoint-53b4230-v1.app.beam.cloud" \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <your-token>' \
  -d '{"image": "$B64_FILE"}'
  ```
</CodeGroup>

## Using S3 to Send Files

With Beam, you can easily [mount S3 buckets](/v2/data/external-storage) to your endpoints and web servers. This allows you to upload files to S3 and access them in your endpoint or web server. This method is recommended if you are sending large payloads (20+ MB). Another benefit of using S3 is that you will not need to include decoding logic in your endpoint.

We can modify our previous example by accepting a filename and reading the image from a mounted S3 bucket. Our frontend will need to upload the image to the S3 bucket and then pass the filename to our endpoint.

```python theme={null}
import os
from beam import CloudBucket, CloudBucketConfig, endpoint
from beam import Image as BeamImage
from PIL import Image

mount_path = "./uploads"
uploads = CloudBucket(
    name="uploads",
    mount_path=mount_path,
    config=CloudBucketConfig(
        access_key="BEAM_S3_KEY",
        secret_key="BEAM_S3_SECRET",
    ),
)

@endpoint(name="image_endpoint", image=BeamImage().add_python_packages(["pillow"]), volumes=[uploads])
def image_endpoint(image_name: str):
    image_path = os.path.join(uploads.mount_path, image_name)
    image = Image.open(image_path)
    # do something with the image
    return {"message": "Image processed successfully"}
```

In order to correctly mount the S3 bucket, we need to make sure that our secrets are set. We can do this using the Beam CLI.

```bash theme={null}
beam secret create BEAM_S3_KEY "your-access-key"
beam secret create BEAM_S3_SECRET "your-secret-key"
```

Once again, we can deploy our endpoint with the command `beam deploy app.py:image_endpoint`.

To test this method, we can upload an image to the S3 bucket using the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html) and then pass the filename to our endpoint.

```bash theme={null}
aws s3 cp ./test.png s3://uploads/
```

The image will be uploaded to the S3 bucket and the endpoint will be able to read it. We can verify this by invoking our endpoint with the filename.

```bash theme={null}
curl -X POST "https://image-endpoint-53b4230-v1.app.beam.cloud" \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <your-token>' \
-d '{"image_name": "test.png"}'
```


# Versioning
Source: https://docs.beam.cloud/v2/endpoint/versioning


Deployment URLs are versioned in this format:

`https://[APP-NAME]-[APP-ID]-[VERSION].app.beam.cloud`

### Accessing the Latest Version

The latest version of your app will always be available at the root URL. For example, by removing the `-v1` suffix, this will invoke the latest version:

```
https://multiply-712408b.app.beam.cloud
```

### Invoking Specific Versions

You can invoke specific versions of your apps by specifying the version in the app URL.

Here are some examples:

* To invoke latest: `https://multiply-712408b.app.beam.cloud`
* To invoke version `3`: `https://multiply-712408b-v3.app.beam.cloud`
* To invoke version `17`: `https://multiply-712408b-v17.app.beam.cloud`


# Hosting a Web Server
Source: https://docs.beam.cloud/v2/endpoint/web-server

Deploying web servers on Beam

With Beam, you can deploy web servers that use the [ASGI](https://asgi.readthedocs.io/en/latest/introduction.html) protocol. This means that you can deploy applications built with popular frameworks like FastAPI and Django.

## Multiple Endpoints Per App

In the example below, we are deploying a FastAPI web server that uses the Huggingface Transformers library to perform sentiment analysis and text generation.

We also included a warmup endpoint so that we can preemptively get our container ready for incoming requests.

<Info>
  This example uses Pydantic to serialize request inputs. [You can read more
  about it here](https://fastapi.tiangolo.com/tutorial/body/).
</Info>

```python app.py theme={null}
from beam import Image, asgi
from pydantic import BaseModel


# Request payload for API, declared with Pydantic
class GenerateInput(BaseModel):
    text: str
    max_length: int


class SentimentInput(BaseModel):
    text: str


def init_models():
    from transformers import pipeline

    model = "gpt2"

    # Initialize two simple models
    sentiment_analyzer = pipeline("sentiment-analysis")
    text_generator = pipeline("text-generation", model="gpt2")

    return sentiment_analyzer, text_generator, model


@asgi(
    name="sentiment-and-generation",
    image=Image(python_packages=["transformers", "torch", "fastapi", "pydantic"]),
    on_start=init_models,
    memory=2048,
)
def handler(context):
    import asyncio

    from fastapi import FastAPI, Query

    app = FastAPI()

    sentiment_analyzer, text_generator, generate_model = context.on_start_value

    @app.post("/sentiment")
    async def analyze_sentiment(input: SentimentInput):
        # Unpack request input and send to ML model
        result = sentiment_analyzer(input.text)
        return result

    @app.post("/generate")
    async def generate_text(input: GenerateInput):
        result = text_generator(input.text, max_length=input.max_length)
        return result

    @app.post("/warmup")
    async def warmup():
        return {"status": "warm"}

    return app
```

<Note>
  As shown above, the handler function for the web server must return the ASGI
  app object.
</Note>

## Launch a Preview Environment (Optional)

Just like an endpoint, you can prototype your web server using [`beam serve`](/v2/reference/cli#serve). This command will monitor changes in your local file system, live-reload the remote environment as you work, and forward remote container logs to your local shell.

```sh theme={null}
beam serve app.py:web_server
```

<Warning>
  Serve sessions end automatically after 10 minutes of inactivity. The entire
  duration of the session is counted towards billable usage, even if the session
  is not receiving requests.
</Warning>

## Deploying the Web Server

When you are ready to deploy your web server, run the following command:

```bash theme={null}
beam deploy app.py:web_server
```

You'll see some logs in the console that show the progress of your deployment.

```bash theme={null}
=> Building image
=> Syncing files
...
=> Invocation details
curl -X POST 'https://sentiment-and-generation-53b4230-v1.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
-d '{}'
```

<Info>
  The container handling the app will spin down after 180 seconds of inactivity
  by default, or customized with the `keep_warm_seconds` parameter. The
  container will be billed for the time it is active and handling requests.
</Info>

## Sending Requests

If we wanted to perform sentiment analysis using our deployed example from above, we would send a POST request like this:

```bash theme={null}
curl -X POST 'https://sentiment-and-generation-53b4230-v1.app.beam.cloud/generate' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
-d '{"text": "The meaning of life is "}'
```

## Concurrent Requests

When building an ASGI app, you can specify the number of concurrent requests your app can handle using the `concurrent_requests` parameter in the `@asgi` decorator.

```python theme={null}
@asgi(
    name="sentiment-and-generation",
    image=Image(python_packages=["transformers", "torch", "fastapi", "pydantic"]),
    on_start=init_models,
    memory=1024,
    concurrent_requests=10
)
```

This allows you to increase the number of requests your app can handle at once, which can help you achieve higher throughput. For instance, if your app is doing I/O-bound work, additional requests can be handled while your I/O operations complete in the background.

We can simulate this by adding a `model` endpoint that pretends to do some expensive I/O to our example from above.

```python theme={null}
@app.get("/model")
async def model(model: str = Query(...)):
    # Pretend we're doing expensive I/O here to demonstrate the value of concurrent requests
    await asyncio.sleep(10)
    return {"model": model}
```

Now, if you send a request to `model` and then send another request to `generate`, you will see that the second request will complete before the first.

<CodeGroup>
  ```bash Model Request theme={null}
  curl -X GET 'https://sentiment-and-generation-53b4230-v1.app.beam.cloud/model?model=gpt2' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
  ```

  ```bash Generate Request theme={null}
  curl -X POST 'https://sentiment-and-generation-53b4230-v1.app.beam.cloud/generate' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
  -d '{"text": "Bananas are ", "max_length": 50}'
  ```
</CodeGroup>

## Response Types

Beam supports various response types, including any FastAPI response type. [You can find a list of FastAPI response types here](https://fastapi.tiangolo.com/advanced/custom-response/).

## Uploading Local Files

If your web server needs access to local files like model weights or other resources, you can use [Beam volumes](/v2/data/volume).

To add files to a volume, you can use the `beam cp` command.

```bash theme={null}
beam cp [local-file] beam://[volume-name]
```

Then, you can define a volume and pass it into your `@asgi` decorator like this:

```python theme={null}
from beam import asgi, Volume, Image

@asgi(
    name="sentiment-analysis",
    image=Image(python_packages=["fastapi"]),
    volumes=[Volume(name="model-weights", mount_path="./model_weights")],
)
def web_server():
    from fastapi import FastAPI

    app = FastAPI()

    @app.get("/")
    async def root():
        with open("./model_weights/somefile.txt", "r") as f:
            return {"message": f.read()}

    return app
```


# Container Images
Source: https://docs.beam.cloud/v2/environment/custom-images


Applications on Beam are run inside *containers*. A container is like a lightweight VM that packages a set of software packages required by your application. The benefit of using containers is portability. The required runtime environment is packaged alongside the application.

Containers are based on container *images* which are instructions for how a container should be built.

Because you are building a custom application, it is likely that your application depends on some custom software to run. This could include custom python packages, libraries, binaries, and drivers.

You can customize the container image used to run your Beam application with the [`Image`](/v2/reference/py-sdk#image) class. The options specified in the `Image` class will influence how the image is built.

## Exploring the Beam Image Class

Every application that runs on Beam instantiates the [`Image`](/v2/reference/py-sdk#image) class. This class provides a variety of methods for customizing the container image used to run your application.

It exposes options for:

* Installing a specific version of Python
* Adding custom shell commands that run during the build process
* Adding custom Python packages to install in the container
* Choosing a custom base image to build on top of
* Using a custom Dockerfile to build your own base image
* Setting up a custom conda environment using micromamba

<Tip>
  The default Beam image uses `ubuntu:22.04` as its base and installs Python
  3.10.
</Tip>

```python theme={null}
from beam import function, Image

image = Image()

# This function will use ubuntu:22.04 with Python 3.10
@function(image=image)
def hello_world():
    return "Hello, world!"

hello_world.remote()
```

## Adding Python Packages

The most common way to customize your image is to add the Python packages required by your application. This is done by calling the `add_python_packages` method on the `Image` object with a list of package names.

<Tip>
  Pinning the version of the package is recommended. This ensures that when you
  re-deploy your application, you won't accidentally pick up a new version that
  breaks your application.
</Tip>

```python theme={null}
from beam import Image, endpoint

image = Image(python_version="python3.11").add_python_packages(["numpy==2.2.0"])

@endpoint(image=image)
def handler():
  return {}
```

### Importing `requirements.txt`

If you already have a `requirements.txt` file, you can also use that directly using the `Image` constructor's `python_packages` parameter:

```python theme={null}
from beam import Image, endpoint

image = Image(python_version="python3.11", python_packages="requirements.txt")

@endpoint(image=image)
def handler():
  return {}
```

## Adding Shell Commands

Sometimes, it is necessary to run additional shell commands while building your image. This can be achieved by calling the `add_commands` method on the `Image` object with a list of commands.

For instance, you might need to install `libjpeg-dev` when using the `Pillow` library. In the example below, we'll install `libjpeg-dev` and then install `Pillow`.

```python theme={null}
from beam import Image, endpoint

image = (
    Image(python_version="python3.11")
    .add_commands(["apt-get update", "apt-get install libjpeg-dev -y"])
    .add_python_packages(["Pillow"])
)

@endpoint(image=image)
def handler():
  return {}
```

## Customizing the Base Image

Some applications and libraries require specific dependencies that are not available in the default Beam image. In these cases, you can use a custom base image.

Some of the most common custom base images are the CUDA development images from NVIDIA (e.g. `nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04`). These images come with additional libraries, debugging tools, and `nvcc` installed.

The image below will use a custom CUDA image as the base.

```python theme={null}
from beam import Image, function

image = Image(
    base_image="nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04"
)

@function(image=image)
def hello_world():
    return "Hello, world!"

hello_world.remote()
```

### CUDA Drivers & NVIDIA Kernel Drivers

When choosing a custom base image, it is important to understand the difference between the NVIDIA Kernel Driver and the CUDA Runtime & Libraries.

| **Component**                | **Location**     | **Role**                                                 |
| ---------------------------- | ---------------- | -------------------------------------------------------- |
| **NVIDIA Kernel Driver**     | **Host Machine** | Low-level GPU management, talks directly to hardware.    |
| **CUDA Runtime & Libraries** | **Container**    | Provides high-level APIs and libraries for applications. |

The NVIDIA Kernel Driver on the host must support the CUDA version used by the container.

In general, if the CUDA version on the host is greater than or equal to the CUDA version in the container, then the NVIDIA Kernel Driver on the host will support the CUDA version used by the container.

<Tip>
  For example, using a CUDA 12.2 image on a host with a CUDA 12.4 driver will
  work. However, using a CUDA 12.8 image on a host with a CUDA 12.4 driver *will
  not* work.
</Tip>

You can consult the table below to help you choose a compatible base image.

| GPU     | Driver Version | CUDA Version |
| ------- | -------------- | ------------ |
| A10G    | 550.90.12      | 12.4         |
| RTX4090 | 550.127.05     | 12.4         |
| H100    | 550.127.05     | 12.4         |

## Using a Specific Python Version

To install a specific version of Python, you can use the `python_version` parameter:

```python theme={null}
from beam import function, Image


# This function will use ubuntu:22.04 with Python 3.11
@function(image=Image(python_version="python3.11"))
def hello_world():
    return "Hello, world!"

hello_world.remote()
```

This function will use the CUDA image as the base and install Python 3.10 because no `python_version` is specified and the CUDA image has no Python version installed.

```python theme={null}
from beam import Image, function


@function(
    image=Image(
        base_image="nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04",
    )
)
def custom_image_no_python():
    return "Hello, world!"
```

This function will use the CUDA image as the base and install Python 3.11 because a `python_version` *is* specified.

```python theme={null}
from beam import Image, function


@function(
    image=Image(
        base_image="nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04",
        python_version="python3.11",
    )
)
def custom_image_python_requested():
    return "Hello, world!"
```

If your image comes with a pre-installed version of Python3, it will be used by default *as long as* you don't specify a `python_version` in your `Image` constructor. This function will use the PyTorch image as the base and will use the Python version that already exists in the PyTorch image.

```python theme={null}
from beam import Image, function


@function(
    image=Image(
        base_image="docker.io/pytorch/pytorch:2.2.1-cuda12.1-cudnn8-devel"
    )
)
def custom_image_pytorch():
    return "Hello, world!"
```

## Building on GPU

By default, Beam builds your images on CPU-only machines. However, sometimes you might need the build to occur on a machine with a GPU.

For instance, some libraries might compile CUDA kernels during installation. In these cases, you can use the `build_with_gpu()` command to run your build on the GPU of your choice.

```python theme={null}
from beam import Image

image = (
    Image()
    .add_commands(
        [
            "apt-get update -y",
            "apt-get install ffmpeg -y",
            "apt-get install nvidia-cuda-toolkit -y", # Requires GPU to install
        ]
    )
    .build_with_gpu(gpu="T4") # Install on a T4
)
```

## Building with Environment Variables

Often, shell commands require certain environment variables to be set. You can set these using the `with_envs` command:

```python theme={null}
from beam import Image

image = (
    Image()
    .add_python_packages(["huggingface_hub[cli]", "accelerate"])
    .with_envs(["HF_HUB_ENABLE_HF_TRANSFER=1", "HF_HOME"=/models])
    .add_commands(["huggingface-cli download meta-llama/Llama-3.2-3B"])
)
```

### Injecting Secrets

Sometimes, you might not want the environment variables to be set in plain text. In these cases, you can leverage Beam secrets and the `with_secrets` command:

<Tip>
  You can create secrets like this, using the CLI: `beam secret create HF_TOKEN <your_token>`.
</Tip>

```python theme={null}
from beam import Image

image = (
    Image()
    .add_python_packages(["huggingface_hub[cli]", "accelerate"])
    .with_envs(["HF_HUB_ENABLE_HF_TRANSFER=1", "HF_HOME"=/models])
    .with_secrets(["HF_TOKEN"]) # Models with a user agreement often require a token
    .add_commands(["huggingface-cli download meta-llama/Llama-3.2-3B"])
)
```

**Note** Adding secrets and environment variables to the build environment *does not* make them available in the runtime environment.

Runtime environment variables and secrets must be specified in the function decorator directy:

```python theme={null}
from beam import function

@function(env_vars={"HF_HOME": "/models"}, secrets=["HF_TOKEN"])
def download_model():
    return "Hello, world!"
```

## Using a Dockerfile

You also have the option to build your own custom base image using a Dockerfile.

The `from_dockerfile()` command accepts a path to a valid Dockerfile as well as an optional path to a context directory:

```python theme={null}
from beam import Image, endpoint

image = Image().from_dockerfile("./Dockerfile").add_python_packages(["numpy"])


@endpoint(image=image, name="test_dockerfile")
def handler():
  return {}
```

The context directory serves as the root for any paths used in commands like `COPY` and `ADD`, meaning all relative paths are relative to this directory.

The image built from your Dockerfile will be used as the base image for a Beam application.

<Info>
  Ports *will not* be exposed in the runtime environment, and the entrypoint
  will be overridden.
</Info>

## Conda Environments

Beam supports using Anaconda environments via [micromamba](https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html). To get started, you can chain the `micromamba` method to your `Image` definition and then specify packages and channels via the `add_micromamba_packages` method.

```python theme={null}
from beam import Image


image = (
    Image(python_version="python3.11")
    .micromamba()
    .add_micromamba_packages(packages=["pandas", "numpy"], channels=["conda-forge"])
    .add_python_packages(packages=["huggingface-hub[cli]"])
    .add_commands(commands=["micromamba run -n beta9 huggingface-cli download gpt2 config.json"])
)
```

You can still use `pip` to install additional packages in the `conda` environment and you can run shell commands too.

<Tip>
  If you need to run a shell command inside the conda environment, you should
  prepend the command with `micromamba run -n beta9` as shown above.
</Tip>


# Custom Registries
Source: https://docs.beam.cloud/v2/environment/custom-registries


Beam supports importing images from custom public and private registries.

## Public Docker Registries

You can import existing images from remote Docker registries, like [Docker Hub](https://hub.docker.com/search?q=), [Google Artifact Registry](https://cloud.google.com/artifact-registry), [ECR](https://aws.amazon.com/ecr/), [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [NVIDIA](https://catalog.ngc.nvidia.com/containers) and more.

Just supply a `base_image` argument to [`Image`](/v2/reference/py-sdk#image).

```python theme={null}
from beam import endpoint, Image

image = (
    Image(
        base_image="docker.io/nvidia/cuda:12.3.1-runtime-ubuntu20.04",
        python_version="python3.9",
    )
    .add_commands(["apt-get update -y", "apt-get install neovim -y"])
    .add_python_packages(["torch"])
)


@endpoint(image=image)
def handler():
    import torch

    return {"torch_version": torch.__version__}
```

<Warning>
  Beam only supports Debian-based images. In addition, make sure your image is
  built for the correct x86 architecture.
</Warning>

## Private Docker Registries

Beam supports importing images from the following private registries: [AWS ECR](https://aws.amazon.com/ecr/), [Google Artifact Registry](https://cloud.google.com/artifact-registry), [Docker Hub](https://hub.docker.com/), [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), and [NVIDIA Container Registry](https://catalog.ngc.nvidia.com/containers).

Private registries require credentials, and you can pass the credentials to Beam in two ways: as a dictionary, or exported from your shell so Beam can automatically lookup the values.

**Passing Credentials as a Dictionary**

You can provide the values for the registry as a dictionary directly, like this:

```python theme={null}
from beam import Image


image = Image(
    base_image="111111111111.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
    base_image_creds={
      "AWS_ACCESS_KEY_ID": "xxxx",
      "AWS_SECRET_ACCESS_KEY": "xxxx",
      "AWS_REGION": "xxxx"
    },
)
```

**Passing Credentials from your Environment**

Alternatively, you can export your credentials in your shell and pass the environment variable names to `base_image_creds` as a list:

```python theme={null}
from beam import Image


image = Image(
    base_image="111111111111.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
    base_image_creds=[
        "AWS_ACCESS_KEY_ID",
        "AWS_SECRET_ACCESS_KEY",
        "AWS_SESSION_TOKEN",
        "AWS_REGION"
    ],
)
```

### AWS ECR

To use a private image from Amazon ECR, export your AWS environment variables. Then configure the Image object with those environment variables.

<Tip>
  You can authenticate with either your static AWS credentials or an AWS STS
  token. If you use the AWS STS token, your `AWS_SESSION_TOKEN` key must also be
  set.
</Tip>

```python theme={null}
from beam import Image

image = Image(
    python_version="python3.12",
    base_image="111111111111.dkr.ecr.us-east-1.amazonaws.com/my-image:latest",
    base_image_creds=["AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY", "AWS_REGION"]
)

@endpoint(image=image)
def handler():
    pass
```

### GCP Artifact Registry

To use a private image from Google Artifact Registry, export your access token.

```sh theme={null}
export GCP_ACCESS_TOKEN=$(gcloud auth print-access-token --project=my-project)
```

Then configure the Image object to use the environment variable.

```python theme={null}
from beam import Image

image = Image(
    python_version="python3.12",
    base_image="us-east4-docker.pkg.dev/my-project/my-repo/my-image:0.1.0",
    base_image_creds=["GCP_ACCESS_TOKEN"]
)

@endpoint(image=image)
def handler():
    pass
```

### NVIDIA GPU Cloud (NGC)

To use a private image from NVIDIA GPU Cloud, export your API key.

```sh theme={null}
export NGC_API_KEY=abc123
```

Then configure the Image object to use the environment variable.

```python theme={null}
from beam import Image

image = Image(
    python_version="python3.12",
    base_image="nvcr.io/nvidia/tensorrt:24.10-py3",
    base_image_creds=["NGC_API_KEY"]
)

@endpoint(image=image)
def handler():
    pass
```

### Docker Hub

To use a private image from Docker Hub, export your Docker Hub credentials.

```sh theme={null}
export DOCKERHUB_USERNAME=user123
export DOCKERHUB_PASSWORD=pass123
```

Then configure the Image object with those environment variables.

```python theme={null}
from beam import Image

image = Image(
    python_version="python3.12",
    base_image="docker.io/my-org/my-image:0.1.0",
    base_image_creds=["DOCKERHUB_USERNAME", "DOCKERHUB_PASSWORD"]
)

@endpoint(image=image)
def handler():
    pass
```

### GitHub Container Registry

To use a private image from GitHub Container Registry, export your GitHub credentials. You will need a [personal access token](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#authenticating-to-the-container-registry).

```sh theme={null}
export GITHUB_USERNAME=user123
export GITHUB_TOKEN=token123
```

Then configure the Image object with those environment variables.

```python theme={null}
from beam import Image

image = Image(
    python_version="python3.12",
    base_image="ghcr.io/my-username/my-image:0.1.0",
    base_image_creds=["GITHUB_USERNAME", "GITHUB_TOKEN"]
)

@endpoint(image=image)
def handler():
    pass
```


# GPU Acceleration
Source: https://docs.beam.cloud/v2/environment/gpu


## Running Tasks on GPU

You can run any code on a cloud GPU by passing a `gpu` argument in your function decorator.

```python theme={null}
from beam import endpoint


@endpoint(gpu="H100")
def handler():
    # Prints the available GPU drivers
    import subprocess
    print(subprocess.check_output(["nvidia-smi"], shell=True))

    return {"gpu":"true"}
```

### Available GPUs

Currently available GPU options are:

* `A10G` (24Gi)
* `RTX4090` (24Gi)
* `H100` (80Gi)

### Check GPU Availability

Run `beam machine list` to check whether a machine is available.

```bash theme={null}
$ beam machine list

  GPU Type   Available
 ──────────────────────
  A10G          Yes
  RTX4090       Yes
```

## Prioritizing GPU Types

You can split traffic across multiple GPUs by passing a list to the `gpu` parameter.

The list is ordered by priority. You can choose which GPUs to prioritize by specifying them at the front of the list.

```python theme={null}
gpu=["T4", "A10G", "H100"]
```

In this example, the `T4` is prioritized over the `A10G`, followed by the `H100`.

## Using Multiple GPUs

You can run workloads across multiple GPUs by using the `gpu_count` parameter.

<Warning>
  This feature is available *by request only*. Please send us a message in
  Slack, and we'll enable it on your account.
</Warning>

```python theme={null}
from beam import endpoint


@endpoint(gpu="A10G", gpu_count=2)
def handler():
    return {"hello": "world"}
```

## GPU Regions

Beam runs on servers distributed around the world, with primary locations in the United States, Europe, and Asia. If you would like your workloads to run in a specific region of the globe, [please reach out](https://join.slack.com/t/beam-cloud/shared_invite/zt-3enuvj3r7-OeAzVPYvyqQHy9avNrLL0w).


# Working in Jupyter Notebooks
Source: https://docs.beam.cloud/v2/environment/jupyter-notebook


You can run Beam functions from Jupyter Notebook cells, which is useful for outsourcing heavy computation to Beam's serverless cloud.

Beam works in local notebooks and cloud notebooks like Google Colab.

<video />

<Card title="View the Example Notebook" icon="github" href="https://github.com/beam-cloud/examples/blob/main/jupyter_notebooks/beam-notebook.ipynb">
  Try out an example Jupyter Notebook
</Card>

## Initial Setup

The first is installing `beam-client` and adding your Beam credentials to the notebook:

```
# Colab Setup: Install beam-client
!pip install beam-client

# Import the Beam client
import beam

# Add your Beam API key
!beam configure default --token [YOUR-BEAM-TOKEN]

!beam config select default
```

## Running Functions

Your local notebook server has access to the Beam credentials on your computer, so you can run Beam functions in the notebook cells like you normally would.

You can run GPU accelerated functions, mount storage volumes, and use the full-functionality of Beam from the notebook.

<Frame>
  <img />
</Frame>

## Launching a Local Notebook Server

You can spin up a local Jupyter notebook server using the `jupyter` CLI.

**If you already have a local Jupyter environment, you can skip this step.**

If you don't have it installed yet, you can do it with `pip`:

```python theme={null}
pip3 install --upgrade pip && pip3 install jupyter
```

Launch the notebook server. The typically opens the server on `localhost:8888`.

```sh theme={null}
jupyter notebook
```


# Remote vs. Local Environment
Source: https://docs.beam.cloud/v2/environment/remote-versus-local


## Differences Between the Remote and Local Environments

Typically, your apps that run on Beam will be using packages that you don't have installed locally.

If your Beam app uses packages that aren't installed locally, you'll need to ensure your Python interpreter doesn't try to load these packages locally.

## Avoiding Import Errors

There are two ways to avoid import errors when using packages that aren't installed locally.

### Import Packages Inline

Importing packages inline is safe because the functions will only be invoked in the remote Beam environment that has these packages installed.

```python theme={null}
from beam import endpoint, Image


@endpoint(image=Image(python_packages=["torch", "pandas", "numpy"]))
def handler():
    import torch
    import pandas
    import numpy
```

### Use `env.is_remote()`

An alternative to using inline imports is to use a special check called `env.is_remote()` to conditionally import packages *only* when inside the remote environment.

```python theme={null}
from beam import env


if env.is_remote():
    import torch
    import pandas 
    import numpy 
```

This command checks whether the Python script is running remotely on Beam, and will only try to import the packages in its scope if it is.

<Warning>
  While it might be tempting to use the `env.is_remote()` flag for other logic in your app, this command should only be used for package imports.
</Warning>


# CPU and RAM
Source: https://docs.beam.cloud/v2/environment/resources


## Configuring CPU and Memory

In addition to choosing a GPU, you can choose the amount of CPU and Memory to allocate:

```python theme={null}
from beam import function

@function(cpu=2, memory="2Gi")
def some_function():
    pass
```

*GPU graphics cards* have VRAM and run on *servers* with RAM.

### RAM vs. VRAM

VRAM is the amount of memory available on the GPU device. For example, if you are running inference on a 13B parameter LLM, you'll usually need at least 40Gi of VRAM in order for the model to be loaded onto the GPU.

In contrast, RAM is responsible for the *amount of data* that can be stored and accessed by the CPU on the server. For example, if you try downloading a 20Gi file, you'll need sufficient disk space and RAM.

In the context of LLMs, here are some approximate guidelines for resources to use in your apps:

| LLM Parameters | Recommended CPU | Recommended Memory (RAM) | Recommended GPU  |
| -------------- | --------------- | ------------------------ | ---------------- |
| 0-7B           | 2               | 32Gi                     | A10G (24Gi VRAM) |
| 7-14B+         | 4               | 32Gi                     | H100 (80Gi VRAM) |

### Monitoring Resource Usage

In the web dashboard, you can monitor the amount of CPU, Memory, and GPU memory used for your tasks.

On a deployment, click the `Metrics` button.

<Frame>
  <img />
</Frame>

On this page, you can see the resource usage over time. The graph will also show the periods when your resource usage exceeded the resource limits set on your app:

<Frame>
  <img />
</Frame>


# Storing Secrets
Source: https://docs.beam.cloud/v2/environment/secrets

How to store secrets and environment variables in Beam

### Storing Secrets and Environment Variables

Secrets and environment variables can be injected into the containers that run your apps.

You can manage secrets through the CLI:

```bash theme={null}
$ beam secret create AWS_ACCESS_KEY ASIAY34FZKBOKMUTVV7A

=> Created secret with name: 'AWS_ACCESS_KEY'
```

### Using Secrets

Once created, you can access a secret like an environment variable:

```python theme={null}
from beam import function


@function(secrets=["AWS_ACCESS_KEY"])
def handler():
    import os

    my_secret = os.environ["AWS_ACCESS_KEY"]
    print(f"Secret: {my_secret}")
```

### Passing Secrets to `on_start`

If your app used an `on_start` function, secrets can be passed to that function as well.

```python theme={null}
from beam import endpoint


# This has access to secrets passed down in the handler
def load_models():
    import os

    my_secret = os.environ["AWS_ACCESS_KEY"]
    print("The function can read secrets:", my_secret)


@endpoint(
    secrets=["AWS_ACCESS_KEY"],
    on_start=load_models,
)
def handler(context):
    return {}
```

## CLI Commands

### List Secrets

```bash theme={null}
beam secret list
```

```bash theme={null}
$ beam secret list

  Name             Last Updated     Created
 ──────────────────────────────────────────────────
  AWS_KEY          19 hours ago     19 hours ago
  AWS_ACCESS_KEY   20 seconds ago   20 seconds ago
  AWS_REGION       7 seconds ago    7 seconds ago

  3 items
```

### Create a Secret

```bash theme={null}
beam secret create [KEY] [VALUE]
```

```bash theme={null}
$ beam secret create AWS_ACCESS_KEY ASIAY34FZKBOKMUTVV7A

=> Created secret with name: 'AWS_ACCESS_KEY'
```

<Warning>
  If your secret contains special characters, you may need to escape them with a
  backslash. For example, `a$b` would need to be `a\$b`.
</Warning>

### Show a Secret

```bash theme={null}
beam secret create show [KEY]
```

```bash theme={null}
$ beam secret show AWS_ACCESS_KEY

=> Secret 'AWS_ACCESS_KEY': ASIAY34FZKBOKMUTVV7A
```

### Modify a Secret

```bash theme={null}
beam secret modify [KEY] [VALUE]
```

```bash theme={null}
$ beam secret modify AWS_ACCESS_KEY ASIAY34FZKBOKMUTVV7A

=> Modified secret 'AWS_ACCESS_KEY'
```

### Delete a Secret

```bash theme={null}
beam secret delete [KEY]
```

```bash theme={null}
$ beam secret delete AWS_ACCESS_KEY

=> Deleted secret 'AWS_ACCESS_KEY'
```


# Serverless ComfyUI
Source: https://docs.beam.cloud/v2/examples/comfy-ui


This guide shows how to deploy a ComfyUI server on Beam using [`Pod`](/v2/pod/web-service). We'll set up a server to generate images with [Flux1 Schnell](https://huggingface.co/Comfy-Org/flux1-schnell), but you can easily adapt it to use other models like Stable Diffusion v1.5.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/blob/main/image_generation/comfy_ui/app.py">
  See the code for this example on Github.
</Card>

<Frame>
  <img />
</Frame>

## Setting Up the ComfyUI Server

1. **Create the Deployment Script**

   Create a file named `app.py` with the following code. This script sets up a Beam `Pod` with ComfyUI, installs dependencies, downloads the Flux1 Schnell model, and launches the server.

   ```python theme={null}
   from beam import Image, Pod

   ORG_NAME = "Comfy-Org"
   REPO_NAME = "flux1-schnell"
   WEIGHTS_FILE = "flux1-schnell-fp8.safetensors"
   COMMIT = "f2808ab17fe9ff81dcf89ed0301cf644c281be0a"

   image = (
       Image()
       .add_commands(["apt update && apt install git -y"])
       .add_python_packages(
           [
               "fastapi[standard]==0.115.4",
               "comfy-cli==1.3.5",
               "huggingface_hub[hf_transfer]==0.26.2",
           ]
       )
       .add_commands(
           [
               "comfy --skip-prompt install --nvidia --version 0.3.10",
               "comfy node install was-node-suite-comfyui@1.0.2",
               "mkdir -p /root/comfy/ComfyUI/models/checkpoints/",
               f"huggingface-cli download {ORG_NAME}/{REPO_NAME} {WEIGHTS_FILE} --cache-dir /comfy-cache",
               f"ln -s /comfy-cache/models--{ORG_NAME}--{REPO_NAME}/snapshots/{COMMIT}/{WEIGHTS_FILE} /root/comfy/ComfyUI/models/checkpoints/{WEIGHTS_FILE}",
           ]
       )
   )

   comfyui_server = Pod(
       image=image,
       ports=[8000],
       cpu=12,
       memory="32Gi",
       gpu="H100",
       entrypoint=["sh", "-c", "comfy launch -- --listen 0.0.0.0 --port 8000"],
   )

   res = comfyui_server.create()
   print("ComfyUI hosted at:", res.url)
   ```

2. **Start ComfyUI**

   ```bash theme={null}
   python app.py
   ```

   This deploys the ComfyUI server to Beam. After deployment, you'll see a URL (e.g., `https://pod-12345.apps.beam.cloud`) where your server is hosted.

   <Warning>
     ComfyUI takes a minute or two to start after deploying it for the first
     time.
   </Warning>

3. **Accessing the Server**

   * Open the URL from your terminal in a browser to access the ComfyUI interface.
   * Use the web UI to load workflows or generate images.

## Using Different Models

You can swap the Flux1 Schnell model for another, such as Stable Diffusion v1.5, by updating the model variables in `app.py`. Here’s how:

1. **Update the Model Variables**

   Define the organization, repository, weights file, and commit ID for your desired model. For example, to use Stable Diffusion v1.5:

   ```python theme={null}
   ORG_NAME = "Comfy-Org"
   REPO_NAME = "stable-diffusion-v1-5-archive"
   WEIGHTS_FILE = "v1-5-pruned-emaonly-fp16.safetensors"
   COMMIT = "21e044065c0b2d82dafd35397a553847c70c0445"
   ```

2. **Apply to the Image Commands**

   The rest of the script uses these variables, so no further changes are needed to the `image` section:

   ```python theme={null}
   image = (
       Image()
       .add_commands(["apt update && apt install git -y"])
       .add_python_packages(
           [
               "fastapi[standard]==0.115.4",
               "comfy-cli==1.3.5",
               "huggingface_hub[hf_transfer]==0.26.2",
           ]
       )
       .add_commands(
           [
               "comfy --skip-prompt install --nvidia --version 0.3.10",
               "comfy node install was-node-suite-comfyui@1.0.2",
               "mkdir -p /root/comfy/ComfyUI/models/checkpoints/",
               f"huggingface-cli download {ORG_NAME}/{REPO_NAME} {WEIGHTS_FILE} --cache-dir /comfy-cache",
               f"ln -s /comfy-cache/models--{ORG_NAME}--{REPO_NAME}/snapshots/{COMMIT}/{WEIGHTS_FILE} /root/comfy/ComfyUI/models/checkpoints/{WEIGHTS_FILE}",
           ]
       )
   )
   ```

3. **Find Model Details**

   To use any other model:

   * Visit [Comfy-Org Hugging Face](https://huggingface.co/Comfy-Org) and find your desired model.
   * Update `ORG_NAME`, `REPO_NAME`, `WEIGHTS_FILE`, and `COMMIT` with values from the model’s repository. Check the "Files and versions" tab for the weights file and commit hash.

## Running Workflows as APIs

You can also expose ComfyUI workflows as APIs using Beam’s ASGI support. This allows you to programmatically generate images by sending requests with prompts. Below is an example of how to set this up:

1. **Create the API Script**

   ```python theme={null}
   from beam import Image, asgi, Output

   image = (
       Image()
       .add_commands(["apt update && apt install git -y"])
       .add_python_packages(
           [
               "fastapi[standard]==0.115.4",
               "comfy-cli",
               "huggingface_hub[hf_transfer]==0.26.2",
           ]
       )
       .add_commands(
           [
               "yes | comfy install --nvidia --version 0.3.10",
               "comfy node install was-node-suite-comfyui@1.0.2",
               "mkdir -p /root/comfy/ComfyUI/models/checkpoints/",
               "huggingface-cli download Comfy-Org/flux1-schnell flux1-schnell-fp8.safetensors --cache-dir /comfy-cache",
               "ln -s /comfy-cache/models--Comfy-Org--flux1-schnell/snapshots/f2808ab17fe9ff81dcf89ed0301cf644c281be0a/flux1-schnell-fp8.safetensors /root/comfy/ComfyUI/models/checkpoints/flux1-schnell-fp8.safetensors",
           ]
       )
   )

   def init_models():
       import subprocess

       cmd = "comfy launch --background"
       subprocess.run(cmd, shell=True, check=True)

   @asgi(
       name="comfy",
       image=image,
       on_start=init_models,
       cpu=8,
       memory="32Gi",
       gpu="H100",
       timeout=-1,
   )
   def handler():
       from fastapi import FastAPI, HTTPException
       import subprocess
       import json
       from pathlib import Path
       import uuid
       from typing import Dict

       app = FastAPI()

       # This is where you specify the path to your workflow file.
       # Make sure "workflow_api.json" exists in the same directory as this script.
       WORKFLOW_FILE = Path(__file__).parent / "workflow_api.json"
       OUTPUT_DIR = Path("/root/comfy/ComfyUI/output")

       @app.post("/generate")
       async def generate(item: Dict):
           if not WORKFLOW_FILE.exists():
               raise HTTPException(status_code=500, detail="Workflow file not found.")

           workflow_data = json.loads(WORKFLOW_FILE.read_text())
           workflow_data["6"]["inputs"]["text"] = item["prompt"]
           request_id = uuid.uuid4().hex
           workflow_data["9"]["inputs"]["filename_prefix"] = request_id

           new_workflow_file = Path(f"{request_id}.json")
           new_workflow_file.write_text(json.dumps(workflow_data, indent=4))

           # Run inference
           cmd = f"comfy run --workflow {new_workflow_file} --wait --timeout 1200 --verbose"
           subprocess.run(cmd, shell=True, check=True)

           image_files = list(OUTPUT_DIR.glob("*"))

           # Find the latest image
           latest_image = max(
               (f for f in image_files if f.suffix.lower() in {".png", ".jpg", ".jpeg"}),
               key=lambda f: f.stat().st_mtime,
               default=None
           )

           if not latest_image:
               raise HTTPException(status_code=404, detail="No output image found.")

           output_file = Output(path=latest_image)
           output_file.save()
           public_url = output_file.public_url(expires=-1)
           print(public_url)
           return {"output_url": public_url}

       return app
   ```

2. **Prepare a Workflow File**

   * Create a `workflow_api.json` file in the same directory as `app.py`. This file should contain your ComfyUI workflow, which you can export from the ComfyUI web interface.

   <Frame>
     <img />
   </Frame>

   * You can also store your `workflow_api.json` file in your Volume and use it like `WORKFLOW_FILE = Path("/your_volume/workflow_api.json")`

3. **Deploy the API**

   ```bash theme={null}
   beam deploy api.py:handler
   ```

4. **Use the API**

   Send a POST request to the `/generate` endpoint with a JSON payload containing a `prompt`:

   ```bash theme={null}
    curl -X POST https://12345.apps.beam.cloud/generate \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer YOUR_BEAM_API' \
    -d '{"prompt": "A cat image"}'
   ```

   The response will include a public URL to the generated image:

   ```json theme={null}
   {
     "output_url": "https://app.beam.cloud/output/id/9a003889-8345-4969-bdf8-2808eebc1c4b"
   }
   ```

   <Frame>
     <img />
   </Frame>


# Chat with DeepSeek R1
Source: https://docs.beam.cloud/v2/examples/deepseek-r1


In this example we are going to use [vLLM](https://github.com/vllm-project/vllm) to host an API for `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` on Beam.

<Warning>
  This example requires our multi-GPU feature, which needs to be enabled on your
  Beam account. Please send us a message in our [Slack
  Community](https://join.slack.com/t/beam-cloud/shared_invite/zt-3enuvj3r7-OeAzVPYvyqQHy9avNrLL0w)
  and we'll enable it for you!
</Warning>

<video />

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/vllm">
  See the code for this example on Github.
</Card>

## Initial Setup

First, clone the vLLM example to your computer.

```sh theme={null}
$ beam example download vllm && cd vllm
```

We'll use our vLLM abstraction to host an OpenAI compatible DeepSeek API on Beam.

From inside the vLLM directory, run the following command to deploy the API:

```sh theme={null}
$ beam deploy models.py:deepseek_r1

=> Building image
=> Using cached image
=> Syncing files
=> Uploading
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 28.7/28.7 kB 0:00:00
=> Files synced
=> Deploying
=> Deployed
=> Invocation details

curl -X POST 'https://deepseek-r1-distill-qwen-7b-54b3408-v4.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-d '{}'
```

This code will deploy a DeepSeek R1 API on Beam, and print out the API URL.

## Running the API

We provide an interactive command line interface to run the API. You'll be prompted to enter the API URL from the deployment output above. If you select stream mode, the API will stream the response to the console.

```sh theme={null}
$ python chat.py

Welcome to the CLI Chat Application!

Type 'quit' to exit the conversation.

Enter the app URL: https://deepseek-r1-distill-qwen-7b-54b3407-v4.app.beam.cloud

Stream mode? (y/n): y

Model deepseek-ai/DeepSeek-R1-Distill-Qwen-7B is ready
```

<Warning>
  The first time you run the API, the model weights will be downloaded from
  Hugging Face. This may take a few minutes, but will be cached for future runs.
</Warning>

## Interacting with DeepSeek R1

You can now interact with the DeepSeek R1 API. The API will stream the response to the console, and print out the tokens generated and the time taken.

```sh theme={null}
**Question:** What’s the meaning of life?

<think>
</think>

The question of the meaning of life is a deep and complex one, and different people and cultures may have different perspectives and answers. Some common themes across various philosophies and belief systems include:

1. **Philosophical Views**
   - **Existentialism**: The belief that life is inherently meaningless, and individuals must create their own meaning.
   - **Mysticism**: The idea that the meaning of life is found within oneself, through spiritual or divine connection.
   - **Buddhism**: The concept of "suffering" (dukkha) and the goal of achieving liberation from suffering, often seen as the ultimate meaning of life.

2. **Cultural and Religious Views**
   - **Religion**: Many religions—such as Christianity, Islam, and Buddhism—propose a higher power or purpose that gives life meaning.
   - **Science and Empiricism**: A materialistic or scientific view often suggests that life’s meaning is derived from personal fulfillment, relationships, or contributing to the greater good.

3. **Personal Perspectives**
   - **Purpose**: Many people find meaning in life by aligning their actions with their personal values, goals, and aspirations.
   - **Relationships**: Building meaningful connections with others can provide a sense of purpose and fulfillment.
   - **Creativity**: Engaging in creative activities, such as art, music, or writing, can bring meaning to life.

Ultimately, the meaning of life is often left open to interpretation, as it can vary greatly depending on individual experiences, beliefs, and contexts. Some find meaning through achieving their personal goals, while others find it in helping others or contributing to the world in some way.

Tokens Generated: 350
Time Taken: 12.66s
Tokens Per Second: 27.64
```


# Fine-tuning Gemma with LoRA
Source: https://docs.beam.cloud/v2/examples/gemma-fine-tune


In this example we are fine-tuning [Gemma 2B](https://huggingface.co/google/gemma-2b), an open source model from Google.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/finetuning/gemma">
  See the code for this example on Github.
</Card>

## Fine-Tuning

In this example, we are using Low-Rank Adaption (LoRA) to fine-tune the [Gemma language model](https://blog.google/technology/developers/gemma-open-models/) using the [Open Assistant dataset](https://huggingface.co/datasets/OpenAssistant/oasst1).

The goal is to use this dataset to improve Gemma's ability to engage in helpful conversations, making it more suitable for assistant-like apps.

### LoRA

You can read more about LoRA [here](https://arxiv.org/abs/2106.09685). However, let's briefly discuss what exactly it does and why we chose to use it here.

At a high level, LoRA introduces a new small set of weights to the model that we will be training. By limiting our training to these additional weights, we can fine-tune the model much quicker. Additionally, since we are not touching the original weights, the model's initial knowledge base should intact.

### Initial Setup

In this example, we are using an [H100](https://www.nvidia.com/en-us/data-center/a100/) GPU. We are using mixed precision (FP16) to optimize for speed and memory usage. In this example, we are only training for one epoch. In practice, you can probably train longer and continue to see improved results.

No surprise here, but we are getting our compute via Beam. We are using the `function` decorator so that we can run our fine-tuning application as if it were on our local machine.

```python theme={null}
from beam import Volume, Image, function


# The mount path is the location on the beam volume with the model weights
MOUNT_PATH = "./gemma-ft"
@function(
    volumes=[Volume(name="gemma-ft", mount_path=MOUNT_PATH)],
    image=Image(
        python_packages=["transformers", "torch", "datasets", "peft", "bitsandbytes"]
    ),
    gpu="H100",
    cpu=4,
)
```

### Mounting Storage Volumes

We're using Beam's persistent [storage volumes](/v2/data/volume) to store model weights and training data. This allows us to download the necessary files directly to the volume, streamlining the setup process.

Here's a simple script to handle the downloads:

```python theme={null}
from beam import function, Volume, Image, env

if env.is_remote():
    from huggingface_hub import snapshot_download
    from datasets import load_dataset

VOLUME_PATH = "./gemma-ft"

@function(
    image=Image(python_version="python3.11")
    .add_python_packages(
        [
            "huggingface_hub",
            "datasets"
            "huggingface_hub[hf-transfer]",
        ]
    )
    .with_envs("HF_HUB_ENABLE_HF_TRANSFER=1"),
    memory="32Gi",
    cpu=4,
    secrets=["HF_TOKEN"],
    volumes=[Volume(name="gemma-ft", mount_path=VOLUME_PATH)],
)
def upload():
    snapshot_download(
        repo_id="google/gemma-2b",
        local_dir=f"{VOLUME_PATH}/weights"
    )

    dataset = load_dataset("OpenAssistant/oasst1", split="train")
    dataset.save_to_disk(f"{VOLUME_PATH}/data")
    print("Files uploaded successfully")

if __name__ == "__main__":
    upload()
```

This script will download the Gemma 2B model weights and the Open Assistant dataset directly to your Beam volume.

First, let's create our volume:

```bash theme={null}
beam volume create gemma-ft
```

Next, we can run our script to populate it with the model and dataset:

```bash theme={null}
python upload.py
```

Once those uploads are complete, we can move on to training.

### Start Training

We can start our training by running `python finetune.py`. After beginning training, you should see something like the following in your terminal:

```bash theme={null}
=> Building image
=> Syncing files
...
=> Running function: <finetune:gemma_fine_tune>
Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]
...
Generating train split: 12947 examples [00:00, 114393.80 examples/s]
...
Map:  93%|#########2| 12000/12947 [00:13<00:01, 921.12 examples/s]
...
  1%|          | 6/809 [00:08<16:35,  1.24s/it]
...
{'loss': 1.617, 'grad_norm': 0.4805833399295807, 'learning_rate': 0.00019752781211372064, 'epoch': 0.01}
...
```

Once it is finished, we can use the beam CLI to look at the resulting files. You should see something like this:

```bash theme={null}
$ beam ls gemma-ft/gemma-2b-finetuned

  Name                                                Size   Modified Time   IsDir
 ──────────────────────────────────────────────────────────────────────────────────
  gemma-2b-finetuned/README.md                    4.97 KiB   Aug 10 2024     No
  gemma-2b-finetuned/adapter_config.json          644.00 B   Aug 10 2024     No
  gemma-2b-finetuned/adapter_model.safetensors   12.20 MiB   Aug 10 2024     No
  gemma-2b-finetuned/checkpoint-700              36.70 MiB   Aug 01 2024     Yes
  gemma-2b-finetuned/checkpoint-800              36.70 MiB   Aug 01 2024     Yes
  gemma-2b-finetuned/checkpoint-809              36.70 MiB   Aug 01 2024     Yes
  gemma-2b-finetuned/special_tokens_map.json      555.00 B   Aug 10 2024     No
  gemma-2b-finetuned/tokenizer.json              16.71 MiB   Aug 10 2024     No
  gemma-2b-finetuned/tokenizer_config.json       45.21 KiB   Aug 10 2024     No

  9 items | 139.06 MiB used
```

## Inference

In `inference.py`, we are loading up our model with the additional fine-tuned weights and setting up an endpoint to send it requests.

Here, we make use of the Beam's `on_start` functionality so that we only load the model when the container starts instead of every time we receive a request. Let's explore the `endpoint` decorator below.

```python theme={null}
from beam import Volume, Image, endpoint


# The mount path is the location on the beam volume with the model weights
MOUNT_PATH = "./gemma-ft"
@endpoint(
    name="gemma-inference",
    on_start=load_finetuned_model,
    volumes=[Volume(name="gemma-ft", mount_path=MOUNT_PATH)],
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(
        python_version="python3.9",
        python_packages=["transformers==4.42.0", "torch", "peft"],
    ),
)
```

Once again, we are mounting our storage volume named "gemma-ft". Since we have already run training, this volume will now contain our fine-tuned weights alongside the base weights we got from Hugging Face.

### Choosing a GPU For Inference

Now that we've trained the model, we can run it on a machine with a weaker GPU.

Training requires more memory than inference because it must store gradients and optimizer states for all parameters, in addition to activations, whereas inference only needs to maintain the current layer's activations during a forward pass. Be sure to keep this in mind as you work on your own applications.

You can use the [Beam dashboard](https://platform.beam.cloud/) to get a sense of GPU utilization in real-time. With this information, you can make a more informed choice about how much compute you require. For this example, we use a T4 GPU. It has 16GB of VRAM and is a good choice for inference with a model this small.

<Frame>
  <img />
</Frame>

### Using Signals to Reload Model Weights Automatically

We use a [`Signal`](/v2/topics/signal) abstraction to fire an event to the inference app when the model has finished training.

This allows us to communicate between apps on Beam. In this example, we have it setup to re-run our on-start method when a signal is received. This way, if we re-train our model, we can load the newest weights without restarting the container.

```python theme={null}
# Register a signal
s = experimental.Signal(
    name="reload-model",
    handler=load_finetuned_model,
)
```

### Deploying The Endpoint

Let's deploy our endpoint! We can do this with the `beam` CLI.

```
beam deploy inference.py:predict --name gemma-ft
```

The output will look something like this:

```bash theme={null}
=> Building image
=> Syncing files
=> Deploying
=> Deployed
=> Invocation details
curl -X POST 'https://app.beam.cloud/endpoint/gemma-ft/v2' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer {YOUR_AUTH_TOKEN}' \
-d '{}'
```

When calling our inference endpoint, we'll need to include a prompt. For example, we can call the deployed endpoint with `-d '{"prompt": "hi"}`. The response we get back will be in the following format:

```bash theme={null}
{"text":"Hello! How can I help you today?<|im_end|>"}
```

Note that the returned response includes the stop tokens `<|im_end|>`. You could strip this token in the endpoint logic if you would like, but it is worth keeping around if you will be appending this response to a longer running conversation.


# Hugging Face Models
Source: https://docs.beam.cloud/v2/examples/inference

A beginner's guide to running highly performant inference workloads on Beam.

This tutorial introduces several key concepts:

* Creating a container image
* Running a custom ML model
* Developing your app using Beam's live reloading workflow
* Pre-loading models and caching them in storage volumes
* Autoscaling and concurrency

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/blob/main/huggingface_inference/app.py">
  See the code for this example on Github.
</Card>

## Setup your app

You'll start by adding an `endpoint` decorator with an [`Image`](/v2/reference/py-sdk#image)

* `Endpoint` is the wrapper for your inference function.
* Inside the `endpoint` is an `Image`. The `Image` defines the image your container will run on.

<Tip>
  If you'd like to make further customizations to your image -- such as adding
  shell commands -- you can do so using the `commands` argument. [Read more
  about custom images.](/v2/environment/custom-images)
</Tip>

```python theme={null}
from beam import Image, endpoint


@endpoint(
    name="inference-quickstart",
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(python_version="python3.9")
    .add_python_packages(["transformers", "torch", "huggingface_hub[hf-transfer]"])
    .with_envs("HF_HUB_ENABLE_HF_TRANSFER=1"),
)
```

## Remote vs. Local Environment

Typically, your apps that run on Beam will be using packages that you don't have installed locally.

Some of our Python packages aren't installed locally -- like Transformers -- so we'll use a special flag called `env.is_remote()` to conditionally import packages only when inside the remote cloud environment.

```python theme={null}
from beam import env


if env.is_remote():
    import transformers
    import torch
```

This command checks whether the Python script is running remotely on Beam, and will only try to import the packages in its scope if it is.

## Running a custom ML model

We'll create a new function to run inference on `facebook/opt-125m` via Huggingface Transformers.

Since we'll deploy this as a REST API, we add an `@endpoint` decorator above the inference function:

```python theme={null}
from beam import Image, endpoint, env

if env.is_remote():
    from transformers import AutoTokenizer, OPTForCausalLM
    import torch

@endpoint(
    name="inference-quickstart",
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(python_version="python3.9")
    .add_python_packages(["transformers", "torch", "huggingface_hub[hf-transfer]"])
    .with_envs("HF_HUB_ENABLE_HF_TRANSFER=1"),
)
def predict(prompt):

    model = OPTForCausalLM.from_pretrained("facebook/opt-125m")
    tokenizer = AutoTokenizer.from_pretrained("facebook/opt-125m")

    # Generate
    inputs = tokenizer(prompt, return_tensors="pt")
    generate_ids = model.generate(inputs.input_ids, max_length=30)
    result = tokenizer.batch_decode(
        generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False
    )[0]

    print(result)

    return {"prediction": result}
```

## Developing your app on Beam

Beam includes a live-reloading feature that allows you to run your code on the same environment you'll be running in production.

<Info>
  By default, Beam will sync all the files in your working directory to the
  remote container. This allows you to use the files you have locally while
  developing. If you want to prevent some files from getting uploaded, you can
  create a [`.beamignore`](/v2/reference/cli#ignore-local-files).
</Info>

In your shell, run `beam serve app.py:predict`. This will:

1. Spin up a container
2. Run it on a GPU
3. Print a cURL request to invoke the API
4. Stream the logs to your shell

You should keep this terminal window open while developing.

```sh theme={null}
(.venv) user@MacBook demo % beam serve app.py:predict
=> Building image
=> Using cached image
=> Syncing files
=> Invocation details

curl -X POST \
'https://app.beam.cloud/endpoint/id/bc55068e-b648-4dbc-9cb7-183e1789e011' \
    -H 'Accept: */*' \
    -H 'Accept-Encoding: gzip, deflate' \
    -H 'Connection: keep-alive' \
    -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
    -H 'Content-Type: application/json' \
    -d '{}'

=> Watching ./inference-app for changes...
```

Now, head back to your IDE, and change a line of code. Hit save.

If you look closely at the shell running `beam serve`, you'll notice the server reloading with your code changes.

You'll use this workflow anytime you're developing an app on Beam. Trust us -- it makes the development process uniquely fast and painless.

## Performance Optimizations

If you called the API via the cURL command, you'll notice that your model was downloaded each time you invoked the API.

In order to improve performance, we'll setup a function to pre-load your models and store them on disk between API calls.

### Pre-loading

Beam includes an `on_start` method, which you can pass to your function decorators. `on_start` is run exactly once when the container first starts:

The value of the `on_start` function can be retrieved from `context.on_start_value`:

```python theme={null}
from beam import Image, endpoint, env

if env.is_remote():
    from transformers import AutoTokenizer, OPTForCausalLM
    import torch

def download_models():
    from transformers import AutoTokenizer, OPTForCausalLM

    model = OPTForCausalLM.from_pretrained("facebook/opt-125m")
    tokenizer = AutoTokenizer.from_pretrained("facebook/opt-125m")

    return model, tokenizer


@endpoint(
    name="inference-quickstart",
    on_start=download_models,
    image=Image(
        python_version="python3.9",
        python_packages=[
            "transformers",
            "torch",
        ],
    ),
)
def predict(context):
    # Retrieve cached model from on_start function
    model, tokenizer = context.on_start_value

    # Do something with the model and tokenizer...
```

### Cache in a storage volume

The `on_start` method saves us from having to download the model multiple times, but we can avoid downloading the model entirely by caching it in a [Storage Volume](/v2/data/volume):

Beam allows you to create highly-available storage volumes that can be used across tasks. You might use volumes for things like storing model weights or large datasets.

```python theme={null}
from beam import Image, endpoint, Volume


# Model weights will be cached in this folder
CACHE_PATH = "./weights"


# This function runs once when the container first starts
def download_models():
    from transformers import AutoTokenizer, OPTForCausalLM

    model = OPTForCausalLM.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)
    tokenizer = AutoTokenizer.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)

    return model, tokenizer


@endpoint(
    name="inference-quickstart",
    on_start=download_models,
    volumes=[Volume(name="weights", mount_path=CACHE_PATH)],
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(
        python_version="python3.9",
        python_packages=[
            "transformers",
            "torch",
        ],
    ),
)
```

Now, these models can be automatically downloaded to the volume by using the `cache_dir` argument in transformers:

```python theme={null}
model = OPTForCausalLM.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)
tokenizer = AutoTokenizer.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)
```

These volumes are mounted directly to the container running your app, so you can read and write them to disk like any normal file.

## Configure Autoscaling (Optional)

You can control your autoscaling behavior with `QueueDepthAutoscaler`.

`QueueDepthAutoscaler` takes two parameters:

* `max_containers`
* `tasks_per_container`

```python theme={null}
from beam import endpoint, QueueDepthAutoscaler


@endpoint(autoscaler=QueueDepthAutoscaler(max_containers=5, tasks_per_container=1))
def function():
    pass
```

## Deployment

With these performance optimizations in place, it's time to deploy your API to create a persistent endpoint. In your shell, run this command to deploy your app:

```sh theme={null}
beam deploy app.py:predict
```

## Monitoring Logs and Task Status

In the dashboard, you can view the status of the task and the logs from the container:

<Frame>
  <img />
</Frame>

## Summary

You've successfully created a highly performant serverless API for your ML model!


# LLaMA 3.1 8B
Source: https://docs.beam.cloud/v2/examples/llama3


This guide demonstrates how to run the Meta Llama 3.1 8B Instruct model on Beam.

<Warning>
  You need an access token from Huggingface to run this example. You can sign up
  for Huggingface and access your token on [the settings
  page](https://huggingface.co/settings/tokens), and store it in the [Beam
  Secrets Manager](/v2/environment/secrets).
</Warning>

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/language_models/llama3_8b">
  See the code for this example on Github.
</Card>

## Prerequisites

1. **Request Access**: Request access to the model [here](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).
2. **Retrieve HF Token**: Get your Huggingface token from [this page](https://huggingface.co/settings/tokens).
3. **Save HF Token on Beam**: Use the command `beam secret create HF_TOKEN [TOKEN]` to save your token.

## Setup Remote Environment

The first thing we'll do is set up an `Image` with the Python packages required for this app.

We use the `if env.is_remote()` flag to conditionally import the Python packages only when the script is running remotely on Beam.

```python theme={null}
from beam import endpoint, Image, Volume, env

# This ensures that these packages are only loaded when the script is running remotely on Beam
if env.is_remote():
    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer

# Model parameters
MODEL_NAME = "meta-llama/Meta-Llama-3.1-8B-Instruct"
MAX_LENGTH = 512
TEMPERATURE = 0.7
TOP_P = 0.9
TOP_K = 50
REPETITION_PENALTY = 1.05
NO_REPEAT_NGRAM_SIZE = 2
DO_SAMPLE = True
NUM_BEAMS = 1
EARLY_STOPPING = True

BEAM_VOLUME_PATH = "./cached_models"

# This runs once when the container first starts
def load_models():
    tokenizer = AutoTokenizer.from_pretrained(
        MODEL_NAME,
        cache_dir=BEAM_VOLUME_PATH,
        padding_side='left'
    )
    tokenizer.pad_token = tokenizer.eos_token
    model = AutoModelForCausalLM.from_pretrained(
        MODEL_NAME,
        device_map="auto",
        torch_dtype=torch.float16,
        cache_dir=BEAM_VOLUME_PATH,
        use_cache=True,
        low_cpu_mem_usage=True
    )
    model.eval()
    return model, tokenizer
```

## Inference Function

Here’s the inference function. By adding the `@endpoint` decorator to it, we can expose this function as a RESTful API.

Note the `secrets` argument which ensures the Huggingface token is loaded into the environment.

```python theme={null}
@endpoint(
    secrets=["HF_TOKEN"],
    on_start=load_models,
    name="meta-llama-3.1-8b-instruct",
    cpu=2,
    memory="16Gi",
    gpu="A10G",
    image=Image(python_version="python3.9")
    .add_python_packages(
        [
            "torch",
            "transformers",
            "accelerate",
            "huggingface_hub[hf-transfer]",
        ]
    )
    .with_envs({
        "HF_HUB_ENABLE_HF_TRANSFER": "1",
        "TOKENIZERS_PARALLELISM": "false",
        "CUDA_VISIBLE_DEVICES": "0",
    }),
    volumes=[
        Volume(
            name="cached_models",
            mount_path=BEAM_VOLUME_PATH,
        )
    ],
)
def generate_text(context, **inputs):
    # Retrieve model and tokenizer from on_start
    model, tokenizer = context.on_start_value

    # Inputs passed to API
    messages = inputs.pop("messages", None)
    if not messages:
        return {"error": "Please provide messages for text generation."}

    generate_args = {
        "max_new_tokens": inputs.get("max_tokens", MAX_LENGTH),
        "temperature": inputs.get("temperature", TEMPERATURE),
        "top_p": inputs.get("top_p", TOP_P),
        "top_k": inputs.get("top_k", TOP_K),
        "repetition_penalty": inputs.get("repetition_penalty", REPETITION_PENALTY),
        "no_repeat_ngram_size": inputs.get("no_repeat_ngram_size", NO_REPEAT_NGRAM_SIZE),
        "num_beams": inputs.get("num_beams", NUM_BEAMS),
        "early_stopping": inputs.get("early_stopping", EARLY_STOPPING),
        "do_sample": inputs.get("do_sample", DO_SAMPLE),
        "use_cache": True,
        "eos_token_id": tokenizer.eos_token_id,
        "pad_token_id": tokenizer.pad_token_id,
    }

    model_inputs_str = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

    # Tokenize inputs with truncation
    tokenized_inputs = tokenizer(
        model_inputs_str,
        return_tensors="pt",
        padding=True,
        truncation=True,
        max_length=2048
    )
    input_ids = tokenized_inputs["input_ids"].to("cuda")
    attention_mask = tokenized_inputs["attention_mask"].to("cuda")
    input_ids_length = input_ids.shape[-1]

    with torch.no_grad():
        outputs = model.generate(
            input_ids=input_ids, attention_mask=attention_mask, **generate_args
        )

        new_tokens = outputs[0][input_ids_length:]
        output_text = tokenizer.decode(new_tokens, skip_special_tokens=True)

        return {"output": output_text}
```

## Deploy to Production

The following command deploys our code to Beam, and hosts it as a REST API:

```sh theme={null}
beam deploy app.py:generate_text
```

## Invoking the API

Once the API is running, you can invoke it using the following cURL command:

```sh theme={null}
curl -X POST 'https://app.beam.cloud/endpoint/id/[ENDPOINT-ID]' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [AUTH-TOKEN]' \
-d '{
    "messages": [
        {"role": "system", "content": "You are a yoda chatbot who always responds in yoda speak!"},
        {"role": "user", "content": "Who are you?"}
    ]
}'
```

Replace `[ENDPOINT-ID]` with your actual endpoint ID and `[AUTH-TOKEN]` with your authentication token. You'll see a response from the API, like this:

```json theme={null}
{
  "output": "A Jedi I am. In the ways of the Force, trained I have been."
}
```

## Summary

You've successfully set up a highly performant serverless API for generating text using the Meta Llama 3.1 8B Instruct model on Beam.


# Stable Diffusion with LoRAs
Source: https://docs.beam.cloud/v2/examples/lora


## Introduction

This guide demonstrates how to run Stable Diffusion with custom LoRAs.

<video />

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/blob/main/image_generation/stable_diffusion_lora/app.py">
  See the code for this example on Github.
</Card>

## Setup Remote Environment

The first thing we'll do is setup an `Image` with the Python packages required for this app.

Because this script will run remotely, we need to make sure our local Python interpreter doesn't try to install these packages locally.

We'll use the `if env.is_remote()` flag to conditionally import the Python packages only when the script is running remotely on Beam.

```python app.py theme={null}
from beam import Image, Volume, endpoint, Output, env


# This check ensures that the packages are only imported when running this script remotely on Beam
if env.is_remote():
    from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
    import torch
    from huggingface_hub import hf_hub_download
    from safetensors.torch import load_file
    import os
    import uuid


# The container image for the remote runtime
image = (
    Image(python_version="python3.9")
    .add_python_packages(
        [
            "diffusers[torch]>=0.10",
            "transformers",
            "huggingface_hub",
            "huggingface_hub[hf-transfer]",
            "torch",
            "peft",
            "pillow",
            "accelerate",
            "safetensors",
            "xformers",
        ]
    )
    .with_envs("HF_HUB_ENABLE_HF_TRANSFER=1")
)
```

## Pre-Load Models

Next, we'll set up a function to run once when the container first starts up. This allows us to cache the model in memory between requests and ensures we don't unnecessarily re-load the model.

```python app.py theme={null}
CACHE_PATH = "./models"
MODEL_URL = "https://huggingface.co/martyn/sdxl-turbo-mario-merge-top-rated/blob/main/topRatedTurboxlLCM_v10.safetensors"

LORA_WEIGHT_NAME = "raw.safetensors"
LORA_REPO = "ntc-ai/SDXL-LoRA-slider.raw"


# This function once when the container first boots
def load_models():

    hf_hub_download(repo_id=LORA_REPO, filename=LORA_WEIGHT_NAME, cache_dir=CACHE_PATH)

    pipe = StableDiffusionXLPipeline.from_single_file(
        MODEL_URL,
        torch_dtype=torch.float16,
        safety_checker=None,
        cache_dir=CACHE_PATH,
    ).to("cuda")

    return pipe
```

## Inference Function

Here's our inference function. By adding the `@endpoint` decorator to it, we can expose this function as a RESTful API.

There are a few things to take note of:

* an `image` with the Python requirements we defined above
* an `on_start` function that runs once when the container first boots. The value from `on_start` (in this case, our `pipe` handler) is available in the inference function using the `context` value: `pipe = context.on_start_value`
* `volumes`, which are used to store the downloaded LoRAs and model weights on Beam
* `keep_warm_seconds`, which tells Beam how long to keep the container running between requests

```python app.py theme={null}
@endpoint(
    image=image,
    on_start=load_models,
    keep_warm_seconds=60,
    cpu=2,
    memory="32Gi",
    gpu="A10G",
    volumes=[Volume(name="models", mount_path=CACHE_PATH)],
)
def generate(context, prompt="medieval rich kingpin sitting in a tavern, raw"):
    # Retrieve pre-loaded model from loader
    pipe = context.on_start_value

    pipe.enable_sequential_cpu_offload()
    pipe.enable_attention_slicing("max")

    pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)

    # Use a unique adapter name
    adapter_name = f"raw_{uuid.uuid4().hex}"

    # Load and activate the LoRA from a local path
    pipe.load_lora_weights(
        LORA_REPO, weight_name=LORA_WEIGHT_NAME, adapter_name=adapter_name
    )

    # Activate the LoRA
    pipe.set_adapters(["raw"], adapter_weights=[2.0])

    # Generate image
    image = pipe(
        prompt,
        negative_prompt="nsfw",
        width=512,
        height=512,
        guidance_scale=2,
        num_inference_steps=10,
    ).images[0]

    # Save image file
    output = Output.from_pil_image(image).save()

    # Retrieve pre-signed URL for output file
    url = output.public_url()

    return {"image": url}
```

## Saving Image Outputs

Notice the `Output.from_pil_image(image).save()` method below.

This will generate a sharable URL to access the images created from the inference function:

```python app.py theme={null}
from beam import Output

# Save image file
output = Output.from_pil_image(image).save()

# Retrieve pre-signed URL for output file
url = output.public_url()
```

## Create a Preview Deployment

You can spin up a temporary REST API to test this endpoint on Beam, using the `beam serve` command:

```bash theme={null}
beam serve app.py:generate
```

When you run this command, Beam will spin up a GPU-backed container to test your code on the cloud:

```bash theme={null}
=> Building image
=> Using cached image
=> Syncing files
Reading .beamignore file
=> Files synced
=> Invocation details
curl -X POST 'https://app.beam.cloud/endpoint/id/bcaa198b-2556-4c8c-9429-46d3202dbc95' \
-H 'Connection: keep-alive' \
-H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
-H 'Content-Type: application/json' \
-d '{}'
=> Watching '/Users/beta9/beam/examples/07_image_generation' for changes...
```

You can paste the `curl` command in your shell to call the API.

The API will return a pre-signed URL with the image generated:

```bash theme={null}
{"image":"https://app.beam.cloud/output/id/09cb70bf-b5e8-4679-9da2-71611a1c3b57"}
```

<Frame>
  <img />
</Frame>

## Deploy to Production

The `beam serve` command is used for temporary APIs. When you're ready to move to production, deploy a persistent endpoint:

```bash theme={null}
beam deploy app.py:generate
```


# Text-to-Video with Mochi
Source: https://docs.beam.cloud/v2/examples/mochi-1


This guide demonstrates how to run the Mochi-1 text-to-video model on Beam. Mochi-1 is a powerful model for generating high-quality videos based on text prompts.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/video_models/mochi1">
  See the code for this example on Github.
</Card>

## Introduction

Mochi-1 is a state-of-the-art text-to-video model. This guide will help you deploy and use the model as a serverless API on Beam.

## Upload Model Weights

Before using the Mochi-1 model, you need to upload its weights to Beam. This is handled by the `upload.py` script:

```python theme={null}
from beam import function, Volume, Image, env

if env.is_remote():
    from huggingface_hub import snapshot_download

VOLUME_PATH = "./mochi-1-preview"

@function(
    image=Image(
        python_packages=["huggingface_hub", "huggingface_hub[hf_xet]"]
    ),
    memory="32Gi",
    cpu=4,
    secrets=["HF_TOKEN"],
    volumes=[Volume(name="mochi-1-preview", mount_path=VOLUME_PATH)],
)
def upload():
    snapshot_download(
        repo_id="genmo/mochi-1-preview",
        local_dir=f"{VOLUME_PATH}/weights"
    )
    print("Files uploaded successfully")

if __name__ == "__main__":
    upload()
```

### Steps to Run the Script

Run the script locally to upload the weights:

```bash theme={null}
python upload.py
```

Once the weights are uploaded, the `generate_video` endpoint can access them for inference.

## Setup Remote Environment

The model and its dependencies are defined in the `mochi_image`. Here’s how it’s configured:

```python theme={null}
from beam import endpoint, env, Volume, Image, Output

VOLUME_PATH = "./mochi-1-preview"

if env.is_remote():
    import torch
    from diffusers import MochiPipeline
    from diffusers.utils import export_to_video
    import uuid

def load_models():
    pipe = MochiPipeline.from_pretrained(
        f"{VOLUME_PATH}/weights", variant="bf16", torch_dtype=torch.bfloat16)
    return pipe
```

The `mochi_image` includes all necessary Python packages and system dependencies:

```python theme={null}
mochi_image = (
    Image(
        python_version="python3.11",
        python_packages=["torch", "transformers", "accelerate",
                         "sentencepiece", "imageio-ffmpeg", "imageio", "ninja"]
    )
    .add_commands(["apt update && apt install git -y", "pip install git+https://github.com/huggingface/diffusers.git"])
)
```

## Inference Function

The `generate_video` function processes text prompts and generates a video:

```python theme={null}
@endpoint(
    name="mochi-1-preview",
    on_start=load_models,
    cpu=4,
    memory="32Gi",
    gpu="A10G",
    gpu_count=2,
    image=mochi_image,
    volumes=[Volume(name="mochi-1-preview", mount_path=VOLUME_PATH)],
    timeout=-1
)
def generate_video(context, **inputs):
    pipe = context.on_start_value

    prompt = inputs.pop("prompt", None)

    if not prompt:
        return {"error": "Please provide a prompt"}

    pipe.enable_model_cpu_offload()
    pipe.enable_vae_tiling()
    frames = pipe(prompt, num_frames=40).frames[0]

    file_name = f"/tmp/mochi_out_{uuid.uuid4()}.mp4"

    export_to_video(frames, file_name, fps=15)

    output_file = Output(path=file_name)
    output_file.save()
    public_url = output_file.public_url(expires=-1)
    print(public_url)
    return {"output_url": public_url}
```

## Deployment

Deploy the API to Beam:

```bash theme={null}
beam deploy app.py:generate_video
```

## Invoking the API

To invoke the API, send a POST request with the following payload:

```json theme={null}
{
  "prompt": "The camera follows behind a rugged green Jeep with a black snorkel as it speeds along a narrow dirt trail cutting through a dense jungle. Thick vines hang from towering trees with sprawling canopies, their leaves forming a vibrant green tunnel above the vehicle. Mud splashes up from the Jeep’s tires as it powers through a shallow stream crossing the path. Sunlight filters through gaps in the trees, casting dappled golden light over the scene. The dirt trail twists sharply into the distance, overgrown with wild ferns and tropical plants. The vehicle is seen from the rear, leaning into the curve as it maneuvers through the untamed terrain, emphasizing the adventure of the rugged journey. The surrounding jungle is alive with texture and color, with distant mountains barely visible through the mist and an overcast sky heavy with the promise of rain."
}
```

Here’s an example of a cURL request:

```bash theme={null}
curl -X POST 'https://app.beam.cloud/endpoint/id/[ENDPOINT-ID]' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [AUTH-TOKEN]' \
-d '{
    "prompt": "Your text prompt for video generation."
}'
```

## Example Output

The API will return a generated video URL. Here’s an example:

```json theme={null}
{
  "output_url": "https://app.beam.cloud/output/id/dc443a80-7fcc-42bc-928b-4605e41b0825"
}
```

## Example Video

Here is an example video generated by the Mochi-1 model:

<video>
  <source type="video/mp4" />
</video>

## Summary

You’ve successfully deployed and tested a Mochi-1 text-to-video generation API using Beam.


# Examples
Source: https://docs.beam.cloud/v2/examples/overview

End-to-end examples for running real workloads on Beam

Browse complete, runnable examples grouped by what you're building. Each one shows a real workload, from container image to deployment, that you can copy and adapt.

New to Beam? Start with the [Quickstart](/v2/getting-started/quickstart) and [Core Concepts](/v2/getting-started/core-concepts) first.

## Large Language Models

Serve and run inference with open and custom LLMs.

<CardGroup>
  <Card title="Hugging Face Models" icon="bolt" href="/v2/examples/inference">
    A beginner's guide to running performant inference workloads on Beam.
  </Card>

  <Card title="LLaMA 3.1 8B" icon="comments" href="/v2/examples/llama3">
    Serve Meta's LLaMA 3.1 8B model on a GPU.
  </Card>

  <Card title="Run an OpenAI-Compatible vLLM Server" icon="server" href="/v2/examples/vllm">
    Host an OpenAI-compatible inference server with vLLM.
  </Card>

  <Card title="Chat with DeepSeek R1" icon="comment-dots" href="/v2/examples/deepseek-r1">
    Run the DeepSeek R1 reasoning model.
  </Card>

  <Card title="Qwen2.5-7B with SGLang" icon="gauge" href="/v2/examples/sglang">
    Serve Qwen2.5-7B with the SGLang runtime.
  </Card>
</CardGroup>

## Image and Video

Generate and transform images and video on GPUs.

<CardGroup>
  <Card title="Serverless ComfyUI" icon="image" href="/v2/examples/comfy-ui">
    Host ComfyUI for image generation workflows.
  </Card>

  <Card title="Text-to-Video with Mochi" icon="film" href="/v2/examples/mochi-1">
    Generate video from text with the Mochi model.
  </Card>

  <Card title="Stable Diffusion with LoRAs" icon="wand-magic-sparkles" href="/v2/examples/lora">
    Run Stable Diffusion with custom LoRA adapters.
  </Card>
</CardGroup>

## Audio and Transcription

Transcribe and synthesize speech.

<CardGroup>
  <Card title="Faster Whisper" icon="microphone" href="/v2/examples/whisper">
    Transcribe audio with Faster Whisper.
  </Card>

  <Card title="Parler TTS" icon="volume-high" href="/v2/examples/parler-tts">
    Synthesize speech with Parler TTS.
  </Card>

  <Card title="Zonos" icon="music" href="/v2/examples/zonos">
    Generate speech with the Zonos model.
  </Card>
</CardGroup>

## Web Apps

Host interactive apps and scrape the web.

<CardGroup>
  <Card title="Web Scraping with Beam Functions" icon="spider" href="/v2/examples/web-scraping">
    Build a web scraper that runs on Beam functions.
  </Card>

  <Card title="Running Streamlit Apps" icon="chart-line" href="/v2/examples/streamlit">
    Host a Streamlit app behind a public URL.
  </Card>
</CardGroup>

## Agents

Build and coordinate AI agents.

<CardGroup>
  <Card title="Building AI Agents" icon="robot" href="/v2/agents/introduction">
    Build stateful agents with concurrency built in.
  </Card>

  <Card title="Research Assistant" icon="magnifying-glass" href="/v2/agents/synchronization">
    A research assistant that synchronizes state across tasks.
  </Card>
</CardGroup>

## Fine-Tuning

Fine-tune open models on GPUs.

<CardGroup>
  <Card title="Fine-tuning Gemma with LoRA" icon="dumbbell" href="/v2/examples/gemma-fine-tune">
    Fine-tune Google's Gemma model with LoRA.
  </Card>

  <Card title="Fine-Tuning Llama 3.1 8B with Unsloth" icon="gauge-high" href="/v2/examples/unsloth">
    Fast fine-tuning of Llama 3.1 8B with Unsloth.
  </Card>
</CardGroup>


# Parler TTS
Source: https://docs.beam.cloud/v2/examples/parler-tts


This guide demonstrates how to set up and run the Parler TTS text-to-speech model as a serverless API on Beam.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/audio_and_transcription/parler-tts">
  See the code for this example on Github.
</Card>

## Introduction

Parler-TTS Mini is a lightweight text-to-speech (TTS) model, trained on 45K hours of audio data, that can generate high-quality, natural sounding speech with features that can be controlled using a simple text prompt. This guide explains how to deploy and use it on Beam.

## Deployment Setup

Define the model and its dependencies using the `parlertts_image`:

```python theme={null}
from beam import endpoint, env, Image, Output

if env.is_remote():
    from parler_tts import ParlerTTSForConditionalGeneration
    from transformers import AutoTokenizer
    import soundfile as sf
    import uuid

def load_models():
    model = ParlerTTSForConditionalGeneration.from_pretrained(
        "parler-tts/parler-tts-mini-v1").to("cuda:0")
    tokenizer = AutoTokenizer.from_pretrained("parler-tts/parler-tts-mini-v1")
    return model, tokenizer

parlertts_image = (
    Image(
        python_version="python3.10",
        python_packages=[
            "torch",
            "transformers",
            "soundfile",
            "Pillow",
            "wheel",
            "packaging",
            "ninja",
            "huggingface_hub[hf-transfer]",
        ],
    )
    .add_commands(
        [
            "apt update && apt install git -y",
            "pip install git+https://github.com/huggingface/parler-tts.git",
        ]
    )
    .with_envs("HF_HUB_ENABLE_HF_TRANSFER=1")
)
```

## Inference Function

The `generate_speech` function processes text and generates speech audio:

```python theme={null}
@endpoint(
    name="parler-tts",
    on_start=load_models,
    cpu=2,
    memory="32Gi",
    gpu="A10G",
    gpu_count=2,
    image=parlertts_image
)
def generate_speech(context, **inputs):
    model, tokenizer = context.on_start_value

    prompt = inputs.pop("prompt", None)
    description = inputs.pop("description", None)

    if not prompt or not description:
        return {"error": "Please provide a prompt and description"}

    device = "cuda:0"

    input_ids = tokenizer(
        description, return_tensors="pt").input_ids.to(device)
    prompt_input_ids = tokenizer(
        prompt, return_tensors="pt").input_ids.to(device)

    generation = model.generate(
        input_ids=input_ids, prompt_input_ids=prompt_input_ids)
    audio_arr = generation.cpu().numpy().squeeze()

    file_name = f"/tmp/parler_tts_out_{uuid.uuid4()}.wav"

    sf.write(file_name, audio_arr, model.config.sampling_rate)

    output_file = Output(path=file_name)
    output_file.save()
    public_url = output_file.public_url(expires=1200000000)
    print(public_url)
    return {"output_url": public_url}
```

### Deployment

Deploy the API to Beam:

```bash theme={null}
beam deploy app.py:generate_speech
```

## API Usage

Send a `POST` request with the following JSON payload:

```json theme={null}
{
  "prompt": "Your text to convert to speech",
  "description": "Description of the voice/style"
}
```

### Example Request

```json theme={null}
{
  "prompt": "On Beam run AI workloads anywhere with zero complexity. One line of Python, global GPUs, full control!!!",
  "description": "A female speaker delivers a slightly expressive and animated speech with a moderate speed and pitch. The recording is of very high quality, with the speaker's voice sounding clear and very close up."
}
```

### Example Response

A generated audio file will be returned:

```json theme={null}
{
  "output_url": "https://app.beam.cloud/output/id/dc443a80-7fcc-42bc-928b-4605e41b0825"
}
```

## Audio Example

Here’s an example of the generated audio output:

<audio>
  <source type="audio/mpeg" />
</audio>

## Summary

You’ve successfully deployed a Parler TTS text-to-speech API using Beam.


# Qwen2.5-7B with SGLang
Source: https://docs.beam.cloud/v2/examples/sglang


This guide demonstrates how to deploy a high-performance language model server using [SGLang](https://github.com/sgl-project/sglang) with the [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model from Qwen. The server runs on Beam, providing an OpenAI-compatible API endpoint for text generation.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/sglang/">
  See the full code for this example on GitHub.
</Card>

## Overview

SGLang is a fast inference framework for large language models, optimized for low latency and high throughput. We use it to serve the Qwen2.5-7B-Instruct model, a 7-billion-parameter instruction-tuned model, on an H100 GPU via Beam’s `Pod` abstraction.

A test script demonstrates interaction with the server using the OpenAI Python client.

## Setup

First, create a file named `app.py`:

```python theme={null}
from beam import Image, Pod

# Image of SGLang and dependencies
image = (
    Image(python_version="python3.11")
    .add_python_packages([
        "transformers==4.47.1",
        "numpy<2",
        "fastapi[standard]==0.115.4",
        "pydantic==2.9.2",
        "starlette==0.41.2",
        "torch==2.4.0",
    ])
    .add_commands([
        'pip install "sglang[all]==0.4.1" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/'
    ])
)

# Define the SGLang server Beam Pod
sglang_server = Pod(
    image=image,
    ports=[8080],
    cpu=12,
    memory="32Gi",
    gpu="H100",
    secrets=["HF_TOKEN"],
    entrypoint=[
        "python",
        "-m",
        "sglang.launch_server",
        "--model-path",
        "Qwen/Qwen2.5-7B-Instruct",
        "--port",
        "8080",
        "--host",
        "0.0.0.0",
    ],
)

# Deploy the pod
res = sglang_server.create()

print("SGLang server hosted at:", res.url)
```

## Deployment

Deploy the server using the Beam CLI:

```bash theme={null}
python app.py
```

Here's the expected output, with the URL of the deployed app:

```bash theme={null}
=> Files synced
=> Creating container
=> Container created successfully ===> pod-b451fa2f-3c4a-47e0-bb37-333434fds22b66-add2d058
=> This container will timeout after 600 seconds.
=> Invocation details
curl -X POST 'https://b451fa2f-3c4a-47e0-bb37-333434fds22b66-8080.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-d '{}'
SGLang server hosted at: https://b451fa2f-3c4a-47e0-bb37-333434fds22b66-8080.app.beam.cloud
```

## API Usage

The SGLang server exposes an OpenAI-compatible API at `/v1`. You can interact with it using the OpenAI Python client or any HTTP client.

### Test Script

Create a file named `test.py` to test the deployed server:

```python theme={null}
import openai

# Initialize OpenAI client with Beam endpoint and Beam API key
client = openai.Client(
    base_url="https://35b937b9-1a70-4343-89d9-1125b1290e4d-8080.app.beam.cloud/v1",
    api_key="BEAM_API_KEY",  # Replace with your actual Beam API key
)

# Send a chat completion request
response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct",
    messages=[
        {"role": "user", "content": "List 3 countries and their capitals."},
    ],
    temperature=0,
    max_tokens=64,
)

# Print the response
print(response.choices[0].message.content)
```

#### Running the Test

1. Replace `BEAM_API_KEY` with your actual Beam API key.
2. Update the `base_url` with your deployed pod’s URL.
3. Install the OpenAI client locally:

```bash theme={null}
pip install openai
```

4. Run the script:

```bash theme={null}
python test.py
```

Expected output for the prompt, *"List 3 countries and their capitals"*.

```
1. France - Paris
2. Japan - Tokyo
3. Brazil - Brasília
```


# Running Streamlit Apps
Source: https://docs.beam.cloud/v2/examples/streamlit


You can easily deploy Streamlit apps on Beam. In this guide, we'll show you how to deploy a simple Streamlit app that visualizes a simple dataset.

<Frame>
  <img />
</Frame>

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/web_servers/streamlit_server">
  See the code for this example on Github.
</Card>

### App Structure

There are two components to deploying a Streamlit app on Beam:

1. An `start_server.py` file with your Beam code. You can view the source code [here](https://github.com/beam-cloud/examples/blob/main/web_servers/streamlit_server/app.py).
2. A `app.py` file that hosts the Streamlit app

Here's what the Beam-specific code looks like:

```python start_server.py theme={null}
from beam import Image, Pod

streamlit_server = Pod(
    image=Image().add_python_packages(["streamlit", "pandas", "altair", "requests"]),
    ports=[8501],  # Default port for streamlit
    cpu=1,
    memory=1024,
    entrypoint=["streamlit", "run", "app.py"],
)

res = streamlit_server.create()

print("Streamlit server hosted at:", res.url)
```

## Deployment

To run the app, you can simply invoke the Python module directly:

```python theme={null}
python start_server.py
```

Running this command will print the URL of the Streamlit app to the console.

```shell theme={null}
=> Creating container
=> Container created successfully ===> pod-15fba0c6-3fe8-408e-a0f8-c99cf166dcc9-97b6207e
=> Invocation details

curl -X GET 'https://15fba0c6-3fe8-408e-a0f8-c99cf166dcc9-8888.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-d '{}'
```

You can enter the URL in your browser to view the Streamlit app!


# Fine-Tuning Meta Llama 3.1 8B with Unsloth
Source: https://docs.beam.cloud/v2/examples/unsloth


In this guide, we fine-tune the [Meta-Llama-3.1-8B-bnb-4bit](https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit) model, optimized by Unsloth, using Low-Rank Adaptation (LoRA) on the [Alpaca-cleaned dataset](https://huggingface.co/datasets/yahma/alpaca-cleaned). We leverage Beam's infrastructure for compute and storage, then deploy an inference endpoint. Throughout the process, we'll track and evaluate our fine-tuning performance using Weights & Biases (wandb).

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/unsloth">
  See the full code for this example on GitHub.
</Card>

## Setup

### Environment Configuration

We define a shared `Image` configuration for both fine-tuning and inference, ensuring consistency. The image includes necessary dependencies and installs Unsloth from its GitHub repository.

<Note>
  To use Weights & Biases (wandb) for tracking, you'll need your API key. You
  can find it in your [wandb dashboard](https://wandb.ai/settings#api) under the
  "API keys" section. Copy the key and replace `YOUR_WANDB_KEY` in the `wandb
      login` command.
</Note>

```python finetune.py theme={null}
from beam import Image

# Weights & Biases API Key (replace with your key)
WANDB_API_KEY = "YOUR_WANDB_KEY"

image = (
    Image(python_version="python3.11")
    .add_python_packages([
        "ninja",
        "packaging",
        "wheel",
        "torch",
        "xformers",
        "trl",
        "peft",
        "accelerate",
        "bitsandbytes",
        "wandb"
    ])
    .add_commands([
        "pip uninstall unsloth -y",
        'pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"',
        f"wandb login {WANDB_API_KEY}"
    ])
)

# Constants
MODEL_NAME = "unsloth/Meta-Llama-3.1-8B-bnb-4bit"
MAX_SEQ_LENGTH = 2048
VOLUME_PATH = "./model_storage"
```

## Fine-Tuning

The fine-tuning script (`finetune.py`) uses Unsloth to adapt the model to the Alpaca-cleaned dataset while tracking metrics with Weights & Biases.

```python theme={null}
from beam import endpoint, Image, Volume, env

# Weights & Biases API Key (replace with your key)
WANDB_API_KEY = "YOUR_WANDB_KEY"

if env.is_remote():
    import torch
    from unsloth import FastLanguageModel
    from transformers import TrainingArguments
    from trl import SFTTrainer
    from datasets import load_dataset
    import os
    import wandb

MODEL_NAME = "unsloth/Meta-Llama-3.1-8B-bnb-4bit"
MAX_SEQ_LENGTH = 2048
VOLUME_PATH = "./model_storage"
TRAIN_CONFIG = {
    "batch_size": 2,
    "grad_accumulation": 4,
    "max_steps": 60,
    "learning_rate": 2e-4,
    "seed": 3407,
}

image = (
    Image(python_version="python3.11")
    .add_python_packages(
        [
            "ninja",
            "packaging",
            "wheel",
            "torch",
            "xformers",
            "trl",
            "peft",
            "accelerate",
            "bitsandbytes",
            "wandb"
        ]
    )
    .add_commands(
        [
            "pip uninstall unsloth -y",
            'pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"',
        ]
    )
    .add_commands(
        [
            'echo "127.0.0.1 localhost" >> /etc/hosts',
            f"wandb login {WANDB_API_KEY}"
        ]
    )
)


@endpoint(
    name="unsloth-fine-tune",
    cpu=12,
    memory="32Gi",
    gpu="H100",
    image=image,
    volumes=[Volume(name="model-storage", mount_path=VOLUME_PATH)],
    timeout=-1,
)
def fine_tune_model():
    import os
    import wandb

    os.environ["WANDB_PROJECT"] = "llama-3.1-finetuning"
    os.environ["WANDB_LOG_MODEL"] = "checkpoint"

    output_dir = os.path.join(VOLUME_PATH, "fine_tuned_model")
    os.makedirs(output_dir, exist_ok=True)

    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name=MODEL_NAME, max_seq_length=MAX_SEQ_LENGTH, load_in_4bit=True
    )

    def format_alpaca_prompt(instruction, input_text, output):
        template = (
            "Below is an instruction that describes a task, paired with an input that "
            "provides further context. Write a response that appropriately completes the request.\n"
            "### Instruction:\n{}\n### Input:\n{}\n### Response:\n{}"
        )
        return template.format(instruction, input_text, output) + tokenizer.eos_token

    def format_dataset(examples):
        texts = [
            format_alpaca_prompt(instruction, input_text, output)
            for instruction, input_text, output in zip(
                examples["instruction"], examples["input"], examples["output"]
            )
        ]
        return {"text": texts}

    dataset = load_dataset("yahma/alpaca-cleaned", split="train")
    dataset = dataset.map(format_dataset, batched=True)

    model = FastLanguageModel.get_peft_model(
        model,
        r=16,
        target_modules=[
            "q_proj",
            "k_proj",
            "v_proj",
            "o_proj",
            "gate_proj",
            "up_proj",
            "down_proj",
        ],
        lora_alpha=16,
        lora_dropout=0,
        use_gradient_checkpointing="unsloth",
        random_state=TRAIN_CONFIG["seed"],
    )

    trainer = SFTTrainer(
        model=model,
        tokenizer=tokenizer,
        train_dataset=dataset,
        dataset_text_field="text",
        max_seq_length=MAX_SEQ_LENGTH,
        dataset_num_proc=2,
        packing=False,
        args=TrainingArguments(
            per_device_train_batch_size=TRAIN_CONFIG["batch_size"],
            gradient_accumulation_steps=TRAIN_CONFIG["grad_accumulation"],
            max_steps=TRAIN_CONFIG["max_steps"],
            learning_rate=TRAIN_CONFIG["learning_rate"],
            fp16=False,
            bf16=True,
            logging_steps=1,
            output_dir=output_dir,
            seed=TRAIN_CONFIG["seed"],
            report_to="wandb",
            save_steps=100,
        ),
    )

    with torch.autograd.set_detect_anomaly(True):
        trainer.train()

    model.save_pretrained(output_dir)
    tokenizer.save_pretrained(output_dir)
    wandb.finish()

    return {
        "status": "success",
        "message": "Fine-tuning complete",
        "model_path": output_dir,
    }
```

### Running Fine-Tuning

Execute the script:

```bash theme={null}
python finetune.py
```

After completion, verify that the files are saved in your Beam Volume:

```bash theme={null}
beam ls model-storage/fine_tuned_model
```

Here's the expected output with the fine-tuned files:

```
  Name                           Size   Modified Time   IsDir
 ─────────────────────────────────────────────────────────────
  fine_tuned_model/README.md                  4.99 KiB   1 hour ago      No
  fine_tuned_model/adapter_config.json        805.00 B   1 hour ago      No
  fine_tuned_model/adapter_model.safeten…   160.06 MiB   1 hour ago      No
  fine_tuned_model/checkpoint-60/                        1 hour ago      Yes
  fine_tuned_model/special_tokens_map.js…     459.00 B   1 hour ago      No
  fine_tuned_model/tokenizer.json            16.41 MiB   1 hour ago      No
  fine_tuned_model/tokenizer_config.json     49.46 KiB   1 hour ago      No
  ...
```

### Training Performance Metrics

We tracked our fine-tuning process using Weights & Biases, which provided detailed metrics on training progress. The dashboard showed that the training loss started at approximately 1.85 and, despite significant fluctuations, exhibited a general downward trend, ending at around 0.95 by step 60. This suggests that the model was learning patterns from the Alpaca-cleaned dataset over the 60 training steps.

<Frame>
  <img />
</Frame>

The dashboard shows a consistent decrease in training loss over time, confirming that our model was learning effectively from the Alpaca dataset.

## Evaluation

To understand the impact of fine-tuning the Meta Llama 3.1 8B model with Unsloth on the Alpaca-cleaned dataset, we evaluated both the base model and the fine-tuned model on two widely used benchmarks: **HellaSwag** (a commonsense reasoning task) and **MMLU** (Massive Multitask Language Understanding, covering a broad range of subjects). The results highlight the fine-tuned model's improvements over the base model, demonstrating the effectiveness of our fine-tuning process.

### Overall Performance

The table below summarizes the overall performance on HellaSwag and MMLU. The fine-tuned model shows modest but consistent gains across both benchmarks.

| Benchmark             | Base Model | Fine-tuned Model | Improvement |
| --------------------- | ---------- | ---------------- | ----------- |
| HellaSwag (acc)       | 59.09%     | 60.37%           | +1.28%      |
| HellaSwag (acc\_norm) | 77.93%     | 78.75%           | +0.82%      |
| MMLU (overall)        | 61.42%     | 62.33%           | +0.91%      |

* **HellaSwag**: The fine-tuned model improves accuracy (acc) by 1.28% and normalized accuracy (acc\_norm) by 0.82%, indicating better commonsense reasoning capabilities.
* **MMLU**: An overall improvement of 0.91% suggests the model has enhanced its general knowledge and reasoning across diverse topics.

### Analysis

The fine-tuned model demonstrates consistent improvements over the base model, particularly in tasks requiring logical reasoning, ethical judgment, and commonsense understanding. These gains align with the Alpaca-cleaned dataset's focus on instruction-following and coherent responses.

## Inference

### Inference Script

The inference script (`inference.py`) loads the fine-tuned model and exposes an endpoint for generating responses.

```python theme={null}
from beam import Image, endpoint, Volume, env

if env.is_remote():
    from unsloth import FastLanguageModel
    from unsloth.chat_templates import get_chat_template

image = (
    Image(python_version="python3.11")
    .add_python_packages(
        [
            "ninja",
            "packaging",
            "wheel",
            "torch",
            "xformers",
            "trl",
            "peft",
            "accelerate",
            "bitsandbytes",
        ]
    )
    .add_commands(
        [
            "pip uninstall unsloth -y",
            'pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"',
        ]
    )
)

MAX_SEQ_LENGTH = 2048
VOLUME_PATH = "./model_storage"


@endpoint(
    name="unsloth-inference",
    image=image,
    cpu=12,
    memory="32Gi",
    gpu="H100",
    timeout=-1,
    volumes=[Volume(name="model-storage", mount_path=VOLUME_PATH)],
)
def generate(**inputs):
    prompt = inputs.pop("prompt", None)

    if not prompt:
        return {"error": "Please provide a prompt"}

    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name=f"{VOLUME_PATH}/fine_tuned_model",
        max_seq_length=MAX_SEQ_LENGTH,
        load_in_4bit=True,
    )

    tokenizer = get_chat_template(
        tokenizer,
        chat_template="llama-3.1",
    )
    FastLanguageModel.for_inference(model)

    messages = [
        {
            "role": "user",
            "content": prompt,
        },
    ]
    inputs = tokenizer.apply_chat_template(
        messages,
        tokenize=True,
        add_generation_prompt=True,
        return_tensors="pt",
    ).to("cuda")

    outputs = model.generate(
        input_ids=inputs, max_new_tokens=64, use_cache=True, temperature=1.5, min_p=0.1
    )
    res = tokenizer.batch_decode(outputs)
    return {"output": res}
```

### Deploying the Endpoint

Run this command to deploy the inference endpoint:

```bash theme={null}
beam deploy inference.py:generate
```

You'll get back a URL with the endpoint:

```bash theme={null}
=> Building image
=> Syncing files
=> Deploying
=> Deployed
=> Invocation details
curl -X POST 'https://app.beam.cloud/endpoint/unsloth-inference/v1' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer {YOUR_AUTH_TOKEN}' \
-d '{"prompt": "Your prompt"}'
```


# Run an OpenAI-Compatible vLLM Server
Source: https://docs.beam.cloud/v2/examples/vllm


In this example, we are going to use [vLLM](https://github.com/vllm-project/vllm) to host an OpenAI compatible InternVL3 8B API on Beam.

<video />

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/vllm">
  See the code for this example on Github.
</Card>

## Introduction to vLLM

[vLLM](https://github.com/vllm-project/vllm) is a high-performance, easy-to-use library for LLM inference. It can be up to 24 times faster than HuggingFace's Transformers library and it allows you to easily setup an OpenAI compatible API for your LLM. Additionally, a number of LLMs (like Llama 3.1) support LoRA. This means that you can easily follow our [LoRA guide](/v2/examples/gemma-fine-tune) and host your resulting model using vLLM.

The key to vLLM's performance is Paged Attention. In LLMs, input tokens produce attention keys and value tensors, which are typically stored in GPU memory. Paged Attention stores these continuous keys and values in non-contiguous memory by partitioning them into blocks that are fetched on a need-to-use basis.

> Because the blocks do not need to be contiguous in memory, we can manage the keys and values in a more flexible way as in OS’s virtual memory: one can think of blocks as pages, tokens as bytes, and sequences as processes. The contiguous logical blocks of a sequence are mapped to non-contiguous physical blocks via a block table. - [vLLM Explainer Doc](https://blog.vllm.ai/2023/06/20/vllm.html)

# Hosting an OpenAI-Compatible Chat API with vLLM

With vLLM, we can host a fully functional chat API that we can use with already built SDKs to interact with. You could build this functionality yourself, but vLLM provides a great out of the box solution as well.

## Initial Setup

To get started with vLLM on Beam, we can use the `VLLM` class from the Beam SDK. This class supports all of the flags and arguments of the vLLM command line tool as arguments.

### Setup Compute Environment

Let's take a look at the code required to deploy the `OpenGVLab/InternVL3-8B-AWQ` model with an efficient configuration. We start by defining the environment and the necessary arguments for our vLLM server.

```python models.py theme={null}
from beam.integrations import VLLM, VLLMArgs

MODEL_ID = "OpenGVLab/InternVL3-8B-AWQ"

vllm_server = VLLM(
    name=MODEL_ID.split("/")[-1],
    cpu=4,
    memory="16Gi",
    gpu="A10G",
    gpu_count=1,
    workers=1,
    vllm_args=VLLMArgs(
        model=MODEL_ID,
        served_model_name=[MODEL_ID],
        trust_remote_code=True,
        max_model_len=4096,
        gpu_memory_utilization=0.90,
        limit_mm_per_prompt={"image": 2},
        quantization="awq",
        max_num_batched_tokens=8192,
    )
)
```

**Key Configuration Parameters:**

* `name`: A descriptive name for your Beam application.
* `cpu`: Number of CPU cores allocated (e.g., 4).
* `memory`: Amount of memory allocated (e.g., "16Gi").
* `gpu`: Type of GPU to use (e.g., "A10G").
* `gpu_count`: Number of GPUs (e.g., 1).
* `workers`: Number of worker processes for vLLM (e.g., 1).
* `vllm_args`: Arguments passed directly to the vLLM engine:
  * `model`: The Hugging Face model identifier.
  * `served_model_name`: Name under which the model is served.
  * `trust_remote_code`: Allows the model to execute custom code if required.
  * `max_model_len`: Maximum token sequence length for the model.
  * `gpu_memory_utilization`: Target GPU memory utilization (e.g., 0.90 for 90%).
  * `limit_mm_per_prompt`: (If applicable) Limits for multi-modal inputs.
  * `quantization`: Enables model quantization (e.g., "awq"). This is often beneficial even if the model name suggests it's pre-quantized, as vLLM handles the specifics.
  * `max_num_batched_tokens`: Sets the capacity for tokens in a batch for dynamic batching (e.g., 8192).

**Equivalent vLLM Command Line (for reference):**

The `VLLM` integration in Beam simplifies deployment. If you were to run a similar configuration using the `vllm serve` command-line tool directly, some of the corresponding arguments would be:

```bash theme={null}
vllm serve OpenGVLab/InternVL3-8B-AWQ \
    --trust-remote-code \
    --max-model-len 4096 \
    --limit-mm-per-prompt image=2 \
    --quantization awq \
    --max-num-batched-tokens 8192 \
    --gpu-memory-utilization 0.90
# Note: Parameters like cpu, memory, gpu_count, and workers are managed by Beam's infrastructure.
```

## Deploying the API

To deploy our model, we can run the following command:

```bash theme={null}
beam deploy models.py:internvl
```

The output will look like this:

```bash theme={null}
=> Building image
=> Using cached image
=> Syncing files
Reading .beamignore file
Collecting files from /Users/minzi/Dev/beam/ex-repo/vllm
Added /Users/minzi/Dev/beam/ex-repo/vllm/models.py
Added /Users/minzi/Dev/beam/ex-repo/vllm/tool_chat_template_mistral.jinja
Added /Users/minzi/Dev/beam/ex-repo/vllm/README.md
Added /Users/minzi/Dev/beam/ex-repo/vllm/chat.py
Added /Users/minzi/Dev/beam/ex-repo/vllm/inference.py
Collected object is 14.46 KB
=> Files already synced
=> Deploying
=> Deployed
=> Invocation details
curl -X POST 'https://internvl-15c4487-v4.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer YOUR_TOKEN' \
-d '{}'
```

## Using the API

### Pre-requisites

Once your function is deployed, you can interact with it using the OpenAI Python client.

To get started, you can clone the [example repository](https://github.com/beam-cloud/examples/tree/main/vllm) and run the `chat.py` script.

<Warning>
  Make sure you have the `openai` library installed locally, since that is how
  we interact with the deployed API.
</Warning>

```bash theme={null}
git clone https://github.com/beam-cloud/examples.git
cd examples/vllm
pip install openai
python chat.py
```

### Starting a Dialogue

You will be greeted with a prompt to enter the URL of your deployed function.

Once you enter the URL, the container will initialize on Beam and you will be able to interact with the model.

```bash theme={null}
Welcome to the CLI Chat Application!

Type 'quit' to exit the conversation.

Enter the app URL: https://internvl-instruct-15c4487-v3.app.beam.cloud

Model OpenGVLab/InternVL2_5-8B is ready

Question: What is in this image?

Image link (press enter to skip): https://upload.wikimedia.org/wikipedia/commons/7/74/White_domesticated_duck,_stretching.jpg

Assistant:  The image you've shared is of a white duck standing on a grassy field. The duck, with its distinctive orange beak and feet, is facing to the left.
```

To host other models, you can simply change the arguments you pass into the `VLLM` class.

<CodeGroup>
  ```python Yi Coder 9B Chat theme={null}
  from beam.integrations import VLLM, VLLMArgs

  YI_CODER_CHAT = "01-ai/Yi-Coder-9B-Chat"

  yicoder_chat = VLLM(
      name=YI_CODER_CHAT.split("/")[-1],
      cpu=8,
      memory="16Gi",
      gpu="H100",
      vllm_args=VLLMArgs(
          model=YI_CODER_CHAT,
          served_model_name=[YI_CODER_CHAT],
          task="chat",
          trust_remote_code=True,
          max_model_len=8096,
      ),
  )
  ```

  ```python Mistral 7B Instruct v0.3 theme={null}
  from beam.integrations import VLLM, VLLMArgs

  MISTRAL_INSTRUCT = "mistralai/Mistral-7B-Instruct-v0.3"

  mistral_instruct = VLLM(
      name=MISTRAL_INSTRUCT.split("/")[-1],
      cpu=8,
      memory="16Gi",
      gpu="H100",
      secrets=["HF_TOKEN"],
      vllm_args=VLLMArgs(
          model=MISTRAL_INSTRUCT,
          served_model_name=[MISTRAL_INSTRUCT],
          chat_template="./tool_chat_template_mistral.jinja",
          enable_auto_tool_choice=True,
          tool_call_parser="mistral",
      ),
  )
  ```
</CodeGroup>


# Web Scraping with Beam Functions
Source: https://docs.beam.cloud/v2/examples/web-scraping


In this example, we'll demonstrate how to build a Wikipedia web scraper using Beam functions. While you could run this on a local computer, Beam provides access to more powerful computational resources, allowing you to add advanced features to your webscraper using large language models or OCR models.

<video />

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/web_scraping">
  See the code for this example on Github.
</Card>

## Defining our Scraping Function

We will start by defining our scraping function. This is the Beam function that will be invoked remotely. We use the [`Image` class](/v2/environment/custom-images) from the `beam` SDK to install these packages in the container running your code.

```python theme={null}
from beam import Image, function


@function(image=Image().add_python_packages(["requests", "beautifulsoup4"]))
def scrape_page(url):
    import requests
    from bs4 import BeautifulSoup

    response = requests.get(url)
    if response.status_code != 200:
        return {"url": url, "title": "", "content": "", "links": []}

    soup = BeautifulSoup(response.text, "html.parser")
    title = soup.find(id="firstHeading").text
    content = soup.find(id="mw-content-text").find(class_="mw-parser-output")

    if not content:
        return {"url": url, "title": title, "content": "", "links": []}

    paragraphs = [p.text for p in content.find_all("p", recursive=False)]
    links = [urljoin(url, link["href"]) for link in content.find_all("a", href=True)]

    return {
        "url": url,
        "title": title,
        "content": "\n\n".join(paragraphs),
        "links": links,
    }
```

Our function takes in a URL, fetches the page's HTML, and then uses [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/) to extract the page's title, content, and links. It returns that content in a dictionary so that our crawler can invoke new functions with the links found on the page. If we wanted, we could add more functionality to this function to extract or process the content in a variety of ways. For example, we could add a language model to summarize the content or use an OCR model to extract text from an image.

## Building a Batch Crawler with Beam's Function Map

Next, we'll build a crawler that will use Beam's `map` method to invoke our `scrape_page` function on a list of URLs. Below, is our `__init__` method for the crawler.

```python theme={null}
class WikipediaCrawler:
    def __init__(self, start_url, max_pages=100, batch_size=5):
        self.start_url = start_url
        self.max_pages = max_pages
        self.batch_size = batch_size
        self.visited_pages = set()
        self.pages_to_visit = [start_url]
        self.scraped_data = {}
```

Our crawler takes in a starting URL, a maximum number of pages to scrape, and a batch size. The batch size determines how many remote function invocations we will make at a time.

Next, we'll define the actual `crawl` method along with a helper method to determine if a URL is a valid Wikipedia URL.

```python theme={null}
    def is_wikipedia_url(self, url):
        parsed_url = urlparse(url)
        return parsed_url.netloc.endswith(
            "wikipedia.org"
        ) and parsed_url.path.startswith("/wiki/")

    def crawl(self):
        while len(self.scraped_data) < self.max_pages and self.pages_to_visit:
            # Create a batch of 5 pages to scrape that we have not yet visited
            batch = []
            while len(batch) < self.batch_size and self.pages_to_visit:
                p = self.pages_to_visit.pop(0)
                if p not in self.visited_pages:
                    batch.append(p)

            for result in scrape_page.map(batch):
                # Save the result and collect new links
                self.scraped_data[result["url"]] = result
                if len(self.scraped_data) < self.max_pages:
                    new_links = [
                        link
                        for link in result["links"]
                        if self.is_wikipedia_url(link)
                        and link not in self.visited_pages
                        and link not in self.pages_to_visit
                    ]
                    self.pages_to_visit.extend(new_links)

        print(f"Crawling completed. Scraped {len(self.scraped_data)} pages.")

    def get_scraped_data(self):
        return self.scraped_data
```

The crawl method runs continuously until we have scraped the maximum number of pages or there are no more pages to visit. It creates a batch of URLs to scrape and then passes them to the `scrape_page` function's `map` method. This allows us to scrape multiple pages in parallel. After the pages are scraped, we collect any new links that we want to visit and add them to the `pages_to_visit` list.

## Running the Batch Crawler

Finally, we can run our crawler. Below is the code for our `main` function which initializes the crawler and runs the crawl method.

```python theme={null}
if __name__ == "__main__":
    start_url = "https://en.wikipedia.org/wiki/Web_scraping"
    crawler = WikipediaCrawler(start_url, max_pages=20)
    crawler.crawl()

    # Write the scraped data to a file
    with open("scraped_data.json", "w") as f:
        json.dump(crawler.get_scraped_data(), f)
```

This code initializes the crawler with a starting URL and a maximum number of pages to scrape. It then runs the crawl method and writes the scraped data to a file. You can run this code like any other Python script:

```bash theme={null}
python batch_crawl.py
```

When you run this code, you should see output that looks like the following:

```bash theme={null}
=> Building image
=> Using cached image
=> Syncing files
...
=> Uploading
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 727.9/727.9 kB 0:00:00
=> Files synced
=> Running function: <app2:scrape_page>
=> Function complete <21f88938-8b82-465c-8b16-8bc0259e1997>
=> Running function: <app2:scrape_page>
=> Running function: <app2:scrape_page>
=> Running function: <app2:scrape_page>
=> Running function: <app2:scrape_page>
=> Running function: <app2:scrape_page>
=> Function complete <0f384b7a-98da-400e-bcc5-abacf7f239ef>
=> Function complete <16da6df3-955d-4ad7-a8ec-c6456ab6de1e>
=> Function complete <2dd4c91b-a48f-4d7e-ada3-485633539ee5>
=> Function complete <04452bb5-f642-43e3-9d0f-9cb7532c0d4b>
=> Function complete <7ba94632-1907-415a-acad-c37a2cddd97e>
```

The output shows five function invocations in parallel. Once the scraping is complete, you can see the results in the `scraped_data.json` file. It will look something like this:

```json theme={null}
{
    "https://en.wikipedia.org/wiki/Web_scraping": {
        "url": "https://en.wikipedia.org/wiki/Web_scraping",
        "title": "Web scraping",
        "content": "Web scraping, web harvesting, or web data extraction ...",
        "links": [
            "https://en.wikipedia.org/wiki/Data_scraping",
            ...
        ]
    },
    "https://en.wikipedia.org/wiki/Wikipedia:Verifiability": {
        "url": "https://en.wikipedia.org/wiki/Wikipedia:Verifiability",
        "title": "Wikipedia:Verifiability",
        "content": "\n\n\nIn the English Wikipedia, ...",
        "links": [
            ...
        ]
    },
    ...
}
```

## Building a Continuous Crawler with Beam Functions and Threads

The batched web crawler is a good starting point, but it requires waiting for a full batch to finish before starting any new jobs. If we want to keep our crawler limit continuously saturated, we can use Beam functions in conjunction with Python threads.

To do this, we will use the same `scrape_page` function, but instead of using the `map` method, we will use a thread pool to invoke the function in parallel. Below is the code for our `WikipediaCrawler` class with a continuous crawl method.

```python theme={null}
    def process_scraped_page(self, result):
        if not result or len(self.scraped_data) >= self.max_pages:
            return

        self.scraped_data[result["url"]] = result
        if len(self.scraped_data) < self.max_pages:
            new_links = filter(self.is_wikipedia_url, result["links"])
            self.pages_to_visit.extend(new_links)

    def crawl(self):
        with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
            futures = {}
            while len(self.scraped_data) < self.max_pages and (
                self.pages_to_visit or futures
            ):
                # Start new tasks if we have capacity and pages to visit
                while len(futures) < 5 and self.pages_to_visit:
                    url = self.pages_to_visit.pop(0)
                    self.visited_pages.add(url)
                    future = executor.submit(scrape_page.remote, url)
                    futures[future] = url

                # Wait for any task to complete
                if futures:
                    done, _ = concurrent.futures.wait(
                        futures, return_when=concurrent.futures.FIRST_COMPLETED
                    )
                    for future in done:
                        url = futures.pop(future)
                        try:
                            result = future.result()
                            self.process_scraped_page(result)
                        except Exception as e:
                            print(f"Error processing {url}: {str(e)}")

        print(f"Crawling completed. Scraped {len(self.scraped_data)} pages.")
```

This code is more complex than the batch crawler, but it allows us to better utilize our compute resources. Instead of having containers sitting idle while other containers complete their work, we immediately send a new function invocation as soon as another one completes. To do this, we track the futures returned by the `executor.submit` method and wait for any of them to complete using the `concurrent.futures.wait` method. We specify that we only want to wait for one of the futures to complete using the `concurrent.futures.FIRST_COMPLETED` constant. This means that as soon as any future completes, we will process the result and add new work to the pool.

## Running the Continuous Crawler

To run the continuous crawler, you can use the same `main` function as before. When you run this code, you should see output that looks like the following:

```bash theme={null}
=> Building image
=> Using cached image
=> Syncing files
...
=> Files already synced
=> Running function: <app:scrape_page>
=> Function complete <5c059d2c-2570-4ac1-8c8b-11d96543d197>
=> Running function: <app:scrape_page>
=> Running function: <app:scrape_page>
=> Running function: <app:scrape_page>
=> Running function: <app:scrape_page>
=> Running function: <app:scrape_page>
=> Function complete <4f0ff5f6-12db-448f-acad-66362485c988>
=> Running function: <app:scrape_page>
=> Function complete <b4ada76f-a567-4b43-9d40-0481f3fccd6f>
=> Running function: <app:scrape_page>
=> Function complete <44ae4cbe-6de1-4442-a28c-9adb32937a03>
=> Running function: <app:scrape_page>
=> Function complete <5788c95d-afb7-428d-b906-d0378f099d58>
=> Running function: <app:scrape_page>
=> Function complete <dedfaf60-9430-4ece-83b8-58c73ad88f30>
```

As you can see, as soon as one function invocation completes, we immediately start a new one.


# Faster Whisper
Source: https://docs.beam.cloud/v2/examples/whisper


This guide will walk you through deploying and invoking a transcription API using the Faster Whisper model on Beam. The API can be invoked with either a URL to an `.mp3` file or a base64-encoded audio file.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/audio_and_transcription/faster_whisper">
  See the code for this example on Github.
</Card>

## Initial Setup

In your Python file, add the following code to define your endpoint and handle the transcription:

```python app.py theme={null}
from beam import endpoint, Image, Volume, env
import base64
import requests
from tempfile import NamedTemporaryFile

BEAM_VOLUME_PATH = "./cached_models"

# These packages will be installed in the remote container
if env.is_remote():
    from faster_whisper import WhisperModel, download_model

# This runs once when the container first starts
def load_models():
    model_path = download_model("large-v3", cache_dir=BEAM_VOLUME_PATH)
    model = WhisperModel(model_path, device="cuda", compute_type="float16")
    return model

@endpoint(
    on_start=load_models,
    name="faster-whisper",
    cpu=2,
    memory="32Gi",
    gpu="A10G",
    image=Image(
        base_image="nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04",
        python_version="python3.10",
    )
    .add_python_packages(["git+https://github.com/SYSTRAN/faster-whisper.git", "huggingface_hub[hf-transfer]"])
    .with_envs("HF_HUB_ENABLE_HF_TRANSFER=1"),
    volumes=[
        Volume(
            name="cached_models",
            mount_path=BEAM_VOLUME_PATH,
        )
    ],
)
def transcribe(context, **inputs):
    # Retrieve cached model from on_start
    model = context.on_start_value

    # Inputs passed to API
    language = inputs.get("language")
    audio_base64 = inputs.get("audio_file")
    url = inputs.get("url")

    if audio_base64 and url:
        return {"error": "Only a base64 audio file OR a URL can be passed to the API."}
    if not audio_base64 and not url:
        return {
            "error": "Please provide either an audio file in base64 string format or a URL."
        }

    binary_data = None

    if audio_base64:
        binary_data = base64.b64decode(audio_base64.encode("utf-8"))
    elif url:
        resp = requests.get(url)
        binary_data = resp.content

    text = ""

    with NamedTemporaryFile() as temp:
        try:
            # Write the audio data to the temporary file
            temp.write(binary_data)
            temp.flush()

            segments, _ = model.transcribe(temp.name, beam_size=5, language=language)

            for segment in segments:
                text += segment.text + " "

            print(text)
            return {"text": text}

        except Exception as e:
            return {"error": f"Something went wrong: {e}"}
```

## Deployment

To deploy the app, run the following command:

<Info>
  If you named your file something different than `app.py`, make sure to
  customize the command with your correct file name.
</Info>

```python theme={null}
beam deploy app.py:transcribe
```

This command will deploy your app as a web endpoint. The endpoint URL will be printed out in the shell.

## Invoking the API

Once the API is running, you can invoke it with a URL to an `.mp3` file using the following cURL command:

<Tip>
  If you want to test with sample `.mp3` files, you can find many samples on
  [this website](https://audio-samples.github.io/).
</Tip>

```sh theme={null}
curl -X POST 'https://faster-whisper-7157fd0-v1.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [YOUR-AUTH-TOKEN]' \
-d '{"url":"https://audio-samples.github.io/samples/mp3/blizzard_unconditional/sample-0.mp3"}'
```

Replace the URL with the URL printed in your shell, and `[YOUR-AUTH-TOKEN]` with your authentication token.

## Summary

You've successfully set up a highly performant serverless API for transcribing audio files using the Faster Whisper model on Beam. The API can handle both URLs to audio files and base64-encoded audio files. With the provided setup, you can easily serve, invoke, and develop your transcription API.


# Zonos
Source: https://docs.beam.cloud/v2/examples/zonos


This guide demonstrates how to deploy a Text-to-Speech (TTS) API using the [Zonos model](https://github.com/Zyphra/Zonos) from Zyphra. The API converts input text into spoken audio, leveraging a pre-trained transformer model and speaker embeddings derived from an example audio file. We use Beam’s infrastructure for compute and file output handling.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/audio_and_transcription/zonos">
  See the full code for this example on GitHub.
</Card>

## Setup

### Environment Configuration

First, create a file named `app.py`:

```python theme={null}
from beam import Image, endpoint, Output, env

if env.is_remote():
    import torchaudio
    from zonos.model import Zonos
    from zonos.conditioning import make_cond_dict
    from zonos.utils import DEFAULT_DEVICE as device
    import os
    import uuid

# Custom image configuration
image = (
    Image(
        base_image="nvidia/cuda:12.4.1-devel-ubuntu22.04",
        python_version="python3.11"
    )
    .add_commands(["apt update && apt install -y espeak-ng git"])
    .add_commands([
        "pip install -U uv",
        "git clone https://github.com/Zyphra/Zonos.git /tmp/Zonos",
        "cd /tmp/Zonos && pip install setuptools wheel && pip install -e .",
    ])
)

@endpoint(
    name="zonos-tts",
    image=image,
    cpu=12,
    memory="32Gi",
    gpu="H100",
    timeout=-1
)
def generate(**inputs):
    text = inputs.get("text")

    if not text:
        return {"error": "Please provide a text"}

    os.chdir("/tmp/Zonos")

    model = Zonos.from_pretrained("Zyphra/Zonos-v0.1-transformer", device=device)

    wav, sampling_rate = torchaudio.load("assets/exampleaudio.mp3")
    speaker = model.make_speaker_embedding(wav, sampling_rate)

    cond_dict = make_cond_dict(text=text, speaker=speaker, language="en-us")
    conditioning = model.prepare_conditioning(cond_dict)

    codes = model.generate(conditioning)

    # Save generated audio
    file_name = f"/tmp/zonos_out_{uuid.uuid4()}.wav"
    wavs = model.autoencoder.decode(codes).cpu()
    torchaudio.save(file_name, wavs[0], model.autoencoder.sampling_rate)

    # Upload and get public URL
    output_file = Output(path=file_name)
    output_file.save()
    public_url = output_file.public_url(expires=1200000000)

    return {"output_url": public_url}

if __name__ == "__main__":
    generate()
```

## Deployment

Run this command to deploy the endpoint:

```bash theme={null}
beam deploy app.py:generate
```

It will return a URL with the endpoint:

```bash theme={null}
=> Building image
=> Syncing files
=> Deploying
=> Deployed
=> Invocation details
curl -X POST 'https://app.beam.cloud/endpoint/zonos-tts/v1' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer {YOUR_AUTH_TOKEN}' \
-d '{"text": "On Beam run AI workloads anywhere with zero complexity."}'
```

## API Usage

The deployed endpoint accepts POST requests with a JSON payload containing the text to convert to speech.

### Request Format

```json theme={null}
{
  "text": "Your text to convert to speech"
}
```

### Example Request

```bash theme={null}
curl -X POST 'https://app.beam.cloud/endpoint/zonos-tts/v1' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer {YOUR_AUTH_TOKEN}' \
-d '{"text": "On Beam run AI workloads anywhere with zero complexity. One line of Python, global GPUs, full control"}'
```

### Example Response

The API returns a JSON object with a URL to the generated audio file:

```json theme={null}
{
  "output_url": "https://app.beam.cloud/output/id/704defd0-9370-4499-9124-677925e64961"
}
```


# Distributed Maps
Source: https://docs.beam.cloud/v2/function/maps

Using Beam's distributed Map

Beam includes a concurrency-safe distributed map, accessible both locally and within remote containers. Serialization is done using cloudpickle, so any pickleable object will work. The interface is that of a standard python dictionary, but unlike a typical dicitonary it will persist between runs.

## Example: Accessing a map locally and remotely

In the following example, we create a distributed map. Our first function is invoked remotely using `.remote()`, and it sets the value a key in our map. The second function is invoked locally using `.local()`, and it sets another value. Finally, we print the result of our third, remotely invoked function, which retrieves the values we just set.

```python theme={null}
from beam import Map, function


@function()
def first():
    m = Map(name="m")
    m["beam"] = "me up"
    return


@function()
def second():
    m = Map(name="m")
    m["speed"] = "of light"
    return


@function()
def third():
    m = Map(name="m")
    return [m["beam"], m["speed"]]


if __name__ == '__main__':
    first.remote()
    second.local()
    print(third.remote())
```

You can run the example above with `python app.py`. The output will be:

```bash theme={null}
['me up', 'of light']
```


# Queues
Source: https://docs.beam.cloud/v2/function/queues

Using Beam's distributed Queue to coordinate between tasks

Beam includes a concurrency-safe distributed queue, accessible both locally and within remote containers.

Serialization is done using cloudpickle, so any object that supported by that should work here. The interface is that of a standard python queue.

Because this is backed by a distributed queue, it will persist between runs.

In the example below, we run one function remotely on Beam and another locally. The remote function puts a value in the queue, and the local function pops it out and prints it. The output  will be `beam me up`.

```python Simple Queue theme={null}
from beam import Queue, function


@function()
def first():
    q = Queue(name="q")
    q.put("beam me up")
    return


@function()
def second():
    q = Queue(name="q")
    print(q.pop())
    return


if __name__ == '__main__':
    first.remote()
    second.local()
```


# Running Functions Remotely
Source: https://docs.beam.cloud/v2/function/running-functions

A short guide on using Beam to run one-off functions in the cloud

You can add a decorator to any Python function to run it remotely on Beam:

```python app.py theme={null}
from beam import function


@function()
def handler():
    return {"hello world"}

if __name__ == "__main__":
    handler.remote()
```

Just run this like a normal Python file, and the code will run on Beam's cloud and stream the response back to your shell.

```sh theme={null}
$ python app.py

=> Building image
=> Using cached image
=> Syncing files
=> Uploading
=> Files synced
=> Running function: <app:handler>
Loading image <d055bc4ee4ad0e61>...
Loaded image <d055bc4ee4ad0e61>, took: 3.131485ms
=> Function complete <b9ba6b86-6dfa-4bf3-89d0-75262bcc06f0>
```

<Info>
  By default, Beam will sync all the files in your working directory to the
  remote container. This allows you to use the files you have locally while
  developing. If you want to prevent some files from getting uploaded, you can
  create a [`.beamignore`](/v2/reference/cli#ignore-local-files).
</Info>

## Passing Function Args

You can also pass arguments to your function just like normal Python functions:

```python app.py theme={null}
from beam import function

@function()
def greet(name: str):
    return f"Hello {name}"

if __name__ == "__main__":
    print(greet.remote("World"))  # "Hello World"
```

## Task Timeouts

You can set timeouts on tasks. Timeouts are set in seconds:

```python theme={null}
from beam import function


# Set a 24 hour timeout
@function(timeout=86400)
def long_timeout():
    return {"hello world"}


# Disable timeouts completely
@function(timeout=-1)
def no_timeout():
    return {"message": "hello world"}
```

## Running Tasks in the Background

By default, remote functions will stop when you close your local Python process or exit your shell.

You can override this behavior and keep the function running in the background by setting `headless=False` in
your function decorator.

```python theme={null}
import time
from beam import function


# Run the function in the background
@function(headless=True)
def handler():
    for i in range(100):
        print(i)
        time.sleep(1)

    return {"message": "This is running in the background"}


if __name__ == "__main__":
    handler.remote()
```


# Scheduled Jobs
Source: https://docs.beam.cloud/v2/function/scheduled-job

How to run workloads on a schedule.

## Run Scheduled Jobs

Use the `@schedule` decorator to define a scheduled job.

```python theme={null}
from beam import schedule


@schedule(when="@weekly", name="weekly-task")
def task():
    print("Hi, from your weekly scheduled task!")
```

To schedule it, run `beam deploy`:

```sh theme={null}
beam deploy app.py:task
```

You'll see the upcoming jobs listed in the console.

```sh theme={null}
=> Deployed
=> Schedule details
Schedule: @hourly
Upcoming:
  1. 2024-08-30 18:00:00 UTC (2024-08-30 14:00:00 EDT)
  2. 2024-08-30 19:00:00 UTC (2024-08-30 15:00:00 EDT)
  3. 2024-08-30 20:00:00 UTC (2024-08-30 16:00:00 EDT)
```

## Scheduling Options

The following predefined schedules can be used in the `when` parameter:

| **Predefined Schedule**    | **Description**                                            | **Cron Expression** |
| -------------------------- | ---------------------------------------------------------- | ------------------- |
| `@yearly` (or `@annually`) | Run once a year at midnight on January 1st                 | `0 0 1 1 *`         |
| `@monthly`                 | Run once a month at midnight on the first day of the month | `0 0 1 * *`         |
| `@weekly`                  | Run once a week at midnight on Sunday                      | `0 0 * * 0`         |
| `@daily` (or `@midnight`)  | Run once a day at midnight                                 | `0 0 * * *`         |
| `@hourly`                  | Run once an hour at the beginning of the hour              | `0 * * * *`         |

## Stopping Scheduled Jobs

You can stop a scheduled job from running by using the `beam deployment stop` CLI command.

First, list the upcoming jobs with `beam deployment list`:

```sh theme={null}
  ID                       Name                   Active   Version   Created At      Updated At      Stub Name                 Workspace Name
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  10c192b6-6489-42c9-a3…   schedule               Yes            2   9 minutes ago   9 minutes ago   schedule/deployment/ap…   f6fa28
```

Then reference the **Deployment ID** to stop a job:

```sh theme={null}
$ beam deployment stop 10c192b6-6489-42c9-a3

Stopped 10c192b6-6489-42c9-a3bf-75c52ad1816b
```

## Gotchas

<Tip>
  If you deploy a new version of your scheduled job, the previous schedule will be disabled.
</Tip>


# Add to Cursor or Claude
Source: https://docs.beam.cloud/v2/getting-started/add-to-cursor-claude

Connect the Beam docs to your AI tools with MCP

Beam's documentation is available as an [MCP](https://modelcontextprotocol.io) server, so your AI tools can search the docs and answer questions with accurate, up-to-date context while you build.

The server is hosted at:

```text theme={null}
https://docs.beam.cloud/mcp
```

## Add to Cursor

Click the button to install the Beam docs MCP server in Cursor:

[![Add to Cursor](https://cursor.com/deeplink/mcp-install-dark.svg)](cursor://anysphere.cursor-deeplink/mcp/install?name=Beam\&config=eyJ1cmwiOiJodHRwczovL2RvY3MuYmVhbS5jbG91ZC9tY3AifQ==)

Or add it manually. In **Cursor Settings → MCP & Integrations → New MCP Server**, add:

```json theme={null}
{
  "mcpServers": {
    "beam": {
      "url": "https://docs.beam.cloud/mcp"
    }
  }
}
```

## Add to Claude

<Tabs>
  <Tab title="Claude Code">
    Add the server from your terminal:

    ```bash theme={null}
    claude mcp add --transport http beam https://docs.beam.cloud/mcp
    ```
  </Tab>

  <Tab title="Claude Desktop">
    Go to **Settings → Connectors → Add custom connector**, then enter:

    * **Name:** Beam
    * **URL:** `https://docs.beam.cloud/mcp`

    Restart Claude Desktop and the Beam docs tools will be available.
  </Tab>
</Tabs>

<Note>
  Looking for more ways to use the docs with AI tools, including `llms.txt` and `.md` pages? See [Using Beam Docs with AI Tools](/v2/resources/ai-tools).
</Note>


# Core Concepts
Source: https://docs.beam.cloud/v2/getting-started/core-concepts


## How Beam Works

Beam is a new kind of cloud provider that makes using the cloud feel almost the same as using your local machine. You write plain Python, add a decorator, and run your file. Beam packages your code into a container, launches it in the cloud in under a second, runs it, and scales out automatically when traffic increases.

It's powered by an [open-source container orchestrator](https://github.com/beam-cloud/beta9) that launches containers in less than 1 second.

```mermaid theme={null}
flowchart LR
    localCode["Your Python code + decorator"] --> beam["Beam"]
    beam --> container["Container in the cloud"]
    container --> autoscale["Autoscales to thousands of containers"]
    container --> shutdown["Shuts down when idle"]
```

There's no infrastructure to provision and no YAML to write. You only pay for the compute you use, billed by the millisecond.

<Note>
  The examples below use the [Python SDK](/v2/reference/py-sdk), which provides Beam's decorator programming model. The [TypeScript SDK](/v2/reference/ts-sdk) offers programmatic access for creating sandboxes and calling deployed endpoints.
</Note>

## Functions

You can run functions on the cloud, either once, or on a schedule. [Learn more about Functions](/v2/function/running-functions).

<CardGroup>
  <Card title="Functions" icon="code">
    One-off Python functions, like training runs, scraping, or batch jobs.

    ```python theme={null}
    from beam import function

    @function()
    def handler():
        return {}

    if __name__ == "__main__":
        # Runs locally
        handler.local()
        # Runs on the cloud
        handler.remote()
    ```
  </Card>

  <Card title="Scheduled Jobs" icon="clock">
    Functions that run based on a schedule you specify.

    ```python theme={null}
    from beam import schedule

    @schedule(when="every 1d")
    def handler():
        return {}

    if __name__ == "__main__":
        # Runs locally
        handler.local()
        # Runs on the cloud
        handler.remote()
    ```
  </Card>
</CardGroup>

<CardGroup>
  <Card title="Run Your Function" icon="bolt">
    You'll run your functions like a normal Python function: `python app.py`.
    Even though it *feels* like the code is running locally, it's running on a
    container in the cloud.
  </Card>
</CardGroup>

## Endpoints

You can also deploy synchronous and asynchronous web endpoints. Learn more about [Endpoints](/v2/endpoint/overview) and [Task Queues](/v2/task-queue/running-tasks).

<CardGroup>
  <Card title="Endpoints" icon="bolt">
    Synchronous REST API endpoints, for tasks that run in 60s or less.

    ```python theme={null}
    from beam import endpoint

    @endpoint(name="quickstart")
    def handler():
      return {}
    ```
  </Card>

  <Card title="Task Queues" icon="layer-group">
    Asynchronous REST API endpoints, for heavier tasks that take a long time to run.

    ```python theme={null}
    from beam import task_queue

    @task_queue(name="quickstart")
    def handler():
      print(48393 * 39383)
    ```
  </Card>
</CardGroup>

<CardGroup>
  <Card title="Testing Your Code (Optional)" icon="loader">
    Beam provides a temporary cloud environment to test your code.

    <br />

    These environments hot-reload with your code changes. You can test your workflow end-to-end before deploying to production.

    ```bash theme={null}
    beam serve app.py:handler
    ```
  </Card>
</CardGroup>

<CardGroup>
  <Card title="Deploying to Production" icon="check-double">
    When you're ready to deploy a persistent endpoint, you'll use `beam deploy`:

    ```bash theme={null}
    beam deploy app.py:handler
    ```
  </Card>
</CardGroup>

## Web Services

You can also bring your own container and host web services, like Jupyter Notebooks, Node.js apps, and much more. [Learn more about Pods](/v2/pod/web-service).

<Card title="Pods" icon="bolt">
  Run any container behind an SSL-backed REST API.

  ```python theme={null}
  from beam import Pod

  pod = Pod(
    name="my-pod",
    cpu=2,
    memory="1Gi",
    ports=[8000],
    entrypoint=["python", "-m", "http.server", "--bind", "::", "8000"],
  )

  # Run the container as an API
  pod.deploy()
  ```
</Card>


# Installation
Source: https://docs.beam.cloud/v2/getting-started/installation


## Mac and Linux

Install the Beam SDK and CLI. The Python SDK provides the decorator programming model (functions, endpoints, task queues, and sandboxes); the TypeScript SDK provides programmatic access for creating sandboxes and calling deployed endpoints.

<CodeGroup>
  ```bash Python theme={null}
  uv tool install beam-client
  ```

  ```bash TypeScript theme={null}
  npm install @beamcloud/beam-js@rc
  ```
</CodeGroup>

Beam will create a credentials file in `~/.beam/config.ini`. When you run `beam config create`, your API keys will be saved to this file.

## Homebrew

You can install the CLI separately from the SDK using Homebrew:

```bash theme={null}
brew tap beam-cloud/beam

brew install beam
```

## Windows

You can install Beam on Windows using [Windows Subsystem for Linux](https://learn.microsoft.com/en-us/windows/wsl/install) (WSL).

These steps assume you're starting fresh, but note that some systems (e.g. with Docker Desktop) may already have WSL distributions installed.

<Steps>
  <Step title="Install WSL with Ubuntu 22.04">
    After installation, you may be prompted to set up a new user for the Ubuntu
    environment: `wsl --install Ubuntu-22.04`
  </Step>

  <Step title="Set WSL Version to 1 (Optional)">
    Only do this if you explicitly need WSL 1. Most users should stick with WSL
    2: `wsl --set-version Ubuntu-22.04 1`
  </Step>

  <Step title="Launch Ubuntu">
    This ensures you’re using the correct distribution (not docker-desktop or
    others): `wsl -d Ubuntu-22.04`
  </Step>

  <Step title="Install pip">
    `sudo apt update && sudo apt install python3-pip -y`
  </Step>

  <Step title="Install Beam SDK">`uv tool install beam-client`</Step>
</Steps>

## Upgrading

Once installed, you can upgrade the CLI by running:

```bash theme={null}
uv tool upgrade beam-client
```

## Uninstalling

The Python SDK can be uninstalled using `pip`:

```bash theme={null}
python3 -m pip uninstall beam-client
```


# Introduction
Source: https://docs.beam.cloud/v2/getting-started/introduction

The open-source serverless cloud for AI and ML workloads

Beam lets you run functions, REST APIs, task queues, and sandboxes on CPUs and GPUs, end to end. There's no infrastructure to manage and no YAML to write: you define everything in code, and Beam runs it in containers that launch in under a second.

## Get Started

Create a free account on [Beam](https://platform.beam.cloud) to get \$30 in credit, then pick a path to start building. Most people start with the SDK and add the docs MCP if their editor speaks it (Cursor, Claude).

<CardGroup>
  <Card title="Run Your First App" icon="rocket" href="/v2/getting-started/quickstart">
    Install the SDK, run a function, and deploy a web endpoint in minutes.
  </Card>

  <Card title="Add to Cursor or Claude" icon="plug" href="/v2/getting-started/add-to-cursor-claude">
    Connect the Beam docs MCP so your AI editor has live context.
  </Card>
</CardGroup>

## What You Can Build

<CardGroup>
  <Card title="Endpoints and Web APIs" icon="bolt" href="/v2/endpoint/overview">
    Deploy functions as autoscaling REST APIs on CPUs and GPUs.
  </Card>

  <Card title="Sandboxes" icon="box" href="/v2/sandbox/overview">
    Run untrusted or LLM-generated code in secure, isolated environments.
  </Card>

  <Card title="GPU Inference" icon="microchip" href="/v2/environment/gpu">
    Serve ML and LLM inference on on-demand GPUs.
  </Card>

  <Card title="Task Queues" icon="layer-group" href="/v2/task-queue/running-tasks">
    Process heavy or long-running jobs asynchronously.
  </Card>

  <Card title="Functions and Scheduled Jobs" icon="cloud" href="/v2/function/running-functions">
    Run one-off functions, batch jobs, and cron schedules with no timeouts.
  </Card>

  <Card title="Host Any Container" icon="server" href="/v2/pod/web-service">
    Deploy any existing Docker image as a web service.
  </Card>
</CardGroup>

## Keep Exploring

<CardGroup>
  <Card title="Core Concepts" icon="compass" href="/v2/getting-started/core-concepts">
    Understand how functions, endpoints, and sandboxes work.
  </Card>

  <Card title="Examples" icon="grid" href="/v2/examples/overview">
    Browse end-to-end examples for real workloads.
  </Card>
</CardGroup>

<Note>
  The [Python SDK](/v2/reference/py-sdk) provides Beam's decorator programming model (functions, endpoints, task queues, and sandboxes). The [TypeScript SDK](/v2/reference/ts-sdk) provides programmatic access for creating sandboxes and calling deployed endpoints.
</Note>

## Community

Beam is completely open source. Star the repo, ask questions, and share what you build.

<CardGroup>
  <Card title="GitHub" icon="github" href="https://github.com/beam-cloud/beta9">
    Star the repo and contribute.
  </Card>

  <Card title="Slack" icon="slack" href="https://join.slack.com/t/beam-cloud/shared_invite/zt-3enuvj3r7-OeAzVPYvyqQHy9avNrLL0w">
    Join the community and get help.
  </Card>
</CardGroup>

## Enterprise

Running Beam at scale or need self-hosting and dedicated support? [Get in touch](https://calendly.com/elimernit/30min) or reach us at [founders@beam.cloud](mailto:founders@beam.cloud).


# Quickstart
Source: https://docs.beam.cloud/v2/getting-started/quickstart


## Run a Function in the Cloud

The simplest way to run code on Beam is to add the `@function` decorator to any Python function. Save this to `app.py`:

```python app.py theme={null}
from beam import function


@function(cpu=1, memory="1Gi")
def square(x: int):
    return {"result": x**2}


if __name__ == "__main__":
    print(square.remote(x=12))
```

Run it like any other Python file:

```sh theme={null}
python app.py
```

Beam syncs your code, launches a container, runs the function, and streams the result back to your shell:

```
=> Building image
=> Using cached image
=> Syncing files
=> Files synced
=> Running function: <app:square>
{'result': 144}
=> Function complete
```

The container spins up in seconds, runs your code, and shuts itself down. No idle costs, no infrastructure to clean up.

## Deploy a Web Endpoint

To turn your code into a live web API, swap `@function` for `@endpoint`. We'll include `numpy` in the image to show how easily you can add Python packages.

* `Image()` defines your container environment. You can add Python packages, system dependencies, or even custom Dockerfiles.
* `@endpoint` turns your function into a real, live web API that runs in the cloud.

```python app.py theme={null}
from beam import endpoint, Image


@endpoint(
    name="quickstart",
    cpu=1,
    memory="1Gi",
    image=Image().add_python_packages(["numpy"]),
)
def predict(**inputs):
    x = inputs.get("x", 256)
    return {"result": x**2}
```

### Deployment

Deploy the endpoint to the cloud:

```sh theme={null}
beam deploy app.py:predict
```

### Call the API

When the deploy finishes, Beam prints your endpoint URL along with a ready-to-run `curl` command. Replace `[TOKEN]` with your token and use the URL from your deploy output:

<CodeGroup>
  ```sh curl theme={null}
  curl -X POST 'https://app.beam.cloud/endpoint/quickstart' \
    -H 'Authorization: Bearer [TOKEN]' \
    -H 'Content-Type: application/json' \
    -d '{"x": 12}'
  ```

  ```typescript TypeScript theme={null}
  import { beamOpts, Deployments } from "@beamcloud/beam-js";

  beamOpts.token = process.env.BEAM_TOKEN!;
  beamOpts.workspaceId = process.env.BEAM_WORKSPACE_ID!;

  const deployment = await Deployments.get({
    name: "quickstart",
    stubType: "endpoint/deployment",
  });

  const response = await deployment.call({ x: 12 });
  console.log(response);
  ```
</CodeGroup>

Either way, you'll get back:

```json theme={null}
{ "result": 144 }
```

The container spins up in seconds, runs your code, and shuts itself down. No idle costs. No infrastructure to clean up.

## What Next?

Here are some other things you can try:

* [Customize your container image](/v2/environment/custom-images)
* [Add a GPU to your app](/v2/environment/gpu)
* [Run a scheduled job](/v2/function/scheduled-job)
* [Parallelize a function across 10 containers](/v2/scaling/parallelizing-functions)


# Networking
Source: https://docs.beam.cloud/v2/pod/networking


## Exposing Ports

You can expose TCP ports to the outside world by specifying the ports you want to expose in the `ports` parameter.

`ports` accepts a list, so you can expose multiple ports too.

In the example below, we expose two ports:

* `8888` for a Jupyter Notebook server
* `3000` for a separate application or web server

```python theme={null}
from beam import Image, Pod

pod = Pod(
    image=Image(base_image="jupyter/base-notebook:latest"),
    ports=[8888, 3000],
    entrypoint=["start-notebook.py"],
)
```

Once your Pod is running, both ports will be available at a public URL.

## Network Security

### Blocking Outbound Traffic

You can block all outbound network access from your Pod while still allowing inbound connections to exposed ports. This is useful for security-sensitive workloads that shouldn't communicate with external services.

```python theme={null}
from beam import Image, Pod

pod = Pod(
    image=Image(base_image="python:3.11-slim"),
    ports=[8000],
    block_network=True,  # Block all outbound traffic
    entrypoint=["python", "-m", "http.server", "8000"],
)
```

With `block_network=True`, the Pod can receive requests on exposed ports but cannot make outbound connections to external services.

### Allow Lists (CIDR Ranges)

For more fine-grained control, you can specify an allow list of CIDR ranges that your Pod is permitted to connect to. All other outbound traffic will be blocked.

```python theme={null}
from beam import Image, Pod

pod = Pod(
    image=Image(base_image="python:3.11-slim"),
    ports=[8000],
    allow_list=[
        "8.8.8.8/32",      # Allow Google DNS
        "10.0.0.0/8",      # Allow private network range
        "2001:db8::/32",   # Allow IPv6 range
    ],
    entrypoint=["python", "app.py"],
)
```

**Important Notes:**

* Maximum of 10 CIDR entries per Pod
* Supports both IPv4 and IPv6 addresses
* Must use proper CIDR notation (e.g., `"8.8.8.8/32"` for a single IP)
* Cannot use `allow_list` and `block_network` together - they are mutually exclusive
* Invalid CIDR values will trigger an error at creation time

## Static IPs

Pods are served in a static IP range, making it possible to whitelist the Beam IP range from the client.

For the static IP range, send us a message in [Slack](https://join.slack.com/t/beam-cloud/shared_invite/zt-3enuvj3r7-OeAzVPYvyqQHy9avNrLL0w).


# Host a Web Service
Source: https://docs.beam.cloud/v2/pod/web-service


[`Pod`](/v2/reference/py-sdk#pod) provides a way to run serverless containers on the cloud. It enables you to quickly launch a container as an HTTPS server that you can access from a web browser.

Pods run in isolated containers, allowing you to run untrusted code safely from your host system.

This can be used for a variety of use cases, such as:

* Hosting GUIs, like Jupyter Notebooks, Streamlit or Reflex apps, and ComfyUI
* Testing code in an isolated environment as part of a CI/CD pipeline
* Securely executing code generated by LLMs

...and much more (if you've got a cool use case, [let us know!](https://join.slack.com/t/beam-cloud/shared_invite/zt-3enuvj3r7-OeAzVPYvyqQHy9avNrLL0w))

# Launching Cloud Containers

Containers can be launched programmatically through the Python SDK, or with the Beam CLI.

For example, the following code is used to launch a cloud-hosted Jupyter Notebook:

<CodeGroup>
  ```python Python theme={null}
  from beam import Image, Pod

  notebook = Pod(
      image=Image(base_image="jupyter/base-notebook:latest"),
      ports=[8888],
      cpu=1,
      memory=1024,
      env={
          "NOTEBOOK_ARGS": "--ip='' --NotebookApp.token='' --NotebookApp.notebook_dir=/tmp"
      },
      entrypoint=["start-notebook.py"],
  )

  nb = notebook.create()

  print("Container hosted at:", nb.url)
  ```

  ```shell CLI theme={null}
  beam run --image jupyter/base-notebook:latest --ports 8888 \
    --env NOTEBOOK_ARGS="--ip='' --NotebookApp.token='' --NotebookApp.notebook_dir=/tmp" \
    --entrypoint "start-notebook.py"
  ```
</CodeGroup>

When this code is executed, Beam will launch a container and expose it as a publicly available HTTPS server:

```
$ python app.py

=> Building image
=> Using cached image
=> Syncing files
=> Creating container
=> Container created successfully ===> pod-2929b184-b445-4f23-abc6-7c4b151001da-ec86d9ac

Container hosted at: https://2929b184-b445-4f23-abc6-7c4b151001da-8888.app.beam.cloud
```

### Accessing Containers via HTTP

You can then enter this URL in the browser to interact with your hosted container instance:

<Frame>
  <img />
</Frame>

### Securely Executing Untrusted Code

Beam's containers are launched in isolated environments from your host system, making it safe to execute untrusted or LLM-generated code.

## Parameters

Pods can be heavily customized to fit your needs.

### Using Custom Images

You can customize the container image using the [`Image`](/v2/reference/py-sdk#image) object. This can be customized with Python packages, shell commands, Conda packages, and much more.

<CodeGroup>
  ```python Python theme={null}
  from beam import Image, Pod

  pod = Pod(
      image=Image(base_image="jupyter/base-notebook:latest"),
      entrypoint=["start-notebook.py"],
  )
  ```

  ```shell CLI theme={null}
  beam run --image jupyter/base-notebook:latest --entrypoint "start-notebook.py"
  ```
</CodeGroup>

### Specifying Entry Points

An *entry point* is the command or script that will run when the container starts. You can interact with Pods using the CLI or the Python SDK.

<CodeGroup>
  ```python Python theme={null}
  from beam import Image, Pod

  pod = Pod(
      image=Image(base_image="jupyter/base-notebook:latest"),
      entrypoint=["start-notebook.py"],
  )

  pod.create()
  ```

  ```shell CLI theme={null}
  beam run \
    --image jupyter/base-notebook:latest \
    --entrypoint "start-notebook.py"
  ```
</CodeGroup>

### Passing Environment Variables

You can pass environment variables into your container for credentials or other parameters. Like entry points, environment variables can be defined in both the CLI or the Python SDK:

<CodeGroup>
  ```python Python theme={null}
  from beam import Image, Pod

  Pod(
      image=Image(base_image="jupyter/base-notebook:latest"),
      env={"NOTEBOOK_ARGS": "--ip='' --NotebookApp.token='' --NotebookApp.notebook_dir=/tmp"},
      entrypoint=["start-notebook.py"],
  )
  ```

  ```shell CLI theme={null}
  beam run \
    --image jupyter/base-notebook:latest \
    --env NOTEBOOK_ARGS="--ip='' --NotebookApp.token='' --NotebookApp.notebook_dir=/tmp" \
    --entrypoint "start-notebook.py"
  ```
</CodeGroup>

## Deploying a Pod

Pods can be deployed as persistent endpoints using the `beam deploy` command.

<Warning>
  When deploying a Pod, don't forget to include the `name` field.
</Warning>

```python app.py theme={null}
from beam import Pod

pod = Pod(
    name="my-deployed-pod",
    cpu=2,
    memory="1Gi",
    ports=[8000],
    entrypoint=["python", "-m", "http.server", "8000"],
)
```

```sh theme={null}
beam deploy app.py:pod
```

## Terminating a Pod

Pod instances can be terminated directly using the `terminate()` method.

Alternatively, you can terminate the container the Pod is running on by using the `beam container stop <container-id>` command.

```python theme={null}
from beam import Pod

# Initialize a pod
notebook = Pod()

# Launch the pod
notebook.create()

# Terminate the pod
notebook.terminate()
```

## Lifecycle

### Timeouts

Pods are serverless and automatically scale-to-zero.

By default, pods will be terminated after 10 minutes without any active connections to the hosted URL or until the process exits by itself. Making a connection request (i.e. accessing the URL in your browser) will keep the container alive until the timeout is reached.

You can set a custom timeout by passing the `keep-warm-seconds` parameter when creating a pod. By specifying -1, the pod will not spin down to due inactivity, and will remain up until either the entrypoint process exits, or you explicitly stop the container.

**Keep Alive for 5 minutes**

```python theme={null}
beam run --image jupyter/base-notebook:latest --keep-warm-seconds 300
```

**Keep Alive Indefinitely**

<Tip>*There is no upper limit on the duration of a session*.</Tip>

```python theme={null}
beam run --image jupyter/base-notebook:latest --keep-warm-seconds -1
```

### List Running Pods

You can list all running Pods using the `beam container list` command.

```bash theme={null}
$ beam container list

  ID                                                  Status    Stub ID                                Deployment ID   Scheduled At    Uptime
 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  pod-58a38dba-8b7b-4db3-b002-436c6d9b4858-a613eecd   RUNNING   58a38dba-8b7b-4db3-b002-436c6d9b4858                   5 seconds ago   4 seconds

  1 items
```

You can kill any container by running `beam container stop <container-id>`.


# Reference
Source: https://docs.beam.cloud/v2/reference/api


### Authentication

All APIs are authenticated through [Bearer Authentication](https://swagger.io/docs/specification/authentication/bearer-authentication/).

<Note>
  All API responses follow standard HTTP semantics. In case of an error, you
  will receive a non-2xx response with a JSON body describing the error.
</Note>

## Tasks

The Tasks API lets you interact with tasks. See the left navigation for detailed parameters and responses for each endpoint.

## Pods - Beta

<Note>
  The Pods API is currently in beta. It's an experimental API that allows you to create and interact with runtime containers and is subject to change. Please contact us if you have any feedback or questions.
</Note>

The Pods API lets you create and interact with runtime containers. Full, interactive reference pages are generated from our OpenAPI definition in the sidebar under Pods. See the left navigation for detailed parameters, responses, and an interactive playground for each endpoint.


# List Containers/Pods/Sandboxes
Source: https://docs.beam.cloud/v2/reference/api-docs/gatewayservice/get-containers

v2/reference/api-docs/gateway.swagger.json GET /containers
List containers/pods/sandboxes associated with your workspace.


# Stop Container/Pod/Sandbox
Source: https://docs.beam.cloud/v2/reference/api-docs/gatewayservice/post-containers-stop

v2/reference/api-docs/gateway.swagger.json POST /containers/{containerId}/stop
Stop a running container/pod/sandbox by its ID.


# Create a Pod
Source: https://docs.beam.cloud/v2/reference/api-docs/podservice/post-pods

v2/reference/api-docs/pods.swagger.json POST /pods
Create a new pod and return its identifiers and initial state. Provide exactly one of `stubId` or `checkpointId`.


# Cancel Task
Source: https://docs.beam.cloud/v2/reference/api-docs/tasks/tasks-cancel


### Cancelling Tasks

Tasks can be cancelled through the `api.beam.cloud/v2/task/cancel/` endpoint.

#### Request

```bash theme={null}
curl -X DELETE --compressed 'https://api.beam.cloud/v2/task/cancel/' \
  -H 'Authorization: Bearer [YOUR_TOKEN]' \
  -H 'Content-Type: application/json' \
  -d '{"task_ids": ["TASK_ID"]}'
```

This API accepts a list of tasks, which can be passed in like this:

```json theme={null}
{
  "task_ids": [
    "70101e46-269c-496b-bc8b-1f7ceeee2cce",
    "81bdd7a3-3622-4ee0-8024-733227d511cd",
    "7679fb12-94bb-4619-9bc5-3bd9c4811dca"
  ]
}
```

#### Response

`200`

```json theme={null}
{}
```


# Get Task Status
Source: https://docs.beam.cloud/v2/reference/api-docs/tasks/tasks-status


### Query Task Status

You can check the status of any task by querying the `task` API:

```sh theme={null}
https://api.beam.cloud/v2/task/{TASK_ID}/
```

### Task Statuses

Your payload will return the status of the task. These are the possible statuses for a task:

| Status      | Description                                                                                                                                                                               |
| ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `PENDING`   | The task is enqueued and has not started yet.                                                                                                                                             |
| `RUNNING`   | The task is running.                                                                                                                                                                      |
| `COMPLETE`  | The task completed without any errors.                                                                                                                                                    |
| `RETRY`     | The task is being retried. Defaults to 3, unless `max_retries` is provided in the function decorator.                                                                                     |
| `CANCELLED` | The task was cancelled by the client.                                                                                                                                                     |
| `TIMEOUT`   | The task timed out, based on the `timeout` provided in the function decorator.                                                                                                            |
| `EXPIRED`   | The task remained in the queue and was never picked up by a worker. **For endpoints, this usually occurs when the task does not start running before the request timeout (180 seconds).** |
| `FAILED`    | The task did not complete successfully.                                                                                                                                                   |

#### Request

```sh theme={null}
curl -X GET \
  'https://api.beam.cloud/v2/task/{TASK_ID}/' \
  -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
  -H 'Content-Type: application/json'
```

#### Response

The response to `/task` returns the following data:

| Field                     | Type    | Description                                                                                                 |
| ------------------------- | ------- | ----------------------------------------------------------------------------------------------------------- |
| `id`                      | string  | The unique identifier of the task.                                                                          |
| `started_at`              | string  | The timestamp when the task started, in ISO 8601 format. Null if the task hasn't started yet.               |
| `ended_at`                | string  | The timestamp when the task ended, in ISO 8601 format. Null if the task is still running or hasn't started. |
| `status`                  | string  | The current status of the task (e.g., COMPLETE, RUNNING, etc.).                                             |
| `container_id`            | string  | The identifier of the container running the task.                                                           |
| `updated_at`              | string  | The timestamp when the task was last updated, in ISO 8601 format.                                           |
| `created_at`              | string  | The timestamp when the task was created, in ISO 8601 format.                                                |
| `outputs`                 | array   | An array containing the outputs of the task.                                                                |
| `stats`                   | object  | An object containing statistics about the task's execution environment.                                     |
| `stats.active_containers` | integer | The number of active containers for the task.                                                               |
| `stats.queue_depth`       | integer | The depth of the queue for the deployment.                                                                  |
| `stub`                    | object  | An object containing detailed information about the task's configuration and deployment.                    |
| `stub.id`                 | string  | The identifier of the deployment stub.                                                                      |
| `stub.name`               | string  | The name of the deployment stub.                                                                            |
| `stub.type`               | string  | The type of the deployment stub.                                                                            |
| `stub.config`             | string  | The full runtime configuration for the deployment, returned as a JSON string.                               |
| `stub.config_version`     | integer | The version number of the deployment stub configuration.                                                    |
| `stub.object_id`          | integer | The object identifier associated with the deployment stub.                                                  |
| `stub.created_at`         | string  | The timestamp when the deployment stub was created, in ISO 8601 format.                                     |
| `stub.updated_at`         | string  | The timestamp when the deployment stub was last updated, in ISO 8601 format.                                |

<Tip>
  To parse `stub.config`, use `json.loads()` or an equivalent JSON decoder in
  your language.
</Tip>

Here's what the response payload looks like as JSON:

```json theme={null}
{
  "id": "07ce4078-bccc-4a42-b530-5f2653484a6a",
  "started_at": "2024-07-22T14:02:28.466278Z",
  "ended_at": "2024-07-22T14:02:28.475954Z",
  "status": "COMPLETE",
  "container_id": "endpoint-d327e987-759d-493e-b3e4-005774bcf998-8b747792",
  "updated_at": "2024-07-22T14:02:28.477026Z",
  "created_at": "2024-07-22T14:02:28.413232Z",
  "outputs": [],
  "stats": {
    "active_containers": 0,
    "queue_depth": 0
  },
  "stub": {
    "id": "d327e987-759d-493e-b3e4-005774bcf998",
    "name": "endpoint/deployment/app:squared",
    "type": "",
    "config": "{\"runtime\":{\"cpu\":1000,\"gpu\":\"\",\"memory\":128,\"image_id\":\"4724a2a2dfb601d8\"},\"handler\":\"app:squared\",\"on_start\":\"\",\"python_version\":\"python3.10\",\"keep_warm_seconds\":180,\"max_pending_tasks\":100,\"callback_url\":\"\",\"task_policy\":{\"max_retries\":0,\"timeout\":180,\"expires\":\"0001-01-01T00:00:00Z\"},\"workers\":1,\"authorized\":false,\"volumes\":null,\"autoscaler\":{\"type\":\"queue_depth\",\"max_containers\":1,\"tasks_per_container\":1}}",
    "config_version": 0,
    "object_id": 0,
    "created_at": "0001-01-01T00:00:00Z",
    "updated_at": "0001-01-01T00:00:00Z"
  }
}
```


# CLI Reference
Source: https://docs.beam.cloud/v2/reference/cli


The `beam` CLI is a command-line utility that lets you work with Beam using `beam` commands, from uploading files to volumes to deploying your applications.

You can use the `--help` flag to get information about any command.

### Most Common Commands

You'll be using these a lot!

* `beam deploy` – deploy an app to the cloud
* `beam shell` – SSH into a container to debug it interactively
* `beam serve` – create a temporary live preview of your app
* `beam logs` – stream logs from a task, container, or deployment

## Installation

This installs the Beam SDK and CLI in your Python environment.

```bash theme={null}
uv tool install beam-client
```

You can find instructions for installing the CLI on Windows [here](/v2/getting-started/installation#windows).

## Setup Credentials

Beam will create a credentials file in `~/.beam/config.ini`. When you run `beam config create`, your API keys will be saved to this file.

## Config

Configures your Beam [API keys](/v2/reference/cli#setup-credentials) and saves a profile to `~/.beam/config.ini`

```bash theme={null}
beam config
```

```bash theme={null}
$ beam config create prod

Context Name [prod]:
Gateway Host [gateway.beam.cloud]:
Gateway Port [443]:
Token:
```

* `Context Name` (required) -- the name of the profile i.e. prod or staging.
* `Gateway Host` (optional) -- used only for self-hosting. If you are using the beam.cloud, you can leave this blank.
* `Gateway Port` (optional) -- used only for self-hosting. If you are using the beam.cloud, you can leave this blank.
* `Token` (required) -- your API token, found on [this page of the dashboard](https://platform.beam.cloud/settings/api-keys).

### Create

Create a new context.

```bash theme={null}
beam config create [NAME]
```

```bash theme={null}
$ beam config create prod-env

Context Name [prod-env]:
Gateway Host [gateway.beam.cloud]:
Gateway Port [443]:
Token: [YOUR-TOKEN]
Added new context!
```

<Info>
  If you are prompted to enter a value for `Gateway Host` or `Gateway Port`, you
  can leave both fields blank.
</Info>

### Delete

Delete a saved context.

```bash theme={null}
beam config delete [NAME]
```

```bash theme={null}
$ beam config delete prod-env

Do you want to continue? [y/N]: y
Deleted context prod.
```

### List

Lists saved contexts.

```bash theme={null}
beam config list
```

```bash theme={null}
$ beam config list

  Name      Host                       Port   Token
 ───────────────────────────────────────────────────────
  default   gateway.beam.cloud   443    qONcMO...
  staging   gateway.beam.cloud   443    qONcMO...
```

### Select

Set the default context.

```bash theme={null}
beam config select [NAME]
```

```bash theme={null}
$ beam config select staging-env

Default context updated with 'staging-env'.
```

### Specifying Context

Most commands support the `--context` (or `-c`) flag, which allows you to specify which config profile to use for that command. This is useful when working with multiple environments (e.g., development, staging, production).

```bash theme={null}
beam deploy app.py:handler --context staging
beam task list -c production
beam volume list --context dev
```

If no context is specified, the default context (set via `beam config select`) will be used.

## Deployment

### Create

Deploys your app and creates a persistent web endpoint to access it.

<Note>
  You can run this command with `beam deploy [...]` or `beam deploy create   [...]`.
</Note>

```bash theme={null}
beam deploy [FILE:FUNCTION] --name [APP-NAME]
```

```bash theme={null}
$ beam deploy create app.py:handler --name inference-app

=> Building image
=> Using cached image
=> Syncing files
=> Files synced
=> Deploying endpoint
=> Deployed
```

### List

Lists all deployments.

```bash theme={null}
beam deployment list
```

```bash theme={null}
$ beam deployment list

  ID                 Name              Active   Version   Created At    Updated At    Stub Name         Workspace Name
 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  9c9e99aa-a64e-4…   whisper-stt       Yes            1   1 day ago     1 day ago     endpoint/deplo…   cf2db0
  e1831baa-e4a9-4…   inference-app     Yes            2   5 days ago    5 days ago    endpoint/deplo…   cf2db0
  6983cfe2-abf7-4…   vllm-app          Yes            2   7 days ago    7 days ago    endpoint/deplo…   cf2db0

  3 items
```

### Stop

Stops a deployment.

```bash theme={null}
beam deployment stop [DEPLOYMENT-ID]
```

```bash theme={null}
$ beam deployment stop c7b9fdaa-a25a-4db0-a825-c31f94c91c3f

Stopped c7b9fdaa-a25a-4db0-a825-c31f94c91c3f
```

### Start

Starts an inactive deployment.

```bash theme={null}
beam deployment start [DEPLOYMENT-ID]
```

```bash theme={null}
$ beam deployment start c555edd8-3f10-4b54-ac1c-4e1e5e10eabd

Starting c555edd8-3f10-4b54-ac1c-4e1e5e10eabd
```

### Delete

Deletes a deployment.

```bash theme={null}
beam deployment delete [DEPLOYMENT-ID]
```

```bash theme={null}
$ beam deployment delete c7b9fdaa-a25a-4db0-a825-c31f94c91c3f

Deleted deployment c7b9fdaa-a25a-4db0-a825-c31f94c91c3f
```

# Shell

### SSH Into Containers

Allows you to interactively access a container on Beam.

```bash theme={null}
$ beam shell app.py:handler

=> Building image
=> Using cached image
=> Syncing files
Reading .beamignore file
=> Files synced

Welcome to Ubuntu 22.04.5 LTS (GNU/Linux 6.8.0-51-generic x86_64)

root@runc:/mnt/code#
```

You can also shell into a running container, by passing in a `container_id`:

```bash theme={null}
beam shell --container-id <CONTAINER_ID>
```

# Serve

### Create a Preview Environment

Creates a temporary deployment preview.

```bash theme={null}
beam serve [FILE:FUNCTION]
```

```bash theme={null}
$ beam serve app.py:predict

=> Building image
=> Using cached image
=> Syncing files
Reading .beamignore file
=> Files synced

=> Invocation details

curl -X POST 'https://app.beam.cloud/endpoint/id/55108039-e3bf-409b-bad5-f4982b2f1c02' \
-H 'Accept: */*' \
-H 'Accept-Encoding: gzip, deflate' \
-H 'Connection: keep-alive' \
-H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
-H 'Content-Type: application/json' \
-d '{}'

=> Watching /Users/beam for changes...
⠇ Serving endpoint...
```

## Container

Manage the containers running in your account.

### List

Lists all containers.

```bash theme={null}
beam container list
```

```bash theme={null}
$ beam container list

  ID                                                       Status    Stub Id                                Scheduled At
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  function-ee5c046f-985f-40b7-a0fa-477794e0a052-6c0d5340   RUNNING   27d567fe-8bd3-41b4-bd5b-2e6ce1afb454   3 seconds ago

  1 item(s)
```

### Stop

Terminate a running container.

```bash theme={null}
beam container stop [CONTAINER-ID]
```

```bash theme={null}
$ beam container stop function-ee5c046f-985f-40b7-a0fa-477794e0a052-6c0d5340

Stopped container.
```

## Task

Any code you run on Beam creates a task. Any time you run a function or invoke an API, a task is created.

### List Tasks

Lists all tasks.

```bash theme={null}
beam task list
```

```bash theme={null}
$ beam task list

  Task ID             Status     Started At       Ended At         Container ID        Stub Name           Workspace Name
 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  29d739d5-624f-41…   COMPLETE   29 minutes ago   29 minutes ago   endpoint-b4e9a64…   endpoint/deploym…   cf2db0
  f45fe342-0bff-4e…   COMPLETE   35 minutes ago   35 minutes ago   endpoint-c1dd0b6…   endpoint/deploym…   cf2db0
  4b9a1b6f-e34d-4d…   COMPLETE   1 day ago        1 day ago        endpoint-05fb7c9…   endpoint/deploym…   cf2db0
  38910b63-8c8a-42…   COMPLETE   1 day ago        1 day ago        endpoint-05fb7c9…   endpoint/deploym…   cf2db0
  cf051d10-fa28-42…   COMPLETE   1 day ago        1 day ago        endpoint-05fb7c9…   endpoint/deploym…   cf2db0
```

### Stop a Task

Stops a task.

```bash theme={null}
beam task stop [TASK-ID]
```

```bash theme={null}
$ beam task stop c6d9e4a3-9262-485a-a7bb-a72980008c02

Stopped task c6d9e4a3-9262-485a-a7bb-a72980008c02.
```

## Volume

Beam allows you to create highly-available storage volumes that can be used across tasks. You might use volumes for things like storing model weights or large datasets.

### Create a Volume

Creates a volume.

```bash theme={null}
beam volume create [VOLUME-NAME]
```

```bash theme={null}
$ beam volume create weights

  Name       Created At    Updated At    Workspace Name
 ───────────────────────────────────────────────────────
  weights   May 07 2024   May 07 2024   cf2db0
```

### Delete a Volume

```bash theme={null}
beam volume delete [VOLUME-NAME]
```

```bash theme={null}
$ beam volume delete model-weights

Any apps (functions, endpoints, task queue, etc) that
refer to this volume should be updated before it is deleted.
Are you sure? (y/n) [n]: y

Deleted volume model-weights
```

### List Volumes

List all volumes mounted to your apps.

```bash theme={null}
beam volume list
```

```bash theme={null}
$ beam volume list

  Name                                Size   Created At   Updated At   Workspace Name
 ─────────────────────────────────────────────────────────────────────────────────────
  weights                       240.23 MiB   2 days ago   2 days ago   cf2db0

  1 volumes | 240.23 MiB used
```

### List Volume Contents

List all contents of a volume.

```bash theme={null}
beam ls [VOLUME-NAME]
```

```bash theme={null}
$ beam ls weights

  Name                               Size   Modified Time    IsDir
 ──────────────────────────────────────────────────────────────────
  .locks                           0.00 B   29 minutes ago   Yes
  models--facebook--opt-125m   240.23 MiB   28 minutes ago   Yes

  2 items | 240.23 MiB used
```

### Copy Files to Volumes

Copies a file to a volume.

```bash theme={null}
beam cp [LOCAL-PATH] beam://[VOLUME-NAME]
```

```bash theme={null}
=> weights (copying 1 object)
[LennonBeatlemania.pth] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 kB 0:00:00
```

### Move Files in Volumes

Move files around a volume.

```bash theme={null}
beam mv [SOURCE] [DEST]
```

```bash theme={null}
$ beam mv file.txt files/text-files

Moved file.txt to files/text-files/file.txt
```

### Remove Files from Volumes

Remove a file from a volume.

```bash theme={null}
beam rm [FILE]
```

```bash theme={null}
=> weights/app.py (1 object deleted)
app.py
```

### Downloading Data

You can download directories and individual files.

```bash theme={null}
beam cp beam://myvol/file.txt .             # beam://myvol/file.txt => ./file.txt
beam cp beam://myvol/file.txt file.new      # beam://myvol/file.txt => ./file.new
beam cp beam://myvol/mydir .                # beam://myvol/mydir/file.txt => ./file.txt
```

## Secret

Secrets and environment variables can be injected into the containers that run your apps. After a secret is saved, it can be used in your application code like this:

```python theme={null}
from beam import function


@function(secrets=["AWS_ACCESS_KEY"])
def handler():
    import os

    my_secret = os.environ["AWS_ACCESS_KEY"]
    print(f"Secret: {my_secret}")
```

### List Secrets

Lists all secrets that exist.

```bash theme={null}
beam secret list
```

```bash theme={null}
$ beam secret list

  Name             Last Updated     Created
 ──────────────────────────────────────────────────
  AWS_KEY          19 hours ago     19 hours ago
  AWS_ACCESS_KEY   20 seconds ago   20 seconds ago
  AWS_REGION       7 seconds ago    7 seconds ago

  3 items
```

### Create a Secret

Creates a secret

```bash theme={null}
beam secret create [KEY] [VALUE]
```

```bash theme={null}
$ beam secret create AWS_ACCESS_KEY ASIAY34FZKBOKMUTVV7A

=> Created secret with name: 'AWS_ACCESS_KEY'
```

### Show a Secret

Shows the value of a secret.

```bash theme={null}
beam secret show [KEY]
```

```bash theme={null}
$ beam secret show AWS_ACCESS_KEY

=> Secret 'AWS_ACCESS_KEY': ASIAY34FZKBOKMUTVV7A
```

### Modify a Secret

Modifies the value of a secret.

```bash theme={null}
beam secret modify [KEY] [VALUE]
```

```bash theme={null}
$ beam secret modify AWS_ACCESS_KEY ASIAY34FZKBOKMUTVV7A

=> Modified secret 'AWS_ACCESS_KEY'
```

### Delete a Secret

Permanently deletes a secret.

```bash theme={null}
beam secret delete [KEY]
```

```bash theme={null}
$ beam secret delete AWS_ACCESS_KEY

=> Deleted secret 'AWS_ACCESS_KEY'
```

## Logs

You can stream logs from a task, deployment, or a container to your shell.

### Deployment

Streams logs for a deployment.

```bash theme={null}
beam logs --deployment-id [DEPLOYMENT-ID]
```

You can find the deployment ID by running `beam deployment list`.

```bash theme={null}
$ beam logs --deployment-id 2089b5b5-9b2a-450e-9a56-a50d4f0a8d4c

Starting task worker[0]
Worker[0] ready
Running task <25156946-2c3a-4900-8a2c-568f162493a7>
Task completed <25156946-2c3a-4900-8a2c-568f162493a7>, took 3.4570693s
Spinning down taskqueue
```

### Task

Streams logs for a task.

```bash theme={null}
beam logs --task-id [TASK-ID]
```

You can find the task ID by running `beam task list`.

### Container

Streams logs for a container.

```bash theme={null}
beam logs --container-id [CONTAINER-ID]
```

You can find the container ID by running `beam container list`.

### Stub ID

Streams logs for a stub ID.

```bash theme={null}
beam logs --stub-id [STUB-ID]
```

## Machine

Manage the machines available on Beam.

### List

List the available GPUs at any given moment.

```bash theme={null}
$ beam machine list

  GPU Type   Available
 ──────────────────────
  A10G          Yes
  H100          Yes
```

## Helpers and utils

### Ignore Local Files

You can create a `.beamignore` file in your project's root directory to tell Beam which local files and directories to ignore when syncing to Beam.

This follows the conventions of [`.gitignore`](https://git-scm.com/docs/gitignore)

**Ignoring Files**

```
do_not_sync.txt
.DS_Store
```

**Ignoring Folders**

```
images/*
node_modules/*
```


# Python SDK Reference
Source: https://docs.beam.cloud/v2/reference/py-sdk


Beam's Python SDK is the heart of the Beam platform. Unlike traditional cloud providers, Beam apps are defined entirely in code — no YAML, no config files. All infrastructure and runtime configuration is expressed in Python.

This reference outlines every available decorator, object, and configuration option in the SDK. For a quickstart or high-level overview, check out the [Getting Started guide](/v2/getting-started/quickstart).

# Environment

### `Image`

Defines a custom container image that your code will run in.

An Image object encapsulates the configuration of a custom container image that will be used as the runtime environment for executing tasks.

```python theme={null}
from beam import endpoint, Image


image = (
    Image(
        base_image="docker.io/nvidia/cuda:12.3.1-runtime-ubuntu20.04",
        python_version="python3.9",
    )
    .add_commands(["apt-get update -y", "apt-get install ffmpeg -y"])
    .add_python_packages(["transformers", "torch"])
    .build_with_gpu(gpu="A10G")
)


@endpoint(image=image)
def handler():
    return {}
```

<ParamField type="string">
  The Python version to be used in the image. Defaults to Python 3.8.
</ParamField>

<ParamField type="list">
  A list of Python packages to install in the container image. Alternatively, a
  string containing a path to a requirements.txt can be provided. Default is \[].
</ParamField>

<ParamField type="list">
  A list of shell commands to run when building your container image. These
  commands can be used for setting up the environment, installing dependencies,
  etc. Default is \[].
</ParamField>

<ParamField type="string">
  A custom base image to replace the default ubuntu20.04 image used in your container. This can be a public or private image from Docker Hub, Amazon ECR, Google Cloud Artifact Registry, or
  NVIDIA GPU Cloud Registry. The formats for these registries are respectively `docker.io/my-org/my-image:0.1.0`,
  `111111111111.dkr.ecr.us-east-1.amazonaws.com/my-image:latest`,
  `us-east4-docker.pkg.dev/my-project/my-repo/my-image:0.1.0`, and `nvcr.io/my-org/my-repo:0.1.0`. Default is None.
</ParamField>

<ParamField type="dict">
  A key/value pair or key list of environment variables that contain credentials to
  a private registry. When provided as a dict, you must supply the correct keys and values.
  When provided as a list, the keys are used to lookup the environment variable value
  for you. Default is None.

  #### List of Base Image Creds

  ```python theme={null}
  image = Image(
    base_image="111111111111.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
    base_image_creds=[
        "AWS_ACCESS_KEY_ID",
        "AWS_SECRET_ACCESS_KEY",
        "AWS_SESSION_TOKEN",
        "AWS_REGION",
    ],
  )
  ```

  #### Dict of Base Image Creds

  ```python theme={null}
  image = Image(
      base_image="111111111111.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
      base_image_creds={
        "AWS_ACCESS_KEY_ID": "xxxx",
        "AWS_SECRET_ACCESS_KEY": "xxxx"
        "AWS_REGION": "xxxx"
      },
  )
  ```
</ParamField>

<ParamField type="dict">
  Adds environment variables to an image. These will be available when building the image
  and when the container is running. This can be a string, a list of strings, or a
  dictionary of strings. The string must be in the format of `KEY=VALUE`. If a list of
  strings is provided, each element should be in the same format. Default is None.
</ParamField>

<ParamField type="string">
  Builds the image on a GPU.
</ParamField>

### `Image.from_registry()`

Create an Image from a remote container registry.

<ParamField type="string">
  The full URI of the registry image.
</ParamField>

<ParamField type="list">
  Credentials for private registries. Either a dict of key to value, or a list
  of env var keys to read at build time.
</ParamField>

```python theme={null}
from beam import Image, endpoint

image = Image.from_registry("docker.io/library/python:3.11-slim")

@endpoint(image=image)
def handler():
    return {}
```

### `Image.from_id()`

Create an image from a filesystem snapshot.

<ParamField type="string">
  Snapshot to use as the base.
</ParamField>

```python theme={null}
Image.from_id("snapshot-123")
```

### `Image.from_dockerfile()`

Build the base image using a local Dockerfile.

<ParamField type="string">
  Path to Dockerfile.
</ParamField>

<ParamField type="list">
  Directory to sync as build context. Defaults to the Dockerfile directory.
</ParamField>

```python theme={null}
image = Image.from_dockerfile("./Dockerfile", context_dir="./app").add_python_packages(["uvicorn"])
```

### `Image.add_python_packages()`

Queue pip packages to install during the build. Accepts a list or a path to requirements.txt.

<ParamField type="list">
  Package names or a `requirements.txt` path.
</ParamField>

```python theme={null}
image = Image().add_python_packages(["transformers==4.44.0", "torch==2.4.0"])
```

### `Image.add_commands()`

Shell commands to run during the build in the order added.

<ParamField type="list">
  Shell commands.
</ParamField>

```python theme={null}
image = Image().add_commands(["apt-get update -y", "apt-get install -y ffmpeg"])
```

### `Image.with_envs()`

Add environment variables available during build and at runtime.

<ParamField type="list">
  One `KEY=VALUE`, a list of them, or a dict.
</ParamField>

```python theme={null}
image = Image().with_envs({"HF_HOME":"/models","HF_HUB_ENABLE_HF_TRANSFER":"1"})
```

### `Image.with_secrets()`

Expose platform secrets to the build environment.

<ParamField type="List[str]">
  Secret names created via the platform.
</ParamField>

```python theme={null}
image = Image().with_secrets(["HF_TOKEN"])
```

### `Image.micromamba()`

Switch package management to micromamba and target a micromamba Python.

```python theme={null}
image = Image(python_version="python3.11").micromamba()
```

### `Image.add_micromamba_packages()`

Install micromamba packages and optional channels.

<ParamField type="Union[Sequence[str], str]">
  Package names or a `requirements.txt` path.
</ParamField>

<ParamField type="Sequence[str]">
  Micromamba channels.
</ParamField>

```python theme={null}
image = Image().micromamba().add_micromamba_packages(packages=["pandas","numpy"], channels=["conda-forge"])
```

### `Image.build_with_gpu()`

Request the build to run on a GPU node. Useful when installers detect GPU and compile CUDA parts.

<ParamField type="string">
  GPU type such as `T4`, `A10G`, `H100`, `4090`.
</ParamField>

```python theme={null}
image = Image().add_commands(["pip install xformers"]).build_with_gpu("A10G")
```

## `Context`

Context is a dataclass used to store various useful fields you might want to access in your entry point logic.

| Field Name       | Type           | Default Value | Purpose                                                |
| ---------------- | -------------- | ------------- | ------------------------------------------------------ |
| `container_id`   | Optional\[str] | None          | Unique identifier for a container                      |
| `stub_id`        | Optional\[str] | None          | Identifier for a stub                                  |
| `stub_type`      | Optional\[str] | None          | Type of the stub (function, endpoint, task queue, etc) |
| `callback_url`   | Optional\[str] | None          | URL called when the task status changes                |
| `task_id`        | Optional\[str] | None          | Identifier for the specific task                       |
| `timeout`        | Optional\[int] | None          | Maximum time allowed for the task to run (seconds)     |
| `on_start_value` | Optional\[Any] | None          | Any values returned from the `on_start` function       |
| `bind_port`      | int            | 0             | Port number to bind a service to                       |
| `python_version` | str            | ""            | Version of Python to be used                           |

## `Client`

You can use this to track the state of tasks and deployments.

```python theme={null}
from beam import Client, function

@function()
def handler():
    client = Client(
        token="YOUR_TOKEN"
    )

    # Get a deployment by its ID
    deployment = client.get_deployment_by_id("YOUR_DEPLOYMENT_ID")

    # Submit a task
    file = client.upload_file("./app.py")
    task = deployment.submit(input={"audio_file": file})

    task = deployment.submit(input={"text": "Hello world"})

    # Get a deployment by its stub ID
    deployment = client.get_deployment_by_stub_id("YOUR_STUB_ID")
    task = deployment.submit(input={"data": "example"})

    # Get a task by its ID
    task = client.get_task_by_id("YOUR_TASK_ID")
    result = task.result(wait=True)


if __name__ == "__main__":
    handler.remote()
```

<ParamField type="string">
  Authentication token for the Beam API. If not provided, will use the
  `BEAM_TOKEN` environment variable.
</ParamField>

### `Client.upload_file()`

Upload a local file to be used as input to a function or deployment.

<ParamField type="string">
  The path to the local file to upload.
</ParamField>

```python theme={null}
client = Client(token="YOUR_TOKEN")
file_url = client.upload_file("./data.csv")
```

### `Client.get_task_by_id()`

Retrieve a task by its task ID.

<ParamField type="string">
  The task ID to retrieve.
</ParamField>

```python theme={null}
client = Client(token="YOUR_TOKEN")
task = client.get_task_by_id("YOUR_TASK_ID")
result = task.result(wait=True)
```

### `Client.get_deployment_by_id()`

Retrieve a deployment using its deployment ID.

<ParamField type="string">
  The deployment ID to retrieve.
</ParamField>

```python theme={null}
client = Client(token="YOUR_TOKEN")
deployment = client.get_deployment_by_id("YOUR_DEPLOYMENT_ID")
task = deployment.submit(input={"text": "Hello world"})
```

### `Client.get_deployment_by_stub_id()`

Retrieve a deployment using the associated stub ID.

<ParamField type="string">
  The stub ID associated with the deployment.
</ParamField>

```python theme={null}
client = Client(token="YOUR_TOKEN")
deployment = client.get_deployment_by_stub_id("YOUR_STUB_ID")
task = deployment.submit(input={"data": "example"})
```

## Sandbox

A sandboxed container for running Python code or arbitrary processes.

You can use this to create isolated environments where you can execute code,
manage files, and run processes.

<ParamField type="bool">
  Whether to block all outbound network access from the sandbox. When enabled,
  the sandbox cannot make outbound connections to external services, but inbound
  connections to exposed ports are still allowed. Cannot be used together with
  `allow_list`.
</ParamField>

<ParamField type="List[str]">
  List of CIDR ranges that the sandbox is allowed to connect to. All other
  outbound network access will be blocked. Must use CIDR notation (e.g.,
  `"8.8.8.8/32"` for a single IP, `"10.0.0.0/8"` for a range). Supports both
  IPv4 and IPv6. Maximum of 10 CIDR entries. Cannot be used together with
  `block_network`.
</ParamField>

<ParamField type="Optional[List[int]]">
  List of ports to expose from the sandbox. When specified, these ports will
  be accessible via public URLs immediately upon sandbox creation. You can also
  dynamically expose additional ports at runtime using the `expose_port()` method.
  Default is an empty list.
</ParamField>

### `Sandbox.connect()`

Connect to an existing sandbox instance by ID.

<ParamField type="<class 'str'>">
  The container ID of the existing sandbox instance.
</ParamField>

```python theme={null}
# Connect to an existing sandbox
instance = sandbox.connect("sandbox-123")
```

### `Sandbox.create()`

Create a new sandbox instance.

This method creates a new containerized sandbox environment with the
specified configuration.

```python theme={null}
# Create a new sandbox
instance = Sandbox().create()
print(f"Sandbox created with ID: {instance.sandbox_id()}")
```

### `Sandbox.create_from_memory_snapshot()`

Create a new sandbox instance from a memory snapshot.

This method creates a new containerized sandbox environment with the
specified configuration from a memory snapshot.

```python theme={null}
# Create a new sandbox from a memory snapshot
instance = Sandbox().create_from_memory_snapshot(snapshot_id)
print(f"Sandbox created with ID: {instance.sandbox_id()}")
```

### `Sandbox.debug()`

Print the debug buffer contents to stdout.

This method outputs any debug information that has been collected
during sandbox operations.

## SandboxInstance

A sandbox instance that provides access to the sandbox internals.

This class represents an active sandboxed container and provides methods for
process management, file system operations, preview URLs, and lifecycle
management.

### `SandboxInstance.expose_port()`

Dynamically expose a port to the internet.

This method creates a public URL that allows external access to a specific
port within the sandbox. The URL is SSL-terminated and provides secure
access to services running in the sandbox.

<ParamField type="<class 'int'>">
  The port number to expose within the sandbox.
</ParamField>

```python theme={null}
# Expose port 8000 for a web service
url = instance.expose_port(8000)
print(f"Web service available at: {url}")
```

### `SandboxInstance.list_urls()`

List the URLs / ports that are exposed on the sandbox.

This method returns a list of preview URLs / ports that are exposed on the sandbox.

```python theme={null}
for port, url in instance.list_urls().items():
    print(f"Port {port} available at: {url}")

print(f"Available URLs: {urls}")
```

### `SandboxInstance.update_network_permissions()`

Update the network permissions of a running sandbox without restart.

This method allows you to dynamically change network access policies while
the sandbox is running. You can block all outbound traffic or specify an
allowlist of CIDR ranges. Exposed ports remain accessible regardless of
these restrictions.

<ParamField type="bool">
  Whether to block all outbound network access from the sandbox. When enabled,
  the sandbox cannot make outbound connections to external services, but inbound
  connections to exposed ports are still allowed. Cannot be used together with
  `allow_list`.
</ParamField>

<ParamField type="Optional[List[str]]">
  List of CIDR ranges that the sandbox is allowed to connect to. All other
  outbound network access will be blocked. Must use CIDR notation (e.g.,
  `"8.8.8.8/32"` for a single IP, `"10.0.0.0/8"` for a range). Supports both
  IPv4 and IPv6. Maximum of 10 CIDR entries. Cannot be used together with
  `block_network=True`.
</ParamField>

```python theme={null}
# Block all outbound traffic
instance.update_network_permissions(block_network=True)

# Allow only specific destinations
instance.update_network_permissions(
    allow_list=["8.8.8.8/32", "10.0.0.0/8"]
)

# Remove all restrictions
instance.update_network_permissions(block_network=False, allow_list=[])
```

### `SandboxInstance.sandbox_id()`

Get the ID of the sandbox.

```python theme={null}
sandbox_id = instance.sandbox_id()
print(f"Working with sandbox: {sandbox_id}")
```

### `SandboxInstance.terminate()`

Terminate the container associated with this sandbox instance.

This method stops the sandbox container and frees up associated resources.
Once terminated, the sandbox instance cannot be used for further operations.

```python theme={null}
# Terminate the sandbox
success = instance.terminate()
if success:
    print("Sandbox terminated successfully")
```

### `SandboxInstance.update_ttl()`

Update the keep warm setting of the sandbox.

This method allows you to change how long the sandbox will remain active
before automatically shutting down.

<ParamField type="<class 'int'>">
  The number of seconds to keep the sandbox alive. Use -1 for sandboxes that
  never timeout.
</ParamField>

```python theme={null}
# Keep the sandbox alive for 1 hour
instance.update_ttl(3600)

# Make the sandbox never timeout
instance.update_ttl(-1)
```

### `SandboxInstance.create_image_from_filesystem()`

Create a filesystem snapshot of the current sandbox.

This method captures the filesystem state of the sandbox as an immutable artifact.

You can later restore this snapshot into a new sandbox instance.

```python theme={null}
# Take a filesystem snapshot of the sandbox
image_id = instance.create_image_from_filesystem()
print(f"Created image: {image_id}")
```

### `SandboxInstance.snapshot_memory()`

Create a memory snapshot of the current sandbox.

This method captures the memory state of the sandbox as an immutable artifact.

You can later restore this snapshot into a new sandbox instance.

```python theme={null}
# Take a memory snapshot of the sandbox
snapshot_id = instance.snapshot_memory()
print(f"Created snapshot: {snapshot_id}")
```

## SandboxProcess

Represents a running process within a sandbox.

This class provides control and monitoring capabilities for processes
running in the sandbox. It allows you to wait for completion, kill
processes, check status, and access output streams.

### `SandboxProcess.kill()`

Kill the process.

This method forcefully terminates the running process. Use this
when you need to stop a process that is not responding or when
you want to cancel a long-running operation.

```python theme={null}
process = pm.exec("sleep", "100")

# Kill the process after 5 seconds
import time
time.sleep(5)
process.kill()
```

### `SandboxProcess.logs()`

Returns a combined stream of both stdout and stderr.

This is a convenience property that combines both output streams.
The streams are read concurrently, so if one stream is empty, it won't block
the other stream from being read.

```python theme={null}
process = pm.exec("python3", "-c", "import sys; print('stdout'); print('stderr', file=sys.stderr)")

# Read combined output
for line in process.logs:
    print(f"LOG: {line.strip()}")

# Or read all at once
all_logs = process.logs.read()
```

### `SandboxProcess.status()`

Get the status of the process.

This method returns the current exit code and status string of the process.
An exit code of -1 indicates the process is still running.

```python theme={null}
process = pm.exec("sleep", "5")

# Check status periodically
while True:
    exit_code, status = process.status()
    if exit_code >= 0:
        print(f"Process finished with exit code: {exit_code}")
        break
    time.sleep(1)
```

Get a handle to a stream of the process's stderr.

```python theme={null}
process = pm.exec("python3", "-c", "import sys; print('Error', file=sys.stderr)")
stderr_content = process.stderr.read()
print(f"STDERR: {stderr_content}")
```

Get a handle to a stream of the process's stdout.

```python theme={null}
process = pm.exec("echo", "Hello World")
stdout_content = process.stdout.read()
print(f"STDOUT: {stdout_content}")
```

### `SandboxProcess.wait()`

Wait for the process to complete.

This method blocks until the process finishes execution and returns
the exit code. It polls the process status until completion.

```python theme={null}
process = pm.exec("long_running_command")
exit_code = process.wait()
if exit_code == 0:
    print("Command completed successfully")
```

## SandboxProcessManager

Manager for executing and controlling processes within a sandbox.

This class provides a high-level interface for running commands and Python
code within the sandbox environment. It supports both blocking and non-blocking
execution, environment variable configuration, and working directory specification.

### `SandboxProcessManager.exec()`

Run an arbitrary command in the sandbox.

This method executes shell commands within the sandbox environment.
The command is executed using the shell available in the sandbox.

<ParamField type="string">
  The command and its arguments to execute.
</ParamField>

<ParamField type="Optional[str]">
  The working directory to run the command in. Default is None.
</ParamField>

<ParamField type="Optional[Dict[str, str]]">
  Environment variables to set for the command. Default is None.
</ParamField>

```python theme={null}
# Run a simple command
process = pm.exec("ls", "-la")
process.wait()

# Run with custom environment
process = pm.exec("echo", "$CUSTOM_VAR", env={"CUSTOM_VAR": "hello"})

# Run in specific directory
process = pm.exec("pwd", cwd="/tmp")
```

### `SandboxProcessManager.get_process()`

Get a process by its PID.

<ParamField type="<class 'int'>">
  The process ID to look up.
</ParamField>

```python theme={null}
try:
    process = pm.get_process(12345)
    print(f"Found process: {process.pid}")
except SandboxProcessError:
    print("Process not found")
```

### `SandboxProcessManager.list_processes()`

List all processes running in the sandbox.

```python theme={null}
processes = pm.list_processes()
for pid, process in processes.items():
    print(f"Process {pid} is running")
```

### `SandboxProcessManager.run_code()`

Run Python code in the sandbox.

This method executes Python code within the sandbox environment. The code
is executed using the Python interpreter available in the sandbox.

<ParamField type="<class 'str'>">
  The Python code to execute.
</ParamField>

<ParamField type="<class 'bool'>">
  Whether to wait for the process to complete. If True, returns
  SandboxProcessResponse. If False, returns SandboxProcess.
</ParamField>

<ParamField type="Optional[str]">
  The working directory to run the code in. Default is None.
</ParamField>

<ParamField type="Optional[Dict[str, str]]">
  Environment variables to set for the process. Default is None.
</ParamField>

```python theme={null}
# Run blocking Python code
result = pm.run_code("print('Hello from sandbox!')")
print(result.result)

# Run non-blocking Python code
process = pm.run_code("import time; time.sleep(10)", blocking=False)
# Do other work while process runs
process.wait()
```

## SandboxProcessResponse

Response object containing the results of a completed process execution.

This class encapsulates the output and status information from a process
that has finished running in the sandbox.

## SandboxProcessStream

A stream-like interface for reading process output in real-time.

This class provides an iterator interface for reading stdout or stderr
from a running process. It buffers output and provides both line-by-line
iteration and bulk reading capabilities.

Example:

```python theme={null}
# Get a process stream
process = pm.exec("echo", "Hello World")

# Read line by line
for line in process.stdout:
    print(f"Output: {line.strip()}")

# Read all output at once
all_output = process.stdout.read()
```

### `SandboxProcessStream()`

### `SandboxProcessStream.read()`

Read all remaining output from the stream.

## SandboxProcessError

## SandboxConnectionError

## SandboxFileInfo

Metadata of a file in the sandbox.

This class provides detailed information about files and directories
within the sandbox filesystem, including permissions, ownership,
and modification times.

## SandboxFileSystem

File system interface for managing files within a sandbox.

This class provides a comprehensive API for file operations within
the sandbox, including uploading, downloading, listing, and managing
files and directories.

### `SandboxFileSystem.create_directory()`

Create a directory in the sandbox.

Note: This method is not yet implemented.

<ParamField type="<class 'str'>">
  The path where the directory should be created.
</ParamField>

### `SandboxFileSystem.delete_directory()`

Delete a directory in the sandbox.

Note: This method is not yet implemented.

<ParamField type="<class 'str'>">
  The path of the directory to delete.
</ParamField>

### `SandboxFileSystem.delete_file()`

Delete a file in the sandbox.

This method removes a file from the sandbox filesystem.

<ParamField type="<class 'str'>">
  The path to the file within the sandbox.
</ParamField>

```python theme={null}
# Delete a temporary file
fs.delete_file("/tmp/temp_file.txt")

# Delete a log file
fs.delete_file("/var/log/old_log.log")
```

### `SandboxFileSystem.download_file()`

Download a file from the sandbox to a local path.

This method downloads a file from the sandbox filesystem and
saves it to the specified local path.

<ParamField type="<class 'str'>">
  The path to the file within the sandbox.
</ParamField>

<ParamField type="<class 'str'>">
  The destination path on the local filesystem.
</ParamField>

```python theme={null}
# Download a log file
fs.download_file("/var/log/app.log", "local_app.log")

# Download to a specific directory
fs.download_file("/output/result.txt", "./results/result.txt")
```

### `SandboxFileSystem.find_in_files()`

Find files matching a pattern in the sandbox.

This method searches for files within the specified directory
that match the given pattern.

<ParamField type="<class 'str'>">
  The directory path to search in.
</ParamField>

<ParamField type="<class 'str'>">
  The pattern to match files against.
</ParamField>

```python theme={null}
# Find all Python files
python_files = fs.find_in_files("/workspace", "*.py")

# Find all log files
log_files = fs.find_in_files("/var/log", "*.log")
```

### `SandboxFileSystem.list_files()`

List the files in a directory in the sandbox.

This method returns information about all files and directories
within the specified directory in the sandbox.

<ParamField type="<class 'str'>">
  The path to the directory within the sandbox.
</ParamField>

```python theme={null}
# List files in the root directory
files = fs.list_files("/")
for file_info in files:
    if file_info.is_dir:
        print(f"Directory: {file_info.name}")
    else:
        print(f"File: {file_info.name} ({file_info.size} bytes)")

# List files in a specific directory
workspace_files = fs.list_files("/workspace")
```

### `SandboxFileSystem.replace_in_files()`

Replace a string in all files in a directory.

This method performs a find-and-replace operation on all files
within the specified directory, replacing occurrences of the
old string with the new string.

<ParamField type="<class 'str'>">
  The directory path to search in.
</ParamField>

<ParamField type="<class 'str'>">
  The string to find and replace.
</ParamField>

<ParamField type="<class 'str'>">
  The string to replace with.
</ParamField>

```python theme={null}
# Replace a configuration value
fs.replace_in_files("/config", "old_host", "new_host")

# Update version numbers
fs.replace_in_files("/app", "1.0.0", "1.1.0")
```

### `SandboxFileSystem.stat_file()`

Get the metadata of a file in the sandbox.

This method retrieves detailed information about a file or directory
within the sandbox, including size, permissions, ownership, and
modification time.

<ParamField type="<class 'str'>">
  The path to the file within the sandbox.
</ParamField>

```python theme={null}
# Get file information
file_info = fs.stat_file("/path/to/file.txt")
print(f"File size: {file_info.size} bytes")
print(f"Is directory: {file_info.is_dir}")
print(f"Modified: {file_info.mod_time}")
```

### `SandboxFileSystem.upload_file()`

Upload a local file to the sandbox.

This method reads a file from the local filesystem and uploads
it to the specified path within the sandbox.

<ParamField type="<class 'str'>">
  The path to the local file to upload.
</ParamField>

<ParamField type="<class 'str'>">
  The destination path within the sandbox.
</ParamField>

```python theme={null}
# Upload a Python script
fs.upload_file("my_script.py", "/workspace/script.py")

# Upload to a subdirectory
fs.upload_file("config.json", "/app/config/config.json")
```

## SandboxFileSystemError

## SandboxFilePosition

A position in a file.

### `SandboxFilePosition()`

## SandboxFileSearchMatch

A match in a file.

### `SandboxFileSearchMatch()`

## SandboxFileSearchRange

A range in a file.

### `SandboxFileSearchRange()`

## `Pod`

A **Pod** is an object that allows you to run arbitrary services in a fast, scalable, and secure remote container on Beam.

You can think of a Pod as a lightweight compute environment that you fully control—complete with a custom container, ports you can expose, environment variables, volumes, secrets, and GPUs.

```python theme={null}
from beam import Pod, Image

# Create a Pod that runs a simple HTTP server
pod = Pod(
    name="web-server",
    cpu=2,
    memory="512Mi",
    image=Image(
        base_image="python:3.9-slim",
        python_packages=["requests"],
    ),
    ports=[8000],  # We'll expose port 8000
)

# Spin up the Pod container, running `python -m http.server 8000`
result = pod.create(entrypoint=["python", "-m", "http.server", "8000"])

print("Container ID:", result.container_id)
print("URL:", result.url)
```

When you run this snippet (e.g., python app.py), Beam will:

* Build your container (if necessary) and sync your local files to the remote environment.
* Create a Pod container with the specified resources (2 CPU cores, 512 MiB memory).
* Run `python -m http.server 8000` inside that remote container.
* Expose the container on port 8000. You’ll get back a container ID and a URL to access it.
* Once the Pod is running, you can perform additional operations—like opening an interactive shell inside the container or deploying the Pod as a named app.

<ParamField type="List[str]">
  The command to run in the container. By default, nothing is specified, so you
  must provide an entrypoint to actually run anything. You can override or
  provide this entrypoint at creation time using `pod.create(entrypoint=...)`.
</ParamField>

<ParamField type="Optional[List[int]]">
  A list of ports to expose. If provided, the container will be accessible
  through an HTTP URL for each port opened. For example, if `[8000]` is
  specified, you'll get `<Pod URL>:8000`.
</ParamField>

<ParamField type="Optional[str]">
  An optional name for the pod. If you plan to deploy this Pod (i.e., treat it
  as a persistent app), you should specify a name. If you do not specify a name,
  Beam will generate a random name at deploy time, or you must specify
  `--name=...` in the CLI.
</ParamField>

<ParamField type="Union[int, float, str]">
  The amount of CPU allocated to the container. For example, `2` means 2 CPU
  cores, `"500m"` might mean half a CPU core, `1.0` means 1 CPU core, etc.
</ParamField>

<ParamField type="Union[int, str]">
  The amount of memory (in MiB) allocated to the container. You can also specify
  this as a string with units (e.g., `"512Mi"`, `"2Gi"`).
</ParamField>

<ParamField type="Union[GpuTypeAlias, List[GpuTypeAlias]]">
  The type or name of the GPU device to be used for GPU-accelerated tasks. You
  can specify multiple GPUs by providing a list (in which case the scheduler
  prioritizes their selection based on the order in the list). If no GPU is
  required, leave it empty.
</ParamField>

<ParamField type="int">
  The number of GPUs allocated to the container. If a GPU is specified but this
  value is set to 0, it will automatically be updated to 1.
</ParamField>

<ParamField type="Image">
  The container image to be used for running the Pod. Defaults to a basic Beam
  `Image` object, which can be customized (e.g., `base_image=`,
  `python_packages=`, and more).
</ParamField>

<ParamField type="Optional[List[Volume]]">
  A list of volumes to be mounted into the container. Volumes allow you to
  persist data or mount external storage services, such as S3-compatible
  buckets.
</ParamField>

<ParamField type="Optional[List[str]]">
  A list of secrets that are injected into the container as environment
  variables. Each secret must be configured in your Beam project.
</ParamField>

<ParamField type="Optional[Dict[str, str]]">
  A dictionary of environment variables to inject into the container.
  For example: `{"MY_API_KEY": "abc123"}`.
</ParamField>

<ParamField type="int">
  The number of seconds to keep the container alive after the last request. A
  value of `-1` means never scale down to zero (i.e., keep the container running
  indefinitely). This only applies if you deploy the Pod.
</ParamField>

<ParamField type="bool">
  If `False`, allows the container to be accessed without an auth token. This is
  useful for public-facing services. If you need to secure it behind an auth
  token, set it to `True`.
</ParamField>

<ParamField type="bool">
  Whether to block all outbound network access from the pod. When enabled, the
  pod cannot make outbound connections to external services, but inbound
  connections to exposed ports are still allowed. Cannot be used together with
  `allow_list`.
</ParamField>

<ParamField type="List[str]">
  List of CIDR ranges that the pod is allowed to connect to. All other outbound
  network access will be blocked. Must use CIDR notation (e.g., `"8.8.8.8/32"`
  for a single IP, `"10.0.0.0/8"` for a range). Supports both IPv4 and IPv6.
  Maximum of 10 CIDR entries. Cannot be used together with `block_network`.
</ParamField>

#### `Create`

Create a new container that runs until it completes or is explicitly killed.

```python theme={null}
from beam import Pod

pod = Pod(cpu=2, memory="1Gi", ports=[8080])
result = pod.create(entrypoint=["python", "-m", "http.server", "8080"])

if result.ok:
    print("Pod created successfully!")
    print("Container ID:", result.container_id)
    print("URL:", result.url)
else:
    print("Failed to create Pod")
```

#### `Deploy`

Deploy the Pod as a named persistent service. Pods can be deployed programmatically via Python, or CLI.

**Deploying via Python**

```python app.py theme={null}
from beam import Pod

pod = Pod(
    name="my-deployed-pod",
    cpu=2,
    memory="1Gi",
    ports=[8000],
    entrypoint=["python", "-m", "http.server", "8000"],
)

# Deploy the Pod
ok = pod.deploy()
if ok:
    print("Pod successfully deployed!")
else:
    print("Pod deployment failed!")
```

```python app.py theme={null}
python app.py
```

**Deploying via CLI**

```python app.py theme={null}
from beam import Pod

pod = Pod(
    name="my-deployed-pod",
    cpu=2,
    memory="1Gi",
    ports=[8000],
    entrypoint=["python", "-m", "http.server", "8000"],
)
```

```sh theme={null}
beam deploy app.py:pod
```

## `Function`

Decorator for defining a remote function.

This method allows you to run the decorated function in a remote container.

```python Function theme={null}
from beam import Image, Function


@function(
    cpu=1.0,
    memory=128,
    gpu="T4",
    image=Image(python_packages=["torch"]),
    keep_warm_seconds=1000,
)
def transcribe(filename: str):
    print(filename)
    return "some_result"


# Call a function in a remote container
function.remote("some_file.mp4")
# Map the function over several inputs
# Each of these inputs will be routed to remote containers
for result in function.map(["file1.mp4", "file2.mp4"]):
    print(result)
```

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="int">
  The amount of memory allocated to the container. It should be specified in
  MiB, or as a string with units (e.g., "1Gi").
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU required, leave it empty. Multiple GPUs can be
  specified as a list.
</ParamField>

<ParamField type="string">
  The container image used for task execution.
</ParamField>

<ParamField type="int">
  The maximum number of seconds a task can run before timing out. Set to -1 to
  disable the timeout.
</ParamField>

<ParamField type="int">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

<ParamField type="string">
  An optional URL to send a callback to when a task is completed, timed out, or
  cancelled.
</ParamField>

<ParamField type="list">
  A list of storage volumes to be associated with the function.
</ParamField>

<ParamField type="list">
  A list of secrets that are injected into the container as environment
  variables.
</ParamField>

<ParamField type="string">
  An optional name for this function, used during deployment. If not specified,
  you must specify the name at deploy time with the `--name` argument.
</ParamField>

<ParamField type="TaskPolicy">
  The task policy for the function. This helps manage the lifecycle of an
  individual task. Setting values here will override timeout and retries.
</ParamField>

<ParamField type="list">
  A list of exceptions that will trigger a retry.
</ParamField>

#### `Remote`

You can run any function remotely on Beam by using the `.remote()` method:

```python theme={null}
from beam import function


@function(cpu=8)
def square(i: int):
    return i**2


if __name__ == "__main__":
    # Run the `square` function remotely on Beam
    result = square.remote(i=12)
    print(result)
```

The code above is invoked by running `python example.py`:

```bash theme={null}
(.venv) user@MacBook % python example.py
=> Building image
=> Using cached image
=> Syncing files
=> Files synced
=> Running function: <example:square>
=> Function complete <908c76b1-ee68-4b33-ac3a-026ae646625f>
144
```

#### `Map`

You can scale out workloads to many containers using the `.map()` method. You might use this for parallelizing computational-heavy tasks, such as batch inference or data processing jobs.

```python theme={null}
from beam import function


@function(cpu=8)
def square(i: int):
    return i**2


def main():
    numbers = list(range(10))
    squared = []

    # Run a remote container for every item in list
    for result in square.map(numbers):
        squared.append(result)
```

## `Schedule`

This method allows you to schedule the decorated function to run at specific intervals defined by a cron expression.

```python theme={null}
from beam import schedule


@schedule(when="*/5 * * * *", name="scheduled-job")
def task():
    print("Hi, from scheduled task!")
```

<ParamField type="string">
  The cron expression or predefined schedule that determines when the task will run.
  This parameter defines the interval or specific time when the task should execute.
</ParamField>

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="int">
  The amount of memory allocated to the container. It should be specified in
  megabytes (e.g., 128 for 128 megabytes).
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU required, leave it empty.
</ParamField>

<ParamField type="string">
  The container image used for the task execution..
</ParamField>

<ParamField type="float">
  The maximum number of seconds a task can run before it times out. Default is
  180\. Set it to -1 to disable the timeout.
</ParamField>

<ParamField type="int">
  The number of concurrent tasks to handle per container. Modifying this
  parameter can improve throughput for certain workloads. Workers will share the
  CPU, Memory, and GPU defined. You may need to increase these values to
  increase concurrency.
</ParamField>

<ParamField type="int">
  The maximum number of tasks that can be pending in the queue. If the number of
  pending tasks exceeds this value, the task queue will stop accepting new
  tasks.
</ParamField>

<ParamField type="string">
  An optional URL to send a callback to when a task is completed, timed out, or
  cancelled.
</ParamField>

<ParamField type="int">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

<ParamField type="list">
  A list of volumes to be mounted to the container.
</ParamField>

<ParamField type="list">
  A list of secrets that are injected into the container as environment
  variables.
</ParamField>

<ParamField type="string">
  An optional name for this endpoint, used during deployment. If not specified,
  you must specify the name at deploy time with the `--name` argument
</ParamField>

**Scheduling Options**

| **Predefined Schedule**    | **Description**                                            | **Cron Expression** |
| -------------------------- | ---------------------------------------------------------- | ------------------- |
| `@yearly` (or `@annually`) | Run once a year at midnight on January 1st                 | `0 0 1 1 *`         |
| `@monthly`                 | Run once a month at midnight on the first day of the month | `0 0 1 * *`         |
| `@weekly`                  | Run once a week at midnight on Sunday                      | `0 0 * * 0`         |
| `@daily` (or `@midnight`)  | Run once a day at midnight                                 | `0 0 * * *`         |
| `@hourly`                  | Run once an hour at the beginning of the hour              | `0 * * * *`         |

## `Endpoint`

Decorator used for deploying a web endpoint.

```python theme={null}
from beam import endpoint, Image


@endpoint(
    cpu=1.0,
    memory=128,
    gpu="T4",
    image=Image(python_packages=["torch"]),
    keep_warm_seconds=1000,
)
def multiply(**inputs):
    result = inputs["x"] * 2
    return {"result": result}
```

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="int">
  The amount of memory allocated to the container. It should be specified in
  megabytes (e.g., 128 for 128 megabytes).
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU required, leave it empty.
</ParamField>

<ParamField type="string">
  The container image used for the task execution..
</ParamField>

<ParamField type="float">
  The maximum number of seconds a task can run before it times out. Default is
  180\. Set it to -1 to disable the timeout.
</ParamField>

<ParamField type="int">
  The number of concurrent tasks to handle per container. Modifying this
  parameter can improve throughput for certain workloads. Workers will share the
  CPU, Memory, and GPU defined. You may need to increase these values to
  increase concurrency.
</ParamField>

<ParamField type="int">
  The duration in seconds to keep the task queue warm even if there are no
  pending tasks. Keeping the queue warm helps to reduce the latency when new
  tasks arrive. Default is 10s.
</ParamField>

<ParamField type="int">
  The maximum number of tasks that can be pending in the queue. If the number of
  pending tasks exceeds this value, the task queue will stop accepting new
  tasks.
</ParamField>

<ParamField type="Function">
  A function that runs when the container first starts. The return values of the
  `on_start` function can be retrieved by passing a `context` argument to your
  handler function.
</ParamField>

<ParamField type="list">
  A list of volumes to be mounted to the container.
</ParamField>

<ParamField type="list">
  A list of secrets that are injected into the container as environment
  variables.
</ParamField>

<ParamField type="string">
  An optional name for this endpoint, used during deployment. If not specified,
  you must specify the name at deploy time with the `--name` argument
</ParamField>

<ParamField type="boolean">
  If false, allows the endpoint to be invoked without an auth token.
</ParamField>

<ParamField type="int">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

<ParamField type="boolean">
  Capture a memory snapshot of the running container after `on_start` completes,
  speeding up cold boot. Initial checkpoints can take up to 3 minutes to
  capture, and 5 minutes to distribute among our servers.
</ParamField>

#### `Serve`

[`beam serve`](/v2/reference/cli#serve) monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.

Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.

It's also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.

To start an ephemeral `serve` session, you'll use the `serve` command:

```sh theme={null}
beam serve app.py
```

<Info>Sessions end automatically after 10 minutes of inactivity.</Info>

<Tip>
  By default, Beam will sync all the files in your working directory to the
  remote container. This allows you to use the files you have locally while
  developing. If you want to prevent some files from getting uploaded, you can
  create a [`.beamignore`](/v2/reference/cli#ignore-local-files).
</Tip>

## `Task Queue`

Decorator for defining a task queue.

This method allows you to create a task queue out of the decorated function.

The tasks are executed asynchronously. You can interact with the task queue either through an API (when deployed), or directly in Python through the `.put()` method.

```python Task Queue theme={null}
from beam import Image, task_queue


# Define the task queue
@task_queue(
    cpu=1.0,
    memory=128,
    gpu="T4",
    image=Image(python_packages=["torch"]),
    keep_warm_seconds=1000,
)

def transcribe(filename: str):
    return {}


transcribe.put("some_file.mp4")
```

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="int">
  The amount of memory allocated to the container. It should be specified in
  megabytes (e.g., 128 for 128 megabytes).
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU required, leave it empty.
</ParamField>

<ParamField type="string">
  The container image used for the task execution..
</ParamField>

<ParamField type="float">
  The maximum number of seconds a task can run before it times out. Default is
  180\. Set it to -1 to disable the timeout.
</ParamField>

<ParamField type="int">
  The number of concurrent tasks to handle per container. Modifying this
  parameter can improve throughput for certain workloads. Workers will share the
  CPU, Memory, and GPU defined. You may need to increase these values to
  increase concurrency.
</ParamField>

<ParamField type="int">
  The duration in seconds to keep the task queue warm even if there are no
  pending tasks. Keeping the queue warm helps to reduce the latency when new
  tasks arrive. Default is 10s.
</ParamField>

<ParamField type="int">
  The maximum number of tasks that can be pending in the queue. If the number of
  pending tasks exceeds this value, the task queue will stop accepting new
  tasks.
</ParamField>

<ParamField type="string">
  An optional URL to send a callback to when a task is completed, timed out, or
  cancelled.
</ParamField>

<ParamField type="int">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

<ParamField type="list">
  A list of volumes to be mounted to the container.
</ParamField>

<ParamField type="list">
  A list of secrets that are injected into the container as environment
  variables.
</ParamField>

<ParamField type="string">
  An optional name for this endpoint, used during deployment. If not specified,
  you must specify the name at deploy time with the `--name` argument
</ParamField>

<ParamField type="list">
  A list of exceptions that will trigger a retry.
</ParamField>

<ParamField type="boolean">
  Capture a memory snapshot of the running container after `on_start` completes,
  speeding up cold boot. Initial checkpoints can take up to 3 minutes to
  capture, and 5 minutes to distribute among our servers.
</ParamField>

#### `Serve`

[`beam serve`](/v2/reference/cli#serve) monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.

Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.

It's also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.

To start an ephemeral `serve` session, you'll use the `serve` command:

```sh theme={null}
beam serve app.py
```

<Info>Sessions end automatically after 10 minutes of inactivity.</Info>

<Tip>
  By default, Beam will sync all the files in your working directory to the
  remote container. This allows you to use the files you have locally while
  developing. If you want to prevent some files from getting uploaded, you can
  create a [`.beamignore`](/v2/reference/cli#ignore-local-files).
</Tip>

## `ASGI`

Decorator used for creating and deploying an ASGI application.

```python theme={null}
from beam import asgi, Image


@asgi(
    cpu=1.0,
    memory=128,
    gpu="T4",
    image=Image(python_packages=["fastapi"]),
    keep_warm_seconds=10,
    max_pending_tasks=100,
)
def web_server(context):
    from fastapi import FastAPI

    app = FastAPI()

    @app.post("/hello")
    async def something():
        return {"hello": True}

    @app.post("/warmup")
    async def warmup():
        return {"status": "warm"}

    return app
```

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="int">
  The amount of memory allocated to the container. It should be specified in
  MiB, or as a string with units (e.g., "1Gi").
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU required, leave it empty.
</ParamField>

<ParamField type="string">
  The container image used for task execution.
</ParamField>

<ParamField type="list">
  A list of volumes to be mounted to the container.
</ParamField>

<ParamField type="int">
  The maximum number of seconds a task can run before timing out. Set to -1 to
  disable the timeout.
</ParamField>

<ParamField type="int">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

<ParamField type="int">
  The number of processes handling tasks per container. Workers share CPU,
  memory, and GPU resources.
</ParamField>

<ParamField type="int">
  The maximum number of concurrent requests the ASGI application can handle.
</ParamField>

<ParamField type="int">
  The duration in seconds to keep the task queue warm when there are no pending
  tasks.
</ParamField>

<ParamField type="int">
  The maximum number of tasks that can be pending in the queue.
</ParamField>

<ParamField type="list">
  A list of secrets injected into the container as environment variables.
</ParamField>

<ParamField type="string">
  An optional name for this ASGI application, used during deployment.
</ParamField>

<ParamField type="boolean">
  If false, allows the ASGI application to be invoked without an auth token.
</ParamField>

<ParamField type="Autoscaler">
  Configure deployment autoscaling using various strategies.
</ParamField>

<ParamField type="string">
  An optional URL to send a callback when a task is completed, timed out, or
  canceled.
</ParamField>

<ParamField type="TaskPolicy">
  The task policy for the function, overriding timeout and retries.
</ParamField>

#### `Serve`

[`beam serve`](/v2/reference/cli#serve) monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.

Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.

It's also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.

To start an ephemeral `serve` session, you'll use the `serve` command:

```sh theme={null}
beam serve app.py
```

<Info>Sessions end automatically after 10 minutes of inactivity.</Info>

<Tip>
  By default, Beam will sync all the files in your working directory to the
  remote container. This allows you to use the files you have locally while
  developing. If you want to prevent some files from getting uploaded, you can
  create a [`.beamignore`](/v2/reference/cli#ignore-local-files).
</Tip>

## `Realtime`

Decorator for creating a real-time application built on top of ASGI/websockets.\
The handler function runs every time a message is received over the websocket.

```python theme={null}
from beam import realtime

def generate_text():
    return ["this", "could", "be", "anything"]

@realtime(
    cpu=1.0,
    memory=128,
    gpu="T4"
)
def handler(context):
    return generate_text()
```

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="string">
  The amount of memory allocated to the container. It should be specified in
  MiB, or as a string with units (e.g., "1Gi").
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU is required, leave it empty.
</ParamField>

<ParamField type="string">
  The container image used for task execution.
</ParamField>

<ParamField type="list">
  A list of volumes to be mounted to the ASGI application.
</ParamField>

<ParamField type="int">
  The maximum number of seconds a task can run before timing out. Set to -1 to
  disable the timeout.
</ParamField>

<ParamField type="int">
  The number of processes handling tasks per container. Workers share CPU,
  memory, and GPU resources.
</ParamField>

<ParamField type="int">
  The maximum number of concurrent requests the ASGI application can handle.
  This allows processing multiple requests concurrently.
</ParamField>

<ParamField type="int">
  The duration in seconds to keep the task queue warm even if there are no
  pending tasks.
</ParamField>

<ParamField type="int">
  The maximum number of tasks that can be pending in the queue.
</ParamField>

<ParamField type="list">
  A list of secrets injected into the container as environment variables.
</ParamField>

<ParamField type="string">
  An optional name for this ASGI application, used during deployment. If not
  specified, you must provide the name during deployment.
</ParamField>

<ParamField type="boolean">
  If false, allows the ASGI application to be invoked without an auth token.
</ParamField>

<ParamField type="Autoscaler">
  Configure a deployment autoscaler to scale the function horizontally using
  various autoscaling strategies.
</ParamField>

<ParamField type="string">
  An optional URL to send a callback to when a task is completed, timed out, or
  canceled.
</ParamField>

#### `Serve`

[`beam serve`](/v2/reference/cli#serve) monitors changes in your local file system, live-reloads the remote environment as you work, and forwards remote container logs to your local shell.

Serve is great for prototyping. You can develop in a containerized cloud environment in real-time, with adjustable CPU, memory, GPU resources.

It's also great for testing an app before deploying it. Served functions are orchestrated identically to deployments, which means you can test your Beam workflow end-to-end before deploying.

To start an ephemeral `serve` session, you'll use the `serve` command:

```sh theme={null}
beam serve app.py
```

<Info>Sessions end automatically after 10 minutes of inactivity.</Info>

<Tip>
  By default, Beam will sync all the files in your working directory to the
  remote container. This allows you to use the files you have locally while
  developing. If you want to prevent some files from getting uploaded, you can
  create a [`.beamignore`](/v2/reference/cli#ignore-local-files).
</Tip>

### `Function`

Decorator for defining a remote function.

This method allows you to run the decorated function in a remote container.

```python Function theme={null}
from beam import Image, Function


@function(
    cpu=1.0,
    memory=128,
    gpu="T4",
    image=Image(python_packages=["torch"]),
)
def transcribe(filename: str):
    print(filename)
    return "some_result"


# Call a function in a remote container
function.remote("some_file.mp4")
# Map the function over several inputs
# Each of these inputs will be routed to remote containers
for result in function.map(["file1.mp4", "file2.mp4"]):
    print(result)
```

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="int">
  The amount of memory allocated to the container. It should be specified in
  MiB, or as a string with units (e.g., "1Gi").
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU required, leave it empty. Multiple GPUs can be
  specified as a list.
</ParamField>

<ParamField type="string">
  The container image used for task execution.
</ParamField>

<ParamField type="int">
  The maximum number of seconds a task can run before timing out. Set to -1 to
  disable the timeout.
</ParamField>

<ParamField type="int">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

<ParamField type="string">
  An optional URL to send a callback to when a task is completed, timed out, or
  cancelled.
</ParamField>

<ParamField type="list">
  A list of storage volumes to be associated with the function.
</ParamField>

<ParamField type="list">
  A list of secrets that are injected into the container as environment
  variables.
</ParamField>

<ParamField type="string">
  An optional name for this function, used during deployment. If not specified,
  you must specify the name at deploy time with the `--name` argument.
</ParamField>

<ParamField type="TaskPolicy">
  The task policy for the function. This helps manage the lifecycle of an
  individual task. Setting values here will override timeout and retries.
</ParamField>

<ParamField type="list">
  A list of exceptions that will trigger a retry.
</ParamField>

<ParamField type="boolean">
  Determines whether the function continues running in the background after the
  client disconnects.
</ParamField>

## `Bot`

Decorator for defining a bot with multiple states and transitions.

The `bot` decorator allows you to define a bot with specific states (locations) and transitions. These bots run as distributed, stateful workflows, where each transition is executed in a remote container.

```python theme={null}
from beam import Bot, BotContext, BotLocation, Image
from pydantic import BaseModel


# Define input and output types for the bot
class ProductName(BaseModel):
    product_name: str


class URL(BaseModel):
    url: str


class ReviewPage(BaseModel):
    review_page: str


# Create the bot
bot = Bot(
    model="gpt-4o",
    api_key="YOUR_API_KEY",
    locations=[
        BotLocation(marker=ProductName),
        BotLocation(marker=URL, expose=False),
        BotLocation(marker=ReviewPage, expose=False),
    ],
    description="This bot retrieves product reviews and summarizes them.",
)


# Define a transition
@bot.transition(
    inputs={ProductName: 1},
    outputs=[URL],
    description="Retrieve Google Shopping URLs for a product",
    cpu=1,
    memory=128,
    image=Image(python_packages=["serpapi", "google-search-results"]),
)
def get_product_urls(context: BotContext, inputs):
    product_name = inputs[ProductName][0].product_name
    # Perform some action
    return {URL: [URL(url="https://example.com")]}
```

<ParamField type="string">
  The underlying language model (e.g., `"gpt-4o"`) used by the bot.
</ParamField>

<ParamField type="string">
  The Open API key used to authenticate requests to Open AI
</ParamField>

<ParamField type="list">
  A list of `BotLocation` objects defining the bot's states. Each location
  corresponds to a type (e.g., `BaseModel`) that the bot operates on.
</ParamField>

<ParamField type="string">
  A human-readable description of the bot's purpose.
</ParamField>

<ParamField type="bool">
  Specifies whether the bot requires an auth token passed to invoke it.
</ParamField>

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="int">
  The amount of memory allocated to the container. It should be specified in
  megabytes (e.g., 128 for 128 megabytes).
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU required, leave it empty.
</ParamField>

<ParamField type="string">
  The container image used for the task execution..
</ParamField>

<ParamField type="float">
  The maximum number of seconds a task can run before it times out. Default is
  180\. Set it to -1 to disable the timeout.
</ParamField>

<ParamField type="int">
  The number of concurrent tasks to handle per container. Modifying this
  parameter can improve throughput for certain workloads. Workers will share the
  CPU, Memory, and GPU defined. You may need to increase these values to
  increase concurrency.
</ParamField>

<ParamField type="int">
  The duration in seconds to keep the task queue warm even if there are no
  pending tasks. Keeping the queue warm helps to reduce the latency when new
  tasks arrive. Default is 10s.
</ParamField>

<ParamField type="int">
  The maximum number of tasks that can be pending in the queue. If the number of
  pending tasks exceeds this value, the task queue will stop accepting new
  tasks.
</ParamField>

<ParamField type="Function">
  A function that runs when the container first starts. The return values of the
  `on_start` function can be retrieved by passing a `context` argument to your
  handler function.
</ParamField>

<ParamField type="list">
  A list of volumes to be mounted to the container.
</ParamField>

<ParamField type="list">
  A list of secrets that are injected into the container as environment
  variables.
</ParamField>

<ParamField type="string">
  An optional name for this endpoint, used during deployment. If not specified,
  you must specify the name at deploy time with the `--name` argument
</ParamField>

<ParamField type="boolean">
  If false, allows the endpoint to be invoked without an auth token.
</ParamField>

<ParamField type="int">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

# Autoscaling

### `QueueDepthAutoscaler`

Adds an autoscaler to an app.

```python theme={null}
from beam import endpoint, QueueDepthAutoscaler


@endpoint(
    autoscaler=QueueDepthAutoscaler(
        min_containers=1, max_containers=3, tasks_per_container=1
    ),
)
def handler():
    return {"success": "true"}
```

<ParamField type="number">
  The number of containers to keep running at baseline. The containers will
  continue running until the deployment is stopped.
</ParamField>

<ParamField type="number">
  The max number of tasks that can be queued up to a single container. This can
  help manage throughput and cost of compute. When `max_tasks_per_container` is
  0, a container can process any number of tasks.
</ParamField>

<ParamField type="number">
  The maximum number of containers that the autoscaler can create. It defines an
  upper limit to avoid excessive resource consumption.
</ParamField>

# Data Structures

### `Simple Queue`

Creates a Queue instance.

Use this a concurrency safe distributed queue, accessible both locally and within remote containers.

Serialization is done using cloudpickle, so any object that supported by that should work here. The interface is that of a standard python queue.

Because this is backed by a distributed queue, it will persist between runs.

```python Simple Queue theme={null}
from beam import Queue

val = [1, 2, 3]

# Initialize the Queue
q = Queue(name="myqueue")

for i in range(100):
    # Insert something to the queue
    q.put(val)
while not q.empty():
    # Remove something from the queue
    val = q.pop()
    print(val)
```

<ParamField type="string">
  The name of the queue (any arbitrary string).
</ParamField>

### Map

Creates a Map Instance.

Use this a concurrency safe key/value store, accessible both locally and within
remote containers.

Serialization is done using cloudpickle, so any object that supported by that should work here. The interface is that of a standard python dictionary.

Because this is backed by a distributed dictionary, it will persist between runs.

```python Map theme={null}
from beam import Map

# Name the map
m = Map(name="test")

# Set a key
m["some_key"] = True

# Delete a key
del m["some_key"]

# Iterate through the map
for k, v in m.items():
    print("key: ", k)
    print("value: ", v)
```

<ParamField type="string">
  The name of the map (any arbitrary string).
</ParamField>

# Storage

Beam allows you to create highly-available storage volumes that can be used across tasks. You might use volumes for things like storing model weights or large datasets.

### `Volume`

Creates a Volume instance.

When your container runs, your volume will be available at `./{mount_path}` and `/volumes/{name}`.

```python theme={null}
from beam import function, Volume


VOLUME_PATH = "./model_weights"


@function(
    volumes=[Volume(name="model-weights", mount_path=VOLUME_PATH)],
)
def load_model():
    from transformers import AutoModel

    # Load model from cloud storage cache
    AutoModel.from_pretrained(VOLUME_PATH)
```

<ParamField type="string">
  The name of the volume, a descriptive identifier for the data volume.
</ParamField>

<ParamField type="string">
  The path where the volume is mounted within the container environment.
</ParamField>

### `CloudBucket`

Creates a CloudBucket instance.

When your container runs, your cloud bucket will be available at `./{mount_path}` and `/volumes/{name}`.

```python theme={null}
from beam import CloudBucket, CloudBucketConfig

# Cloud Bucket
weights = CloudBucket(
    name="weights",
    mount_path="./weights",
    config=CloudBucketConfig(
        access_key="my-access-key",
        secret_key="my-secret-key",
        endpoint="https://s3-endpoint.com",
    ),
)

@function(volumes=[weights])
def my_function():
    pass
```

<ParamField type="string">
  The name of the cloud bucket, must be the same as the bucket name in the cloud
  provider.
</ParamField>

<ParamField type="string">
  The path where the cloud bucket is mounted within the container environment.
</ParamField>

<ParamField type="CloudBucketConfig">
  Configuration for the cloud bucket.
</ParamField>

## `CloudBucketConfig`

Configuration for a cloud bucket.

```python theme={null}
from beam import CloudBucketConfig

config = CloudBucketConfig(
    read_only=False,
    access_key="my-access-key",
    secret_key="my-secret-key",
    endpoint="https://s3-endpoint.com",
    region="us-west-2"
)
```

<ParamField type="boolean">
  Whether the volume is read-only.
</ParamField>

<ParamField type="string">
  The beam secret name for the S3 access key for the external provider.
</ParamField>

<ParamField type="string">
  The beam secret name for the S3 secret key for the external provider.
</ParamField>

<ParamField type="string">
  The S3 endpoint for the external provider.
</ParamField>

<ParamField type="string">
  The region for the external provider.
</ParamField>

## `Output`

A file that a task has created.

Use this to save a file you may want to save and share later.

```python theme={null}
from beam import Image as BeamImage, Output, function


@function(
    image=BeamImage(
        python_packages=[
            "pillow",
        ],
    ),
)
def save_image():
    from PIL import Image as PILImage

    # Generate PIL image
    pil_image = PILImage.new(
        "RGB", (100, 100), color="white"
    )  # Creating a 100x100 white image

    # Save image file
    output = Output.from_pil_image(pil_image)
    output.save()

    # Retrieve pre-signed URL for output file
    url = output.public_url(expires=400)
    print(url)

    # Print other details about the output
    print(f"Output ID: {output.id}")
    print(f"Output Path: {output.path}")
    print(f"Output Stats: {output.stat()}")
    print(f"Output Exists: {output.exists()}")

    return {"image": url}


if __name__ == "__main__":
    save_image()
```

When you run this function, it will return a pre-signed URL to the image:

```bash theme={null}
https://app.stage.beam.cloud/output/id/abe0c95a-2cd1-40b3-bace-9225f2c79c6d
```

<ParamField type="int">
  The length of time the pre-signed URL will be available for. The file will be
  automatically deleted after this period.
</ParamField>

#### Files

Saving a file and generating a public URL.

```python theme={null}
myfile = "path/to/my.txt"
output = Output(path=myfile)
output.save()
output_url = output.public_url()
```

#### PIL Images

Saving a `PIL.Image` object.

```python theme={null}
image = pipe( ... )
output = Output.from_pil_image(image)
output.save()
```

#### Directories

Saving a directory.

```python theme={null}
mydir = Path("/volumes/myvol/mydir") # or use a str
output = Output(path=mydir)
output.save()
```

## Experimental

### `Signal`

Creates a Signal instance. Signals can be used to notify a container to perform specific actions using a flag.

For example, signals can reload global state, send a webhook, or terminate the container.

<Info>This is a great tool for automated retraining and deployment.</Info>

```python theme={null}
# Setting up a consumer of a signal
s = Signal(name="reload-model", handler=reload_model, clear_after_interval=5)
some_global_model = None

def load_model():
    global some_global_model
    some_global_model = LoadIt()

@endpoint(on_start=load_model)
def handler(**kwargs):
    global some_global_model
    return some_global_model(kwargs["param1"])

# Trigger load_model to execute again while the container is still running
s = Signal(name="reload-model")
s.set(ttl=60)
```

<ParamField type="string">
  The name of the signal.
</ParamField>

<ParamField type="Callable">
  A function to be called when the signal is set. If not provided, no handler
  will be executed.
</ParamField>

<ParamField type="int">
  The number of seconds after which the signal will be automatically cleared if
  both `handler` and `clear_after_interval` are set.
</ParamField>

## Integrations

### `vllm`

A wrapper around the vLLM library that allows you to deploy it as an ASGI app.

```python theme={null}
from beam import integrations

e = integrations.VLLMArgs()
e.device = "cpu"
e.chat_template = "./chatml.jinja"

vllm_app = integrations.VLLM(name="vllm-abstraction-1", vllm_args=e)
```

<ParamField type="float">
  The number of CPU cores allocated to the container.
</ParamField>

<ParamField type="string">
  The amount of memory allocated to the container. It should be specified in
  MiB, or as a string with units (e.g., "1Gi").
</ParamField>

<ParamField type="string">
  The type or name of the GPU device to be used for GPU-accelerated tasks. If
  not applicable or no GPU is required, leave it empty.
</ParamField>

<ParamField type="string">
  The container image used for task execution. This will include an
  `add_python_packages` call with `["fastapi", "vllm", "huggingface_hub"]` added
  to ensure vLLM can run.
</ParamField>

<ParamField type="int">
  The number of workers to run in the container.
</ParamField>

<ParamField type="int">
  The maximum number of concurrent requests the container can handle.
</ParamField>

<ParamField type="int">
  The number of seconds to keep the container warm after the last request.
</ParamField>

<ParamField type="int">
  The maximum number of pending tasks allowed in the container.
</ParamField>

<ParamField type="int">
  The maximum number of seconds to wait for the container to start.
</ParamField>

<ParamField type="boolean">
  Whether the endpoints require authorization.
</ParamField>

<ParamField type="string">
  The name of the container. If not specified, you must provide it during
  deployment.
</ParamField>

<ParamField type="list">
  The volumes to mount into the container. Default is a single volume named
  "vllm\_cache" mounted to "./vllm\_cache", used as the download directory for
  vLLM models.
</ParamField>

<ParamField type="list">
  A list of secrets to pass to the container. To enable Hugging Face
  authentication for downloading models, set the `HF_TOKEN` in the secrets.
</ParamField>

<ParamField type="Autoscaler">
  The autoscaler to use for scaling container deployments.
</ParamField>

<ParamField type="VLLMArgs">
  The arguments to configure the vLLM model.
</ParamField>

# Utils

### `env`

You can use `env.is_remote()` to only import Python packages when your app is running remotely. This is used to avoid import errors, since your Beam app might be using Python packages that aren't installed on your local computer.

```python theme={null}
from beam import env

if env.is_remote():
    import torch
```

The alternative to `env.is_remote()` is to import packages inline in your functions. For more information on this topic, [visit this page](/v2/environment/remote-versus-local).


# TypeScript SDK Reference
Source: https://docs.beam.cloud/v2/reference/ts-sdk


Beam's TypeScript SDK provides a powerful client library for interacting with the Beam platform. Unlike decorators and frameworks in the Python SDK, the TypeScript SDK focuses on programmatic access to Beam's infrastructure and resources.

This reference outlines every available class, method, and configuration option in the TypeScript SDK.

# Installation

Install the package with `npm`:

```typescript theme={null}
npm install @beamcloud/beam-js@rc
```

...or using `yarn`:

```typescript theme={null}
yarn add @beamcloud/beam-js@rc
```

# Configuration

Locate your Beam Token (API Key) and Workspace ID in the [dashboard](https://platform.beam.cloud/settings/api-keys) and set them as environment variables.

```bash theme={null}
export BEAM_TOKEN=YOUR_BEAM_TOKEN
export BEAM_WORKSPACE_ID=YOUR_WORKSPACE_ID
```

## `beamOpts`

Global configuration object for the Beam client.

```typescript theme={null}
import { beamOpts } from "@beamcloud/beam-js";

beamOpts.token = process.env.BEAM_TOKEN!;
beamOpts.workspaceId = process.env.BEAM_WORKSPACE_ID!;
beamOpts.gatewayUrl = "https://app.beam.cloud"; // Optional, defaults to https://app.beam.cloud
```

**Required Configuration:**

* `token`: Your Beam authentication token
* `workspaceId`: Your Beam workspace ID

**Optional Configuration:**

* `gatewayUrl`: The Beam gateway URL (defaults to `https://app.beam.cloud`)

## Quickstart

Run a simple Node.js server in a sandbox. This example uses the `Image` class to create a custom container image and the `Sandbox` class to create a sandbox instance.

```typescript theme={null}
import { beamOpts, Image, Sandbox } from "@beamcloud/beam-js";

beamOpts.token = process.env.BEAM_TOKEN!;
beamOpts.workspaceId = process.env.BEAM_WORKSPACE_ID!;

async function main() {
  const image = new Image({
    baseImage: "node:20",
    commands: [
      "apt update",
      "apt install -y nodejs npm",
      "git clone https://github.com/beam-cloud/quickstart-node.git /app",
    ],
  });

  const sandbox = new Sandbox({
    name: "quickstart",
    image: image,
    cpu: 2,
    memory: 1024,
    keepWarmSeconds: 300,
  });

  const instance = await sandbox.create();

  const process4 = await instance.exec([
    "sh",
    "-c",
    "cd /app && node server.js",
  ]);

  const url = await instance.exposePort(3000);
  console.log(`Server is running at ${url}`);
}

main();
```

# Environment

## `Image`

Defines a custom container image that your code will run in.

An Image object encapsulates the configuration of a custom container image that will be used as the runtime environment for executing tasks.

```typescript theme={null}
import {
  Image,
  PythonVersion,
  PythonVersionAlias,
  GpuType,
  GpuTypeAlias,
} from "@beamcloud/beam-js";

const image = new Image({
  baseImage: "docker.io/nvidia/cuda:12.3.1-runtime-ubuntu20.04",
  pythonVersion: PythonVersion.Python311, // Type-safe enum approach
  commands: ["apt-get update -y", "apt-get install ffmpeg -y"],
  pythonPackages: ["transformers", "torch"],
  gpu: GpuType.A10G, // Type-safe enum approach
});

await image.build();

// Alternative using string literals for convenience
const imageWithStringLiterals = new Image({
  baseImage: "docker.io/nvidia/cuda:12.3.1-runtime-ubuntu20.04",
  pythonVersion: "python3.11", // String literal for Python version
  commands: ["apt-get update -y", "apt-get install ffmpeg -y"],
  pythonPackages: ["transformers", "torch"],
  gpu: "A10G", // String literal for GPU
});

await imageWithStringLiterals.build();
```

**Constructor Parameters:**

<ParamField type="PythonVersionAlias">
  The Python version to be used in the image. Can be a PythonVersion enum (e.g.,
  PythonVersion.Python311) or string literal (e.g., "python3.11"). Defaults to
  Python 3.
</ParamField>

<ParamField type="string[] | string">
  A list of Python packages to install in the container image. Alternatively, a
  string containing a path to a requirements.txt can be provided.
</ParamField>

<ParamField type="string[]">
  A list of shell commands to run when building your container image. These
  commands can be used for setting up the environment, installing dependencies,
  etc.
</ParamField>

<ParamField type="string">
  A custom base image to replace the default ubuntu20.04 image used in your
  container. This can be a public or private image from Docker Hub, Amazon ECR,
  Google Cloud Artifact Registry, or NVIDIA GPU Cloud Registry.
</ParamField>

<ParamField type="ImageCredentials">
  Credentials for accessing private registries. Can be a dictionary of key/value
  pairs or an array of environment variable names.
</ParamField>

<ParamField type="string[]">
  Environment variables to add to the image. These will be available when
  building the image and when the container is running.
</ParamField>

<ParamField type="string[]">
  A list of secrets that are injected into the container as environment
  variables.
</ParamField>

<ParamField type="GpuTypeAlias">
  Builds the image on a GPU node. Can be a GpuType enum (e.g., GpuType.H100) or
  string literal (e.g., "H100").
</ParamField>

### `Image.fromDockerfile()`

Create an Image from a local Dockerfile.

```typescript theme={null}
import { Image } from "@beamcloud/beam-js";

const image = await Image.fromDockerfile("./Dockerfile", "./context");
```

<ParamField type="string">
  Path to the Dockerfile.
</ParamField>

<ParamField type="string">
  Directory to sync as build context. Defaults to the Dockerfile directory.
</ParamField>

### `Image.addPythonPackages()`

Add pip packages to install during the build. Accepts a list or a path to requirements.txt.

```typescript theme={null}
// Using an array of package names
image.addPythonPackages(["transformers==4.44.0", "torch==2.4.0"]);

// Using a requirements.txt file
image.addPythonPackages("./requirements.txt");
```

<ParamField type="string[] | string">
  Package names or a `requirements.txt` path.
</ParamField>

### `Image.withEnvs()`

Add environment variables available during build and at runtime.

```typescript theme={null}
image.withEnvs({ HF_HOME: "/models", HF_HUB_ENABLE_HF_TRANSFER: "1" });
```

<ParamField type="string[] | Record<string, string> | string">
  Environment variables as key/value pairs, array of "KEY=VALUE" strings, or
  single string.
</ParamField>

### `Image.withSecrets()`

Expose platform secrets to the build environment.

```typescript theme={null}
image.withSecrets(["HF_TOKEN"]);
```

<ParamField type="string[]">
  Secret names created via the platform.
</ParamField>

### `Image.micromamba()`

Switch package management to micromamba and target a micromamba Python.

```typescript theme={null}
image.micromamba();
```

### `Image.addMicromambaPackages()`

Install micromamba packages and optional channels.

```typescript theme={null}
image.addMicromambaPackages(["pandas", "numpy"], ["conda-forge"]);
```

<ParamField type="string[] | string">
  Package names or a `requirements.txt` path.
</ParamField>

<ParamField type="string[]">
  Micromamba channels.
</ParamField>

### `Image.build()`

Build the image and return the result.

```typescript theme={null}
const result = await image.build();
console.log("Build successful:", result.success);
```

# Deployments

## `Deployments`

You can use this to manage and interact with deployed Beam applications.

```typescript theme={null}
import { Deployments } from "@beamcloud/beam-js";

// List all deployments
const deployments = await Deployments.list();

// Get a deployment by ID
const deployment = await Deployments.get({ id: "deployment-id" });

// Get a deployment by name and stub type
const deployment = await Deployments.get({
  name: "my-app",
  stubType: "endpoint/deployment",
});

// Call the deployment
const response = await deployment.call({ message: "Hello World" });

// Connect to realtime deployment
const ws = await deployment.realtime("/", (event) => {
  console.log("Received:", event.data);
});
```

### `Deployments.list()`

List all deployments in your workspace.

```typescript theme={null}
const deployments = await Deployments.list({
  stubType: "endpoint/deployment",
  active: true,
  limit: 10,
});
```

### `Deployments.get()`

Retrieve a deployment by ID, name, or URL.

```typescript theme={null}
// By ID
const deployment = await Deployments.get({ id: "deployment-id" });

// By name and stub type
const deployment = await Deployments.get({
  name: "my-app",
  stubType: "endpoint/deployment",
});
```

<ParamField type="string">
  The deployment ID to retrieve.
</ParamField>

<ParamField type="string">
  The deployment name (must be used with stubType).
</ParamField>

<ParamField type="string">
  The stub type (must be used with name).
</ParamField>

<ParamField type="string">
  The deployment URL.
</ParamField>

## `Deployment`

A deployment instance with methods for interaction.

### `Deployment.call()`

Call the deployment with data.

```typescript theme={null}
const response = await deployment.call(
  { message: "Hello World" },
  "/endpoint-path", // optional path
  "POST", // optional HTTP method
);
```

<ParamField type="any">
  The data to send to the deployment.
</ParamField>

<ParamField type="string">
  Optional path to append to the deployment URL.
</ParamField>

<ParamField type="'GET' | 'POST'">
  HTTP method to use for the request.
</ParamField>

### `Deployment.realtime()`

Connect to a realtime deployment via WebSocket.

```typescript theme={null}
const ws = await deployment.realtime("/", (event) => {
  console.log("Message received:", event.data);
});

// Send a message
ws.send(JSON.stringify({ message: "Hello" }));
```

<ParamField type="string">
  Optional path to append to the WebSocket URL.
</ParamField>

<ParamField type="(event: MessageEvent) => void">
  Optional message handler function.
</ParamField>

### `Deployment.httpUrl()` / `Deployment.websocketUrl()`

Get the HTTP or WebSocket URL for the deployment.

```typescript theme={null}
const httpUrl = deployment.httpUrl("/api/predict");
const wsUrl = deployment.websocketUrl("/realtime");
```

# Sandbox

A sandboxed container for running Python code or arbitrary processes.

You can use this to create isolated environments where you can execute code,
manage files, and run processes.

```typescript theme={null}
import { Sandbox, Image, GpuType } from "@beamcloud/beam-js";

const sandbox = new Sandbox({
  name: "my-sandbox",
  cpu: 2,
  memory: "1Gi",
  gpu: GpuType.T4, // Using enum
  image: new Image({
    pythonPackages: ["numpy", "pandas"],
  }),
  keepWarmSeconds: 300,
});

// Alternative with string literal
const sandboxWithStringGpu = new Sandbox({
  name: "my-sandbox-2",
  cpu: 2,
  memory: "1Gi",
  gpu: "T4", // Using string literal
  image: new Image({
    pythonPackages: ["numpy", "pandas"],
  }),
  keepWarmSeconds: 300,
});

// Create a new sandbox instance
const instance = await sandbox.create();

// Or connect to an existing one
const existingInstance = await Sandbox.connect("sandbox-id");
```

You can also configure sandbox networking at creation time to pre-expose ports
or restrict outbound traffic.

```typescript theme={null}
const networkedSandbox = new Sandbox({
  name: "networked-sandbox",
  image: new Image({
    baseImage: "node:20",
  }),
  ports: [3000, 8080],
  allowList: ["8.8.8.8/32"],
});

const networkedInstance = await networkedSandbox.create();
const urls = await networkedInstance.listUrls();
console.log("Known URLs:", urls);
```

<ParamField type="number[]">
  Ports to expose immediately when the sandbox is created. These ports will be
  available via public URLs as soon as the sandbox starts.
</ParamField>

<ParamField type="boolean">
  Blocks all outbound network access from the sandbox while still allowing
  inbound traffic to exposed ports. Cannot be used together with `allowList`.
</ParamField>

<ParamField type="string[]">
  CIDR ranges that the sandbox is allowed to connect to. When specified, all
  other outbound traffic is blocked. Cannot be used together with
  `blockNetwork`.
</ParamField>

### `Sandbox.create()`

Create a new sandbox instance.

```typescript theme={null}
const instance = await sandbox.create(["python", "app.py"]); // optional entrypoint
console.log(`Sandbox created with ID: ${instance.sandboxId}`);
```

### `Sandbox.connect()`

Connect to an existing sandbox instance by ID.

```typescript theme={null}
const instance = await Sandbox.connect("sandbox-123");
```

<ParamField type="string">
  The container ID of the existing sandbox instance.
</ParamField>

### `Sandbox.createFromSnapshot()`

Create a sandbox instance from a filesystem snapshot.

```typescript theme={null}
const instance = await Sandbox.createFromSnapshot("snapshot-123");
```

<ParamField type="string">
  The ID of the snapshot to create the sandbox from.
</ParamField>

## SandboxInstance

A sandbox instance that provides access to the sandbox internals.

### `SandboxInstance.runCode()`

Run Python code in the sandbox.

```typescript theme={null}
const result = await instance.runCode(`
import numpy as np
print("NumPy version:", np.__version__)
result = np.array([1, 2, 3, 4, 5])
print("Array:", result)
`);

console.log("Output:", result.result);
```

<ParamField type="string">
  The Python code to execute.
</ParamField>

<ParamField type="boolean">
  Whether to wait for the process to complete.
</ParamField>

### `SandboxInstance.exec()`

Run an arbitrary command in the sandbox.

```typescript theme={null}
// Using an array of command and arguments
const process = await instance.exec(["ls", "-la", "/workspace"]);
const exitCode = await process.wait();

// Using a single string command
const process2 = await instance.exec("ls -la /workspace");

// With execution options
const process3 = await instance.exec(["node", "server.js"], {
  cwd: "/app",
  env: { NODE_ENV: "production" },
});

// Get the process ID
const pid = process.pid;

// Read output
const stdout = await process.stdout.read();
const stderr = await process.stderr.read();
```

<ParamField type="string | string[]">
  The command to execute. Can be a single string or an array of strings (command
  and arguments).
</ParamField>

<ParamField type="ExecOptions">
  Optional execution options.
</ParamField>

<ParamField type="string">
  The working directory for the command.
</ParamField>

<ParamField type="Record<string, string>">
  Environment variables to set for the command.
</ParamField>

### `SandboxInstance.exposePort()`

Dynamically expose a port to the internet.

```typescript theme={null}
const url = await instance.exposePort(8000);
console.log(`Web service available at: ${url}`);
```

<ParamField type="number">
  The port number to expose within the sandbox.
</ParamField>

Use `listUrls()` to inspect every currently exposed URL, including ports
exposed at creation time with `ports`.

### `SandboxInstance.listUrls()`

List the currently exposed preview/public URLs for the sandbox. Returns
`Promise<Record<number, string>>`, keyed by port number.

```typescript theme={null}
const urls = await instance.listUrls();

for (const [port, url] of Object.entries(urls)) {
  console.log(`Port ${port} available at: ${url}`);
}
```

### `SandboxInstance.updateNetworkPermissions()`

Dynamically update outbound network permissions for the sandbox. This method
returns `Promise<void>`.

Because the method signature is positional, pass `false` as the first argument
when updating to an allow list without fully blocking outbound traffic.

```typescript theme={null}
// Block all outbound traffic
await instance.updateNetworkPermissions(true);

// Allow only specific CIDR ranges
await instance.updateNetworkPermissions(false, ["8.8.8.8/32", "10.0.0.0/8"]);

// Remove all outbound restrictions
await instance.updateNetworkPermissions(false, []);
```

<ParamField type="boolean">
  If `true`, blocks all outbound network access from the sandbox. Cannot be used
  together with `allowList`.
</ParamField>

<ParamField type="string[]">
  Optional list of allowed CIDR ranges. Passing `[]` removes outbound
  restrictions. Cannot be used together with `blockNetwork=true`.
</ParamField>

`allowList` entries must use CIDR notation such as `"8.8.8.8/32"` or
`"10.0.0.0/8"`. `blockNetwork=true` and `allowList` are mutually exclusive, and
exposed ports remain reachable regardless of outbound restrictions.

### `SandboxInstance.snapshot()`

Create a memory snapshot of the current sandbox. This method captures the memory state of the sandbox as an immutable artifact. You can later restore this snapshot into a new sandbox instance using `createFromSnapshot()`.

```typescript theme={null}
const snapshotId = await instance.snapshot();
console.log(`Created memory snapshot: ${snapshotId}`);

// Restore from memory snapshot
const restoredInstance = await Sandbox.createFromSnapshot(snapshotId);
```

### `SandboxInstance.createImageFromFilesystem()`

Create an image from the sandbox filesystem. This method returns an image ID that can be used to create new sandboxes with the same filesystem state. You can use the `Image.fromId()` method to create a new image instance.

```typescript theme={null}
const imageId = await instance.createImageFromFilesystem();
console.log(`Created image from filesystem: ${imageId}`);

// Use the snapshot as a base image for new sandboxes
const image = Image.fromId(imageId);
const newSandbox = new Sandbox({
  name: "from-filesystem",
  image: image,
  // ... other config
});
```

### `SandboxInstance.updateTtl()`

Update the keep warm setting of the sandbox.

```typescript theme={null}
// Keep alive for 1 hour
await instance.updateTtl(3600);

// Make it never timeout
await instance.updateTtl(-1);
```

<ParamField type="number">
  The number of seconds to keep the sandbox alive. Use -1 for sandboxes that
  never timeout.
</ParamField>

### `SandboxInstance.sandboxId()`

Get the ID of the sandbox.

```typescript theme={null}
const sandboxId = instance.sandboxId();
```

### `SandboxInstance.terminate()`

Terminate the container associated with this sandbox instance.

```typescript theme={null}
const success = await instance.terminate();
```

## SandboxProcess

Represents a running process within a sandbox.

### `SandboxProcess.wait()`

Wait for the process to complete.

```typescript theme={null}
const process = await instance.exec(["sleep", "5"]);
const exitCode = await process.wait();
console.log("Process exited with code:", exitCode);
```

### `SandboxProcess.kill()`

Kill the process.

```typescript theme={null}
const process = await instance.exec(["sleep", "100"]);
await process.kill();
```

### `SandboxProcess.status()`

Get the status of the process.

```typescript theme={null}
const [exitCode, status] = await process.status();
if (exitCode >= 0) {
  console.log("Process finished with exit code:", exitCode);
}
```

### `SandboxProcess.stdout` / `SandboxProcess.stderr`

Get handles to the process output streams.

```typescript theme={null}
const process = await instance.exec(["echo", "Hello World"]);
const output = await process.stdout.read();
console.log("Output:", output);
```

## SandboxFileSystem

File system interface for managing files within a sandbox.

### `SandboxFileSystem.uploadFile()`

Upload a local file to the sandbox.

```typescript theme={null}
await instance.fs.uploadFile("./local-file.txt", "workspace/uploaded-file.txt");
```

<ParamField type="string">
  The path to the local file to upload.
</ParamField>

<ParamField type="string">
  The destination path within the sandbox.
</ParamField>

### `SandboxFileSystem.downloadFile()`

Download a file from the sandbox to a local path.

```typescript theme={null}
await instance.fs.downloadFile(
  "workspace/output.txt",
  "./downloaded-output.txt",
);
```

<ParamField type="string">
  The path to the file within the sandbox.
</ParamField>

<ParamField type="string">
  The destination path on the local filesystem.
</ParamField>

### `SandboxFileSystem.listFiles()`

List the files in a directory in the sandbox.

```typescript theme={null}
const files = await instance.fs.listFiles("/workspace");
for (const file of files) {
  console.log(`${file.name} (${file.isDir ? "directory" : "file"})`);
}
```

<ParamField type="string">
  The path to the directory within the sandbox.
</ParamField>

### `SandboxFileSystem.deleteFile()`

Delete a file in the sandbox.

```typescript theme={null}
await instance.fs.deleteFile("/tmp/temp-file.txt");
```

<ParamField type="string">
  The path to the file within the sandbox.
</ParamField>

### `SandboxFileSystem.statFile()`

Get metadata of a file in the sandbox.

```typescript theme={null}
const fileInfo = await instance.fs.statFile("/path/to/file.txt");
console.log(`File size: ${fileInfo.size} bytes`);
console.log(`Is directory: ${fileInfo.isDir}`);
```

<ParamField type="string">
  The path to the file within the sandbox.
</ParamField>

# Pod

A **Pod** is an object that allows you to run arbitrary services in a fast, scalable, and secure remote container on Beam.

You can think of a Pod as a lightweight compute environment that you fully control—complete with a custom container, ports you can expose, environment variables, volumes, secrets, and GPUs.

```typescript theme={null}
import { Pod, Image } from "@beamcloud/beam-js";

// Create a Pod that runs a simple HTTP server
const pod = new Pod({
  name: "web-server",
  cpu: 2,
  memory: "512Mi",
  image: new Image({
    baseImage: "python:3.9-slim",
    pythonPackages: ["requests"],
  }),
  ports: [8000],
});

// Create the pod container
const result = await pod.create(["python", "-m", "http.server", "8000"]);

console.log("Container ID:", result.containerId);
console.log("URL:", result.url);
```

### `Pod.create()`

Create a new container that runs until it completes or is explicitly killed.

```typescript theme={null}
const result = await pod.create(["python", "app.py"]);
console.log("Pod created successfully:", result.url);
```

<ParamField type="string[]">
  The command to run in the container.
</ParamField>

# Storage

## `Volume`

Creates a Volume instance.

When your container runs, your volume will be available at `./{mountPath}` and `/volumes/{name}`.

```typescript theme={null}
import { Volume } from "@beamcloud/beam-js";

const volume = new Volume("model-weights", "./weights");
await volume.getOrCreate();

// Use with Pod or Sandbox
const pod = new Pod({
  name: "my-pod",
  volumes: [volume],
  // ... other config
});
```

<ParamField type="string">
  The name of the volume, a descriptive identifier for the data volume.
</ParamField>

<ParamField type="string">
  The path where the volume is mounted within the container environment.
</ParamField>

### `Volume.getOrCreate()`

Get or create the volume in the platform.

```typescript theme={null}
const success = await volume.getOrCreate();
console.log("Volume ready:", volume.ready);
```

# Utils

## `TaskPolicy`

Task policy for managing the lifecycle of individual tasks.

```typescript theme={null}
import { TaskPolicy } from "@beamcloud/beam-js";

const policy = new TaskPolicy({
  maxRetries: 3,
  timeout: 300,
  ttl: 3600,
});
```

<ParamField type="number">
  The maximum number of times a task will be retried if the container crashes.
</ParamField>

<ParamField type="number">
  The maximum number of seconds a task can run before it times out. Set it to -1
  to disable the timeout.
</ParamField>

<ParamField type="number">
  The expiration time for a task in seconds. Must be greater than 0 and less
  than 24 hours (86400 seconds).
</ParamField>

# Types

The TypeScript SDK includes comprehensive type definitions for all resources:

* `GpuType`: Enum of available GPU types (T4, A10G, A100, H100, etc.)
* `GpuTypeAlias`: Union type allowing both enum values and string literals for GPU specification
* `PythonVersion`: Enum of supported Python versions
* `PythonVersionAlias`: Union type allowing both enum values and string literals for Python version specification
* `TaskStatus`: Enum of task statuses (PENDING, RUNNING, COMPLETE, etc.)
* `DeploymentData`: Interface for deployment data
* `TaskData`: Interface for task data
* `ExecOptions`: Options for sandbox command execution (cwd, env)
* `PodInstanceData`: Interface for pod instance data
* And many more...

```typescript theme={null}
import {
  GpuType,
  GpuTypeAlias,
  PythonVersion,
  PythonVersionAlias,
  TaskStatus,
  DeploymentData,
} from "@beamcloud/beam-js";

// Use types for better TypeScript support
const gpu: GpuType = GpuType.A10G; // Enum approach
const gpuAlias: GpuTypeAlias = "A10G"; // String literal approach
const anotherGpu: GpuTypeAlias = GpuType.H100; // Both work with GpuTypeAlias

const python: PythonVersion = PythonVersion.Python311; // Enum approach
const pythonAlias: PythonVersionAlias = "python3.11"; // String literal approach
const anotherPython: PythonVersionAlias = PythonVersion.Python310; // Both work with PythonVersionAlias

// Both approaches work seamlessly for any alias type
function createResource(gpu: GpuTypeAlias, python: PythonVersionAlias) {
  console.log(`Using GPU: ${gpu}, Python: ${python}`);
}

createResource(GpuType.H100, PythonVersion.Python311); // Works with enums
createResource("H100", "python3.11"); // Works with strings
createResource("A10G", PythonVersion.Python310); // Works with mixed approaches
```

## `ExecOptions`

Options for configuring command execution in a sandbox.

```typescript theme={null}
import { ExecOptions } from "@beamcloud/beam-js";

const opts: ExecOptions = {
  cwd: "/app",
  env: { NODE_ENV: "production", DEBUG: "true" },
};

const process = await instance.exec(["node", "server.js"], opts);
```

<ParamField type="string">
  The working directory for the command.
</ParamField>

<ParamField type="Record<string, string>">
  Environment variables to set for the command.
</ParamField>


# Using Beam Docs with AI Tools
Source: https://docs.beam.cloud/v2/resources/ai-tools

Bring the Beam documentation into your LLMs, IDEs, and agents

The Beam docs are built to work well with AI assistants and coding agents. You can feed them to an LLM, open them in your editor, or connect them as a tool.

## llms.txt

We publish machine-readable indexes of the documentation that follow the [llms.txt](https://llmstxt.org) standard:

* [llms.txt](https://docs.beam.cloud/llms.txt) — a concise, structured index of every page, ideal for giving a model a map of the docs.
* [llms-full.txt](https://docs.beam.cloud/llms-full.txt) — the full text of the documentation in a single file, ideal for pasting into a model with a large context window.

## Markdown for any page

Append `.md` to the URL of any docs page to get its raw Markdown. For example:

```text theme={null}
https://docs.beam.cloud/v2/getting-started/quickstart.md
```

This is handy for piping a single page into an LLM or referencing it from an agent.

## Copy and open in your assistant

Every page has a menu in the top-right corner that lets you:

* **Copy page** as Markdown to paste into any chat.
* **Open in ChatGPT, Claude, or Perplexity** with the page preloaded.
* **Open in Cursor or VS Code** to use the page as context while you build.
* **Connect via MCP** so your agent can query the docs directly.

## Building on Beam with an agent

When working with a coding agent, point it at the resources above so it has accurate, up-to-date context:

* Share `llms.txt` so the agent knows the structure of the docs.
* Share the relevant `.md` pages (for example, the [Python SDK reference](/v2/reference/py-sdk.md) or [Quickstart](/v2/getting-started/quickstart.md)) for the task at hand.
* Connect the docs MCP server so the agent can search the documentation on demand.


# FAQ
Source: https://docs.beam.cloud/v2/resources/faq

This is an ongoing list of issues people sometimes encounter while using Beam. If you're having an issue, check this list first.

## `concurrency_limit_reached` or `cpu quota exceeded`

We offer three pricing tiers and each has its own CPU and GPU quotas.

| Plan       | CPU Quota | GPU Quota |
| ---------- | --------- | --------- |
| Free Trial | 10        | 5         |
| Developer  | 10        | 5         |
| Team       | 1,000     | 20        |
| Growth     | 10,000+   | 100+      |

If you get this message, make sure you've added a payment method to your account and [selected the pay-as-you-go developer plan on this page](https://platform.beam.cloud/settings/plans).

## `Unable to connect to gateway`

Make sure you're on the latest version of the `beam-client` CLI.

```bash theme={null}
uv tool upgrade beam-client
```

Run this command to validate your version of the CLI:

```bash theme={null}
beam --version
```

<Tip>
  [You can see the latest CLI releases
  here](https://github.com/beam-cloud/beam-client/releases).
</Tip>

## `No space left on device`

This error typically occurs when your app runs out of disk space. For example, if you're downloading a 30Gi file and your app only has 8Gi of memory, you might see this error.

For more information on configuring RAM for your apps, [read more on this page](/v2/environment/gpu#configuring-cpu-and-memory).

## `cannot import name 'App' from 'beam'`

If you're seeing this error, it's because you're trying to use Beam V2 with a V1 app. There is no `App` class in Beam V2.

For more information on using Beam V2, [read more on this page](/v2/releases/v2-upgrade).

## `Unable to locate config file`

This typically happens when there are multiple Python environments on your computer.

<Warning>
  If you are using Conda, we recommend exiting Conda and using a standard Python
  Virtual Environment instead: `python3 -m virtualenv .venv && source
      .venv/bin/activate`
</Warning>

The most common way of solving this is by running `which python` and installing `beam-client` to that specific path.

For example:

```bash theme={null}
$ which python

python: aliased to /usr/bin/python3 # gotcha!

$ /usr/bin/python3 -m virtualenv .venv && source .venv/bin/activate

$ (.venv) /usr/bin/python3 -m pip install --upgrade beam-client
```

## Tensorflow Can't Find GPUs

If you're using Tensorflow, you might run into an issue when `tf` doesn't recognize the available GPUs on the device.

<Warning>
  Make sure to install `tensorflow[and-cuda]`, otherwise the regular version of
  `tf` won't have access to the GPU device.
</Warning>

```python theme={null}
from beam import Image, endpoint, env

if env.is_remote():
    import tensorflow as tf


@endpoint(
    name="tensorflow-gpu",
    cpu=1,
    memory="4Gi",
    gpu="A10G",
    # Make sure to use `tensorflow[and-cuda]` in order to access GPU resources
    image=Image().add_python_packages(["tensorflow[and-cuda]"]),
)
def predict():
    # Show available GPUs
    gpus = tf.config.list_physical_devices("GPU")

    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
    except RuntimeError as e:
        print(e)

    print("Is built with CUDA:", tf.test.is_built_with_cuda())
    print("Is GPU available:", tf.test.is_gpu_available())
    print("GPUs available:", tf.config.list_physical_devices("GPU"))
```


# Pricing and Billing
Source: https://docs.beam.cloud/v2/resources/pricing-and-billing


Beam is serverless, which means your apps will scale-to-zero by default. Billing is based on the lifecycle of your containers. You are only charged when your containers are running.

## What am I charged for?

You are charged whenever a container is running. This includes:

* Running any code you've defined in an [`on_start`](/v2/endpoint/loaders) function
* Running your application code
* Any [`keep_warm_seconds`](/v2/endpoint/keep-warm) you've set

## What am I *not* charged for?

* Waiting for a machine to start
* Pulling your container image

## Default Container Spin-down Times

After handling a request, Beam keeps containers running ("warm") for a certain amount of time in order to quickly handle future requests. By default, these are the container "keep warm" times for each deployment type:

| Deployment Type         | Container Keep Warm Duration |
| ----------------------- | ---------------------------- |
| Endpoints/ASGI/Realtime | 180s                         |
| Task Queues             | 10s                          |
| Pods                    | 600s                         |

## Real-World Example

You've deployed a REST API. You've added two Python Packages in your `Image()`, which are loaded when your app first starts.

You've also added a `keep_warm_seconds=300`, which will keep the container alive for 300 seconds (5 minutes) after each request.

```python app.py theme={null}
from beam import endpoint


# This runs once when the container first starts
def load_models():
    return {}

@endpoint(keep_warm_seconds=300, on_start=load_models)
def predict():
    return {}
```

Let's pretend you deploy this and call the API. Suppose it takes:

* 1s to boot your application and run your `on_start` function.
* 100ms to run your task.
* 300s to keep the container alive, based on the `keep_warm_seconds` argument.

You would be billed for a total of 301.1 seconds.


# Configuration
Source: https://docs.beam.cloud/v2/sandbox/configuration

Learn how to configure and customize your sandbox environment

Sandboxes are configurable cloud environments. You control CPU, memory, GPU, dependencies, environment variables, and storage so each sandbox can be customized exactly to your needs.

## Basic

Start with the simplest configuration:

```python theme={null}
from beam import Sandbox, Image

# Default: 1 CPU, 128MB RAM, Python 3.11
sandbox = Sandbox()
sb = sandbox.create()
```

This gives you a minimal environment that works well for simple scripts and running untrusted code. For most real work, you'll want to customize the resources.

## Advanced compute settings

You can increase CPU, memory, or assign a GPU - depending on your use case:

```python theme={null}
# Simple scripts, web scraping
sandbox = Sandbox(cpu=1.0)

# Data processing, API development
sandbox = Sandbox(cpu=2.0, memory="16Gi")

# Machine learning, complex analysis, rendering
sandbox = Sandbox(cpu=4.0, memory="16Gi", gpu="A10G")
```

## Customize your environment

See the [Images](/v2/environment/custom-images) section for more information on how to customize the runtime.

## Persistent Storage

Beam supports two types of persistent storage: fast distributed volumes and cloud buckets you already manage.

### Distributed Storage Volumes

Mount fast storage volumes to persist files between sessions:

```python theme={null}
from beam import Volume

# Mount a storage volume to your sandbox
volume = Volume(name="documents", mount_path="/workspace/documents")

sandbox = Sandbox(volumes=[volume])
```

Use volumes when you:

* Are working on a project that spans multiple sessions
* Need to share data between different sandbox instances
* Want to keep work safe even if sandbox crashes

### Cloud Buckets

For large datasets or team sharing, you can use your own buckets:

```python theme={null}
from beam import CloudBucket

# Connect to your S3 bucket
bucket = CloudBucket(
    bucket_name="my-data-bucket",
    mount_path="/data"
)

sandbox = Sandbox(volumes=[bucket])
```

Use cloud buckets for:

* Sensitive data
* Connecting existing object storage
* Long-term data storage in your own infrastructure

## Session Management

### Timeout Configuration

Set timeouts to control costs:

```python theme={null}
# Quick tasks (testing, simple scripts)
sandbox = Sandbox(keep_warm_seconds=1800)  # 30 minutes

# Development sessions
sandbox = Sandbox(keep_warm_seconds=3600)  # 1 hour

# Long-running tasks (training, processing)
sandbox = Sandbox(keep_warm_seconds=7200)  # 2 hours

# Manual termination only
sandbox = Sandbox(keep_warm_seconds=-1)
```

Start with shorter timeouts and increase as needed. You can always create a new sandbox if you need more time.

### Manual vs Automatic Termination

```python theme={null}
# Auto-terminate after 1 hour
sandbox = Sandbox(keep_warm_seconds=3600)

# Manual termination only (you control when it stops)
sandbox = Sandbox(keep_warm_seconds=-1)
```

Use manual termination for:

* Long-running training jobs
* Collaborative development sessions
* When you need to pause and resume work

## Environment Variables and Secrets

### Environment Variables

Pass configuration to your applications:

```python theme={null}
sandbox = Sandbox(
    env={
        "DATABASE_URL": "postgresql://user:pass@host:5432/db",
        "API_KEY": "your-api-key",
        "DEBUG": "true",
        "ENVIRONMENT": "development"
    }
)
```

Environment variables are good for:

* Keeping sensitive data out of your code
* Configuring applications for different environments
* Sharing configuration across team members

### Secrets Management

You can attach secrets to your sandbox using Beam's secret management system - they will be exposed as environment variables inside the Sandbox:

```python theme={null}
sandbox = Sandbox(secrets=["OPENAI_API_KEY"])
```

You add secrets using the [Beam CLI](https://docs.beam.cloud/v2/reference/cli#create-a-secret):

```sh theme={null}
$ beam secret create OPENAI_API_KEY ASIAY34FZKBOKMUTVV7A

=> Created secret with name: 'OPENAI_API_KEY'
```

Use secrets for:

* Database passwords
* API keys and tokens
* Private keys and certificates

## Best Practices

### Start Small, Scale Up

```python theme={null}
# Start with minimal resources
sandbox = Sandbox(cpu=1.0, memory="1Gi")

# If you need more power, create a new sandbox
powerful_sandbox = Sandbox(cpu=4.0, memory="8Gi")
```

You only pay for what you use. Start small and scale up as needed.

## Common Mistakes

### Over-provisioning

```python theme={null}
# Don't do this for simple scripts
sandbox = Sandbox(cpu=8.0, memory="32Gi", gpu="A10G")  # Overkill!
```

Start with minimal resources and scale up as needed.

### Including Unnecessary Packages

```python theme={null}
# Don't include packages you don't need
from beam import PythonVersion
image = Image(python_version=PythonVersion.Python311).add_python_packages([
    "flask", "django", "fastapi", "tornado", "bottle"  # Pick one!
])
```

Only include packages you actually use.

### Long Timeouts for Short Tasks

```python theme={null}
# Don't set 2-hour timeout for 5-minute tasks
sandbox = Sandbox(keep_warm_seconds=7200)  # Wasteful!
```

Try to match timeout to expected task duration. You can always extend the timeout dynamically like so:

```python theme={null}
sandbox = Sandbox(keep_warm_seconds=300)

# Do some stuff

sandbox.update_ttl(300) # Reset TTL back to 300 seconds
```

## What's Next?

Now that you understand configuration, let's put it to work:

* **[Process Management](/v2/sandbox/processes)**: Run code and commands in your configured environment
* **[File System Operations](/v2/sandbox/filesystem)**: Upload, download, and manage files
* **[Networking](/v2/sandbox/networking)**: Deploy web services and expose them to the internet
* **[Examples](/v2/sandbox/overview)**: See real-world configurations in action


# File System Operations
Source: https://docs.beam.cloud/v2/sandbox/filesystem

Upload, download, and manage files within your sandbox environment

Each sandbox has a built-in file system API available at `sb.fs`. You can upload local files, download files from the sandbox, list directories, and manage files with full metadata access.

## Uploading Files

### Basic File Upload

```python theme={null}
from beam import Sandbox, Image, PythonVersion

sandbox = Sandbox(image=Image(python_version=PythonVersion.Python311))
sb = sandbox.create()

# Upload a local file to the sandbox
sb.fs.upload_file("my_script.py", "/workspace/my_script.py")

```

### Uploading Multiple Files

```python theme={null}
# Upload several files
files_to_upload = [
    ("main.py", "/workspace/main.py"),
    ("requirements.txt", "/workspace/requirements.txt"),
    ("data.csv", "/workspace/data/data.csv"),
    ("config.yaml", "/workspace/config/config.yaml")
]

for local_path, sandbox_path in files_to_upload:
    sb.fs.upload_file(local_path, sandbox_path)
    print(f"Uploaded {local_path} to {sandbox_path}")
```

## Downloading Files

### Basic File Download

```python theme={null}
# Download a file from the sandbox
sb.fs.download_file("/workspace/output.txt", "local_output.txt")

# Download to a specific directory
sb.fs.download_file("/workspace/results/data.csv", "downloads/data.csv")
```

### Downloading Multiple Files

```python theme={null}
# Download all files in a directory
files = sb.fs.list_files("/workspace/results")
for file in files:
    if not file.is_dir:
        local_path = f"downloads/{file.name}"
        sb.fs.download_file(f"/workspace/results/{file.name}", local_path)
        print(f"Downloaded {file.name}")
```

## File Management

### Listing Files and Directories

```python theme={null}
# List files in workspace
files = sb.fs.list_files("/workspace")
for file in files:
    if file.is_dir:
        print(f"{file.name}/")
    else:
        print(f"{file.name} ({file.size} bytes)")
```

### File Information

```python theme={null}
# Get detailed information about a file
file_info = sb.fs.stat_file("/workspace/my_script.py")

print(f"Name: {file_info.name}")
print(f"Size: {file_info.size} bytes")
print(f"Is Directory: {file_info.is_dir}")
print(f"Permissions: {oct(file_info.permissions)}")
print(f"Owner: {file_info.owner}")
print(f"Group: {file_info.group}")
print(f"Modified: {file_info.mod_time}")
```


# Networking
Source: https://docs.beam.cloud/v2/sandbox/networking

Expose ports dynamically for services running inside your sandbox

The Sandbox provides some basic network tools. You can run web services and expose them to the internet behind SSL-terminated endpoints. This is useful for web development, API testing, and running interactive applications with LLMs (think v0, reflex.build, etc).

## Exposing Ports

You can expose ports from your Sandbox in two ways: statically at creation time, or dynamically at runtime.

### Static Port Exposure

Specify ports when creating the sandbox to have them exposed immediately with public URLs:

```python theme={null}
from beam import Sandbox, Image, PythonVersion

# Create a sandbox with pre-exposed ports
sandbox = Sandbox(
    image=Image(python_version=PythonVersion.Python311),
    ports=[8000, 8080, 3000]  # Ports exposed at creation
)

sb = sandbox.create()

# URLs are immediately available
urls = sb.list_urls()
for port, url in urls.items():
    print(f"Port {port} exposed at: {url}")
```

### Dynamic Port Exposure

You can also expose ports dynamically after the sandbox is created using the `expose_port()` method.

#### Expose a port

```python theme={null}
from beam import Sandbox, Image, PythonVersion

sandbox = Sandbox(image=Image(python_version=PythonVersion.Python311))
sb = sandbox.create()

# Expose port 8000
url = sb.expose_port(8000)
print(f"Port 8000 exposed at: {url}")

# The URL will be something like:
# https://384ced3c-f837-4429-bada-39e0b965c9f4-8000.app.beam.cloud
```

#### Expose multiple ports

```python theme={null}
# Expose multiple ports
ports = [8000, 8080, 3000]
urls = {}

for port in ports:
    url = sb.expose_port(port)
    urls[port] = url
    print(f"Port {port} exposed at: {url}")

# Access different services
print(f"Main app: {urls[8000]}")
print(f"Admin panel: {urls[8080]}")
print(f"API server: {urls[3000]}")
```

#### List exposed ports/preview URLs

```python theme={null}
# List exposed ports & preview URLs
urls = sb.list_urls()
for port, url in urls.items():
    print(f"Port {port} exposed at: {url}")
```

## Network Security

### Blocking Outbound Traffic

You can block all outbound network access from your Sandbox while still allowing inbound connections to exposed ports. This is useful for security-sensitive workloads or when executing untrusted code.

```python theme={null}
from beam import Sandbox, Image, PythonVersion

# Create a sandbox with blocked outbound network
sandbox = Sandbox(
    image=Image(python_version=PythonVersion.Python311),
    block_network=True,  # Block all outbound traffic
)

sb = sandbox.create()

# The sandbox can still receive requests on exposed ports
url = sb.expose_port(8000)
print(f"Port 8000 exposed at: {url}")

# But it cannot make outbound connections to external services
```

With `block_network=True`, the Sandbox can receive requests on exposed ports but cannot initiate outbound connections to external services.

### Allow Lists (CIDR Ranges)

For more fine-grained control, you can specify an allow list of CIDR ranges that your Sandbox is permitted to connect to. All other outbound traffic will be blocked.

```python theme={null}
from beam import Sandbox, Image, PythonVersion

# Create a sandbox with an allow list
sandbox = Sandbox(
    image=Image(python_version=PythonVersion.Python311),
    allow_list=[
        "8.8.8.8/32",      # Allow Google DNS
        "10.0.0.0/8",      # Allow private network range
        "2001:db8::/32",   # Allow IPv6 range
    ],
)

sb = sandbox.create()

# The sandbox can only connect to addresses in the allow list
# All other outbound traffic is blocked
```

**Important Notes:**

* Maximum of 10 CIDR entries per Sandbox
* Supports both IPv4 and IPv6 addresses
* Must use proper CIDR notation (e.g., `"8.8.8.8/32"` for a single IP, `"10.0.0.0/8"` for a range)
* Cannot use `allow_list` and `block_network` together - they are mutually exclusive
* Invalid CIDR values will trigger an error at creation time

### Updating Network Permissions at Runtime

You can dynamically update the network permissions of a running Sandbox without restarting it. This allows you to change access policies during the sandbox's lifetime.

```python theme={null}
from beam import Sandbox, Image, PythonVersion

# Create a sandbox with no network restrictions
sandbox = Sandbox(image=Image(python_version=PythonVersion.Python311))
sb = sandbox.create()

# Later, block all outbound traffic
sb.update_network_permissions(block_network=True)

# Or update to use an allowlist instead
sb.update_network_permissions(
    allow_list=[
        "8.8.8.8/32",      # Allow Google DNS
        "10.0.0.0/8",      # Allow private network range
    ]
)

# Remove all restrictions
sb.update_network_permissions(block_network=False, allow_list=[])
```

**Important Notes:**

* Cannot use `block_network=True` and `allow_list` together - they are mutually exclusive
* Exposed ports remain accessible regardless of network restrictions
* Changes take effect immediately without requiring a restart


# Overview
Source: https://docs.beam.cloud/v2/sandbox/overview

Run anything in secure code execution environments

Sandboxes are ultra-fast, Python-native environments for running any workload - with GPUs, networking, and persistent storage - in seconds.

## Features

* **Ultra Fast Boot Times**: Sandboxes cold boot in 1–3 seconds, even with dependencies included.
* **Image Caching**: Beam caches dependencies in your base image, so subsequent sandboxes boot faster. You can also build custom images for each app.
* **Snapshots**: Create Snapshots of the filesystem and restart Sandboxes from a previous state.
* **Preview URLs**: Dynamically expose ports behind SSL-terminated, authenticated endpoints.
* **Session Management**: Keep sandboxes running indefinitely, or configure them to shut down automatically after any period you choose.

## Quick Start

Create a sandbox, run some code, and see the results:

```python theme={null}
from beam import PythonVersion, Image, Sandbox

# Create a sandbox with the tools you need
sandbox = Sandbox(image=Image(python_version=PythonVersion.Python311))

# Launch it into the cloud
sb = sandbox.create()

# Run some code - this happens in the cloud, not on your machine!
result = sb.process.run_code("print('hello from the sandbox!')").result

print(result)

# Clean up - shut down the sandbox
sb.terminate()
```

## Running a Node.js server

You can run arbitrary code on Beam. It doesn't need to be Python!

For example, let's run a Node server. We'll track the startup time too:

```python theme={null}
import time
from beam import Image, Sandbox

start = time.time()

# Create a sandbox on port 3000
sb = Sandbox(image=Image().from_registry("node:20")).create()
url = sb.expose_port(3000)

# Terminate sandbox after 5 minutes
sb.update_ttl(300)

# Run some code
sb.process.exec("sh", "-c", "npx http-server -p 3000 -c-1")

elapsed = time.time() - start
print(f"Node app running at: {url}")
print(f"Sandbox started in {elapsed:.2f} seconds")
```

## Core Features

### Process Management

Run Python code, shell commands, or start long-running processes:

```python theme={null}
# Run some Python code
result = sb.process.run_code("print('Hello from sandbox!')")
print(result)

# Execute arbitrary shell commands
process = sb.process.exec("ls", "-la", "/workspace")
print(process.logs.read())
process.wait()

# Expose a port to the internet
url = sb.expose_port(8000)
print(f"Sandbox running here: {url}")

# Start a web server in the background
server_process = sb.process.exec("python3", "-m", "http.server", "8000")

try:
    for line in server_process.logs:
        print(line, end="")
finally:
    sb.terminate()
```

### File System Operations

Upload local files, download results, and manage your workspace:

```python theme={null}
# Upload local files to the sandbox
sb.fs.upload_file("my_script.py", "/workspace/my_script.py")

# Run it
result = sb.process.run_code("exec(open('/workspace/my_script.py').read())")

# Download a file from the sandbox to your local
sb.fs.download_file("/workspace/output.csv", "local_results.csv")
```

### Dynamic Preview URLs

Expose ports to make your services accessible over the internet:

```python theme={null}
# Start a Flask app
process = sb.process.exec("python3", "app.py", cwd="/workspace")

# Expose it to the world
url = sb.expose_port(5000)
print(f"Your app is live at: {url}")
```

## Key Concepts

### SandboxInstance

When you create a sandbox, you get a `SandboxInstance` class that provides:

* `process`: Run commands and code with real-time output
* `fs`: Upload, download, and manage files
* `expose_port()`: Make your services accessible to the internet
* `terminate()`: Cleanup when you're done

### Lifecycle

1. **Create**: Configure your environment (CPU, memory, packages, etc.)
2. **Launch**: Start the sandbox with `create()`
3. **Use**: Execute code, manage files, expose services
4. **Terminate**: Clean up with `terminate()` (or let it auto-terminate)

## What's Next?

Now that you understand what Sandbox can do, let's dive deeper into each capability:

* **[Configuration](/v2/sandbox/configuration)**: Learn how to customize your sandbox for different use cases
* **[Process Management](/v2/sandbox/processes)**: Master running code and commands with real-time feedback
* **[File System Operations](/v2/sandbox/filesystem)**: Upload, download, and manage files inside your Sandbox
* **[Networking](/v2/sandbox/networking)**: Deploy web services and expose them to the internet
* **[Examples](/v2/sandbox/overview)**: See real-world patterns and workflows


# Process Management
Source: https://docs.beam.cloud/v2/sandbox/processes

Execute code and commands with real-time output streaming in your sandbox

The Sandbox provides process management through the `process` property. You can execute Python code, run shell commands, and manage long-running processes with real-time output streaming.

## Running Python Code

### Basic Code Execution

```python theme={null}
from beam import Sandbox, Image, PythonVersion

sandbox = Sandbox(image=Image(python_version=PythonVersion.Python311))
sb = sandbox.create()

# Run simple Python code
result = sb.process.run_code("print('Hello from sandbox!')")
print(result.result)  # Hello from sandbox!
print(f"Exit code: {result.exit_code}")  # 0
```

### Complex Python Scripts

```python theme={null}
# Multi-line Python code
code = """
import numpy as np
import pandas as pd

# Generate sample data
data = np.random.randn(1000, 3)
df = pd.DataFrame(data, columns=['A', 'B', 'C'])

# Calculate statistics
stats = df.describe()
print("Data Statistics:")
print(stats)

# Save results
df.to_csv('/workspace/data.csv', index=False)
print("Data saved to /workspace/data.csv")
"""

response = sb.process.run_code(code)
print(response.result)
```

### Error Handling

```python theme={null}
# Code with errors
response = sb.process.run_code("""
import nonexistent_module
print("This won't execute")
""")

print(f"Exit code: {response.exit_code}")  # Non-zero exit code
print(f"Error output: {response.result}")  # Error message
```

## Executing Commands

### Basic Command Execution

```python theme={null}
# Run a simple command
process = sb.process.exec("ls", "-la", "/workspace")

# Wait for completion
exit_code = process.wait()
print(f"Command completed with exit code: {exit_code}")

# Get all output
for line in process.logs:
    print(line, end="")
```

### Shell Commands

```python theme={null}
# Use shell features
process = sb.process.exec("echo $HOME && pwd && whoami")

# Wait and get output
process.wait()
for line in process.logs:
    print(line, end="")
```

### Working Directory

```python theme={null}
# Execute in specific directory
process = sb.process.exec("ls", "-la", cwd="/workspace")
process.wait()

# Create directory and work in it
sb.process.run_code("import os; os.makedirs('/workspace/myproject', exist_ok=True)")
process = sb.process.exec("touch", "test.txt", cwd="/workspace/myproject")
```

### Environment Variables

```python theme={null}
# Set environment variables for command
env = {
    "DATABASE_URL": "postgresql://localhost/mydb",
    "DEBUG": "true",
    "API_KEY": "secret-key"
}

process = sb.process.exec("env", "|", "grep", "DATABASE", env=env)
process.wait()
```

## Non-blocking Execution

### Background Processes

```python theme={null}
# Start a long-running process without waiting
process = sb.process.run_code("""
import time
for i in range(10):
    print(f"Processing {i}...")
    time.sleep(1)
""", blocking=False)

print(f"Process started with PID: {process.pid}")

# Do other work while it runs
print("Process is running in background...")

# Check if still running
print(f"Exit code: {process.exit_code}")  # -1 if still running

# Wait for completion when ready
process.wait()
print("Process completed!")
```

### Real-time Output Streaming

```python theme={null}
# Start process and stream output in real-time
process = sb.process.run_code("""
import time
for i in range(5):
    print(f"Step {i}: Processing...")
    time.sleep(1)
print("Done!")
""", blocking=False)

# Stream output as it happens
for line in process.logs:
    print(f"[REAL-TIME] {line}", end="")
```

## Process Control

### Process Management

```python theme={null}
# Start multiple processes
process1 = sb.process.exec("sleep", "30", blocking=False)
process2 = sb.process.exec("sleep", "60", blocking=False)

print(f"Process 1 PID: {process1.pid}")
print(f"Process 2 PID: {process2.pid}")

# List all running processes
for p in sb.process.list_processes():
    print(f"PID {p.pid}: {p.status()}")

# Kill specific process
process1.kill()
print("Process 1 killed")

# Get process by PID
specific_process = sb.process.get_process(process2.pid)
print(f"Process 2 status: {specific_process.status()}")
```

### Process Status and Monitoring

```python theme={null}
# Start a process
process = sb.process.exec("sleep", "10", blocking=False)

# Monitor status
while True:
    exit_code, status = process.status()
    print(f"PID {process.pid}: Exit code {exit_code}, Status: {status}")

    if exit_code >= 0:
        print("Process completed")
        break

    time.sleep(1)
```

### Process Output Streams

```python theme={null}
# Start process with output
process = sb.process.run_code("""
import sys
print("This goes to stdout")
print("This also goes to stdout", file=sys.stdout)
print("This goes to stderr", file=sys.stderr)
""", blocking=False)

# Read stdout
print("=== STDOUT ===")
for line in process.stdout:
    print(f"STDOUT: {line}", end="")

# Read stderr
print("=== STDERR ===")
for line in process.stderr:
    print(f"STDERR: {line}", end="")

# Read combined logs
print("=== COMBINED LOGS ===")
for line in process.logs:
    print(f"LOG: {line}", end="")
```

### List running processes

```python theme={null}
# List running processes
processes = sb.process.list_processes()
for pid, process in processes.items():
    print(f"PID {process.pid}: {process.exit_code}")
```


# Snapshots
Source: https://docs.beam.cloud/v2/sandbox/snapshots


Snapshots let you capture the filesystem and/or memory of a Sandbox as an immutable artifact.
You can then use this artifact to create new Sandboxes with that same captured state.

Use snapshots when you want to:

* Fork Sandboxes to test different variations of code
* Initialize Sandboxes with existing state for faster cold starts
* Save a reproducible environment you can return to later

## Creating a Filesystem Snapshot

```python theme={null}
from beam import Image, Sandbox

# Create a sandbox and make some changes to the filesystem
sandbox = Sandbox(cpu=1).create()
p = sandbox.process.exec("sh", "-c", "mkdir -p /something && touch /something/file.txt")

# Read the logs
p.wait()
print(p.logs.read())

# Generate a filesystem snapshot and terminate the sandbox
image_id = sandbox.create_image_from_filesystem()
sandbox.terminate()
```

## Using Filesystem Snapshots

You can use Snapshots as a base image for any other abstraction or Sandbox on Beam, using `Image.from_id`:

```python theme={null}
from beam import Image, Sandbox

# Creates an image from a filesystem snapshot
image = Image.from_id(image_id)

sandbox = Sandbox(image=image).create()
p = sandbox.process.exec("ls", "-l", "/something")

p.wait()
print(p.logs.read())

sandbox.terminate()
```

## Creating a Memory Snapshot

You can also create a memory snapshot of a running Sandbox, which will capture the state of the sandbox's memory - including all running processes and exposed ports.

```python theme={null}
from beam import Image, Sandbox

# Create a sandbox and make some changes to the filesystem
sandbox = Sandbox(cpu=1).create()
sandbox.expose_port(8000)

p = sandbox.process.exec("python", "-c", "import http.server; http.server.HTTPServer(('', 8000), http.server.SimpleHTTPRequestHandler).serve_forever()")

# Generate a memory snapshot and terminate the sandbox
snapshot_id = sandbox.snapshot_memory()

print(sandbox.list_urls())

sandbox.terminate()
```

## Using Memory Snapshots

You can use memory snapshots as a starting point for a new Sandbox, using `Sandbox().create_from_memory_snapshot`:

```python theme={null}
from beam import Image, Sandbox

# Creates a new sandbox from a memory snapshot

sandbox = Sandbox().create_from_memory_snapshot(snapshot_id)

print(sandbox.list_urls())

sandbox.terminate()
```


# Scaling Out
Source: https://docs.beam.cloud/v2/scaling/concurrency


You can scale out your app to multiple containers by adding autoscaling.

## Scaling Horizontally (Adding More Containers)

When you deploy a Task Queue or endpoint, Beam creates a queueing system that manages each task that's created when your API is called.

You can configure how Beam will scale based on how many things are in the task queue.

<Frame>
  <img />
</Frame>

### Scale by Queue Depth

Our simplest autoscaling strategy allows you to scale by the number of tasks in the queue.

This allows you to control how many tasks each container can process before scaling up. For example, you could setup an autoscaler to run 30 tasks per container. When you pass 30 tasks in your queue,
we will add a container. When you pass 60, we'll add another containers (up until `max_containers` is reached).

```python theme={null}
from beam import QueueDepthAutoscaler, endpoint

autoscaling_config = QueueDepthAutoscaler(
    max_containers=5,
    tasks_per_container=30,
)

@endpoint(autoscaler=autoscaling_config)
def function():
    ...
```

## Setting Always-On Containers

<Note>
  Any running containers count towards billable usage. Take care to avoid
  setting `min_containers` unless you're comfortable paying for usage 24/7.
</Note>

You can configure the number of containers running at baseline using the `min_containers` field.

By setting `min_containers=1`, 1 container will *always* remain running until the deployment is stopped.

```python theme={null}
from beam import endpoint, QueueDepthAutoscaler


@endpoint(
    autoscaler=QueueDepthAutoscaler(
        min_containers=1, max_containers=3, tasks_per_container=1
    ),
)
def handler():
    return {"success": "true"}
```

<Warning>
  If you redeploy an app that has `min_containers` set, make sure to explicitly
  stop the previous deployment versions in order to avoid running containers
  that you are no longer using.
</Warning>


# Concurrent Inputs
Source: https://docs.beam.cloud/v2/scaling/concurrent-inputs


## Increasing Throughput in a Single Container

You can increase throughput for your workloads by configuring the number of workers to launch per container. For example, if you have 4 workers on 1 container, you can run 4 tasks at once.

Workers are especially useful for CPU workloads, since you can increase throughput by adding workers and additional CPU cores, rather than using autoscaling to additional containers.

```python theme={null}
from beam import Image, QueueDepthAutoscaler, task_queue


@task_queue(
    cpu=4, # 1 CPU core per worker is a good rule of thumb
    workers=4, # Launch 2 workers per container to increase throughput
    image=Image(python_version="python3.8", python_packages=["pandas", "csaps"]),
    autoscaler=QueueDepthAutoscaler(max_containers=5, tasks_per_container=1),
)
def handler():
    import pandas as pd

    print(pd)
    import time

    time.sleep(5)
    return {"result": True}
```

Workers are always orchestrated together. When the container launches, all the workers launch.

This can result in higher throughput than using multiple containers with horizontal autoscaling.

### Worker Use-Cases

<Tip>
  Workers allow you to increase your *per container* throughput, vertically.
  Autoscaling allows to scale the *number of containers* and increase throughput
  horizontally
</Tip>

Each worker will share the CPU, Memory, and GPU defined in your app. This means that you'll usually need to increase these values in order to utilize more workers.

For example, let's say your model use 3Gi of GPU memory, and your app is deployed on a T4 GPU with 16Gi of GPU memory. You can safely deploy with 4 workers, since those will fit within the 16Gi of GPU memory available.

It's not always possible to fit multiple workers onto a single machine. In order to use workers effectively, you'll need to know how much compute is consumed by your task.

When you've added multiple workers, you'll be able to see each time a new worker is started in your logs:

<Frame>
  <img />
</Frame>


# Parallelizing Functions
Source: https://docs.beam.cloud/v2/scaling/parallelizing-functions

How to parallelize your functions

## Fanning Out Workloads

You can scale out individual Python functions to many containers using the `.map()` method.

You might use this for parallelizing computational-heavy tasks, such as batch inference or data processing jobs.

```python theme={null}
from beam import function


@function(cpu=0.1)
def square(i: int):
    return i**2


def main():
    numbers = list(range(10))
    squared = []

    # Run a remote container for every item in list
    for result in square.map(numbers):
        print(result)
        squared.append(result)


if __name__ == "__main__":
    main()
```

When we run this Python module, 10 containers will be spawned to run the workload:

```bash theme={null}
$ python math-app.py

=> Building image
=> Using cached image
=> Syncing files

=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>
=> Running function: <map-example:square>

=> Function complete <a6a1c063-b0d7-4c62-b6b1-a7940b19fde9>
=> Function complete <531e1f2c-a4f2-4edf-9cb9-6240df959815>
=> Function complete <bc421f5a-e09b-42d4-8035-d3d13ca5c238>
=> Function complete <2a3dde03-20df-4805-8fb0-ec9743f2bde3>
=> Function complete <59b64517-7b4a-4260-8c65-d0fbb9b98a76>
=> Function complete <f0ab7790-e2fb-441f-8278-74856719a457>
=> Function complete <1256a9ac-c035-412a-ac65-c94248f1ce99>
=> Function complete <476189dd-ba28-4646-9911-96ef8794cb58>
=> Function complete <04ef44cd-ff64-4ef2-a087-00c01ce5a2e4>
=> Function complete <104a602c-93a7-43d5-983c-071f64d91a2c>
```

## Passing Multiple Arguments

The `.map()` method can also parallelize functions that require multiple parameters. Simply pass a list of tuples, where each tuple corresponds to a set of arguments for your function.

Below is an example that counts how many prime numbers appear between a start and a stop index for each tuple in ranges:

```python theme={null}
from beam import function

def is_prime(n: int) -> bool:
    if n < 2:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

@function(cpu=0.1)
def count_primes_in_range(start: int, stop: int) -> int:
    """
    Returns the number of prime numbers in the range [start, stop).
    """
    return sum(is_prime(i) for i in range(start, stop))

def main():
    # Each tuple represents (start, stop)
    ranges = [
        (0, 10),
        (10, 20),
        (20, 30)
    ]

    # .map() will launch a remote container for each tuple
    for result in count_primes_in_range.map(ranges):
        print(result)

if __name__ == "__main__":
    main()
```

In this example:

1. `ranges` is a list of tuples `(start, stop)`.
2. Calling `count_primes_in_range.map(ranges)` spawns a remote execution for each tuple, passing `(start, stop)` to the function.
3. Each remote call returns the number of prime numbers in that sub-range, which we print out.

With `.map()`, Beam takes care of distributing each item (or tuple of items) to separate containers for parallel processing. This approach makes it easy to scale out CPU-heavy or data-intensive tasks with minimal code.


# Compute Pools
Source: https://docs.beam.cloud/v2/scaling/pools


Compute pools are groups of dedicated machines reserved for your workloads. Instead of running on shared, on-demand capacity, your containers are scheduled onto nodes that belong only to you.

Pools are useful when you need guaranteed capacity, consistent hardware, or want to run Beam workloads on your own machines.

## Creating a Pool

You can reserve dedicated hardware from the dashboard. Open the **Add Hardware** dialog, choose an instance type, and give your pool a name. Any nodes you reserve under the same pool name will be added to that pool.

<Frame>
  <img />
</Frame>

After clicking **Reserve & install**, the nodes are provisioned and joined to your pool automatically.

<Note>
  Pools remain active until they are explicitly terminated. You will continue
  to be billed for reserved nodes until you remove them from your account.
</Note>

## Using a Pool

Once your pool is set up, route workloads to it by passing the `pool` argument to your function decorator:

```python theme={null}
from beam import function

@function(gpu="H200", pool="H200-pool")
def handler():
    return {}
```

Any tasks for this function will be scheduled onto machines in the pool, rather than on-demand serverless capacity.

## Bring Your Own Hardware

You can also connect your own machines to a pool. From the **Add Hardware** dialog, choose **Bring your own hardware**, enter a pool name, and run the generated command on your machine:

<Frame>
  <img />
</Frame>

Machines that run this command join the pool and appear in your fleet automatically.


# Privacy Policy
Source: https://docs.beam.cloud/v2/security/privacy-policy


**Your privacy is important to us.** This Privacy Policy document explains the
collection, use and disclosure of information that we receive through Slai. This
Privacy Policy does not apply to any third-party websites, services or
applications, even if they are accessible through our Services.

**Effective Date**: October 3, 2024

This policy describes how Smartshare Inc. (“we,” “us,” or “Company”) collects,
aggregates, stores, safeguards and uses the data and information (including
non-public personal information, or “NPI”) provided by users through our
websites, [www.beam.cloud](http://www.beam.cloud) and [www.slai.io](http://www.slai.io/) (the “Site”), as well as
information collected by us through other means, including by email, over the
phone, or in offline communications. This Site is operated by the Company and
has been created to provide information about our company, products, and
services (together, the “Services”). This policy applies to the Site, the
Services, and our mobile, tablet and other smart device applications, and
application program interfaces (collectively, "Application"). The Site,
Application and Services together are hereinafter collectively referred to as
the “Site.”

**We take your privacy and the security of your information seriously.**

This policy explains:

* What information we collect

* How we use the information we collect

* Choices you can make about the way your information is collected and used

* How we protect personal information electronically and physically

This policy is incorporated into and a material term of your registration and/or
use of Company’s products and services, including our website,
[www.slai.io.](http://www.slai.io/) and [www.beam.cloud](http://www.beam.cloud/) By using the Site, you consent to the
practices set forth in this Privacy Policy.

**INFORMATION WE COLLECT**

**Information You Provide to Us**

Company collects information from you when you choose to provide it to us
through the Site or through any other means. This may include when you create an
account on the Site, register or request products or services, request
information from us, sign up for newsletters or our email lists, use our Site,
or otherwise contact us.

The information we collect may include your name, address, email address,
telephone or mobile phone number, and information from other third-party
applications you connect to the Site. You may be required to provide certain
personal and/or business information to apply for and receive Company products
or services.

**Information We Automatically Collect**

We may use cookies or other technologies to automatically collect certain
information when you visit our Site or interact with our emails. For example, if
you are responding to an offer, promotional email or other email from us, we may
automatically populate your personal information into our system once you enter
your offer code or similar identifying device or otherwise accept your offer or
promotion. Additionally, we may automatically collect certain non-personal
information from you such as your browser type, operating system, software
version, and Internet Protocol ("IP") address. We also may collect information
about your use of the Site, including the date and time of access, the areas or
pages that you visit, the amount of time you spend using the Site, the number of
times you return, whether you open, forward, or click-through emails, and other
Site usage data.

You may adjust your browser or operating system settings to limit this tracking
or to decline cookies, but by doing so, you may not be able to use certain
features on the Site or take full advantage of all of our offerings. Check the
"Help" menu of your browser or operating system to learn how to adjust your
tracking settings or cookie preferences. Please note that our system may not
respond to Do Not Track requests or headers from some or all browsers.

**HOW WE USE INFORMATION WE COLLECT**

Company uses the data and information you provide in a manner that is consistent
with this Privacy Policy and applicable law. If you provide personal data for a
certain reason, we may use the personal data in connection with the reason for
which it was provided. For instance, if you contact us by email, we will use the
personal data you provide to answer your question or resolve your problem. Also,
if you provide personal data in order to obtain access to the Site, we will use
your personal data to provide you with access to the Site and to monitor your
use of the Site.

Company may also use your personal data and other personally non-identifiable
information collected through the Site or the provision of the Services to help
us improve the content and functionality of the Site or the Services, to better
understand our users and to improve the Site and the Services. Company and its
affiliates may use this information to contact you in the future to tell you
about services we believe will be of interest to you. If at any time you wish
not to receive any future marketing communications or you wish to have your name
deleted from our mailing lists, please contact us as indicated below.

**SHARING OF INFORMATION WE COLLECT**

Company is not in the business of selling your information. There are, however,
certain circumstances in which we may share your personal data with certain
third parties without further notice to you, as set forth below:

**Agents, Consultants and Third Party Service Providers:**

Company, like many businesses, sometimes hires other companies to perform
certain business-related functions. Examples of such functions include mailing
information, maintaining databases and processing payments. When we employ
another entity to perform a function of this nature, we only provide them with
the information that they need to perform their specific function.

**Business Transfers:**

As we develop our business, we might sell or buy businesses or assets. In the
event of a corporate sale, merger, reorganization, dissolution or similar event,
personal data may be part of the transferred assets.

**Related Companies:**

We may also share your personal data with our corporate affiliates and
subsidiaries, if any, for purposes consistent with this Privacy Policy.

**Legal Requirements:**

Company may disclose your personal data if required to do so by law or in the
good faith belief that such action is necessary to (i) comply with a legal
obligation, (ii) protect and defend the rights or property of Company, (iii) act
in urgent circumstances to protect the personal safety of users of the Site, the
Services or the public, or (iv) protect against legal liability.

**LINKS TO OTHER WEBSITES**

The Site may have links to third-party websites, which may have privacy policies
that differ from our own. We are not responsible for the practices of such
sites, nor does any such link imply that Company endorses or has reviewed the
third-party site subject to such link. We suggest contacting those sites
directly for information on their privacy policies.

**CHILDREN AND MINORS**

Company does not knowingly collect personal data from minors under the age of
18\. If you are under the age of 18, please do not submit any personal data
through the Site. We encourage parents and legal guardians to monitor their
children’s Internet usage and to help enforce our Privacy Policy by instructing
their children never to provide personal data without the parent’s permission.
If you have reason to believe that a minor under the age of 18 has provided
personal data to Company through the Site, please contact us, and we will
endeavor to delete that information from our databases.

**DATA SECURITY**

We have taken certain physical, administrative, and technical steps to safeguard
the information we collect from and about our customers and Site visitors. While
we make reasonable efforts to help ensure the integrity and security of our
network and systems, we cannot guarantee our security measures. Therefore, you
should take special care in deciding what information you send to us via email.
Please keep this in mind when disclosing any personal data to the Company via
the Internet.

**ACCESS TO YOUR PERSONAL INFORMATION**

To keep your personal data accurate, current, and complete, please contact us as
specified below. We will take reasonable steps to update or correct personal
information in our possession that you have previously submitted via the Site.

**INFORMATION FOR CALIFORNIA RESIDENTS**

This section applies to the personal information we may collect from California
residents that use the Site and our compliance with the California Consumer
Privacy Act (“CCPA”).

The CCPA provides California residents that use the Site, subject to
limitations, the right to request more details about the types or specific
pieces of personal information we collect (as described in the “Information We
Collect” section), to delete their personal information, to opt out of any
“sales” that may be occurring, and to not be discriminated against for making
requests protected by the CCPA.

California residents that use the Site may make a request pursuant to their
rights under the CCPA by contacting us at [support@slai.io](mailto:support@slai.io). Please mark your
inquiry “CCPA Request”. We will verify your request using the information you
have provided to us in use of the Site, including email address.
Government-issued identification may be required in order to process your
request. California residents that use the Site can also designate an authorized
agent to exercise these rights on their behalf.

**INFORMATION FOR EEA USERS**

If you live outside of the United States, you understand and agree that we may
transfer your information to the United States. This site is subject to U.S.
laws, which may not afford the same level of protection of those in your
country. If you do not want your information transferred to the United States,
do not use the Site.

***What Rights Do I Have?***

Individuals located in the European Economic Area (EEA) have certain rights in
respect of your NPI, including the right of access, rectification, restriction,
opposition, erasure and data portability. Where possible, we rely on user
consent as a lawful basis for processing personal data and obtain such consent
in compliance with applicable laws. In some cases, Company may process NPI
pursuant to legal obligation or to protect your vital interests or those of
another person.

***How May I Exercise My Individual Rights?***

Company users may access and update their NPI by sending an email to
[support@slai.io](mailto:support@slai.io). Users located within the EEA may contact us with questions or
requests regarding their NPI using the contact information below. Please note
that Company may request additional information from you to verify your identity
before we disclose any personal or account information. We only send marketing
communications to users we believe to be located in the EEA with your prior
consent, and you may opt out of such communications at any time by clicking the
“unsubscribe” link found within Company email updates and changing your contact
preferences. Please note, you will continue to receive essential account-related
information, even if you unsubscribe from promotional emails.

***Who Can I Contact at Company Regarding Data Protection Issues?***

Company has designated a Data Protection Officer to assist with data privacy and
data protection issues. You may contact him or her by emailing
[support@slai.io](mailto:support@slai.io) and addressing your questions or
concerns to the Data Protection Officer.

**IF YOU HAVE QUESTIONS**

If you have any questions about this Privacy Statement or the practices
described herein, you may contact us at [support@slai.io](mailto:support@slai.io).

**CHANGES TO THIS STATEMENT**

Company reserves the right to revise this Privacy Policy at any time. When we
do, we will post the change(s) on the Site. This Privacy Policy was last updated
on the date indicated above. Your continued use of the Site after any changes or
revisions to this Privacy Policy shall indicate your agreement with the terms of
such revised Privacy Policy.


# Terms and Conditions
Source: https://docs.beam.cloud/v2/security/terms-and-conditions

These are the terms the Beam Platform is provided under.

**Date last updated**: January 23, 2025

**IMPORTANT – CAREFULLY READ ALL THE TERMS AND CONDITIONS OF THIS BEAM TERMS OF
SERVICE AND THE BEAM [PRIVACY POLICY](/v2/security/privacy-policy), WHICH IS
INCORPORATED BY REFERENCE, (COLLECTIVELY THE "AGREEMENT"). BY CREATING AN
ACCOUNT TO USE THE BEAM PLATFORM AS A SERVICE ("SERVICE"),
CLICKING "I ACCEPT", OR PROCEEDING WITH THE USE OF THE SERVICE, INDIVIDUALLY,
AND/OR YOU AS AN AUTHORIZED REPRESENTATIVE OF YOUR COMPANY ON WHOSE BEHALF YOU
USE THE SERVICE ("YOU"), ARE INDICATING THAT YOU HAVE READ, UNDERSTOOD AND
ACCEPT THIS AGREEMENT WITH SMARTSHARE, INC., A DELAWARE CORPORATION ("Beam"),
AND THAT YOU AGREE TO BE BOUND BY THE TERMS. YOU AGREE THAT YOU WILL (A) INFORM
ANY EMPLOYEES OR CONTRACTORS AT YOUR COMPANY OF THE POLICIES AND PRACTICES THAT
ARE RELEVANT TO THEIR USE OF THE SERVICES AND OF ANY SETTINGS THAT MAY IMPACT
THE PROCESSING OF THEIR DATA; AND (B) ENSURE THE TRANSFER AND PROCESSING OF ANY
SUCH EMPLOYEE'S OR CONTRACTOR'S DATA UNDER THIS AGREEMENT IS LAWFUL. IF YOU DO
NOT AGREE WITH ALL OF THE TERMS OF THIS AGREEMENT, YOU MAY NOT USE THE SERVICE.
THE EFFECTIVE DATE OF THIS AGREEMENT SHALL BE THE DATE THAT YOU REGISTER TO USE
THE SERVICE**

**PLEASE NOTE: THESE TERMS OF SERVICE CONTAINS AN ARBITRATION CLAUSE AND CLASS
ACTION WAIVER THAT APPLIES TO ALL USERS. If You reside in the United States,
this provision applies to all disputes with Beam. If You reside outside of the
United States, this provision applies to any action You bring against Beam in
the United States. It affects how disputes with Beam are resolved. By accepting
these Terms of Service, You agree to be bound by the arbitration clause and
class action waiver. Please read it carefully.**

## SERVICE

1. Provision of Service. Beam grants You the right to access and use the Service
   in accordance with this Agreement and Your applicable subscription
   ("Subscription") indicated on the order form and/or online checkout. You will
   comply with all user documentation and all laws, rules, and regulations
   applicable to the use of Service.
2. Restrictions on use of the Service. You may not: (i) modify, alter, tamper
   with, repair, or otherwise create derivative works of the Service; (ii)
   reverse engineer, disassemble, or decompile the Service or apply any other
   process or procedure to derive the source code of the Service; (iii) access
   or use the Service in a way intended to avoid incurring fees or exceeding
   usage limits or quotas; (iv) rent, transfer, resell, or sublicense the
   Service; (v) attempt to disable or circumvent any security, billing, or
   monitoring mechanisms used by the Service; (vi) use the Service to perform a
   malicious activity; (vii) upload or otherwise process any malicious content
   to or through the Service; or (viii) benchmark or perform competitive
   analysis on the Service. The specific Subscription You select may have
   limitations as outlined in the applicable Subscription order form and/or
   online checkout.
3. Updates to the Service. Beam may from time to time make updates to the
   Service as it deems reasonably necessary, and this Agreement shall apply to
   such updated Service. Your continued use of the updated Service indicates
   Your acceptance of the updated.
4. Use of the Services may require the use of certain third party products and
   services (" **Third Party Services**"). Use of any Third Party Services is at
   your sole risk and will be governed by separate terms and conditions,
   separate privacy policies relating to usage of data you may share through the
   Third Party Services in the course of using the Services, other applicable
   policies, and may include separate fees and charges. Beam may display content
   from third parties through the Services or may provide information about or
   links to Third Party Services. Your interactions with any such third parties,
   and any terms, conditions, warranties, or representations associated with
   such interactions, are solely between you and the applicable third parties.
   Beam is not responsible or liable for any loss or damage of any sort incurred
   as the result of any such interactions or as the result of the presence of
   such third-party information made available through the Services.

## REGISTRATION; SUBSCRIPTION AND FEES

1. Registration. To register to use the Service, You must provide Beam with the
   information requested in the registration process, including Your name and
   work email address. You are responsible for all activities that occur under
   Your account; Beam and Beam's affiliates are not responsible for unauthorized
   access to Your account. You will contact Beam immediately if You believe an
   unauthorized third party may be using Your account or if Your account
   information is lost or stolen. You will provide complete and accurate
   information during the registration process and will update it to ensure it
   remains accurate.
2. Some parts of the Services are billed on a Subscription basis. You will be
   billed on a recurring and periodic basis ("Billing Cycle") with payment terms
   as set forth on the applicable order form and/or online checkout. Billing
   cycles are set either on calendar month or annual basis, depending on the
   type of Subscription plan You select when purchasing a Subscription. At the
   end of each Billing Cycle, Your Subscription will automatically renew for
   additional successive periods of equal duration to the initial Subscription
   term unless You cancel it before the end of the then current Subscription
   period. If a free trial period applies to You, Your Subscription will be
   charged upon the expiration of any applicable free trial period.
   Subscriptions canceled prior to the expiration of any trial period will not
   be charged. You may cancel Your Subscription renewal by contacting Beam
   customer support team at [support@slai.io](mailto:support@slai.io), or
   through the account management portal where applicable.
3. A valid payment method is required to process the payment for Your
   Subscription. You shall provide Beam with accurate and complete billing
   information including full name, address, state, zip code, telephone number,
   and a valid payment method information. By submitting such payment
   information, You automatically authorize Beam to charge all Subscription fees
   incurred through Your account to any such payment instruments. Should
   automatic billing fail to occur for any reason, Beam will issue an electronic
   invoice indicating that You must proceed manually, within a certain deadline
   date, with the full payment corresponding to the billing period as indicated
   on the invoice.
4. Beam, in its sole discretion and at any time, may modify the Subscription
   fees for the Subscriptions. Any Subscription fee change will become effective
   at the end of the then-current Billing Cycle. Beam will provide You with a
   reasonable prior notice of any change in Subscription fees to give You an
   opportunity to terminate Your Subscription before such change becomes
   effective. Your continued use of the Services after the Subscription fee
   change comes into effect constitutes Your agreement to pay the modified
   Subscription fee amount.
5. Unless otherwise agreed to in the applicable order form, all fees are payable
   in the currency of the United States through our payment processor
   ("Stripe"). You will be responsible for all taxes resulting from the
   performance of the Service other than taxes on Beam's income. If all or any
   part of any payment owed to Beam under this Agreement is withheld, based upon
   a claim that such withholding is required pursuant to the tax laws of any
   country or its political subdivisions and/or any tax treaty between the U.S.
   and any such country, such payment shall be increased by the amount necessary
   to result in a net payment to Beam of the amounts otherwise payable under
   this Agreement. All fees paid or payable under this Agreement are
   non-refundable and Subscriptions are non-cancelable during the Subscription
   term. Beam may change its fees and payment terms at its discretion.
6. Payments through Stripe. In order to make payments to Beam, You may be
   required to provide Your credit card details to Stripe. Payment processing
   services by Stripe are subject to the Stripe Security Policy, found
   [here](https://stripe.com/docs/security/stripe), and the Stripe Privacy
   Policy, found [here](https://stripe.com/privacy), which Stripe may update
   from time to time. As a condition of Beam enabling payment processing
   services through Stripe, You agree to provide Beam accurate and complete
   information about You and Your business, and You authorize Beam to share it
   and transaction information (exclusive of any credit or debit card numbers,
   details or associated passwords) related to Your use of the payment
   processing services provided by Stripe.
7. Communications. You expressly agree that Beam, or its payment processor, is
   permitted to bill You any applicable fees, any applicable tax and any other
   charges You may incur with Beam in connection with Your use of the Service.
   The fees will be billed to the credit card or other payment account You
   provide in accordance with the billing terms in effect at the time the fees
   are due and payable. You acknowledge and agree that Beam will automatically
   charge Your credit card or other payment account on record with Beam. If
   payment is not received or cannot be charged to Your credit card account for
   any reason, Beam reserves the right to either suspend or terminate Your
   access to the Service and terminate this Agreement. By using the Service, You
   consent to receiving electronic communications from Beam. These electronic
   communications may include notices about applicable fees and charges related
   to the Service and transactional or other information concerning or related
   to the Service. These electronic communications are part of Your relationship
   with Beam and You receive them as part of Your use of the Service. You agree
   that any notices, agreements, disclosures or other communications that we
   send You electronically will satisfy any legal communication requirements,
   including that such communications be in writing.
8. Acceptable Use. In addition to the prohibitions set forth in Section 1(b)
   above, You agree not to, and not to allow third parties to use the Service:
   to violate, or encourage the violation of, the legal rights of others (for
   example, infringing or misappropriate the intellectual property rights of
   others in violation of the Digital Millennium Copyright Act); to engage in,
   promote or encourage illegal activity; for any unlawful, invasive,
   infringing, defamatory or fraudulent purpose (for example, this may include
   phishing, creating a pyramid scheme or mirroring a website); to intentionally
   distribute viruses, worms, Trojan horses, corrupted files, hoaxes, or other
   items of a destructive or deceptive nature; to interfere with the use of the
   Service, or the equipment used to provide the Service, by customers,
   authorized resellers, or other authorized users; to disable, interfere with
   or circumvent any aspect of the Service; to generate, distribute, publish or
   facilitate unsolicited mass email, promotions, advertisements or other
   solicitations ("spam"); or to use the Service, or any interfaces provided
   with the Service in a manner that violates the terms of this Agreement. If
   You become aware of any use or content that is in violation of the foregoing
   Acceptable Use restrictions, You agree to promptly remedy such use or
   content. If You fail to do so, Beam or its providers may suspend or disable
   access to the Service (including Your Data) until You comply.

## INTELLECTUAL PROPERTY RIGHTS AND OWNERSHIP

1. Beam Rights. This Agreement does not transfer any right, title or interest
   in any intellectual property right to each other, except as expressly set
   forth in this Agreement. Beam owns all rights, title and interest in and to
   the Service. There are no implied rights. Beam reserves all rights not
   expressly granted herein.
2. We welcome and encourage You to provide feedback, comments and suggestions
   for improvements to the Service ("Feedback"). You may submit Feedback by
   emailing us through the Contact section of the website, or by other means of
   communication. Any Feedback You submit to us will be considered
   non-confidential and non-proprietary to You. By submitting Feedback to us,
   You grant us a non-exclusive, worldwide, royalty-free, irrevocable,
   sub-licensable, perpetual license to use and publish those ideas and
   materials for any purpose, without compensation to You and without the
   obligation to identify You.
3. Your Rights in Your Data. You represent and warrant to Beam that: (1) You or
   Your licensors own all right, title, and interest in and to any and all
   permitted electronic data uploaded and stored by You in the Service ("Your
   Data"); (2) You have all rights in Your Data necessary to grant the rights
   contemplated by this Agreement; and (3) none of Your Data violates this
   Agreement, any applicable law or regulation or any third party's
   intellectual property or other right. For the avoidance of doubt, as between
   Beam and You, You will retain all right, title and interest in all Your Data
   and to all models and analyses created by You or Your authorized personnel
   using the Services.

## YOUR DATA

1. You are solely responsible for the development, content, operation,
   maintenance, and use of Your Data. You will ensure that Your Data, and Your
   use of it, complies with this Agreement and any applicable laws and
   regulations. You are responsible for properly configuring and using the
   Service and taking Your own steps to maintain appropriate security,
   protection and backup of Your Data. You hereby consent that Beam may use Your
   Data, the queries and models You submit to the Service, and metadata about
   Your usage of the Service to measure and improve the Service and support Your
   usage of the Service. If You include any data about any individual in Your
   use of the Service, (1) Beam will hold and store Your Data on Your behalf,
   and You are the data controller of such data; (2) Beam will process personal
   data in compliance with this Section, Your instructions and in accordance
   with Beam's privacy policy (3) You agree to follow all applicable
   instructions to parameterize Your Data as set forth in the
   Beam[ documentation](https://docs.slai.io) ("Documentation"); and (4) You
   warrant that: (a) Your instructions to Beam comply with applicable privacy
   and data protection laws and regulations, (b) You have all appropriate
   consents and an appropriate lawful basis to provide the data to the Service,
   and (c) You have provided proper privacy notifications to individuals as
   required by applicable laws and regulations. If You are located in the
   European Union or will transmit any of Your Data that includes personal data
   regarding a resident of the European Union, You may contact us at [dpa@slai.io](mailto:dpa@slai.io)
   to request a data processing addendum that is pre-signed by Beam and You
   agree that under this Agreement, Beam is merely a data processor. Beam will
   use commercially reasonable efforts designed to prevent the unauthorized
   disclosure or destruction of Your Data stored with Beam in accordance with
   our [Security Policy](/v2/security/terms-and-conditions).

2. HIPAA Data. You agree not to upload to any Service any HIPAA data unless You
   have entered into BAA with Beam. Unless a BAA is in place, Beam will have no
   liability under this Agreement for HIPAA data, notwithstanding anything to
   the contrary in this Agreement or in HIPAA or any similar federal or state
   laws, rules or regulations. If You are permitted to submit HIPAA data to a
   Service, then You may submit HIPAA data to Beam and/or the Service only by
   uploading it as Customer Data. Upon mutual execution of the BAA, the BAA is
   incorporated by reference into this Agreement and is subject to its terms.

## CONFIDENTIAL INFORMATION

"Confidential Information" means any proprietary information that is marked
"confidential" or "proprietary" or any other similar term or in relation to
which its confidentiality should by its nature be inferred or, if disclosed
orally, is identified as being confidential at the time of disclosure and,
within two (2) weeks thereafter, is summarized, appropriately labeled and
provided in tangible form, received by the other party during, or prior to,
entering into this Agreement including, without limitation, the Service and any
non-public technical and business information. Confidential Information does not
include information that (i) is or becomes generally known to the public through
no fault of or breach of this Agreement by the receiving party; (ii) is
rightfully known by the receiving party at the time of disclosure without an
obligation of confidentiality; (iii) is independently developed by the receiving
party without the use of the disclosing party's Confidential Information; or
(iv) the receiving party rightfully obtains from a third party without
restriction on use or disclosure. You and Beam will maintain the confidentiality
of Confidential Information. The receiving party of any Confidential Information
of the other party agrees not to use such Confidential Information for any
purpose except as necessary to fulfill its obligations and exercise its rights
under this Agreement. The receiving party shall protect the secrecy of and
prevent disclosure and unauthorized use of the disclosing party's Confidential
Information using the same degree of care that it takes to protect its own
confidential information and in no event shall use less than reasonable care.
The receiving party may disclose the Confidential Information of the disclosing
party if required by judicial or administrative process, provided that the
receiving party first provides to the disclosing party prompt notice of such
required disclosure (to the extent allowed) to enable the disclosing party to
seek a protective order. Upon termination or expiration of this Agreement, the
receiving party will destroy (and provide written certification of such
destruction) the disclosing party's Confidential Information.

### TERM; TERMINATION

1. Term; Termination. The term of this Agreement commences when You accept this
   Agreement (such as by creating an account or proceeding with the use of the
   Service) and will remain in effect until terminated in accordance with this
   Agreement. You may terminate this Agreement at any time by canceling Your
   account by contacting us at [support@slai.io](mailto:support@slai.io) or
   through the account management portal where applicable. Beam may terminate
   this Agreement at any time on thirty (30) days advance notice. Beam may also
   terminate Your account and this Agreement, or suspend Your account,
   immediately if (i) Beam changes the way Beam provides or discontinues the
   Service; (ii) Your account was suspended under Section 7 of this Agreement
   and You have not remediated the reason for the suspension; or (iii) Beam
   determines that: (1) Your use of the Service poses a security risk to the
   Service or any third party; (2) Your use of the Service may adversely impact
   other users of the Service; (3) Your use of the Service may subject Beam,
   Beam's affiliates, or any third party to liability; (4) Your use of the
   Service may be fraudulent; (5) You are in breach of this Agreement; or (6)
   You have ceased to operate in the ordinary course, made an assignment for the
   benefit of creditors or similar disposition of Your assets, or become the
   subject of any bankruptcy, reorganization, liquidation, dissolution or
   similar proceeding.
2. Effect of Termination. Upon termination of this Agreement (i) all Your rights
   under this Agreement immediately terminate and You must cease using the
   Service, (ii) You are solely responsible for deleting or retrieving Your Data
   from the Service prior to termination for any reason, and (iii) You must pay
   all unpaid fees to Beam. If either party terminates Your account or this
   Agreement, Beam will provide You with a reasonable opportunity to retrieve
   Your Data from the Service, if You so request. Such a request must be sent by
   email to Beam at [support@slai.io](mailto:support@slai.io) within seven (7)
   days after You receive notice regarding the termination. In any event, Your
   Data will be deleted from the Service no earlier than thirty (30) days after
   the termination notice regarding Your account has been sent to You.
3. You understand and agree that Beam may change, suspend or discontinue any
   part of the Service and Service as a whole. Beam will notify You of any
   material change to or discontinuation of the Service by email or via Beam's
   website. If Beam discontinues the Service (excluding for Your breach), You
   will receive a pro-rata refund for any pre-paid but unused fees.

### SUSPENSION

Without limiting other available remedies included in this Agreement or
otherwise, Beam may suspend Your access to the Service if You are in
non-compliance with this Agreement.

1. **WARRANTY AND WARRANTY DISCLAIMER.**

   1. Beam warrants that the Service will materially conform to the
      specifications set forth in the applicable Documentation for the duration
      of Your Subscription term. If Beam is unable to correct any reported
      non-conformity with this warranty, Beam may terminate the applicable
      Subscription and as Your sole remedy, You will be entitled to receive a
      pro-rata refund of any prepaid but unused Subscription fees. This warranty
      will not apply if the error or non-conformance was caused by misuse of the
      Service, or third-party hardware, software, or services used in connection
      with the Service.
   2. You should regularly back up Your Data while using the Service. Beam
      PROVIDES THE SERVICE ON AN "AS IS" BASIS. Beam DOES NOT MAKE ANY
      WARRANTIES REGARDING THE PERFORMANCE OF THE SERVICE OR UPTIME OF THE
      SERVICE, OR THAT YOUR USE OF THE SERVICES WILL BE SECURE, UNINTERRUPTED OR
      ERROR FREE, WHETHER EXPRESS, IMPLIED, STATUTORY OR OTHERWISE. Beam
      EXPRESSLY DISCLAIMS ALL WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE
      IMPLIED WARRANTIES OF NON-INFRINGEMENT OF THIRD PARTY RIGHTS, TITLE,
      MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Beam HAS NO
      RESPONSIBILITY FOR LOSS OF YOUR DATA OR INABILITY TO USE THE SERVICE FOR
      ANY REASONS, INCLUDING, WITHOUT LIMITATION, IF DUE TO THE ACTS OR
      OMISSIONS OF ITS THIRD PARTY HOSTING PROVIDERS.

### LIMITATION OF LIABILITY.

NEITHER SLAI, ITS AFFILIATES OR THEIR LICENSORS ARE LIABLE FOR SPECIAL,
INCIDENTAL, CONSEQUENTIAL OR INDIRECT DAMAGES, INCLUDING WITHOUT LIMITATION,
LOST PROFITS, LOST SAVINGS, OR DAMAGES ARISING FROM LOSS OF USE, LOSS OF
QUERIES, CONTENT OR DATA OR ANY ACTUAL OR ANTICIPATED DAMAGES, REGARDLESS OF THE
LEGAL THEORY ON WHICH SUCH DAMAGES MAY BE BASED, AND EVEN IF BEAMHAS BEEN
ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SLAI’S AND SLAI’S AFFILIATES' AND
LICENSORS' AGGREGATE LIABILITY FOR ANY PERMITTED DIRECT DAMAGES UNDER THIS
AGREEMENT WILL BE LIMITED TO THE GREATER OF (i) THE AMOUNT OF ONE HUNDRED
DOLLARS; OR (ii) THE FEES THAT YOU HAVE ACTUALLY PAID OR PAYABLE TO BEAMFOR THE
RELEVANT SERVICES WITHIN THE SIX (6) MONTH PERIOD IMMEDIATELY PRECEDING THE
EVENT GIVING RISE TO THE CLAIM FOR DAMAGES. SECTION 9 ON LIMITATION OF LIABILITY
AND SECTION 8 ABOVE ON WARRANTY DISCLAIMER FAIRLY ALLOCATE THE RISKS IN THIS
AGREEMENT. THIS ALLOCATION IS AN ESSENTIAL ELEMENT OF THE BASIS OF THE BARGAIN
BETWEEN THE PARTIES AND THAT THE LIMITATIONS SPECIFIED IN THIS SECTION 9 SHALL
APPLY NOTWITHSTANDING ANY FAILURE OF THE ESSENTIAL PURPOSE OF THESE TERMS OR ANY
LIMITED REMEDY HEREUNDER.

### INDEMNIFICATION

You will, at Beam's option, defend, indemnify, and hold Beam, Beam's affiliates
and licensors, and each of their respective employees, officers, directors, and
representatives harmless from and against any claims, damages, losses,
liabilities, costs, and expenses (including reasonable legal fees) arising out
of or relating to any third party claim concerning: (a) breach of this Agreement
or violation of applicable law or regulation by You; (b) Your Data or the
combination of Your Data with other applications, content or processes,
including any claim involving alleged infringement or misappropriation of
third-party rights by Your Data or by the use, development, design, production,
advertising or marketing of Your Data; or (c) the use of the Services. Beam will
promptly notify You of any claim subject to this Section, but Beam's failure to
promptly notify You will only affect Your obligations to the extent that Beam's
failure prejudices Your ability to defend the claim. You may: (a) use counsel of
Your own choosing (subject to Beam's written consent) to defend against any
claim; and (b) settle the claim as You deem appropriate, provided that You
obtain Beam's prior written consent before entering into any settlement.

### GENERAL

1. 1. Miscellaneous. Beam and You are independent contractors, and neither
      party, nor any of their respective affiliates, is an agent of the other
      for any purpose or has the authority to bind the other. This Agreement
      does not create any third party beneficiary rights in any individual or
      entity that is not a party to this Agreement. You may not assign this
      Agreement, or delegate or sublicense any of Your rights under this
      Agreement, without Beam's prior written consent. Beam may without
      restriction, assign, transfer or delegate this Agreement and any rights
      and obligations hereunder, at its sole discretion, with 30 days prior
      notice. Your right to terminate this Agreement at any time remains
      unaffected. A party's failure to enforce any provision of this Agreement
      will not constitute a present or future waiver of such provision nor
      limit that party's right to enforce such provision at a later time. If
      any portion of this Agreement is held to be invalid or unenforceable,
      the remaining portions of this Agreement will remain in full force and
      effect. In any action or proceeding to enforce rights under this
      Agreement, the prevailing party shall be entitled to recover costs and
      attorneys' fees. 2. Entire Agreement. This Agreement is the entire
      agreement between You and Beam regarding the subject matter of this
      Agreement. This Agreement supersedes all prior or contemporaneous
      representations, understandings, agreements, or communications between
      You and Beam, whether written or verbal, regarding the subject matter of
      this Agreement. 3. Notice. All communications and notices to be made or
      given pursuant to this Agreement must be in English. Beam may provide
      any notice to You under this Agreement by posting a notice in the
      Service or sending a message to the email address associated with Your
      account. You will be deemed to have received any email sent to the email
      address then associated with Your account when Beam sends the email,
      whether or not You actually receive the email. To give Beam notice under
      this Agreement, You must (1) email Beam at [legal@slai.io](mailto:legal@slai.io), or (2) send
      Beam Your notice by certified mail, return receipt requested, to Beam at
      1 Broadway 14th Floor, Cambridge MA 02142, Attn: Smartshare.
   2. Dispute Resolution and Arbitration Agreement and Choice of Law and
      Jurisdiction
   3. This Dispute Resolution and Arbitration Agreement shall apply if You (i)
      reside in the United States; or (ii) do not reside in the United States,
      but bring any claim against Beam in the United States.
   4. AGREEMENT TO ARBITRATE. ANY CONTROVERSY OR CLAIM ARISING OUT OF OR
      RELATING TO THIS AGREEMENT, OR THE BREACH THEREOF, SHALL BE SETTLED BY
      ARBITRATION ADMINISTERED BY THE AMERICAN ARBITRATION ASSOCIATION IN
      ACCORDANCE WITH ITS COMMERCIAL ARBITRATION RULES, AND JUDGMENT ON THE
      AWARD RENDERED BY THE ARBITRATOR MAY BE ENTERED IN ANY COURT HAVING
      JURISDICTION THEREOF. IF THERE IS A DISPUTE ABOUT WHETHER THIS
      ARBITRATION AGREEMENT CAN BE ENFORCED OR APPLIES TO OUR DISPUTE, YOU AND
      BeamAGREE THAT THE ARBITRATOR WILL DECIDE THAT ISSUE.
   5. 1. Pre-Arbitration Dispute Resolution and Notification. Prior to
         initiating an arbitration, You and Beam each agree to notify the
         other party of the dispute and attempt to negotiate an informal
         resolution to it first. We will contact You at the email address You
         have provided to us; You can contact Beam's customer service team by
         emailing us at the contact addresses provided on the Site. If after a
         good faith effort to negotiate, one of us feels the dispute has not
         and cannot be resolved informally, the party intending to pursue
         arbitration agrees to notify the other party via email prior to
         initiating the arbitration. In order to initiate arbitration, a claim
         must be filed with the AAA and the written Demand for Arbitration
         (available at
         [www.adr.org](http://www.adr.org/cs/idcplg?IdcService=GET%5FFILE\&dDocName=ADRSTAGE2034889\&RevisionSelectionMethod=LatestReleased))
         provided to the other party, as specified in the AAA Rules.
      2. Exceptions to Arbitration Agreement. You and Beam each agree that the
         following claims are exceptions to the Arbitration Agreement and will
         be brought in a judicial proceeding in a court of competent
         jurisdiction: (i) Any claim related to actual or threatened
         infringement, misappropriation or violation of a party's copyrights,
         trademarks, trade secrets, patents, or other intellectual property
         rights; (ii) Any claim seeking emergency injunctive relief based on
         exigent circumstances (e.g., imminent danger or commission of a
         crime, hacking, cyber-attack).
      3. Arbitration Rules and Governing Law. This Arbitration Agreement
         evidences a transaction in interstate commerce and thus the Federal
         Arbitration Act governs the interpretation and enforcement of this
         provision. The arbitration will be administered by AAA in accordance
         with the Commercial Arbitration Rules (the " AAA Rules") then in
         effect, except as modified here. The AAA Rules are available at
         [www.adr.org](http://www.adr.org/) or by calling the AAA at
         1–800–778–7879.
      4. Modification to AAA Rules - Arbitration Hearing/Location. You agree
         that any required arbitration hearing will be conducted in the
         English language by one (1) mutually agreed upon arbitrator, at
         Beam's sole and complete discretion, (a) in Delaware or in any other
         location to which You and Beam both agree; (b) via phone or video
         conference; or (c) for any claim or counterclaim under \$25,000, by
         solely the submission of documents to the arbitrator.
      5. **JURY TRIAL WAIVER. YOU AND BEAMACKNOWLEDGE AND AGREE THAT WE ARE
         EACH WAIVING THE RIGHT TO A TRIAL BY JURY AS TO ALL ARBITRABLE
         DISPUTES.**
      6. **NO CLASS ACTIONS OR REPRESENTATIVE PROCEEDINGS. YOU AND SLAI
         ACKNOWLEDGE AND AGREE THAT WE ARE EACH WAIVING THE RIGHT TO
         PARTICIPATE AS A PLAINTIFF OR CLASS USER IN ANY PURPORTED CLASS
         ACTION LAWSUIT, CLASS-WIDE ARBITRATION, PRIVATE ATTORNEY-GENERAL
         ACTION, OR ANY OTHER REPRESENTATIVE PROCEEDING AS TO ALL DISPUTES.
         FURTHER, UNLESS YOU AND BEAMBOTH OTHERWISE AGREE IN WRITING, THE
         ARBITRATOR MAY NOT CONSOLIDATE MORE THAN ONE PARTY'S CLAIMS AND MAY
         NOT OTHERWISE PRESIDE OVER ANY FORM OF ANY CLASS OR REPRESENTATIVE
         PROCEEDING. IF THIS PARAGRAPH IS HELD UNENFORCEABLE WITH RESPECT TO
         ANY DISPUTE, THEN THE ENTIRETY OF THE ARBITRATION AGREEMENT WILL BE
         DEEMED VOID WITH RESPECT TO SUCH DISPUTE.**
      7. Severability. Except as provided in the immediately preceding
         paragraph, in the event that any portion of this Arbitration
         Agreement is deemed illegal or unenforceable, such provision shall be
         severed and the remainder of the Arbitration Agreement shall be given
         full force and effect.
      8. Changes. Notwithstanding the provisions of Section 3 ("Modification
         of These Terms"), if Beam changes this Section ("Dispute Resolution
         and Arbitration Agreement") after the date You last accepted these
         Terms (or accepted any subsequent changes to these Terms), You may
         reject any such change by sending us written notice (including by
         email) within thirty (30) days of the date such change became
         effective. By rejecting any change, You are agreeing that You will
         arbitrate any Dispute between You and Beam in accordance with the
         provisions of the "Dispute Resolution and Arbitration Agreement"
         section as of the date You last accepted these Terms (or accepted any
         subsequent changes to these Terms).
      9. Choice of Law; Jurisdiction. If You reside in the United States,
         these Terms will be interpreted in accordance with the laws of the
         State of Delaware and the United States of America, without regard to
         conflict-of-law provisions. Judicial proceedings (other than small
         claims actions) that are excluded from the Arbitration Agreement
         above must be brought in state or federal court in Delaware, unless
         we both agree to some other location. You and we both consent to
         venue and personal jurisdiction in Delaware.
2. Survival. All provisions of this Agreement which by their nature should
   survive termination shall survive termination, including, without
   limitation, accrued payment obligations, ownership provisions, warranty
   disclaimers, indemnity, limitations of liability and dispute resolution.
3. 1. Force Majeure. Beam is not liable for any delay or failure to perform any
      obligation under this Agreement where the delay or failure results from
      any cause beyond Beam's reasonable control, including acts of God, labor
      disputes or other industrial disturbances, systemic electrical,
      telecommunications, or other utility failures, earthquake, storms or
      other elements of nature, blockages, embargoes, riots, acts or orders of
      government, acts of terrorism, or war.
   2. Government Licensees. The Service is a commercial computer software
      program developed solely at private expense. As defined in U.S. Federal
      Acquisition Regulations (FAR) section 2.101 and U.S. Defense Federal
      Acquisition Regulations (DFAR) sections 252.227-7014(a)(1) and
      252.227-7014(a)(5) (or otherwise as applicable to You), the Service
      licensed in this Agreement is deemed to be "commercial items" and
      "commercial computer software" and "commercial computer software
      documentation." Consistent with FAR section 12.212 and DFAR section
      227.7202, (or such other similar provisions as may be applicable to You),
      any use, modification, reproduction, release, performance, display, or
      disclosure of such service commercial item, or service commercial
      computer software, or service commercial documentation by the U.S.
      government (or any agency or contractor thereof) shall be governed solely
      by the terms of this Agreement and shall be prohibited except to the
      extent expressly permitted by the terms of this Agreement.
   3. Changes to the Terms. Beam reserves the right to modify this Agreement at
      any time in accordance with this provision. If we make changes to this
      Agreement, we will post this Agreement on the Beam website. If You
      disagree with the revised Agreement, You may terminate this Agreement
      with immediate effect by following the procedure described in the "Term
      and Termination" section. If You do not terminate Your Agreement before
      the date the revised Agreement becomes effective, Your continued access
      to or use of the Services will constitute acceptance of the revised
      Agreement.


# Amazon Web Services
Source: https://docs.beam.cloud/v2/self-hosting/aws

Learn how to deploy Beam OSS (Beta9) to Amazon EKS.

## Prerequisites

* Amazon EKS
* Karpenter
* Helm and kubectl
* Beta9 CLI

## Dependencies

Beta9 uses an S3-compatible object storage system for its file system. In this example, we'll deploy localstack.

<Note>
  Without a Localstack license, its data is temporary. If its pod is deleted, the data will be lost. We recommend that you use AWS S3 or something similar.
</Note>

```sh theme={null}
helm repo add localstack https://localstack.github.io/helm-charts
helm install localstack localstack/localstack --values=- <<EOF
extraEnvVars:
- name: SERVICES
  value: "s3"
enableStartupScripts: true
startupScriptContent: |
  #!/bin/bash
  awslocal s3 mb s3://juicefs
  awslocal s3 mb s3://logs
persistence:
  enabled: true
  storageClass: local-path
  accessModes:
  - ReadWriteOnce
  size: 50Gi
EOF
```

## Install Helm Chart

Install the helm chart and open connections to the service.

```sh theme={null}
# Step 1: Install the chart
helm install beta9 oci://public.ecr.aws/n4e0e1y0/beta9-chart --version 0.1.166

# Step 2: Confirm the pods are running
kubectl get pods -w

# Step 3: Open ports to the http and grpc services
kubectl port-forward svc/beta9-gateway 1993 1994
```

## Configure the CLI

Create a new config.

```sh theme={null}
./beta9
=> Welcome to Beta9! Let's get started

           ,#@@&&&&&&&&&@&/
        @&&&&&&&&&&&&&&&&&&&&@#
         *@&&&&&&&&&&&&&&&&&&&&&@/
   ##      /&&&&&&&&&&&&&@&&&&&&&&@,
  @&&&&&.    (&&&&&&@/    &&&&&&&&&&/
 &&&&&&&&&@*   %&@.      @& ,@&&&&&&&,
.@&&&&&&&&&&&&#        &&*  ,@&&&&&&&&
*&&&&&&&&&&&@,   %&@/@&*    @&&&&&&&&@
.@&&&&&&&&&*      *&@     .@&&&&&&&&&&
 %&&&&&&&&     /@@*     .@&&&&&&&&&&@,
  &&&&&&&/.#@&&.     .&&&    %&&&&&@,
   /&&&&&&&@%*,,*#@&&(         ,@&&
     /&&&&&&&&&&&&&&,
        #@&&&&&&&&&&,
            ,(&@@&&&,

Context Name [default]:
Gateway Host [0.0.0.0]:
Gateway Port [1993]:
Token:
Added new context!
```

Confirm the config was created and has a token set.

```sh theme={null}
cat ~/.beta9/config.ini
[default]
token = <token should be here>
gateway_host = localhost
gateway_port = 1993
```

## Setting Configuration Values

Setup your [config file](https://github.com/beam-cloud/beta9/blob/main/pkg/common/config.default.yaml). You will need to set a few values in here and create a secret in your cluster, under the `beta9` namespace.

### Recommended Settings

```yaml theme={null}
gateway:
  externalURL: https://app.example.com

imageService:
  registryStore: s3
  registryCredentialProvider: aws
  registries:
    s3:
      bucketName: <your-image-bucket-name>
      region: <your-aws-region>
      # keys not needed if using iam with k8s service account (irsa)
      accessKey:
      secretKey:
  runner:
    baseImageTag: 0.1.10
    baseImageName: beta9-runner
    baseImageRegistry: public.ecr.aws/n4e0e1y0

worker:
  imageTag: 0.1.143
  imageName: beta9-worker
  imageRegistry: public.ecr.aws/n4e0e1y0
  serviceAccountName: <k8s service account to use - should be able to access juicefs s3 bucket>

storage:
  mode: juicefs
  juicefs:
    awsS3Bucket: <your-juicefs-bucket>
    # keys not needed if using iam with k8s service account (irsa)
    awsAccessKey:
    awsSecretKey:
```

## Mounting Secrets

Once you've configured the config and created a secret in K8s, you'll need to do two more things:

1. Mount the secret to the gateway by modifying the persistence value in the `values.yaml` file.
2. Add an env var to the gateway called `CONFIG_PATH` that points to where you are mounting the secret.

## IAM Policies

To access the S3 bucket that you need to setup and configure in the config/secret, you'll need to also setup an IAM role that a K8s service account can authenticate with.

This is called [EKS IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html). Once you figure this out, you'll need to add an annotation to the K8s service account that points to their IAM role.

Here is an example in the `values.yaml` file:

```yaml theme={null}
serviceAccount:
  create: true
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<account-id>:role/beta9-role
  name: beta9-role
```

<Tip>
  We recommend saving secrets with the [External Secrets Operator](https://github.com/external-secrets/external-secrets), but you can also create secrets manually in the cluster.

  To create a secret manually, create your secrets file on disk and run `kubectl apply` like you would normally.
</Tip>

## Gotchas

* Make sure your ingress supports GRPC and HTTP
* Your IAM permissions need to be set correctly. You will need to create S3 buckets manually or in Terraform.
* If you are using [Karpenter](https://karpenter.sh/) for your autoscaler, you'll need to add a label to the nodes which you want the Beta9 scheduler to pick up.


# Local Machine
Source: https://docs.beam.cloud/v2/self-hosting/local-machine

Learn how to deploy Beam OSS (Beta9) to your local machine.

## Prerequisites

* Kubernetes
* Helm and kubectl
* Beta9 CLI

## Dependencies

Beta9 uses an S3-compatible object storage system for its file system. In this example, we'll deploy localstack.

<Note>
  Without a Localstack license, its data is temporary. If its pod is deleted, the data will be lost. We recommend that you use AWS S3 or something similar.
</Note>

```sh theme={null}
helm repo add localstack https://localstack.github.io/helm-charts
helm install localstack localstack/localstack --values=- <<EOF
extraEnvVars:
- name: SERVICES
  value: "s3"
enableStartupScripts: true
startupScriptContent: |
  #!/bin/bash
  awslocal s3 mb s3://juicefs
  awslocal s3 mb s3://logs
persistence:
  enabled: true
  storageClass: local-path
  accessModes:
  - ReadWriteOnce
  size: 50Gi
EOF
```

## Install Helm Chart

Install the helm chart and open connections to the service.

```sh theme={null}
# Step 1: Install the chart
helm install beta9 oci://public.ecr.aws/n4e0e1y0/beta9-chart --version 0.1.166

# Step 2: Confirm the pods are running
kubectl get pods -w

# Step 3: Open ports to the http and grpc services
kubectl port-forward svc/beta9-gateway 1993 1994
```

## Configure the CLI

Create a new config.

```sh theme={null}
./beta9
=> Welcome to Beta9! Let's get started

           ,#@@&&&&&&&&&@&/
        @&&&&&&&&&&&&&&&&&&&&@#
         *@&&&&&&&&&&&&&&&&&&&&&@/
   ##      /&&&&&&&&&&&&&@&&&&&&&&@,
  @&&&&&.    (&&&&&&@/    &&&&&&&&&&/
 &&&&&&&&&@*   %&@.      @& ,@&&&&&&&,
.@&&&&&&&&&&&&#        &&*  ,@&&&&&&&&
*&&&&&&&&&&&@,   %&@/@&*    @&&&&&&&&@
.@&&&&&&&&&*      *&@     .@&&&&&&&&&&
 %&&&&&&&&     /@@*     .@&&&&&&&&&&@,
  &&&&&&&/.#@&&.     .&&&    %&&&&&@,
   /&&&&&&&@%*,,*#@&&(         ,@&&
     /&&&&&&&&&&&&&&,
        #@&&&&&&&&&&,
            ,(&@@&&&,

Context Name [default]:
Gateway Host [0.0.0.0]:
Gateway Port [1993]:
Token:
Added new context!
```

Confirm the config was created and has a token set.

```sh theme={null}
cat ~/.beta9/config.ini
[default]
token = <token should be here>
gateway_host = localhost
gateway_port = 1993
```


# Overview
Source: https://docs.beam.cloud/v2/self-hosting/overview

Beta9 is the open source project that powers Beam

## Beam vs. Beta9

**Beam and Beta9 have similar functionality.**

You can switch between either product by changing the SDK imports and CLI commands used:

|              | [beam.cloud](https://beam.cloud) | [Beta9](https://github.com/beam-cloud/beta9/) |
| ------------ | -------------------------------- | --------------------------------------------- |
| Installation | `uv tool install beam-client`    | `pip install beta9`                           |
| Imports      | `from beam import endpoint`      | `from beta9 import endpoint`                  |
| CLI Commands | `beam serve app.py:function`     | `beta9 serve app.py:function`                 |

## Self-Hosting Beta9

<CardGroup>
  <Card title="Self-Host on AWS" icon="aws" href="/v2/self-hosting/aws" />

  <Card title="Self-Host Locally" icon="computer" href="/v2/self-hosting/local-machine" />
</CardGroup>

## Architecture

<Frame>
  <img />
</Frame>

## Contributor Guide

We welcome contributions, big or small! These are the most helpful things for us:

* Rank features in our roadmap
* Open a PR
* Submit a [feature request](https://github.com/beam-cloud/beta9/issues/new?assignees=\&labels=\&projects=\&template=feature-request.md\&title=) or [bug report](https://github.com/beam-cloud/beta9/issues/new?assignees=\&labels=\&projects=\&template=bug-report.md\&title=)


# Querying Task Status
Source: https://docs.beam.cloud/v2/task-queue/query-status


You can check the status of any task by querying the `task` API:

```sh theme={null}
https://api.beam.cloud/v2/task/{TASK_ID}/
```

## Task Statuses

Your payload will return the status of the task. These are the possible statuses for a task:

| Status      | Description                                                                                                                                                                               |
| ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `PENDING`   | The task is enqueued and has not started yet.                                                                                                                                             |
| `RUNNING`   | The task is running.                                                                                                                                                                      |
| `COMPLETE`  | The task completed without any errors.                                                                                                                                                    |
| `RETRY`     | The task is being retried. Defaults to 3, unless `max_retries` is provided in the function decorator.                                                                                     |
| `CANCELLED` | The task was cancelled by the client.                                                                                                                                                     |
| `TIMEOUT`   | The task timed out, based on the `timeout` provided in the function decorator.                                                                                                            |
| `EXPIRED`   | The task remained in the queue and was never picked up by a worker. **For endpoints, this usually occurs when the task does not start running before the request timeout (180 seconds).** |
| `FAILED`    | The task did not complete successfully.                                                                                                                                                   |

### Request

```sh theme={null}
curl -X GET \
  'https://api.beam.cloud/v2/task/{TASK_ID}/' \
  -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
  -H 'Content-Type: application/json'
```

### Response

The response to `/task` returns the following data:

| Field                     | Type    | Description                                                                                                 |
| ------------------------- | ------- | ----------------------------------------------------------------------------------------------------------- |
| `id`                      | string  | The unique identifier of the task.                                                                          |
| `started_at`              | string  | The timestamp when the task started, in ISO 8601 format. Null if the task hasn't started yet.               |
| `ended_at`                | string  | The timestamp when the task ended, in ISO 8601 format. Null if the task is still running or hasn't started. |
| `status`                  | string  | The current status of the task (e.g., COMPLETE, RUNNING, etc.).                                             |
| `container_id`            | string  | The identifier of the container running the task.                                                           |
| `updated_at`              | string  | The timestamp when the task was last updated, in ISO 8601 format.                                           |
| `created_at`              | string  | The timestamp when the task was created, in ISO 8601 format.                                                |
| `outputs`                 | array   | An array containing the outputs of the task.                                                                |
| `stats`                   | object  | An object containing statistics about the task's execution environment.                                     |
| `stats.active_containers` | integer | The number of active containers for the task.                                                               |
| `stats.queue_depth`       | integer | The depth of the queue for the deployment.                                                                  |
| `stub`                    | object  | An object containing detailed information about the task's configuration and deployment.                    |
| `stub.id`                 | string  | The identifier of the deployment stub.                                                                      |
| `stub.name`               | string  | The name of the deployment stub.                                                                            |
| `stub.type`               | string  | The type of the deployment stub.                                                                            |
| `stub.config`             | string  | The configuration details of the deployment stub in JSON format.                                            |
| `stub.config_version`     | integer | The version number of the deployment stub configuration.                                                    |
| `stub.object_id`          | integer | The object identifier associated with the deployment stub.                                                  |
| `stub.created_at`         | string  | The timestamp when the deployment stub was created, in ISO 8601 format.                                     |
| `stub.updated_at`         | string  | The timestamp when the deployment stub was last updated, in ISO 8601 format.                                |

Here's what the response payload looks like as JSON:

```json theme={null}
{
  "id": "c5f01c46-4eb3-4021-9d5f-eae9a08c4aad",
  "started_at": "2025-05-22T22:49:03.839612Z",
  "ended_at": "2025-05-22T22:49:03.913964Z",
  "status": "COMPLETE",
  "container_id": "taskqueue-da2e6878-e202-40d4-9b7a-21706f3a2b13-c23f1166",
  "updated_at": "2025-05-22T22:49:03.915891Z",
  "created_at": "2025-05-22T22:49:03.832363Z",
  "outputs": [],
  "stats": {
    "active_containers": 1,
    "queue_depth": 0
  },
  "stub": {
    "id": "da2e6878-e202-40d4-9b7a-21706f3a2b13",
    "name": "taskqueue/serve/app:handler",
    "type": "taskqueue/serve",
    "config": {
      "runtime": {
        "cpu": 1000,
        "gpu": "",
        "gpu_count": 1,
        "memory": 128,
        "image_id": "d055bc4ee4ad0e61",
        "gpus": ["A10G"]
      },
      "handler": "app:handler",
      "on_start": "",
      "on_deploy": "",
      "on_deploy_stub_id": "",
      "python_version": "python3",
      "keep_warm_seconds": 10,
      "max_pending_tasks": 100,
      "callback_url": "",
      "task_policy": {
        "max_retries": 3,
        "timeout": 3600,
        "expires": "0001-01-01T00:00:00Z",
        "ttl": 7200
      },
      "workers": 1,
      "concurrent_requests": 1,
      "authorized": true,
      "volumes": null,
      "autoscaler": {
        "type": "queue_depth",
        "max_containers": 1,
        "tasks_per_container": 1,
        "min_containers": 0
      },
      "extra": {},
      "checkpoint_enabled": false,
      "work_dir": "",
      "entry_point": null,
      "ports": null
    },
    "config_version": 0,
    "created_at": "2025-05-22T22:48:57.156033Z",
    "updated_at": "2025-05-22T22:48:57.156033Z"
  },
  "deployment": {
    "name": null,
    "version": null
  }
}
```

## Cancelling Tasks

Tasks can be cancelled through the `api.beam.cloud/v2/task/cancel/` endpoint.

### Request

```bash theme={null}
curl -X DELETE --compressed 'https://api.beam.cloud/v2/task/cancel/' \
  -H 'Authorization: Bearer [YOUR_TOKEN]' \
  -H 'Content-Type: application/json' \
  -d '{"task_ids": ["TASK_ID"]}'
```

This API accepts a list of tasks, which can be passed in like this:

```json theme={null}
{
  "task_ids": [
    "70101e46-269c-496b-bc8b-1f7ceeee2cce",
    "81bdd7a3-3622-4ee0-8024-733227d511cd",
    "7679fb12-94bb-4619-9bc5-3bd9c4811dca"
  ]
}
```

### Response

`200`

```json theme={null}
{}
```


# Running Async Tasks
Source: https://docs.beam.cloud/v2/task-queue/running-tasks


### What Are Task Queues?

Task Queues are great for deploying resource-intensive functions on Beam.

Instead of processing tasks immediately, the task queue enables you to add tasks to a queue and process them later, either sequentially or concurrently.

### Creating a Task Queue

You can run any function as a task queue by using the `task_queue` decorator:

```python theme={null}
from beam import task_queue, Output


@task_queue(cpu=1.0, memory=128)
def handler():
    result = 839 * 18

    # Save the result to a text file
    file_name = "result.txt"
    with open(file_name, "w") as f:
        f.write(f"The result is: {result}")

    # Upload task result to Beam to retrieve later
    Output(path=file_name).save()
```

You’ll be able to access the `result.txt` file when the task completes.

<Tip>
  **Endpoints vs. Task Queues**

  Endpoints are RESTful APIs, designed for synchronous tasks that can complete in 180 seconds or less. For longer running tasks, you'll want to use an async [`task_queue`](/v2/task-queue/running-tasks) instead.
</Tip>

### Sending Async Requests

Because task queues run asynchronously, the API will return a Task ID.

**Example Request**

```bash Request theme={null}
  curl -X POST "https://9655d778-58c2-4c5d-8c55-03735b63607e.app.beam.cloud" \
   -H 'Authorization: Basic [YOUR_AUTH_TOKEN]' \
   -H 'Content-Type: application/json' \
   -d '{}'
```

**Example Response**

```bash Response theme={null}
{ "task_id": "edbcf7ff-e8ce-4199-8661-8e15ed880481" }
```

### Viewing Task Responses

Because `task_queue` is async, you will need to make a separate API call to retrieve the task output.

### Saving and Returning Output Files

You can save files using Beam's [Output](/v2/reference/py-sdk#output) class.

The code below saves a file, wraps it in an `Output`, and generates a URL that can be retrieved later:

```python app.py theme={null}
from beam import task_queue, Output


@task_queue(
    cpu=1.0,
    memory=128,
    gpu="A10G",
    callback_url="https://webhook.site/9b74f73d-9ec1-4c8e-adcc-07c78aafab6d",
)
def handler():
    sum = 839 * 18

    # Create a new text file with the result
    file_name = "sum.txt"

    # Write to new text file
    with open(file_name, "w") as f:
        f.write(f"The sum is: {sum}")

    # Save output
    output_file = Output(path=file_name)
    # Uploads the file to Beam storage
    output_file.save()
```

### Retrieving Results

There are two ways to retrieve response payloads:

1. Beam makes a webhook request to your server, based on the [`callback_url`](/v2/topics/callbacks) in your endpoint
2. Saving an `Output` and calling the `/task` API

#### Webhooks

If you've added a [`callback_url`](/v2/topics/callbacks) to your decorator, Beam will fire a webhook to your server with the task response when it completes:

```json theme={null}
{
  "data": {
    "url": "https://app.beam.cloud/output/id/00894876-38df-42c8-a098-879db17e1bf8"
  }
}
```

<Tip>
  For testing purposes, you can setup a temporary webhook URL using
  [https://webhook.site](https://webhook.site)
</Tip>

#### Polling for Results

`Output` payloads can be retrieved by polling the `task` API:

```bash theme={null}
curl -X GET \
  'https://api.beam.cloud/v2/task/{TASK_ID}/' \
  -H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
  -H 'Content-Type: application/json'
```

Your Output will be available in the `outputs` list in the response:

```json theme={null}
{
  "id": "828a5f6b-0852-44cb-97dc-3c2105b745d3",
  "started_at": "2025-05-22T23:19:58.995396Z",
  "ended_at": "2025-05-22T23:19:59.061813Z",
  "status": "COMPLETE",
  "container_id": "taskqueue-2365b036-39df-408f-946f-b25025d1251a-bf09bf62",
  "updated_at": "2025-05-22T23:19:59.063168Z",
  "created_at": "2025-05-22T23:19:58.950594Z",
  "outputs": [
    {
      "name": "sum.txt",
      "url": "https://app.beam.cloud/output/id/c339b459-34de-4f0c-adb9-8be7c20951ce",
      "expires_in": 3600
    }
  ],
  "stats": {
    "active_containers": 1,
    "queue_depth": 0
  }
}
```

### Retry Behavior

Task Queues include a built-in retry system. If a task fails for any reason,
such as out-of-memory error or an application exception, your task will be
retried three times before automatically moving to a failed state.

### Programmatically Enqueuing Tasks

You can interact with the task queue either through an API (when deployed), or directly in Python through the `.put()` method.

<Tip>
  This is useful for queueing tasks programmatically without exposing an
  endpoint.
</Tip>

```python app.py theme={null}
from beam import task_queue, Image


@task_queue(
    cpu=1.0,
    memory=128,
    gpu="T4",
    image=Image(python_packages=["torch"]),
    keep_warm_seconds=1000,
)
def multiply(x):
    result = x * 2
    return {"result": result}

# Manually insert task into the queue
multiply.put(x=10)
```

If invoked directly from your local computer, the code above will produce this output:

```
$ python app.py

=> Building image
=> Using cached image
=> Syncing files
=> Files synced

Enqueued task: f0d205da-e74b-47ba-b7c3-8e1b9a3c0669
```


# Task Callbacks
Source: https://docs.beam.cloud/v2/topics/callbacks

Setup a callback to your server when a task finishes running

If you supply a `callback_url` argument to your function decorator, Beam will make a POST request to your server whenever a task finishes running. *Callbacks fire for both successful and failed tasks.*

Callbacks include the Beam Task ID in the request headers, and the task response URL-encoded in the request body.

<Tip>
  For testing purposes, you can setup a temporary webhook URL using
  [https://webhook.site](https://webhook.site)
</Tip>

## Registering a Callback URL

Callbacks can be added onto endpoints, functions, and task queues:

```python theme={null}
from beam import function


@function(callback_url="https://your-server.io")
def handler(x):
    return {"result": x}

if __name__ == "__main__":
    handler.remote(x=10)
```

## Callback format

### Data Payload

The callback will send the response from your function as JSON, in the `data` field:

```
{
  "data": {
    "result": 10
  }
}
```

## Request headers

The request headers include the following fields:

* `x-task-timestamp` -- timestamp the task was created.
* `x-task-signature` -- signature to verify that the request was sent from Beam.
* `x-task-status` -- status of the task.
* `x-task-id` -- the task ID.

## Request Level Callbacks

There are cases where you might want to define a different `callback_url` for each request, for example if you have different environments for staging and prod.

You can pass `callback_url` as a payload to anything you're running on Beam, and we'll use that as the callback for the request:

```sh theme={null}
curl -X POST \
  --compressed 'https://multiply-712408b-v1.app.beam.cloud' \
  -H 'Authorization: [YOUR_AUTH_TOKEN]' \
  -H 'Content-Type: application/json' \
  -d '{"callback_url": "https://webhook.site/341d3777-cdd0-4c7e-82cb-dcc06ea4f774"}'
```

<Warning>
  When using request-level callbacks, you must include either the `callback_url` value or kwargs (`**inputs`) as input to the handler function:

  ```python theme={null}
  from beam import endpoint


  @endpoint()
  def handler(callback_url): # Make sure to pass this value!
      return {"response": "true"}

  @endpoint()
  def handler(**inputs): # You can use kwargs too
      return {"response": "true"}
  ```
</Warning>

## Verifying Requests

### Timestamp Verification

To secure your server against replay attacks, a **timestamp** and **signature** are included in the callback request headers.

As a best-practice, it is wise to check the timestamp header of each callback request. If the timestamp is over 5s old, there is a risk that the callback was not fired from Beam.

### Signature Verification

The most secure way of verifying a callback request is through **signature verification**.

Your Signature Token can be found in the dashboard, on the `Settings` -> `General` page.

<Frame>
  <img />
</Frame>

#### Validating a Signature

The callback request will include a header field called `x-task-signature`.

`x-task-signature` is a unique signature generated by converting the request body to base64, concatenating it with the timestamp, and signing it with your Beam **signature token**.

The code below shows how to validate a callback signature:

```python theme={null}
import base64
import hashlib
import hmac


def verify_signature(
    request_body: bytes, secret_key: str, timestamp: int, signature: str
):
    # Encode request body to Base64
    base64_payload = base64.b64encode(request_body).decode()

    # Create data to sign by concatenating base64 payload with timestamp
    data_to_sign = f"{base64_payload}:{timestamp}"

    # Initialize HMAC with SHA256 and secret key
    h = hmac.new(secret_key.encode(), data_to_sign.encode(), hashlib.sha256)

    # Compute the HMAC signature
    computed_signature = h.hexdigest()
    assert signature == computed_signature
```


# Integrate into CI/CD
Source: https://docs.beam.cloud/v2/topics/ci

You can integrate Beam into an existing CI/CD process to deploy your code automatically.

## Automated Deploys

It's fairly straightforward to setup automation for deploying your code to Beam. At a high level, the following steps are all you need:

```sh theme={null}
pip3 install --upgrade pip
pip3 install beam-client
beam configure default --token $BEAM_TOKEN
beam deploy file.py:function
```

## Example: Github Actions

You can setup a Github workflow to deploy your code whenever a new commit is made to your Git repo.

### Setup Environment Variables

First, add your `BEAM_TOKEN` to your [Github Secrets](https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions#creating-secrets-for-a-repository):

<Frame>
  <img />
</Frame>

<Frame>
  <img />
</Frame>

### Create Actions file

<Info>
  For a detailed walk-through of this step, [Github's
  documentation](https://docs.github.com/en/actions/quickstart) is the best
  resource.
</Info>

1. Create a directory called `.github/workflows` in your project.
2. In the `.github/workflows` directory, create a file named `beam-actions.yml`

### Deploying to Different Environments

You might want to setup separate Beam apps for your `staging` or `prod` environments.

In your Beam app, you can setup your app name to dynamically update based on the Github branch you've deployed to. `BEAM_DEPLOY_ENV` will get set in our Github Actions script, based on the branch name:

```python app.py theme={null}
from beam import endpoint
import os

@endpoint(name=f'app-{os.getenv("BEAM_DEPLOY_ENV", "staging")}')
def handler():
  return {}
```

If you push to the `main` branch, the app `app-prod` will be deployed. If you push to the `staging` branch, `app-staging` will be deployed. You can customize this with your own branch names.

Here's what the Github Action looks like. Make sure you've added a `BEAM_TOKEN` to your [Github Secrets](https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions#creating-secrets-for-a-repository):

```yaml beam-actions.yml theme={null}
name: Deploy to Beam

on:
  push:
    branches:
      - main
      - staging

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Set environment variables
        run: |
          if [[ "${{ github.ref }}" == 'refs/heads/main' ]]; then
            echo "Setting environment variables: PROD"
            echo "BEAM_DEPLOY_ENV=prod" >> $GITHUB_ENV
          elif [[ "${{ github.ref }}" == 'refs/heads/staging' ]]; then
            echo "Setting environment variables: STAGING"
            echo "BEAM_DEPLOY_ENV=staging" >> $GITHUB_ENV
          fi

      - name: Authenticate and deploy to Beam
        env:
          BEAM_TOKEN: ${{ secrets.BEAM_TOKEN }}
        run: |
          pip3 install --upgrade pip
          pip3 install beam-client
          pip3 install fastapi

          echo "beam configure default --token $BEAM_TOKEN"
          beam configure default --token $BEAM_TOKEN
          beam deploy app.py:function
```

When you push to either `main` or `staging`, a new app will be deployed for each push:

<Frame>
  <img />
</Frame>


# Cold Start Performance
Source: https://docs.beam.cloud/v2/topics/cold-start


This page covers a list of optimizations to make your containers boot up as fast as possible.

# Cold Start Optimizations

## Cache Models in Volumes

To avoid downloading your models from the internet on each request, you can cache them in Beam's Volumes.

In the example below, the models are saved to the Volume by passing the `cache_dir` argument in the Huggingface Transformers method:

```python theme={null}
from beam import Image, endpoint, Volume

# Path to cache model weights
CACHE_PATH = "./weights"

@endpoint(
    volumes=[Volume(name="weights", mount_path=CACHE_PATH)],
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(
        python_version="python3.9",
        python_packages=[
            "transformers",
            "torch",
        ],
    ),
)
def predict():
    from transformers import AutoTokenizer, OPTForCausalLM
    import torch

    model = OPTForCausalLM.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)

    # Run inference
    model.generate(...)
    return {"text": ""}
```

Alternatively, if you want to use Transformers' pipeline abstraction, you can pass the `cache_dir` argument to the underlying models using the `model_kwargs` argument of the pipeline:

```python theme={null}
from beam import Image, endpoint, Volume

# Path to cache model weights
CACHE_PATH = "./weights"

@endpoint(
    volumes=[Volume(name="weights", mount_path=CACHE_PATH)],
    ...
)
def predict():
    from transformers import pipeline

    # Load the model
    generator = pipeline(
        "text-generation",
        model="facebook/opt-125m",
        model_kwargs={"cache_dir": CACHE_PATH},
    )

    # Run inference
    generator(...)
    return {"text": ""}
```

## Load Models Using `on_start`

In addition to using a Volume, it's best-practice to ensure models are only loaded *once* when the container first starts. Beam lets you define an `on_start` function that will run exactly *once* when the container first starts:

This example combines the `on_start` functionality with the Volume caching:

```python theme={null}
from beam import Image, endpoint, Volume

# Path to cache model weights
CACHE_PATH = "./weights"


# This runs once when the container first starts
def download_models():
    from transformers import AutoTokenizer, OPTForCausalLM
    import torch

    model = OPTForCausalLM.from_pretrained("facebook/opt-125m", cache_dir=CACHE_PATH)
    return model


@endpoint(
    on_start=download_models,
    volumes=[Volume(name="weights", mount_path=CACHE_PATH)],
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(
        python_version="python3.9",
        python_packages=[
            "transformers",
            "torch",
        ],
    ),
)
def predict(context):
    # Retrieve cached model and tokenizer from on_start function
    model = context.on_start_value

    # Run inference
    model.generate(...)
    return {"text": ""}
```

## Enable Checkpoint Restore (New)

This allows you to specify a `checkpoint_enabled` flag on your decorator, which will capture a memory snapshot of the running container after `on_start` completes. This means that you can load a model onto a GPU, run some setup logic, and when the app cold starts, it will start *right from that point*.

```python theme={null}
@endpoint(
    secrets=["HF_TOKEN"],
    on_start=load_models,
    name="meta-llama-3.1-8b-instruct",
    cpu=2,
    memory="16Gi",
    gpu="H100",
    keep_warm_seconds=30,
    checkpoint_enabled=True # Add this field!
)
```

Checkpoint restore is available on these GPU types:

* RTX4090
* H100
* A10G

### Notes

* If checkpoint fails, please forward us any errors that appear in logs. It's likely the reason for failure is a missing volume -- to resolve that you need to ensure the cache path is set properly for the model.
* If checkpoint fails, the deployment will revert to standard cold boots. To try checkpointing again, you will need to redeploy.

<Info>
  Checkpoints can take up to 3 minutes to capture, and 5 minutes to distribute
  among our servers. To properly benchmark the cold start improvement, you need
  to call the app after it has been spun down for a few minutes. Otherwise it
  may block as the checkpoint is syncing.
</Info>

## Measuring Cold Start

We've made it easier to optimize your cold starts by adding a cold start profile to each task.

You can view the cold start profile of a task by clicking on any task in the tasks table.

<Frame>
  <img />
</Frame>

This breakdown shows the entire lifecycle of your task: spinning up a container, running your `on_start` function, and running the task itself.

Here's a breakdown of a serverless cold start:

* **Container Start Time**. This is typically under 1s.
* **Image Load Time**. Pulling your container image from our image cache. This varies based on the size of your model and the dependencies you've added.
* **Application Start Time**. Running your code. This is the time running your `on_start`, and loading it on the GPU.


# Runtime Variables
Source: https://docs.beam.cloud/v2/topics/context

Accessing information about the runtime while running tasks

## Available Runtime Variables

In order to access information about the runtime while running a task, you can use the `context` value.

`context` includes important contextual information about the runtime, like the current `task_id` and `callback_url`.

| Field Name       | Purpose                                                |
| ---------------- | ------------------------------------------------------ |
| `container_id`   | Unique identifier for a container                      |
| `stub_id`        | Identifier for a stub                                  |
| `stub_type`      | Type of the stub (function, endpoint, task queue, etc) |
| `callback_url`   | URL called when the task status changes                |
| `task_id`        | Identifier for the specific task                       |
| `timeout`        | Maximum time allowed for the task to run (seconds)     |
| `on_start_value` | Any values returned from the `on_start` function       |
| `bind_port`      | Port number to bind a service to                       |
| `python_version` | Version of Python to be used                           |

## Using a Runtime Variable

Any of the fields above can be accessed on the `context` variable:

```python theme={null}
from beam import task_queue

@task_queue()
def handler(context):
    task_id = context.task_id
    return {}
```


# Public Endpoints
Source: https://docs.beam.cloud/v2/topics/public-endpoints

Deploying public web endpoints on Beam

## Creating a Public Endpoint

By default, endpoints are private and require a bearer token to access. You can remove the authentication requirement for endpoints using the `Authorized=False` argument:

```python auth.py theme={null}
from beam import endpoint


@endpoint(authorized=False)  # Disable authentication
def create_public_endpoint():

    print("This API can be invoked without an auth token")
    return {"success": "true"}
```

## Invoking a Public Endpoint

Public endpoints have slightly different URL schemes than private ones:

```
https://app.beam.cloud/endpoint/public/[STUB-ID]
```

```
https://app.beam.cloud/endpoint/public/4f78aaae-f35c-4eb0-9236-cdd34509bad8
```

<Tip>
  You can find your **Stub ID** on the deployment detail page in the web dashboard.
</Tip>

You can view your the API URL by clicking the `Call API` button on the deployment detail page in the web dashboard.

A full request to a public endpoint might look something like this:

```bash theme={null}
curl -X POST \
--compressed 'https://app.beam.cloud/endpoint/public/4f78aaae-f35c-4eb0-9236-cdd34509bad8' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-d '{}'
```


# Send Events Between Apps
Source: https://docs.beam.cloud/v2/topics/signal


There are certain cases where you'll want to send events between different apps running on Beam.

A common scenario is if you have a model inference and retraining pipeline, where the inference app (App #1) needs to use the latest version of a trained model (App #2).

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/experimental/signals">
  See the code for this example on Github.
</Card>

## Invoking Functions in Other Apps

This example demonstrates how to invoke functions in other apps on Beam. Specifically, we cover the scenario with an inference and a retraining function.

The retraining function needs a way to tell the inference function to use the latest model.

We use an `experimental.Signal()`, which is a special type of event listener that can be triggered from the retrain function.

### App 1: Retraining App

This is the retraining app. Below, we register a `Signal` that will fire an event to our inference app, which is subscribed to this Signal event.

```python theme={null}
from beam import endpoint, experimental

@endpoint(name="trainer")
def train():
    # Send a signal to another app letting it know that it needs to reload the models
    s = experimental.Signal(name="reload-model")
    s.set(ttl=60)
```

### App 2: Inference App

Below is the inference app, which needs to reload the `on_start` function when retraining is finished.

You'll notice that a Signal is registered with a handler that tells us which function to run when an event is fired.

```python theme={null}
from beam import endpoint, Volume, experimental, Image

VOLUME_NAME = "brand_classifier"
CACHE_PATH = f"./{VOLUME_NAME}-cache"


def load_latest_model():
    # Preload and return the model and tokenizer
    global model, tokenizer
    print("Loading latest...")
    model = lambda x: x + 1  # This is just example code

    s.clear()  # Clear the signal so it doesn't fire again


# Set a signal handler - when invoked, it will run the handler function
s = experimental.Signal(
    name="reload-model",
    handler=load_latest_model,
)


@endpoint(
    name="inference",
    volumes=[Volume(name=VOLUME_NAME, mount_path=CACHE_PATH)],
    image=Image(python_packages=["transformers", "torch"]),
    on_start=load_latest_model,
)
def predict(**inputs):
    global model, tokenizer  # These will have the latest values

    return {"success": "true"}
```

To test this example, you can open two terminal windows:

* In window 1, serve and invoke the inference function
* In window 2, serve and invoke the retrain function

Look at the logs in window 1 -- you'll notice that the signal has fired, and `load_latest_model` ran again.

## Clearing Signals

Signals will refresh every 1 second while a container is running, until `signal.clear()` is called. It is recommended to run `signal.clear()` after each signal invovocation.


# Timeouts and Retries
Source: https://docs.beam.cloud/v2/topics/timeouts-and-retries


You can customize the default timeout and retry behavior for your tasks.

# Timeouts

### Default Timeouts

Tasks automatically timeout after 20 minutes *if they haven't started running*. This default exists to prevent stuck tasks from consuming compute resources and potentially blocking other tasks in the queue.

### Customizing Timeouts

You can specify your own timeouts. Timeouts can be used for endpoints, task queues, and functions:

```python timeout.py theme={null}
from beam import function
import time


@function(timeout=600) # Override default timeout
def timeout():
    import time

    # Without the timeout specified above, this function would timeout at 300s
    time.sleep(350)


if __name__ == "__main__":
    timeout()
```

# Retries

Beam includes retry logic, which can be customized using the parameters below.

### Max Retries

You can configure tasks to automatically retry based on a specific exception in your app.

In the example below, we'll specify `retries` and `retry_for`:

```python timeout.py theme={null}
from beam import task_queue


@task_queue(retries=2, retry_for=[Exception])  # Override default retry logic
def handler():
    raise Exception("Something went wrong, retry please!")
```

### Retry for Exceptions

When the task is invoked, we'll see the exception get caught and the task automatically retry:

```sh theme={null}
Running task <87067d0e-5900-413b-a3a3-5ee4c85706ad>
Traceback (most recent call last):
  File "/mnt/code/app.py", line 6, in handler
    raise Exception("Something went wrong, retry please!")

Exception: Something went wrong, retry please!
retry_for error caught: Exception('Something went wrong, retry please!')
Retrying task <87067d0e-5900-413b-a3a3-5ee4c85706ad> after Exception exception

Running task <87067d0e-5900-413b-a3a3-5ee4c85706ad>
```