> ## Documentation Index
> Fetch the complete documentation index at: https://docs.beam.cloud/llms.txt
> Use this file to discover all available pages before exploring further.

# Text-to-Video with Mochi

This guide demonstrates how to run the Mochi-1 text-to-video model on Beam. Mochi-1 is a powerful model for generating high-quality videos based on text prompts.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/video_models/mochi1">
  See the code for this example on Github.
</Card>

## Introduction

Mochi-1 is a state-of-the-art text-to-video model. This guide will help you deploy and use the model as a serverless API on Beam.

## Upload Model Weights

Before using the Mochi-1 model, you need to upload its weights to Beam. This is handled by the `upload.py` script:

```python theme={null}
from beam import function, Volume, Image, env

if env.is_remote():
    from huggingface_hub import snapshot_download

VOLUME_PATH = "./mochi-1-preview"

@function(
    image=Image(
        python_packages=["huggingface_hub", "huggingface_hub[hf_xet]"]
    ),
    memory="32Gi",
    cpu=4,
    secrets=["HF_TOKEN"],
    volumes=[Volume(name="mochi-1-preview", mount_path=VOLUME_PATH)],
)
def upload():
    snapshot_download(
        repo_id="genmo/mochi-1-preview",
        local_dir=f"{VOLUME_PATH}/weights"
    )
    print("Files uploaded successfully")

if __name__ == "__main__":
    upload()
```

### Steps to Run the Script

Run the script locally to upload the weights:

```bash theme={null}
python upload.py
```

Once the weights are uploaded, the `generate_video` endpoint can access them for inference.

## Setup Remote Environment

The model and its dependencies are defined in the `mochi_image`. Here’s how it’s configured:

```python theme={null}
from beam import endpoint, env, Volume, Image, Output

VOLUME_PATH = "./mochi-1-preview"

if env.is_remote():
    import torch
    from diffusers import MochiPipeline
    from diffusers.utils import export_to_video
    import uuid

def load_models():
    pipe = MochiPipeline.from_pretrained(
        f"{VOLUME_PATH}/weights", variant="bf16", torch_dtype=torch.bfloat16)
    return pipe
```

The `mochi_image` includes all necessary Python packages and system dependencies:

```python theme={null}
mochi_image = (
    Image(
        python_version="python3.11",
        python_packages=["torch", "transformers", "accelerate",
                         "sentencepiece", "imageio-ffmpeg", "imageio", "ninja"]
    )
    .add_commands(["apt update && apt install git -y", "pip install git+https://github.com/huggingface/diffusers.git"])
)
```

## Inference Function

The `generate_video` function processes text prompts and generates a video:

```python theme={null}
@endpoint(
    name="mochi-1-preview",
    on_start=load_models,
    cpu=4,
    memory="32Gi",
    gpu="A10G",
    gpu_count=2,
    image=mochi_image,
    volumes=[Volume(name="mochi-1-preview", mount_path=VOLUME_PATH)],
    timeout=-1
)
def generate_video(context, **inputs):
    pipe = context.on_start_value

    prompt = inputs.pop("prompt", None)

    if not prompt:
        return {"error": "Please provide a prompt"}

    pipe.enable_model_cpu_offload()
    pipe.enable_vae_tiling()
    frames = pipe(prompt, num_frames=40).frames[0]

    file_name = f"/tmp/mochi_out_{uuid.uuid4()}.mp4"

    export_to_video(frames, file_name, fps=15)

    output_file = Output(path=file_name)
    output_file.save()
    public_url = output_file.public_url(expires=-1)
    print(public_url)
    return {"output_url": public_url}
```

## Deployment

Deploy the API to Beam:

```bash theme={null}
beam deploy app.py:generate_video
```

## Invoking the API

To invoke the API, send a POST request with the following payload:

```json theme={null}
{
  "prompt": "The camera follows behind a rugged green Jeep with a black snorkel as it speeds along a narrow dirt trail cutting through a dense jungle. Thick vines hang from towering trees with sprawling canopies, their leaves forming a vibrant green tunnel above the vehicle. Mud splashes up from the Jeep’s tires as it powers through a shallow stream crossing the path. Sunlight filters through gaps in the trees, casting dappled golden light over the scene. The dirt trail twists sharply into the distance, overgrown with wild ferns and tropical plants. The vehicle is seen from the rear, leaning into the curve as it maneuvers through the untamed terrain, emphasizing the adventure of the rugged journey. The surrounding jungle is alive with texture and color, with distant mountains barely visible through the mist and an overcast sky heavy with the promise of rain."
}
```

Here’s an example of a cURL request:

```bash theme={null}
curl -X POST 'https://app.beam.cloud/endpoint/id/[ENDPOINT-ID]' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [AUTH-TOKEN]' \
-d '{
    "prompt": "Your text prompt for video generation."
}'
```

## Example Output

The API will return a generated video URL. Here’s an example:

```json theme={null}
{
  "output_url": "https://app.beam.cloud/output/id/dc443a80-7fcc-42bc-928b-4605e41b0825"
}
```

## Example Video

Here is an example video generated by the Mochi-1 model:

<video controls>
  <source src="https://app.beam.cloud/output/id/dc443a80-7fcc-42bc-928b-4605e41b0825" type="video/mp4" />
</video>

## Summary

You’ve successfully deployed and tested a Mochi-1 text-to-video generation API using Beam.
