> ## Documentation Index
> Fetch the complete documentation index at: https://docs.beam.cloud/llms.txt
> Use this file to discover all available pages before exploring further.

# Faster Whisper

This guide will walk you through deploying and invoking a transcription API using the Faster Whisper model on Beam. The API can be invoked with either a URL to an `.mp3` file or a base64-encoded audio file.

<Card title="View the Code" icon="github" href="https://github.com/beam-cloud/examples/tree/main/audio_and_transcription/faster_whisper">
  See the code for this example on Github.
</Card>

## Initial Setup

In your Python file, add the following code to define your endpoint and handle the transcription:

```python app.py theme={null}
from beam import endpoint, Image, Volume, env
import base64
import requests
from tempfile import NamedTemporaryFile

BEAM_VOLUME_PATH = "./cached_models"

# These packages will be installed in the remote container
if env.is_remote():
    from faster_whisper import WhisperModel, download_model

# This runs once when the container first starts
def load_models():
    model_path = download_model("large-v3", cache_dir=BEAM_VOLUME_PATH)
    model = WhisperModel(model_path, device="cuda", compute_type="float16")
    return model

@endpoint(
    on_start=load_models,
    name="faster-whisper",
    cpu=2,
    memory="32Gi",
    gpu="A10G",
    image=Image(
        base_image="nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04",
        python_version="python3.10",
    )
    .add_python_packages(["git+https://github.com/SYSTRAN/faster-whisper.git", "huggingface_hub[hf-transfer]"])
    .with_envs("HF_HUB_ENABLE_HF_TRANSFER=1"),
    volumes=[
        Volume(
            name="cached_models",
            mount_path=BEAM_VOLUME_PATH,
        )
    ],
)
def transcribe(context, **inputs):
    # Retrieve cached model from on_start
    model = context.on_start_value

    # Inputs passed to API
    language = inputs.get("language")
    audio_base64 = inputs.get("audio_file")
    url = inputs.get("url")

    if audio_base64 and url:
        return {"error": "Only a base64 audio file OR a URL can be passed to the API."}
    if not audio_base64 and not url:
        return {
            "error": "Please provide either an audio file in base64 string format or a URL."
        }

    binary_data = None

    if audio_base64:
        binary_data = base64.b64decode(audio_base64.encode("utf-8"))
    elif url:
        resp = requests.get(url)
        binary_data = resp.content

    text = ""

    with NamedTemporaryFile() as temp:
        try:
            # Write the audio data to the temporary file
            temp.write(binary_data)
            temp.flush()

            segments, _ = model.transcribe(temp.name, beam_size=5, language=language)

            for segment in segments:
                text += segment.text + " "

            print(text)
            return {"text": text}

        except Exception as e:
            return {"error": f"Something went wrong: {e}"}
```

## Deployment

To deploy the app, run the following command:

<Info>
  If you named your file something different than `app.py`, make sure to
  customize the command with your correct file name.
</Info>

```python theme={null}
beam deploy app.py:transcribe
```

This command will deploy your app as a web endpoint. The endpoint URL will be printed out in the shell.

## Invoking the API

Once the API is running, you can invoke it with a URL to an `.mp3` file using the following cURL command:

<Tip>
  If you want to test with sample `.mp3` files, you can find many samples on
  [this website](https://audio-samples.github.io/).
</Tip>

```sh theme={null}
curl -X POST 'https://faster-whisper-7157fd0-v1.app.beam.cloud' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer [YOUR-AUTH-TOKEN]' \
-d '{"url":"https://audio-samples.github.io/samples/mp3/blizzard_unconditional/sample-0.mp3"}'
```

Replace the URL with the URL printed in your shell, and `[YOUR-AUTH-TOKEN]` with your authentication token.

## Summary

You've successfully set up a highly performant serverless API for transcribing audio files using the Faster Whisper model on Beam. The API can handle both URLs to audio files and base64-encoded audio files. With the provided setup, you can easily serve, invoke, and develop your transcription API.
