Examples

Browse complete, runnable examples grouped by what you’re building. Each one shows a real workload, from container image to deployment, that you can copy and adapt. New to Beam? Start with the Quickstart and Core Concepts first.

Large Language Models

Serve and run inference with open and custom LLMs.

Hugging Face Models

A beginner’s guide to running performant inference workloads on Beam.

LLaMA 3.1 8B

Serve Meta’s LLaMA 3.1 8B model on a GPU.

Run an OpenAI-Compatible vLLM Server

Host an OpenAI-compatible inference server with vLLM.

Chat with DeepSeek R1

Run the DeepSeek R1 reasoning model.

Qwen2.5-7B with SGLang

Serve Qwen2.5-7B with the SGLang runtime.

Image and Video

Generate and transform images and video on GPUs.

Serverless ComfyUI

Host ComfyUI for image generation workflows.

Text-to-Video with Mochi

Generate video from text with the Mochi model.

Stable Diffusion with LoRAs

Run Stable Diffusion with custom LoRA adapters.

Audio and Transcription

Transcribe and synthesize speech.

Faster Whisper

Transcribe audio with Faster Whisper.

Parler TTS

Synthesize speech with Parler TTS.

Zonos

Generate speech with the Zonos model.

Web Apps

Host interactive apps and scrape the web.

Web Scraping with Beam Functions

Build a web scraper that runs on Beam functions.

Running Streamlit Apps

Host a Streamlit app behind a public URL.

Agents

Build and coordinate AI agents.

Building AI Agents

Build stateful agents with concurrency built in.

Research Assistant

A research assistant that synchronizes state across tasks.

Fine-Tuning

Fine-tune open models on GPUs.

Fine-tuning Gemma with LoRA

Fine-tune Google’s Gemma model with LoRA.

Fine-Tuning Llama 3.1 8B with Unsloth

Fast fine-tuning of Llama 3.1 8B with Unsloth.

Hugging Face ModelsA beginner's guide to running highly performant inference workloads on Beam.

Overview

Large Language Models (LLMs)

Image and Video

Audio and Transcription

Web Apps

Agents

Fine-Tuning

Large Language Models

Hugging Face Models

LLaMA 3.1 8B

Run an OpenAI-Compatible vLLM Server

Chat with DeepSeek R1

Qwen2.5-7B with SGLang

Image and Video

Serverless ComfyUI

Text-to-Video with Mochi

Stable Diffusion with LoRAs

Audio and Transcription

Faster Whisper

Parler TTS

Zonos

Web Apps

Web Scraping with Beam Functions

Running Streamlit Apps

Agents

Building AI Agents

Research Assistant

Fine-Tuning

Fine-tuning Gemma with LoRA

Fine-Tuning Llama 3.1 8B with Unsloth

​Large Language Models

Hugging Face Models

LLaMA 3.1 8B

Run an OpenAI-Compatible vLLM Server

Chat with DeepSeek R1

Qwen2.5-7B with SGLang

​Image and Video

Serverless ComfyUI

Text-to-Video with Mochi

Stable Diffusion with LoRAs

​Audio and Transcription

Faster Whisper

Parler TTS

Zonos

​Web Apps

Web Scraping with Beam Functions

Running Streamlit Apps

​Agents

Building AI Agents

Research Assistant

​Fine-Tuning

Fine-tuning Gemma with LoRA

Fine-Tuning Llama 3.1 8B with Unsloth

Large Language Models

Image and Video

Audio and Transcription

Web Apps

Agents

Fine-Tuning