Skip to main content
Browse complete, runnable examples grouped by what you’re building. Each one shows a real workload, from container image to deployment, that you can copy and adapt. New to Beam? Start with the Quickstart and Core Concepts first.

Large Language Models

Serve and run inference with open and custom LLMs.

Hugging Face Models

A beginner’s guide to running performant inference workloads on Beam.

LLaMA 3.1 8B

Serve Meta’s LLaMA 3.1 8B model on a GPU.

Run an OpenAI-Compatible vLLM Server

Host an OpenAI-compatible inference server with vLLM.

Chat with DeepSeek R1

Run the DeepSeek R1 reasoning model.

Qwen2.5-7B with SGLang

Serve Qwen2.5-7B with the SGLang runtime.

Image and Video

Generate and transform images and video on GPUs.

Serverless ComfyUI

Host ComfyUI for image generation workflows.

Text-to-Video with Mochi

Generate video from text with the Mochi model.

Stable Diffusion with LoRAs

Run Stable Diffusion with custom LoRA adapters.

Audio and Transcription

Transcribe and synthesize speech.

Faster Whisper

Transcribe audio with Faster Whisper.

Parler TTS

Synthesize speech with Parler TTS.

Zonos

Generate speech with the Zonos model.

Web Apps

Host interactive apps and scrape the web.

Web Scraping with Beam Functions

Build a web scraper that runs on Beam functions.

Running Streamlit Apps

Host a Streamlit app behind a public URL.

Agents

Build and coordinate AI agents.

Building AI Agents

Build stateful agents with concurrency built in.

Research Assistant

A research assistant that synchronizes state across tasks.

Fine-Tuning

Fine-tune open models on GPUs.

Fine-tuning Gemma with LoRA

Fine-tune Google’s Gemma model with LoRA.

Fine-Tuning Llama 3.1 8B with Unsloth

Fast fine-tuning of Llama 3.1 8B with Unsloth.