Running Tasks on GPU
You can run any code on a cloud GPU by passing a gpu argument in your function decorator.
from beam import endpoint
@endpoint(gpu="H100")
def handler():
# Prints the available GPU drivers
import subprocess
print(subprocess.check_output(["nvidia-smi"], shell=True))
return {"gpu":"true"}
Available GPUs
Currently available GPU options are:
A10G (24Gi)
RTX4090 (24Gi)
H100 (80Gi)
Check GPU Availability
Run beam machine list to check whether a machine is available.
$ beam machine list
GPU Type Available
──────────────────────
A10G ✅
RTX4090 ✅
Prioritizing GPU Types
You can split traffic across multiple GPUs by passing a list to the gpu parameter.
The list is ordered by priority. You can choose which GPUs to prioritize by specifying them at the front of the list.
gpu=["T4", "A10G", "H100"]
In this example, the T4 is prioritized over the A10G, followed by the H100.
Using Multiple GPUs
You can run workloads across multiple GPUs by using the gpu_count parameter.
This feature is available by request only. Please send us a message in
Slack, and we’ll enable it on your account.
from beam import endpoint
@endpoint(gpu="A10G", gpu_count=2)
def handler():
return {"hello": "world"}
GPU Regions
Beam runs on servers distributed around the world, with primary locations in the United States, Europe, and Asia. If you would like your workloads to run in a specific region of the globe, please reach out.