Endpoints and Web Servers
Pre-Loading Models
This guide shows how you can optimize performance by pre-loading models when your container first starts.
Beam includes an optional on_start
lifecycle hook which you can add to your functions. The on_start
function will be run exactly once when your container first starts.
app.py
Anything returned from on_start
can be retrieved in the context
variable that is automatically passed to your handler:
Example: Downloading Model Weights
Using Loaders with Multiple Workers
If you are scaling out vertically with workers, the loader function will run once for each worker that starts up.