Endpoints and Web Servers
Pre-Loading Models
This guide shows how you can optimize performance by pre-loading models when your container first starts.
Beam includes an optional on_start
lifecycle hook which you can add to your functions. The on_start
function will be run exactly once when your container first starts.
app.py
Anything returned from on_start
can be retrieved in the context
variable that is automatically passed to your handler:
Example: Downloading Model Weights
Using Loaders with Multiple Workers
If you are scaling out vertically with workers, the loader function will run once for each worker that starts up.
Was this page helpful?