cache_dir
argument in the Huggingface Transformers method:
cache_dir
argument to the underlying models using the model_kwargs
argument of the pipeline:
on_start
on_start
function that will run exactly once when the container first starts:
This example combines the on_start
functionality with the Volume caching:
on_start
function, and running the task itself.
Here’s a breakdown of a serverless cold start:
on_start
, and loading it on the GPU.