Deploying Huggingface Models

Define the environment

The first thing we’ll do is define the environment that our app will run on. For this example, we’re building a Sentiment Analysis model using Huggingface. First, you’ll define a Runtime with an Image. We’re going to be defining which packages to install in the runtime, and the hardware this code will run on.

app.py

from beam import App, Runtime, Image

app = App(
    name="sentiment-analysis",
    runtime=Runtime(
        cpu=2,
        memory="8Gi",
        image=Image(
            python_version="python3.9",
            python_packages=["transformers", "torch"],
        ),
    ),
)

Inference function

Now, we’ll write some code to predict the sentiment of a given text prompt. Our function takes keyword arguments, as (**inputs).

run.py

from transformers import pipeline

def predict_sentiment(**inputs):
    model = pipeline(
        "sentiment-analysis", model="siebert/sentiment-roberta-large-english"
    )
    result = model(inputs["text"], truncation=True, top_k=1)
    prediction = {i["label"]: i["score"] for i in result}

    print(prediction)

    return {"prediction": prediction}

Adding a REST API

To prepare to deploy the API, we’ll add a rest_api decorator to our inference function. Add the following decorator to your predict_sentiment function.

app.py

@app.rest_api()
def predict_sentiment(**inputs):
    ...

The complete app.py file will look like this:

app.py

from beam import App, Runtime, Image
from transformers import pipeline

app = App(
    name="sentiment-analysis",
    runtime=Runtime(
        cpu=2,
        memory="8Gi",
        image=Image(
            python_version="python3.9",
            python_packages=["transformers", "torch"],
        ),
    ),
)


@app.rest_api()
def predict_sentiment(**inputs):
    model = pipeline(
        "sentiment-analysis", model="siebert/sentiment-roberta-large-english"
    )

    result = model(inputs["text"], truncation=True, top_k=1)
    prediction = {i["label"]: i["score"] for i in result}

    print(prediction)

    return {"prediction": prediction}

Deploying the app

To deploy the model, enter your terminal and cd to the directory you’re working on. Then, run the following:

beam deploy app.py

After running this command, you’ll see some logs in the console that show the progress of your deployment.

Show Logs

user@MacBook sentiment-analysis % beam deploy app.py
 i  Copying files in workspace './'...
 ✓  Done.
 i  Adding python requirements...
 i  	torch
 i  	transformers
 ✓  Done.
 i  Using cached image.
 ✓  <sentiment-analysis> deployed successfully! 🎉
 i  ID: 64c131ff2628c00009ebefc8
 i  Version: 6
 i  Trigger type: rest_api
 i  CPU: 2000m
 i  Memory: 8Gi
 i  Runtime: python3.9
 i  Send requests to: https://hf-inference-712408b-v1.app.beam.cloud
 i  View deployment status at: https://www.beam.cloud/apps/hf-inference/deployment/64c131ff2628c00009ebefc8/logs

At the bottom of the console, you’ll see a URL for invoking your function. Here’s what a cURL request would look like:

  curl -X POST --compressed "https://hf-inference-712408b-v1.app.beam.cloud" \
   -H 'Accept: */*' \
   -H 'Accept-Encoding: gzip, deflate' \
   -H 'Authorization: Basic [YOUR_AUTH_TOKEN]' \
   -H 'Connection: keep-alive' \
   -H 'Content-Type: application/json' \
   -d '{"text": "If we override the bandwidth, we can get to the SMTP capacitor through the cross-platform RSS alarm!"}'

Start Here

Customizing Container Images

Managing Data

Scaling & Performance

Sandboxes

Endpoints and Web Servers

Task Queues

One-Off Functions

Hosting Containers

Advanced Topics

Self-Hosting

Resources

Security

Deploying Huggingface Models

Define the environment

Inference function

Adding a REST API

Deploying the app

Start Here

Customizing Container Images

Managing Data

Scaling & Performance

Sandboxes

Endpoints and Web Servers

Task Queues

One-Off Functions

Hosting Containers

Advanced Topics

Self-Hosting

Resources

Security

​Define the environment

​Inference function

​Adding a REST API

​Deploying the app

Define the environment

Inference function

Adding a REST API

Deploying the app