This example demonstrates a Question Answering app, built using LangChain. Beam makes it easy to iterate on this code in a remote GPU environment, and you can deploy the app as a REST API with a single command when you’re finished.

How this app works

This app lets you ask questions about a list of URLs. For example, you could supply the URL for apple.com and ask what kinds of products does this company sell?

Setting up the environment

First, you’ll setup your compute environment. You’ll specify:

  • Compute requirements
  • Python packages to install in the runtime

For this example, you’ll need an OpenAI API Key saved in your Beam Secrets Manager, saved as OPENAI_API_KEY.

app.py
from beam import App, Runtime, Image, Output

app = App(
    name="conversational-ai",
    runtime=Runtime(
        cpu=1,
        gpu="T4",
        memory="16Gi",
        image=Image(
            python_packages=[
                "langchain",
                "openai",
                "unstructured",
                "pdf2image",
                "tabulate",
            ],
        ),
    ),
)

Answering questions about URLs

This is the meat of our application — it retrieves the content from the URLs provided, loads them into a GPT model, and generates an answer based on the question you’ve provided.

run.py
from beam import App, Runtime, Image


from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.question_answering import load_qa_chain
from langchain.document_loaders import UnstructuredURLLoader
from langchain.llms import OpenAI

# Add your OpenAI API Key to the Beam Secrets Manager:
# beam.cloud/dashboard/settings/secrets so that it is accessible through:
# os.environ["OPENAI_API_KEY"]


app = App(
    name="conversational-ai",
    runtime=Runtime(
        cpu=1,
        gpu="T4",
        memory="16Gi",
        image=Image(
            python_packages=[
                "langchain",
                "openai",
                "unstructured",
                "pdf2image",
                "tabulate",
            ],
        ),
    ),
)


@app.rest_api()
def start_conversation(**inputs):
    # Grab inputs passed to the API
    urls = inputs["urls"]
    query = inputs["query"]

    loader = UnstructuredURLLoader(urls=urls)
    data = loader.load()

    doc_splitter = CharacterTextSplitter(
        separator="\n\n",
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len,
    )

    docs = doc_splitter.split_documents(data)

    chain = load_qa_chain(
        OpenAI(temperature=0),
        chain_type="stuff",
    )
    res = chain({"input_documents": docs, "question": query}, return_only_outputs=True)
    print(res)
    return {"pred": res}


if __name__ == "__main__":
    # You can customize this query however you want:
    urls = ["https://www.nutribullet.com"]
    query = "What are some use cases I can use this product for?"
    start_conversation(urls=urls, query=query)

Running Inference

Beam includes a live-reloading feature that allows you to run your code on the same environment you’ll be running in production.

By default, Beam will sync all the files in your working directory to the remote container. This allows you to use the files you have locally while developing. If you want to prevent some files from getting uploaded, you can create a .beamignore.

In your shell, run beam serve app.py. This will:

  1. Spin up a container
  2. Run it on a GPU
  3. Print a cURL request to invoke the API
  4. Stream the logs to your shell

You should keep this terminal window open while developing.

(.venv) user@MacBook demo % beam serve app.py
 i  Using cached image.
 ✓  App initialized.
 i  Uploading files...
 ✓  Container scheduled, logs will appear below.
⠴ Starting container... 5s (Estimated: 3m20s)

================= Call the API =================

curl -X POST 'https://apps.beam.cloud/serve/3dpga/650b636542ef2e000aef54fa' \
-H 'Accept: */*' \
-H 'Accept-Encoding: gzip, deflate' \
-H 'Connection: keep-alive' \
-H 'Authorization: Basic [YOUR_AUTH_TOKEN]' \
-H 'Content-Type: application/json' \
-d '{}'

============= Logs Streamed Below ==============

INFO:     | Starting app...
INFO:     | Loading handler in 'app.py:predict'...
INFO:     | Running loader in 'app.py:load_models'...
INFO:     | Ready for tasks.

Now, head back to your IDE, and change a line of code. Hit save.

If you look closely at the shell running beam serve, you’ll notice the server reloading with your code changes.

You’ll use this workflow anytime you’re developing an app on Beam. Trust us — it makes the development process uniquely fast and painless.

Deployment

You might want to deploy this as a persistent REST API. It’s simple to do so. Just run beam deploy:

beam deploy app.py:start_conversation

Your Beam Dashboard will open in a browser window, and you can monitor the deployment status in the web UI.

Calling the API

You’ll call the API by pasting in the cURL command displayed in the browser window.

 curl -X POST --compressed "https://apps.beam.cloud/cjm9u" \
   -H 'Accept: */*' \
   -H 'Accept-Encoding: gzip, deflate' \
   -H 'Authorization: Basic [ADD_YOUR_AUTH_TOKEN]' \
   -H 'Connection: keep-alive' \
   -H 'Content-Type: application/json' \
   -d '{"urls": "[\"https://apple.com\"]", "query": "What kind of products does this company sell?"}'

The API will return an answer, like so:

{"pred":{"output_text":" This company sells iPhones, Apple Watches, iPads, MacBook Pros, Apple Trade In, Apple Card, and Apple TV+."}}```