This app uses OCR to remove text from an image. You might use this as a stand-alone microservice, or as a pre-processing step in a computer vision pipeline. This tutorial is an adaptation of this post.

Define the environment

You’ll start by creating a Beam app definition. In this file, you’re defining a few things:

  • The libraries you want installed in the environment
  • The compute settings (some of the CV operations are heavy, so 16Gi of memory is a safe choice)
app.py
from beam import App, Runtime, Image, Output

app = App(
    name="rmtext",
    runtime=Runtime(
        cpu=1,
        memory="16Gi",
        image=Image(
            python_packages=[
                "numpy",
                "matplotlib",
                "opencv-python",
                "keras_ocr",
                "tensorflow",
            ],
            commands=["apt-get update && apt-get install -y libgl1"],
        ),
    ),
)

Removing the text from an image

You’ll use the code below to accomplish the following:

  • Identify text in the base-64 encoded image and create bounding boxes around each block of text
  • Add a mask around each box of text
  • Paint over each text-mask to remove the text

We’ve added an app.run() decorator to remove_text. This decorator will allow us to run this code on Beam, instead of your laptop.

You can run this code on Beam by running beam run:

beam run app.py:remove_text

Make sure to include a sample image in your working directory, and update the script with the path. In this example, I’m using this image as a sample:

Deployment

If you’re satisfied with this function and want to deploy it as an API, you can do so by updating the decorator:

Just replace @app.run() with @app.task_queue()

# This function will be exposed as a web API when deployed!
@app.task_queue()
  def remove_text(**inputs):
  ...

You can deploy this app by running:

beam deploy app.py

You’ll call the API by copying the task queue URL from the dashboard.

Since this task runs asynchronously, you’ll use the /v1/task/{task_id}/status/ API to retrieve the task status and a link to download the image output.

This will return a response, which contains:

  • Task ID
  • The start and end time
  • A dictionary with pre-signed URLs to download the outputs
{
  "task_id": "edbcf7ff-e8ce-4199-8661-8e15ed880481",
  "started_at": "2023-04-24T22:44:06.911920Z",
  "ended_at": "2023-04-24T22:44:07.184763Z",
  "outputs": {
    "my-output-1": {
      "path": "output_path",
      "name": "my-output-1",
      "url": "http://data.beam.cloud/outputs/6446df99cf455a04e0335d9b/hw6hx/hw6hx-0001/edbcf7ff-e8ce-4199-8661-8e15ed880481/my-output-1.zip?..."
    }
  },
  "status": "COMPLETE",
}

Enter the outputs url in your browser to download the image. You’ll see that the text has been removed: