Whisper Video Transcription
Deploying Whisper as a REST API to transcribe YouTube videos
In this example, we’ll build a simple app which transcribes YouTube videos using Whisper, a state-of-the-art model for speech recognition.
Setting up the environment
First, you’ll setup your compute environment. You’ll specify:
- Compute requirements, including a GPU
- Python and system-level packages to install in the runtime
Transcribing YouTube Videos
We’ll write a basic function which takes in a YouTube video URL, uses the youtube_dl library to download the video as an Output, and runs the video through Whisper to generate a text transcript.
Deployment
You’ll deploy the app by entering your shell, and running:
Your Beam Dashboard will open in a browser window, and you can monitor the deployment status in the web GUI.
Calling the API
You’ll call the API by pasting in the cURL command displayed in the browser window.
The API will return a transcript with our video:
Was this page helpful?