Skip to main content

Zac Liu provides a tutorial on how you can use A360 AI Platform to easily run OpenAI’s Whisper model without installing it yourself.

Introduction

OpenAI has released Whisper, their state-of-the-art speech recognition (speech-to-text) model. OpenAI states that Whisper approaches human-level robustness and accuracy on English speech recognition. Whisper model is now open source and freely available for anyone to use.

Model

Installing Whisper model in your local environment might not be very straight-forward. It requires Python 3.7+, new version of PyTorch, and FFmpeg, an audio processing library. But here we provide an easy way for you to run Whisper in a cloud environment without any complicated installation process by using the A360 AI Platform Community Edition. 

A360 AI Community Edition is freely available to data scientists. You can sign up here.

A360 AI Platform

Once your Jupyter server is configured, you will be able to use the starter notebook to begin transcribing your audio files into text!

Note: Since we only provide CPU compute in the Community Edition, each 10sec audio file takes about 5-7 seconds to complete the transcription. If you are interested in using GPU on A360 AI Platform, please reach out to us here.

Resources

Leave a Reply