Based on https://github.com/morioka/tiny-openai-whisper-api, using whisper-mlx
instead of openai-whisper
OpenAI Whisper API-style local server, runnig on FastAPI. This is for companies behind proxies or security firewalls.
This API will be compatible with OpenAI Whisper (speech to text) API. See also Create transcription - API Reference - OpenAI API.
Some of code has been copied from whisper-ui
This was built & tested on Python 3.10.8, Ubutu20.04/WSL2 but should also work on Python 3.9+.
sudo apt install ffmpeg
pip install fastapi python-multipart pydantic uvicorn ffmpeg-python
# or pip install -r requirements.txt
or
docker compose build
export PYTHONPATH=.
uvicorn main:app --host 0.0.0.0
or
docker compose up
note: Authorization header is ignored.
example 1: typical usecase, identical to OpenAI Whisper API example
curl http://127.0.0.1:8000/v1/audio/transcriptions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F model="whisper-1" \
-F file="@/path/to/file/openai.mp3"
example 2: set the output format as text, described in quickstart.
curl http://127.0.0.1:8000/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F model="whisper-1" \
-F file="@/path/to/file/openai.mp3" \
-F response_format=text
Whisper is licensed under MIT. Everything else by morioka is licensed under MIT.