Transformers documentation
Audio transcriptions with WebUI and transformers serve
Audio transcriptions with WebUI and transformers serve
This guide shows how to do audio transcription for chat purposes, using transformers serve and Open WebUI. This guide assumes you have Open WebUI installed on your machine and ready to run. Please refer to the examples above to use the text functionalities of transformer serve with Open WebUI — the instructions are the same.
To start, let’s launch the server. Some of Open WebUI’s requests require CORS, which is disabled by default for security reasons, so you need to enable it:
transformers serve --enable-cors
Before you can speak into Open WebUI, you need to update its settings to use your server for speech to text (STT) tasks. Launch Open WebUI, and navigate to the audio tab inside the admin settings. If you’re using Open WebUI with the default ports, this link (default) or this link (python deployment) will take you there. Do the following changes there:
- Change the type of “Speech-to-Text Engine” to “OpenAI”;
- Update the address to your server’s address — http://localhost:8000/v1by default;
- Type your model of choice into the “STT Model” field, e.g. openai/whisper-large-v3(available models).
If you’ve done everything correctly, the audio tab should look like this
You’re now ready to speak! Open a new chat, utter a few words after hitting the microphone button, and you should see the corresponding text on the chat input after the model transcribes it.
Update on GitHub