Listen to DeepSeek's thinking process in real-time! This script converts DeepSeek's thinking tags (...) to speech using Kokoro TTS, allowing you to hear the model's "thoughts" as it reasons through your questions.
Special thanks to Kris @AllAboutAI-YT for the inspiration behind this project
- Streams DeepSeek responses through Ollama
- Detects and processes thinking tags in real-time
- Converts "thoughts" to speech using Kokoro TTS
- Supports multiple voice combinations (e.g., af_sky+af_bella)
- Real-time audio playback of AI reasoning
- Python 3.10+
- Ollama with DeepSeek model installed
- Docker for running Kokoro TTS
- For GPU support: NVIDIA GPU + CUDA
- Clone the repository:
git clone https://github.com/yourusername/deepseek-thinking-tts.git
cd deepseek-thinking-tts
- Install Python requirements:
pip install -r requirements.txt
- Start Kokoro TTS server:
For CPU:
docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.1.0post1
For GPU (requires NVIDIA GPU + CUDA):
docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.1.0post1
- Start Ollama with DeepSeek model:
ollama run deepseek-r1:14b
Run the script:
python deepseek-think-tts.py
The script will detect DeepSeek's thinking tags and convert them to speech in real-time, letting you hear the AI's reasoning process out loud.
- The script connects to Ollama running the DeepSeek model
- It monitors the output stream for thinking tags (...)
- When thinking content is detected, it's sent to Kokoro TTS
- The generated speech is played in real-time through your speakers
- Kokoro-FastAPI - TTS server
- Ollama - Local LLM runner
- DeepSeek - Language model