Skip to content

Listen to DeepSeek's thinking process in real-time! This script converts DeepSeek's thinking tags (<think>...</think>) to speech using Kokoro TTS, allowing you to hear the model's "thoughts" as it reasons through your questions.

Notifications You must be signed in to change notification settings

dwain-barnes/DeepSeek-Thinking-TTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

DeepSeek Thinking TTS

Listen to DeepSeek's thinking process in real-time! This script converts DeepSeek's thinking tags (...) to speech using Kokoro TTS, allowing you to hear the model's "thoughts" as it reasons through your questions.

Thanks

Special thanks to Kris @AllAboutAI-YT for the inspiration behind this project

Features

  • Streams DeepSeek responses through Ollama
  • Detects and processes thinking tags in real-time
  • Converts "thoughts" to speech using Kokoro TTS
  • Supports multiple voice combinations (e.g., af_sky+af_bella)
  • Real-time audio playback of AI reasoning

Requirements

  • Python 3.10+
  • Ollama with DeepSeek model installed
  • Docker for running Kokoro TTS
  • For GPU support: NVIDIA GPU + CUDA

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/deepseek-thinking-tts.git
cd deepseek-thinking-tts
  1. Install Python requirements:
pip install -r requirements.txt
  1. Start Kokoro TTS server:

For CPU:

docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.1.0post1

For GPU (requires NVIDIA GPU + CUDA):

docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.1.0post1
  1. Start Ollama with DeepSeek model:
ollama run deepseek-r1:14b

Usage

Run the script:

python deepseek-think-tts.py

The script will detect DeepSeek's thinking tags and convert them to speech in real-time, letting you hear the AI's reasoning process out loud.

How it works

  1. The script connects to Ollama running the DeepSeek model
  2. It monitors the output stream for thinking tags (...)
  3. When thinking content is detected, it's sent to Kokoro TTS
  4. The generated speech is played in real-time through your speakers

Credits

About

Listen to DeepSeek's thinking process in real-time! This script converts DeepSeek's thinking tags (<think>...</think>) to speech using Kokoro TTS, allowing you to hear the model's "thoughts" as it reasons through your questions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages