Transcribe-YT is a Node.js application that downloads YouTube videos, extracts their audio, and transcribes them using OpenAI's Whisper API or AssemblyAI.
- Node.js (v18.19.1 or later recommended)
- Yarn v4
- FFmpeg installed on your system
- An OpenAI API key
- An AssemblyAI API key
-
Clone this repository:
git clone https://github.com/yourusername/transcribe-yt.git cd transcribe-yt
-
Install dependencies using Yarn v4:
yarn install
-
Copy the example configuration file and edit it with your YouTube video URLs:
cp config.example.yaml config.yaml
Then edit
config.yaml
to include the YouTube video URLs you want to transcribe. -
Create a
.env
file in the root directory and add your API keys:OPENAI_API_KEY=your_openai_api_key_here ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here
-
Ensure your
config.yaml
file contains the YouTube video URLs you want to transcribe and specifies the transcription service to use. -
Run the application:
yarn start
The script will:
- Download the audio from each YouTube video
- Convert the audio to MP3 format
- Transcribe the audio using either OpenAI's Whisper API or AssemblyAI (as specified in config.yaml)
- Save the transcriptions in the
transcripts
directory - Clean up the temporary audio files
config.yaml
: List the YouTube video URLs you want to transcribe and specify the transcription service to use..env
: Store your OpenAI and AssemblyAI API keys.
processVideos.ts
: Main script that handles video processing and transcription.package.json
: Defines project dependencies and scripts.config.yaml
: Contains the list of YouTube video URLs to process.transcripts/
: Directory where transcriptions are saved.audio/
: Temporary directory for audio files (cleaned up after processing).
Key dependencies include:
@distube/ytdl-core
: For downloading YouTube videosfluent-ffmpeg
: For audio processingopenai
: For interacting with the OpenAI APIassemblyai
: For interacting with the AssemblyAI APIjs-yaml
: For parsing the YAML configuration filetsx
: For running TypeScript files directly
For a full list of dependencies, refer to package.json
.
This project is licensed under the UNLICENSED license.
Contributions are welcome! Please feel free to submit a Pull Request.
This tool is for educational and personal use only. Ensure you have the right to download and transcribe the YouTube content you're processing.