This project extracts transcriptions from YouTube videos and processes them using the Groq AI model. It utilizes the youtube_transcript_api
to fetch video subtitles and sends the transcribed text to an AI model for further analysis.
- Extracts transcriptions from YouTube videos using
youtube_transcript_api
. - Processes the extracted text using Groq's AI model (
llama-3.3-70b-versatile
). - Allows customization of AI behavior via a system prompt file.
- Supports configurable parameters such as
temperature
,max_tokens
, andtop_p
for AI-generated responses.
- Python 3.8+
- pip
- A valid Groq API key
- Clone the repository:
git clone [https://github.com/saadtariq10/youtube-transcriber.git] cd youtube-transcriber
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file and add your Groq API key:GROQ_API_KEY=your_api_key_here
- Ensure you have a
system_prompt.txt
file in the project directory containing your desired system message for the AI.
- Run the main script with a YouTube video URL:
python main.py
- The script will:
- Extract the video ID from the provided URL.
- Retrieve and process the transcription.
- Send the transcription to the AI model for further analysis.
- Display the processed output.
project_root/
βββ main.py # Main execution script
βββ transcribe_module.py # Handles YouTube transcription
βββ ai_model.py # AI processing module
βββ system_prompt.txt # Customizable AI prompt
βββ .env # API key storage (not included in repo)
βββ requirements.txt # List of dependencies
groq
python-dotenv
youtube_transcript_api
re
Install dependencies using:
pip install -r requirements.txt
This project is licensed under the MIT License. See LICENSE
for more details.
Pull requests are welcome! For major changes, please open an issue first to discuss your ideas.
Saad Tariq