AudioTranscriber is a Python-based tool designed to automate the process of converting audio from video files into text. It efficiently extracts audio from various video formats, splits large audio files into manageable segments, and leverages OpenAI's Whisper API to transcribe the audio. This tool is ideal for content creators, researchers, and developers seeking an automated and reliable transcription solution.
- Audio Extraction: Extracts audio from video formats such as MP4, MOV, AVI, MKV, FLV, and WMV.
- Audio Splitting: Automatically splits very large audio files into smaller parts to comply with size limitations.
- Transcription: Utilizes OpenAI's Whisper API for efficient and accurate audio transcription.
- Summarization: Creates a summary of the transcribed content using a configurable GPT model.
- Error Handling: Handles authentication, rate limits, connection issues, and other types of failures.
- Environment Configuration: Easy setup using an
.env
file to securely manage the API key.
git clone https://github.com/proteusbr1/AudioTranscriber
cd AudioTranscriber
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Ensure you have pip installed. Then run:
pip install -r requirements.txt
Create a .env
file in the root directory of the project and add your OpenAI API key:
OPENAI_API_KEY=your_openai_api_key_here
Note: Replace
your_openai_api_key_here
with your actual OpenAI API key. You can obtain an API key from the OpenAI Platform.
Run the main script main.py
by providing the path to an audio or video file:
python main.py --input path/to/audio_or_video.mp4
By default, the audio language and the transcription/summarization languages will be English (en
). You can specify the audio language, transcription language, and summary language using the options below:
--audio_language
or-al
: Defines the original language of the audio. (Default:en
)--transcript_language
or-tl
: Defines the language for the final transcription. (Note: Whisper does not automatically translate)--summary_language
or-sl
: Defines the language for the summary. (Default:en
)
Example:
python main.py --input path/to/video.mp4 --output path/to/output_transcription.txt --audio_language en --transcript_language en --summary_language pt
In this example, the audio will be transcribed in English, and a summary will subsequently be generated in Portuguese.
To see all available options:
python main.py -h
Contributions are welcome! To enhance the project, follow these steps:
-
Fork the Repository
-
Create a New Branch
git checkout -b feature/YourFeature
-
Commit Your Changes
git commit -m "Add Your Feature"
-
Push to the Branch
git push origin feature/YourFeature
-
Open a Pull Request
Provide a clear description of your changes and the reasons behind them.
This project is licensed under the MIT License.