Skip to content

proteusbr1/AudioTranscriber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AudioTranscriber

AudioTranscriber is a Python-based tool designed to automate the process of converting audio from video files into text. It efficiently extracts audio from various video formats, splits large audio files into manageable segments, and leverages OpenAI's Whisper API to transcribe the audio. This tool is ideal for content creators, researchers, and developers seeking an automated and reliable transcription solution.

🛠️ Features

  • Audio Extraction: Extracts audio from video formats such as MP4, MOV, AVI, MKV, FLV, and WMV.
  • Audio Splitting: Automatically splits very large audio files into smaller parts to comply with size limitations.
  • Transcription: Utilizes OpenAI's Whisper API for efficient and accurate audio transcription.
  • Summarization: Creates a summary of the transcribed content using a configurable GPT model.
  • Error Handling: Handles authentication, rate limits, connection issues, and other types of failures.
  • Environment Configuration: Easy setup using an .env file to securely manage the API key.

📦 Installation

1. Clone the Repository

git clone https://github.com/proteusbr1/AudioTranscriber
cd AudioTranscriber

2. Create a Virtual Environment (Optional but Recommended)

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

Ensure you have pip installed. Then run:

pip install -r requirements.txt

4. Set Up Environment Variables

Create a .env file in the root directory of the project and add your OpenAI API key:

OPENAI_API_KEY=your_openai_api_key_here

Note: Replace your_openai_api_key_here with your actual OpenAI API key. You can obtain an API key from the OpenAI Platform.

⚙️ Usage

Basic Usage

Run the main script main.py by providing the path to an audio or video file:

python main.py --input path/to/audio_or_video.mp4

By default, the audio language and the transcription/summarization languages will be English (en). You can specify the audio language, transcription language, and summary language using the options below:

  • --audio_language or -al: Defines the original language of the audio. (Default: en)
  • --transcript_language or -tl: Defines the language for the final transcription. (Note: Whisper does not automatically translate)
  • --summary_language or -sl: Defines the language for the summary. (Default: en)

Example:

python main.py --input path/to/video.mp4 --output path/to/output_transcription.txt --audio_language en --transcript_language en --summary_language pt

In this example, the audio will be transcribed in English, and a summary will subsequently be generated in Portuguese.

Help

To see all available options:

python main.py -h

🤝 Contributing

Contributions are welcome! To enhance the project, follow these steps:

  1. Fork the Repository

  2. Create a New Branch

    git checkout -b feature/YourFeature
  3. Commit Your Changes

    git commit -m "Add Your Feature"
  4. Push to the Branch

    git push origin feature/YourFeature
  5. Open a Pull Request

Provide a clear description of your changes and the reasons behind them.

📜 License

This project is licensed under the MIT License.

📚 Acknowledgements

  • OpenAI for the Whisper API.
  • Pydub for audio manipulation.
  • dotenv for environment variable management.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages