This project allows you to transcribe audio files from a given URL, detect the language of the transcription, and save the transcription to a text file.
- Download audio from a provided URL.
- Transcribe the audio to text using Whisper.
- Detect the language of the transcribed text.
- Save the transcribed text to a file.
- Python 3.7+
- Django 3.2+
- FFmpeg
- pip (Python package installer)
git clone https://github.com/ErickGBR/transcript-python-audio.git
cd transcribe_audio
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
Create a requirements.txt
file with the following content:
Django>=3.2
requests>=2.25.1
openai-whisper
langdetect
Then install the dependencies:
pip install -r requirements.txt
-
Apply database migrations:
python manage.py migrate
-
Run the Django development server:
python manage.py runserver
-
Open your web browser and navigate to
http://localhost:8000/transcribe/
.
If you prefer to use Docker, follow these instructions:
Create a Dockerfile
with the following content:
# Use an official Python runtime as a parent image
FROM python:3.9-slim
# Set the working directory in the container
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
ffmpeg \
libsndfile1
# Copy the requirements file into the container
COPY requirements.txt .
# Install the dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Copy the current directory contents into the container at /app
COPY . .
# Expose port 8000
EXPOSE 8000
# Run the command to start the server
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]
-
Build the Docker image:
docker build -t transcribe_audio .
-
Run the Docker container:
docker run -p 8000:8000 transcribe_audio
-
Open your web browser and navigate to
http://localhost:8000/transcribe/
.
- Navigate to
http://localhost:8000/transcribe/
. - Enter the URL of an audio file (in MP3 format).
- Click "Transcribe".
- The transcribed text and detected language will be displayed, and the transcription will be saved to a text file.
app/helpers.py
: Contains theAudioTranscriber
class with methods for downloading audio, transcribing it, detecting the language, and saving the transcription.app/views.py
: Contains the view functiontranscribe_audio_view
to handle the transcription process via the web interface.app/templates/transcribe.html
: The HTML template for the web interface.
If you encounter a permission denied
error when running Docker commands, you may need to add your user to the docker
group:
sudo usermod -aG docker $USER
Then, restart your session.
Ensure all required dependencies are installed. You can install FFmpeg with the following command:
sudo apt-get install ffmpeg
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License. See the LICENSE
file for details.