av2txtsum

This repository is related to automatic speech recognition (ASR).

Repository Structure

The repository includes the following .ipynb files:

01_main.ipynb

This notebook outlines the primary goals and objectives of the analysis. It includes instructions on how to download a video, extract audio, convert a text transcript to an SRT transcript, and describes the main tools and libraries used for transcription.

02_whisper.ipynb

Open source project: whisper.cpp (based on OpenAI Whisper)

This notebook describes the results of using whisper.cpp, which is based on OpenAI Whisper.

03_seamlessm4t.ipynb

Open source project: SeamlessM4T

This notebook describes the results of using the SeamlessM4T.

04_faster_whisper.ipynb

Open source project: faster-whisper (based on OpenAI Whisper)

This notebook describes the results of using faster-whisper, which is based on OpenAI Whisper.

05_llama_on_groq.ipynb

This notebook describes how to use Llama on Groq to get summary.

06_gpt_summaries.ipynb

This notebook describes how to get summary using different GPT models over API and compare the results.

Additionally, there are a few folders:

The data folder contains the transcripts and summaries. MP3, WAV, and MP4 files are excluded due to their significant size, but they can be extracted as described in the .ipynb files.
The utils folder contains several Python files that are excluded from the .ipynb files to avoid overloading them with code. Links to these files are included in the .ipynb files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

av2txtsum

Repository Structure

01_main.ipynb

02_whisper.ipynb

03_seamlessm4t.ipynb

04_faster_whisper.ipynb

05_llama_on_groq.ipynb

06_gpt_summaries.ipynb

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
utils		utils
.gitignore		.gitignore
01_main.ipynb		01_main.ipynb
02_whisper.ipynb		02_whisper.ipynb
03_seamlessm4t.ipynb		03_seamlessm4t.ipynb
04_faster_whisper.ipynb		04_faster_whisper.ipynb
05_llama_on_groq.ipynb		05_llama_on_groq.ipynb
06_gpt_summaries.ipynb		06_gpt_summaries.ipynb
LICENSE		LICENSE
README.md		README.md

License

lexust1/av2txtsum

Folders and files

Latest commit

History

Repository files navigation

av2txtsum

Repository Structure

About

Topics

Resources

License

Stars

Watchers

Forks

Languages