A small wrapper package around whisper-timestamped. Create force-aligned transcription TextGrids from raw audio!
-
Updated
Sep 10, 2024 - Python
A small wrapper package around whisper-timestamped. Create force-aligned transcription TextGrids from raw audio!
Solution for Zalo AI Challenge 3 - Lyrics Alignment
Perform force alignment on Mandarin data using aidatatang pretrained model at https://kaldi-asr.org/models/m10
Given forced alignment results, we obtain words with their respective durations from concatenated phones.
Takes audio (mp3) and text input (string) and force aligns the text to the audio. Uses stable-ts and whisperx.
An automated workflow that generates timestamped subtitles from a video file with custom control using regex, Java and multiple online tools.
Public version of my Computer-Aided Pronunciation Training (CAPT) system (server)
Extract phone-level alignment and phonemic transcript from kaldi ali.*.gz files
The AI music site that automatically generates karaoke player
Use kaldi pretrained nnet3 model to align individual sentences and get phone-level transcripts
Add a description, image, and links to the force-alignment topic page so that developers can more easily learn about it.
To associate your repository with the force-alignment topic, visit your repo's landing page and select "manage topics."