English | 中文
- video to video translation
- voice clone
- lip synchronization
- add subtitles
- support chinese and english
you can see it at res.mp4
.
- Speech Recognition: fast-whisper
- Text Translation:
- facebook/m2m
- translators[default]
- Voice Clone:
- Lip Synchronization: videotalking
- Face Restore: gfpgan
- Viceo Combine: pyvideotrans
- Install ffmpeg and add it to the enviroment variable, or simply put the executable file in the
ffmpeg
directory, and the directory structure should be as follows:
ffmpeg
|- ffmpeg.exe
|- ffprobe.exe
...
- Install dependencies
git clone git@github.com:halfzm/v2vt.git
conda create -n v2vt_clone python=3.11.0
conda activate v2vt_clone
cd v2vt_clone
pip install -r requirements.txt
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
- Put the pretrained models needed by
openvoice
in theopenvoice_checkpoints
directory, and the directory structure should be as follows:[optional]
openvoice_checkpoints
|- base_speakers
|- ...
|- converter
|- ...
...
- Put the pretrained models needed by
tts
in thetts_models
directory,and the directory structure should be as follows:
tts_models
|- tts_models--multilingual--multi-dataset--xtts_v2
|- config.json
|- model.pth
|- speakers_xtts.pth
|- vocal.json
|- ...
...
- Put the pretrained models needed by
video_retalking
in thevideo-retalking/checkpoints
directory,and the directory structure should be as follows:
video-retalking
|- checkpoints
|- ...
...
- Quick start
python app.py
-
Input file
The input video should not be too short, otherwise an error will be reported when the voice is cloned (preferably not lower than 5S) -
Output file
By default, you can find the outfile in current work directory, namedoutpu.mp4
, which can also be downloaded directly from the webui.
About licence and code_of_conduct, we follow the video-retalking project.
You can see the details at LICENSE
and CODE_OF_CONDUCT
.