Skip to content

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

License

Notifications You must be signed in to change notification settings

natlamir/MeloTTS-Windows

 
 

Repository files navigation

Install Steps on Windows

  1. Clone the repository
git clone https://github.com/natlamir/MeloTTS-Windows.git
cd MeloTTS-Windows
  1. Create conda environment and install dependencies
conda env create -f environment.yml
conda activate melotts-win
pip install -e .
python -m unidic download

If you have trouble doing the download with the python -m unidic download you can try this:

  1. Install pytorch
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
  1. Run using:
melo-ui

Local Training on Windows

  1. In the melo/data/example folder, delete the example metadata.list file.
  2. If you need to convert mp3 to wav, create a folder called mp3s in the example folder and copy all your mp3 files into the mp3s folder
  3. With a conda window activated with the enviroment open in the melo folder, run ConvertMp3toWav.bat from the conda prompt. This will create a folder data/example/wavs with all of the converted wav files.
  4. Create a transcript file by running python transcript.py which will create a data/example/metadata.list file.
  5. Run python preprocess_text.py --metadata data/example/metadata.list to create the train.list, config.json, among other files in the data/example folder.
  6. Modify config.json to change the batch size, epochs, learning rate, etc.
  7. From the conda prompt run train.bat to start the training.
  8. File will be created within the data/example/config folder with the checkpoints and other logging information.
  9. To test out a checkpoint, run: python infer.py --text "this is a test" -m "C:\ai\MeloTTS-Windows\melo\data\example\config\G_0.pth" -o output changing the G_0 to the checkpoint you want to test with G_1000, G2000, etc.
  10. When you want to use a checkpoint from the UI, create a melo/custom folder and copy the .pth and config.json file over from the data/example/config, rename the .pth to a user-friendly name, and launch the UI to see it in the custom voice dropdown.
  11. To see the tensorboard, install pip install tensorflow
  12. Run tensorboard --logdir=data\example\config
  13. This will give you the local URL to view the tensorboard.

Original Readme:

 

Introduction

MeloTTS is a high-quality multi-lingual text-to-speech library by MIT and MyShell.ai. Supported languages include:

Language Example
English (American) Link
English (British) Link
English (Indian) Link
English (Australian) Link
English (Default) Link
Spanish Link
French Link
Chinese (mix EN) Link
Japanese Link
Korean Link

Some other features include:

  • The Chinese speaker supports mixed Chinese and English.
  • Fast enough for CPU real-time inference.

Usage

The Python API and model cards can be found in this repo or on HuggingFace.

Join the Community

Discord

Join our Discord community and select the Developer role upon joining to gain exclusive access to our developer-only channel! Don't miss out on valuable discussions and collaboration opportunities.

Contributing

If you find this work useful, please consider contributing to this repo.

  • Many thanks to @fakerybakery for adding the Web UI and CLI part.

Authors

Citation

@software{zhao2024melo,
  author={Zhao, Wenliang and Yu, Xumin and Qin, Zengyi},
  title = {MeloTTS: High-quality Multi-lingual Multi-accent Text-to-Speech},
  url = {https://github.com/myshell-ai/MeloTTS},
  year = {2023}
}

License

This library is under MIT License, which means it is free for both commercial and non-commercial use.

Acknowledgements

This implementation is based on TTS, VITS, VITS2 and Bert-VITS2. We appreciate their awesome work.

About

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.6%
  • Jupyter Notebook 1.8%
  • Other 0.6%