-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
80a0e3b
commit 2481417
Showing
6 changed files
with
85 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# Longform Narration | ||
|
||
SimpleTTS supports longform narration as an **experimental feature**. Longform narration is powered by [`txtsplit`](https://github.com/fakerybakery/txtsplit). This means it may not always work as expected. Quality may vary. | ||
|
||
## Example | ||
|
||
Here is an example of how to use the `longform` method to with Kokoro: | ||
|
||
```python | ||
from simpletts.models.kokoro import Kokoro | ||
import soundfile as sf | ||
|
||
# Initialize Kokoro model | ||
tts = Kokoro(device="auto") | ||
|
||
# Synthesize speech | ||
text = """ | ||
Enter your longform text here... | ||
""" | ||
audio, sr = tts.longform(text, ref="af") | ||
|
||
# Save output audio | ||
sf.write("output.wav", audio, sr) | ||
``` |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
from simpletts.models.kokoro import Kokoro | ||
import soundfile as sf | ||
|
||
# Initialize Kokoro model | ||
tts = Kokoro(device="auto") | ||
|
||
# Synthesize speech | ||
text = """ | ||
Text-to-speech technology has come a long way in recent years, with many powerful models now available to developers. However, the fragmented ecosystem of TTS libraries poses significant challenges. Each model typically comes with its own unique API, dependencies, and setup requirements, making it difficult for developers to experiment with different models or switch between them as needed. | ||
This is where a unified TTS library becomes invaluable. By providing a consistent interface across multiple models, it dramatically simplifies the development process. Developers can focus on their applications rather than wrestling with different APIs and dependencies for each model they want to try. | ||
A unified library also promotes better code maintainability and portability. When your application's TTS functionality is abstracted behind a common interface, switching models becomes as simple as changing a single line of code. This flexibility is especially important as the field of TTS continues to evolve rapidly, with new and improved models being released regularly. | ||
Additionally, a unified library can handle common tasks like text preprocessing, audio post-processing, and long-form text synthesis consistently across all models. This reduces duplication of effort and helps ensure consistent behavior regardless of the underlying model being used. | ||
For organizations, having a unified TTS library means reduced training time for developers, simplified maintenance, and the ability to easily benchmark different models against each other. It also makes it easier to swap out models based on specific needs - whether that's quality, speed, licensing requirements, or language support. | ||
In conclusion, as TTS technology becomes increasingly important in modern applications, having a unified library isn't just convenient - it's becoming essential for efficient development and maintenance of TTS-enabled applications. | ||
""" | ||
audio, sr = tts.longform(text, ref="af") | ||
|
||
# Save output audio | ||
sf.write("output.wav", audio, sr) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,7 @@ | |
"tqdm", | ||
"openphonemizer", | ||
"click", | ||
"txtsplit", | ||
], | ||
extras_require={ | ||
"xtts": [ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters