Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several improvements to wav2vec2aligner system: early errors, testing, py 3.8 compatibility, etc #14

Merged
merged 8 commits into from
Nov 22, 2024

Conversation

joanise
Copy link
Member

@joanise joanise commented Nov 19, 2024

PR Goal?

This started just fixing EveryVoiceTTS/EveryVoice#326 but then I saw there was no unit testing, so it went on to include several related things needed to test my fix for this issue:

  • unit testing for this module's code
  • enabling that unit testing in CI
  • fix the python 3.8 incompatibilities I noticed in the process
  • add missing dependencies
  • remove the broken main guard block and replace it by a valid __main__.py
  • fix deprecation warnings (spliced into refactor: change deprecated autocompletion to shell_complete #15)

Fixes?

Fixes EveryVoiceTTS/EveryVoice#326

Feedback sought?

Regular code review.

Please look at the individual commits if you want the reason for each separate change.

Priority?

beta

Tests added?

Yes!

How to test?

For 326:

  • ctc-segmenter align /dev/null /dev/null (or everyvoice segment align /dev/null /dev/null) will now give a friendly error message instead of an exception and stack dump.

For the tests:

  • python -m unittest discover aligner.tests will run the new unit tests (you need to do pip install -e . in this module first).

Confidence?

high

Version change?

no

Related PRs?

EveryVoiceTTS/EveryVoice#590

Copy link

semanticdiff-com bot commented Nov 19, 2024

Review changes with  SemanticDiff

Changed Files
File Status
  aligner/heavy.py  37% smaller
  aligner/cli.py  15% smaller
  aligner/utils.py  6% smaller
  .coveragerc Unsupported file format
  .github/workflows/test.yml Unsupported file format
  .gitignore Unsupported file format
  aligner/__main__.py  0% smaller
  aligner/tests/__init__.py  0% smaller
  aligner/tests/test_cli.py  0% smaller
  requirements.txt Unsupported file format

@joanise joanise marked this pull request as draft November 19, 2024 22:08
Copy link

github-actions bot commented Nov 19, 2024

CLI load time: 0:00.16
Pull Request HEAD: 0f41adcf3180a7f7b5f5ebf076e22d1e5ab697df
Imports that take more than 0.1 s:
import time: self [us] | cumulative | imported package
import time:      1145 |     107454 |     typer.main
import time:       303 |     128664 |   typer
import time:       405 |     132636 | aligner.cli

python aligner/cli.py didn't work anyway, because of relative imports.
Now "python -m aligner" invokes the app, the same as "ctc-segmenter"
@joanise joanise force-pushed the dev.ej/326 branch 5 times, most recently from a3dfdba to e09317c Compare November 20, 2024 20:51
@joanise joanise changed the base branch from main to dev.ej/shell-c November 20, 2024 20:53
While I'm here, sort requirements.txt

The soundfile requirement is indirect: when you run align in CI, if soundfile
is not installed CI fails, but when you run interactively on Linux, this
problem does not occur.
Copy link

codecov bot commented Nov 21, 2024

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

Copy link
Collaborator

@SamuelLarkin SamuelLarkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but I have a question about 7.

@joanise joanise merged commit e82989f into main Nov 22, 2024
6 checks passed
@joanise joanise deleted the dev.ej/326 branch November 22, 2024 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

everyvoice segment does not cleanly handle an empty text file
2 participants