Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi there 👋,
Awesome work to everyone involved in producing this model! It's always great to have more open-source speech/audio quality predictors to experiment with!
I had seen the issue regarding onboarding this model to HF and spent some time making some initial steps. I wanted to get some feedback from you at this stage before I go any further. I have added some tooling to make things easier on my end, but I am more than happy to roll any of that back if needed.
Additions
inference.py
module with theAudioBoxAesthetics
class, which supports saving/loading from Hugging Face + predicting from both WAV and JSONL formats.src/audiobox_aesthetics/export_model_to_hf.py
.src/audiobox_aesthetics/demo.py
(currently hosted here).uv
for dependency lock.Also, check out the example scripts in
examples
.Todo:
Thanks again for your contributions to open-source audio models!
I am more than happy to discuss any changes you would like to see in this PR.
All the best,
Tom 🚀