You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I plan to improve several Emilia pipeline components and reconstruct Emilia dataset using the Emilia raw audio files. Documentation suggests to "download the raw audio files from the provided URL list":
(...)To reconstruct the Emilia dataset, you can download the raw audio files from the provided URL list and use our open-source Emilia-Pipe preprocessing pipeline to process the raw data and rebuild the dataset. Additionally, users can employ Emilia-Pipe to preprocess their own raw speech data to meet specific needs. (...)
Nevertheless, downloading/scraping 101,000 hours of audio from sources similar to YouTube is a daunting challenge. Could you please clarify the procedure for obtaining direct access to the raw audio data?
The text was updated successfully, but these errors were encountered:
Problem Overview
I plan to improve several Emilia pipeline components and reconstruct Emilia dataset using the Emilia raw audio files. Documentation suggests to "download the raw audio files from the provided URL list":
Nevertheless, downloading/scraping 101,000 hours of audio from sources similar to
YouTube
is a daunting challenge. Could you please clarify the procedure for obtaining direct access to the raw audio data?The text was updated successfully, but these errors were encountered: