Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to create dataset after editing "path" in exported dataset.json file #48

Open
namitha-sh opened this issue Jan 24, 2025 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@namitha-sh
Copy link

Hello!

I was trying to create an example dataset along with the dataset.json file and the annotation_project.json file as shown in the example data. I was able to upload the audio directory into Whombat and then download the json file with all the metadata to create a dataset.json file. However the "path" for each audio file in the json file is the full path including the path to the audio directory:

Image

In order for a second person to import this dataset, I manually edited the paths in the dataset.json file to include just the names of the audio files, because when they import the dataset, the path to the audio_dir in their local machine is specified differently:

Image

However this failed to create the dataset even after specifying the correct path to the audio directory when uploading to Whombat.
Would you be able to provide some insight on this issue?

@mbsantiago
Copy link
Owner

Hi @namitha-sh!

Many thanks for reporting the issue. Ideally, as you mention, full paths should not be exported in order to facilitate imports on other machines. I'll have a look into why this is happening an report back soon.

@mbsantiago mbsantiago added the bug Something isn't working label Feb 1, 2025
@mbsantiago mbsantiago self-assigned this Feb 1, 2025
mbsantiago added a commit that referenced this issue Feb 2, 2025
@namitha-sh
Copy link
Author

Hi @mbsantiago

Thank you so much for looking into the issue. I tried out the version after the merge was made and it seems the path is still exported relative to the home directory, not the audio directory itself. I guess if all the audio files are in the home directory, this works but not if they are within another audio directory like in example_data

Image

@mbsantiago
Copy link
Owner

Hey @namitha-sh! Quick question, how are you running whombat? Did you download from PyPI (i.e. via pip), did you use one of the bundled versions in the releases section, or did you run directly from the source code? I haven't yet created a new release after the merge, so the PyPI and bundled versions are not updated.

@namitha-sh
Copy link
Author

Hi @mbsantiago
I ran it directly from the source code, not the bundle or the PyPI. The problem also persists in the annotation project json file as well.

@mbsantiago
Copy link
Owner

ok thanks for the clarification. Will look into it.

mbsantiago added a commit that referenced this issue Feb 5, 2025
@mbsantiago
Copy link
Owner

Hi @namitha-sh, I think I found the issue. I've implemented a proposed solution. Let me know if it works for you.

@namitha-sh
Copy link
Author

Hi @mbsantiago

Thank you so much for making the changes, the paths in the dataset.json is working now and is only showing the path relative to the audio_dir, which is great. But the problem persists in other json files downloaded -- like the annotation_project.json and the validation.json files.

@mbsantiago
Copy link
Owner

Thanks @namitha-sh. If the dataset export is working the others should be an easy fix. Will implement those soon.

@namitha-sh
Copy link
Author

Hi @mbsantiago ! Just wondering if you had any update on this. Also, while the paths are being exported correctly to the dataset.json file now, it still has problems when a new machine imports that file to create the dataset from the associated audio folder. It just says "Failed to create dataset". So perhaps Whombat is still not recognising the path?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants