Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update .gitignore to exclude virtual environment directories and enhance documentation on adding datasets with h5py #2032

Merged
merged 7 commits into from
Feb 18, 2025

Conversation

bendichter
Copy link
Contributor

@bendichter bendichter commented Feb 6, 2025

Motivation

Added a new section to the editing tutorial that demonstrates how to add custom datasets using h5py. This enhances the documentation by showing users how to add new datasets to existing groups in NWB files when the standard PyNWB API doesn't provide direct methods for doing so. The example specifically shows adding a genotype dataset to the Subject object, which is a common use case for neuroscience data management. This came up recently in the following slack message: https://nwb-users.slack.com/archives/C5XKC14L9/p1738791649800719

I don't think there is a way to do this directly with pynwb. Is that right, @rly?

Adrian Duszkiewicz
Hi all 🙂 I’m wondering about the possible solutions to the problem of blinding to some of the ‘subject’ information in the NWB file.
We are moving our ephys processing pipeline to the NWB format and in our case, the experimenter is blind to the genotype of the animal during pre-processing and initial data analysis. In an ideal situation this info would not be accessible to them in the NWB files they are working with and would only be added at a later stage. However, as I understand, the subject info can be only added when creating the NWB file and cannot be edited later. Is there anything obvious I’m missing or is recreating the whole NWB file from scratch after unblinding the only solution to this issue?
Thank you in advance for all the tips! (edited)

…nce documentation on adding datasets with h5py
Copy link

codecov bot commented Feb 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.78%. Comparing base (5a72510) to head (f2037de).
Report is 1 commits behind head on dev.

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #2032   +/-   ##
=======================================
  Coverage   91.78%   91.78%           
=======================================
  Files          27       27           
  Lines        2738     2738           
  Branches      709      709           
=======================================
  Hits         2513     2513           
  Misses        149      149           
  Partials       76       76           
Flag Coverage Δ
integration 73.15% <ø> (ø)
unit 82.39% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rly
Copy link
Contributor

rly commented Feb 7, 2025

I don't think there is a way to do this directly with pynwb. Is that right, @rly?

If the dataset has not been created yet, it can be appended using pynwb but subject.set_modified() has to be called. (If it has been written and it is the same shape, it can be replaced but only using h5py because it is a scalar dataset, and pynwb does not provide direct access to the h5py.Dataset object for a scalar dataset).

import pynwb
from pynwb.testing.mock.file import mock_NWBFile
nwb = mock_NWBFile()
nwb.subject = pynwb.file.Subject(subject_id="test")
io = pynwb.NWBHDF5IO("test.nwb", "w")
io.write(nwb)
io.close()

io = pynwb.NWBHDF5IO("test.nwb", "a")
nwb = io.read()
nwb.subject.genotype = "test"
nwb.subject.set_modified()
io.write(nwb)
io.close()

io = pynwb.NWBHDF5IO("test.nwb", "r")
nwb = io.read()
print(nwb.subject.genotype)  # returns "test"

set_modified should really be called anytime a field setter is successfully executed but there might be some strange edge cases. I'll look into that in hdmf-dev/hdmf#1244

@bendichter
Copy link
Contributor Author

OK, we should definitely add this to the editing tutorial. Can optional Attributes and optional non-scalar Datasets also be added in this way?

@bendichter
Copy link
Contributor Author

@rly this is fixed now

@bendichter bendichter requested a review from rly February 18, 2025 03:10
@rly
Copy link
Contributor

rly commented Feb 18, 2025

Thank you @bendichter

Can optional Attributes and optional non-scalar Datasets also be added in this way?
Attributes - no
Datasets - yes

@bendichter
Copy link
Contributor Author

@rly can you please approve?

@rly rly enabled auto-merge (squash) February 18, 2025 20:32
@rly rly merged commit f3b3306 into dev Feb 18, 2025
25 checks passed
@rly rly deleted the docs-adding-dataset branch February 18, 2025 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants