Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict symbols in names #43

Merged
merged 7 commits into from
Feb 5, 2025
Merged

Restrict symbols in names #43

merged 7 commits into from
Feb 5, 2025

Conversation

rly
Copy link
Contributor

@rly rly commented Feb 2, 2025

Namespace name

When HDMF caches schema, groups are created based on the namespace name. Therefore the namespace name must not contain "/" for both HDF5 and Zarr backends. They should also not contain ":" for Zarr backends on Windows because ":" is not allowed in a file name. (There are actually many characters that are not allowed in filenames on Windows, but ":" is probably the most likely to be used.) We should consider restricting the other ones. Ref: hdmf-dev/hdmf-zarr#219

The schema already states that namespace names cannot contain whitespace (I don't know why).

Proposal: Additionally disallow "/" and ":" (in addition to whitespace) from namespace names

Version string

Groups are also created based on the version string. The version string should follow the NWB versioning guidelines: https://www.nwb.org/versioning-guidelines/
I do not think we need to be strict about this, but at least, the version string must not contain "/" and should not contain ":".

Proposal: Disallow "/" and ":" whitespace from version strings

Fixed group, dataset, attribute, and link names

For the same reason as above, names of groups and datasets, which may be set in schema or defined by users, must not contain "/" and should not contain ":". This restriction is already set in HDMF. hdmf-dev/hdmf#1202

Fixed names of groups, datasets, attributes, and links (things that could be properties of a class) and default names of groups and datasets set in a schema must follow the pattern "^[A-Za-z_][A-Za-z0-9_]*$" for compatibility with the MatNWB and PyNWB APIs: NeurodataWithoutBorders/matnwb#35

The above link only discusses whitespace for Matlab, but Matlab allows only the above pattern for field names, property names, and variable names. https://www.mathworks.com/help/matlab/ref/struct.html

Python attribute names have the same restriction. (Matlab additionally says the maximum length of a name = 63 characters currently.)

In HDMF, we have already blocked creating fixed names of groups and datasets with "/": hdmf-dev/hdmf#1219

Proposal: Require fixed names for groups, datasets, attributes, and links set in schema to follow the pattern "^[A-Za-z_][A-Za-z0-9_]*$"

Data type names

When generating classes, PyNWB and MatNWB use data_type_def as the class name.
Python class names must follow the pattern "^[A-Za-z_][A-Za-z0-9_]*$" (plus some allowance of non-ASCII characters): https://docs.python.org/3/reference/lexical_analysis.html#identifiers .
Matlab class names follow the same rules: https://www.mathworks.com/help/matlab/ref/classdef.html

Therefore,data_type_def values should follow the same pattern "^[A-Za-z_][A-Za-z0-9_]*$".

Proposal: Require data type names to follow the pattern "^[A-Za-z_][A-Za-z0-9_]*$"

The last two proposals were previously proposed in NeurodataWithoutBorders#1 . I am just reviving that issue here and adding to it.

@rly rly requested a review from bendichter February 3, 2025 19:25
@rly
Copy link
Contributor Author

rly commented Feb 3, 2025

Sorry @oruebel , I forgot to make a few changes in description.rst before creating the PR, so the changes were not consistent with the text above. I made those changes now. Please re-review.

@bendichter
Copy link
Collaborator

Looks great! This will help a lot with language and backend compatibility. Hopefully it won't cause headaches for too many people.

@rly rly merged commit 6f541d8 into main Feb 5, 2025
1 check passed
@rly rly deleted the rly-patch-1 branch February 5, 2025 17:15
rly added a commit that referenced this pull request Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants