Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multivalued string example to changesheets documentation #848

Merged
merged 3 commits into from
Jan 6, 2025

Conversation

aclum
Copy link
Contributor

@aclum aclum commented Dec 17, 2024

In this branch, I add an example for multivalued string to the changesheets documenation

Details

...

Related issue(s)

...

Related subsystem(s)

  • Runtime API (except the Minter)
  • Minter
  • Dagster
  • Project documentation (in the docs directory)
  • Translators (metadata ingest pipelines)
  • MongoDB migrations
  • Other

Testing

  • I tested these changes (explain below)
  • I did not test these changes

I tested these changes by using changesheets:validate

Documentation

  • I have not checked for relevant documentation yet (e.g. in the docs directory)
  • I have updated all relevant documentation so it will remain accurate
  • Other (explain below)

Maintainability

  • Every Python function I defined includes a docstring (test functions are exempt from this)
  • Every Python function parameter I introduced includes a type hint (e.g. study_id: str)
  • All "to do" or "fix me" Python comments I added begin with either # TODO or # FIXME
  • I used black to format all the Python files I created/modified
  • The PR title is in the imperative mood (e.g. "Do X") and not the declarative mood (e.g. "Does X" or "Did X")

@eecavanna
Copy link
Collaborator

The tests/test_api/test_metadata.py::test_load_changesheet test is failing on this branch. Here's part of the error message associated with the failure:

                if collection_name is None:
>                   raise Exception("Cannot find ID", group_id, "in any collection")
E                   Exception: ('Cannot find ID', 'nmdc:sty-11-fkbnah04', 'in any collection')

The diff on this branch shows a new line in the changesheet. The line targets a document having an ID of nmdc:sty-11-fkbnah04. The error message says no document having that ID exists in any collections (schema-managed collections, I assume).

As a reminder, the test is run against a test database, not our production or development database.

@eecavanna
Copy link
Collaborator

I'd recommend updating the changesheet line to target the nmdc:sty-11-pzmd0x14 study instead, since the preceding lines of the changesheet imply to me that that study does exists in the test database at the time this test is run.

bmeluch
bmeluch previously approved these changes Dec 18, 2024
Copy link

@bmeluch bmeluch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak to the error Eric mentioned but the added example line looks good to me, thank you!

@eecavanna
Copy link
Collaborator

From the "Authoring Changesheets" document:

Changesheets is a mechanism to update records (JSON documents) that have already been ingested into MongoDB.

I think the test was failing because the ID that was first added to the changesheet did not belong to an existing Mongo document (in the test database). In commit 9907a5b, I updated that ID in the changesheet so that it belonged to a document that does exist in the test database—which resolved the issue.

Thanks, @bmeluch, for reviewing this (pre-ID-edit)! @aclum, I approve of this being merged in as is (I'll submit an approval now).

@aclum aclum merged commit 41527d7 into main Jan 6, 2025
1 check passed
@aclum aclum deleted the multivalued-changesheets-example branch January 6, 2025 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants