Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DCJ-32] Load tags are locked and unlocked at the dataset level #1785

Merged
merged 7 commits into from
Aug 23, 2024

Conversation

okotsopoulos
Copy link
Contributor

@okotsopoulos okotsopoulos commented Aug 20, 2024

Jira ticket: https://broadworkbench.atlassian.net/browse/DCJ-32

Addresses

When files are being ingested to a dataset (via standalone file ingest or as part of a combined metadata-and-file ingest), TDR attempts to take out a lock on a load tag, either provided by the user or generated with the form load-at-<timestamp>. (This is in addition to the dataset-level lock taken out during the duration of the ingest.)

A user may elect to provide and reuse their own load tag to logically link file loads together, but TDR will not allow a load tag to be used by multiple running file ingests: it confuses the algorithm for re-running a load with a load tag and skipping already-loaded files.

Summary of changes

Let flight1 have an active lock on loadTag while ingesting files to a dataset.

flight2 attempts to lock loadTag Previous Behavior New Behavior
…while ingesting files to the same dataset
…while ingesting files to a different dataset

Note that this PR is necessary for us to expand our unlockDataset endpoint to also clear any stuck locks on load tags. We recently had to modify the database directly to clear such stuck locks: https://broadworkbench.atlassian.net/browse/DCJ-625

Testing Strategy

Expanded unit tests (with drive-bys for JUnit 5 conversion, downgrading a Spring Boot test with embedded database to a pure unit test with mocking).

Let flight1 have an active lock on loadTagA while ingesting files to datasetX.

Previously, if flight2 attempted to lock loadTagA while ingesting files to datasetY (a different dataset), the attempt would fail.  Now, such an attempt would succeed.

Unchanged: if flight2 attempted to lock loadTagA while ingesting files to datasetX (the same dataset), the attempt will fail.
@okotsopoulos okotsopoulos marked this pull request as ready for review August 21, 2024 14:47
@okotsopoulos okotsopoulos requested review from a team as code owners August 21, 2024 14:47
@okotsopoulos okotsopoulos requested review from rushtong and fboulnois and removed request for a team August 21, 2024 14:47
Copy link
Member

@pshapiro4broad pshapiro4broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many comments, but mostly minor things

  • I would like to see the DAO test cleaned up a bit since I think teardown and some prep are not needed with the embedded database
  • If Load.locked is unused, it would be nice to remove it

Copy link
Contributor

@fboulnois fboulnois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good 👍

Copy link
Contributor

@rushtong rushtong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you 👍🏽

Notably, a shift in my understanding of "who's locking what":
Flights lock a "load lock key" comprised of a load tag and a dataset ID.
In other words, concurrent load operations (e.g. file ingests) to a dataset are not allowed to use the same load tag.

In an effort to further reduce confusion, the `load` table is now named `load_lock` and has had its `locked` column remove.  The presence of a row in the table denotes that a lock is active.
Copy link

@okotsopoulos okotsopoulos merged commit 5935975 into develop Aug 23, 2024
14 checks passed
@okotsopoulos okotsopoulos deleted the okotsopo-DCJ-32-load-tag-scoped-to-dataset branch August 23, 2024 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants