-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DC-1120]Update Snapshot Builder Settings to include Dataset Table Name and Relationships #1732
Conversation
public final class Observation { | ||
private Observation() {} | ||
|
||
public static final String TABLE_NAME = "observation"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that it really matters because it is test data but I noticed there is no observation occurrence table, just observation table in OMOP so I renamed this class
.table(new SnapshotBuilderTable().datasetTableName(Observation.TABLE_NAME)), | ||
new SnapshotBuilderDatasetConceptSet() | ||
.name("Measurement") | ||
.table(new SnapshotBuilderTable().datasetTableName(Measurement.TABLE_NAME)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment: I could make a constant like I did for Measurement and use the table name like shown here, or just use the string table names like shown below (because they are only used once), or I could just define constants in this class
@@ -688,7 +689,7 @@ public SnapshotAccessRequestResponse getSnapshotAccessRequestById(UUID accessReq | |||
} | |||
|
|||
public AssetSpecification buildAssetFromSnapshotAccessRequest( | |||
Dataset dataset, SnapshotAccessRequestResponse snapshotRequestModel) { | |||
Dataset dataset, SnapshotAccessRequestResponse snapshotAccessRequest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be too much to rename the type as well, in the openapi spec? My thinking here is that the response from the API happens to be a SnapshotAccessRequest object, but not all SnapshotAccessRequest objects are actually a response from the API. Since we use the "response" object in non-response situations, it doesn't make sense for response to be part of the type name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, maybe there is a better name. We have a SnapshotAccessRequest object that represents the request submitted from the user, not something retrieved from the database, rather the input to the database. This is a subset of the data in the SnapshotAccessRequestResponse object which is always the object retrieved from the database, so in a way it kind of is a response. If we want to keep these types distinct, we could leave it as is or some up with a more precise name. If we don't want these types to be distinct, we could make the fields that don't exist on the SnapshotAccessRequest optional on the SnapshotAccessRequestResponse object and use that to represent both. But, I think I might prefer leaving them distinct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to keep the types separate. I do think we may want to separate the database type from the response type at some point; there may be some data that isn't in the database that we'd want to return, or vice versa. It also makes things confusing in that "request" sometimes means an API request and sometimes means an access request (which itself is part of an API request).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could swap it to SnapshotAccessRequestRequest
and SnapshotAccessRequest
? Because technically it is making a request for a SnapshotAccessRequest. But I also think it is okay to leave it as is, at least for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK to me.
@@ -688,7 +689,7 @@ public SnapshotAccessRequestResponse getSnapshotAccessRequestById(UUID accessReq | |||
} | |||
|
|||
public AssetSpecification buildAssetFromSnapshotAccessRequest( | |||
Dataset dataset, SnapshotAccessRequestResponse snapshotRequestModel) { | |||
Dataset dataset, SnapshotAccessRequestResponse snapshotAccessRequest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to keep the types separate. I do think we may want to separate the database type from the response type at some point; there may be some data that isn't in the database that we'd want to return, or vice versa. It also makes things confusing in that "request" sometimes means an API request and sometimes means an access request (which itself is part of an API request).
"primaryTableRelationship": "fpk_person_drug", | ||
"secondaryTableRelationships": [ | ||
"fpk_drug_type_concept", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should be leveraging TDR's relationship objects here instead of storing fpk names. (So instead of storing the primary table fpk, we'd store the field of the table that contains the value which has a fpk relationship to the primary table.) But that change can be done later, since this PR is about changing where this data is stored, not how it's represented.
# Conflicts: # src/test/java/bio/terra/service/snapshot/SnapshotServiceTest.java
tables.forEach( | ||
table -> { | ||
assetModel.addTablesItem( | ||
new AssetTableModel().name(table.getDatasetTableName()).columns(table.getColumns())); | ||
// First add all person <-> occurrence relationships | ||
assetModel.addFollowItem(table.getPrimaryTableRelationship()); | ||
if (table.getPrimaryTableRelationship() != null) { | ||
assetModel.addFollowItem(table.getPrimaryTableRelationship()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the person table, table.getPrimaryTableRelationship() is null. This was causing an issue when demographics was selected because there was a null item in the follow list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we're now adding the root table separately, do we still need this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah I think I could make the primary table relationship required on SnapshotBuilderTable now. But that means I can't use the allOf inheritance in the yaml file so it's a little less pretty but worth it because it will more accurately capture the types we want. Unless there's some way to say allOf ..except? Or I could create a base type and have them both inherit all of that, but since that base type wouldn't be used anywhere we probably don't want it in the yaml. So I'll just redefine SnapshotBuilderTable and SnapshotBuilderRootTable separately
table -> table.getSecondaryTableRelationships().forEach(assetModel::addFollowItem)); | ||
table -> { | ||
if (table.getSecondaryTableRelationships() != null) { | ||
table.getSecondaryTableRelationships().forEach(assetModel::addFollowItem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the null check here too because when we do include the sample table, it won't have secondary relationships which will cause the same issue I saw above for the person table.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with a few comments.
@@ -688,7 +689,7 @@ public SnapshotAccessRequestResponse getSnapshotAccessRequestById(UUID accessReq | |||
} | |||
|
|||
public AssetSpecification buildAssetFromSnapshotAccessRequest( | |||
Dataset dataset, SnapshotAccessRequestResponse snapshotRequestModel) { | |||
Dataset dataset, SnapshotAccessRequestResponse snapshotAccessRequest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could swap it to SnapshotAccessRequestRequest
and SnapshotAccessRequest
? Because technically it is making a request for a SnapshotAccessRequest. But I also think it is okay to leave it as is, at least for now.
tables.forEach( | ||
table -> { | ||
assetModel.addTablesItem( | ||
new AssetTableModel().name(table.getDatasetTableName()).columns(table.getColumns())); | ||
// First add all person <-> occurrence relationships | ||
assetModel.addFollowItem(table.getPrimaryTableRelationship()); | ||
if (table.getPrimaryTableRelationship() != null) { | ||
assetModel.addFollowItem(table.getPrimaryTableRelationship()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we're now adding the root table separately, do we still need this check?
Set<String> missing = new HashSet<>(valueSetNames); | ||
missing.removeAll(tableMap.keySet()); | ||
Set<String> missing = new HashSet<>(includedTableNames); | ||
allTables.stream().map(SnapshotBuilderDatasetConceptSet::getName).forEach(missing::remove); | ||
if (!missing.isEmpty()) { | ||
throw new IllegalArgumentException("Unknown value set names: " + missing); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: What do you think about changing this error message now that we don't have value sets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, maybe unknown concept set name?
.name("Procedure") | ||
.table( | ||
new SnapshotBuilderTable() | ||
.datasetTableName(ProcedureOccurrence.TABLE_NAME)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to allow SnapshotBuilderTable
objects to have no primaryTableRelationships
or secondaryTableRelationships
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I didn't populate it here because it isn't used for tests, but based on your other comment, I think we could make primary table relationship required, so then I'd need to populate that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for other non-omop datasets that we wouldn't want to make this required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline - its tricky to represent the ideal state so we're just leaving them optional for now.
…me and Relationships (#1732) Co-authored-by: rjohanek <rjohanek@broadinstitute.org>
Jira ticket: https://broadworkbench.atlassian.net/browse/1120
Summary of Changes
There will be a follow up UI change to handle changes to settings/access request - Not yet complete
Testing Strategy