-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DC-1120]Update Snapshot Builder Settings to include Dataset Table Name and Relationships #1732
Changes from all commits
8b37723
955ae07
3699d67
3026a9f
caa14d3
c25db68
365ab9f
2fb7a90
889cfe2
bd61314
9be742e
ef8bd5e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,7 +35,9 @@ | |
import bio.terra.model.SamPolicyModel; | ||
import bio.terra.model.SnapshotAccessRequestResponse; | ||
import bio.terra.model.SnapshotAccessRequestStatus; | ||
import bio.terra.model.SnapshotBuilderFeatureValueGroup; | ||
import bio.terra.model.SnapshotBuilderDatasetConceptSet; | ||
import bio.terra.model.SnapshotBuilderOutputTable; | ||
import bio.terra.model.SnapshotBuilderRootTable; | ||
import bio.terra.model.SnapshotBuilderSettings; | ||
import bio.terra.model.SnapshotBuilderTable; | ||
import bio.terra.model.SnapshotIdsAndRolesModel; | ||
|
@@ -694,30 +696,43 @@ public SnapshotAccessRequestResponse getSnapshotAccessRequestById(UUID accessReq | |
} | ||
|
||
public AssetSpecification buildAssetFromSnapshotAccessRequest( | ||
Dataset dataset, SnapshotAccessRequestResponse snapshotRequestModel) { | ||
Dataset dataset, SnapshotAccessRequestResponse snapshotAccessRequest) { | ||
SnapshotBuilderSettings settings = | ||
snapshotBuilderSettingsDao.getBySnapshotId(snapshotAccessRequest.getSourceSnapshotId()); | ||
// build asset model from snapshot request | ||
SnapshotBuilderRootTable rootTable = settings.getRootTable(); | ||
AssetModel assetModel = | ||
new AssetModel() | ||
.name("snapshot-by-request-asset") | ||
.rootTable("person") | ||
.rootColumn("person_id"); | ||
// Manually add dictionary tables, leave columns empty to return all columns | ||
assetModel.addTablesItem(new AssetTableModel().name("person")); | ||
.rootTable(rootTable.getDatasetTableName()) | ||
.rootColumn(rootTable.getRootColumn()); | ||
|
||
// Add root table, leave columns empty to return all columns | ||
assetModel.addTablesItem(new AssetTableModel().name(rootTable.getDatasetTableName())); | ||
|
||
// Build asset tables and columns based on the concept sets included in the snapshot request | ||
List<SnapshotBuilderTable> tables = pullTables(snapshotRequestModel); | ||
List<SnapshotBuilderTable> tables = pullTables(snapshotAccessRequest, settings); | ||
tables.forEach( | ||
table -> { | ||
assetModel.addTablesItem( | ||
new AssetTableModel().name(table.getDatasetTableName()).columns(table.getColumns())); | ||
// First add all person <-> occurrence relationships | ||
assetModel.addFollowItem(table.getPrimaryTableRelationship()); | ||
if (table.getPrimaryTableRelationship() != null) { | ||
assetModel.addFollowItem(table.getPrimaryTableRelationship()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for the person table, table.getPrimaryTableRelationship() is null. This was causing an issue when demographics was selected because there was a null item in the follow list There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that we're now adding the root table separately, do we still need this check? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm yeah I think I could make the primary table relationship required on SnapshotBuilderTable now. But that means I can't use the allOf inheritance in the yaml file so it's a little less pretty but worth it because it will more accurately capture the types we want. Unless there's some way to say allOf ..except? Or I could create a base type and have them both inherit all of that, but since that base type wouldn't be used anywhere we probably don't want it in the yaml. So I'll just redefine SnapshotBuilderTable and SnapshotBuilderRootTable separately |
||
} | ||
}); | ||
|
||
// Second, add all occurrence <-> concept relationships | ||
tables.forEach( | ||
table -> table.getSecondaryTableRelationships().forEach(assetModel::addFollowItem)); | ||
table -> { | ||
if (table.getSecondaryTableRelationships() != null) { | ||
table.getSecondaryTableRelationships().forEach(assetModel::addFollowItem); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added the null check here too because when we do include the sample table, it won't have secondary relationships which will cause the same issue I saw above for the person table. |
||
} | ||
}); | ||
|
||
assetModel.addTablesItem(new AssetTableModel().name("concept")); | ||
// Add dictionary table, leave columns empty to return all columns | ||
assetModel.addTablesItem( | ||
new AssetTableModel().name(settings.getDictionaryTable().getDatasetTableName())); | ||
|
||
// Make sure we just built a valid asset model | ||
dataset.validateDatasetAssetSpecification(assetModel); | ||
|
@@ -727,116 +742,25 @@ public AssetSpecification buildAssetFromSnapshotAccessRequest( | |
} | ||
|
||
@VisibleForTesting | ||
List<SnapshotBuilderTable> pullTables(SnapshotAccessRequestResponse snapshotRequestModel) { | ||
var valueSets = snapshotRequestModel.getSnapshotSpecification().getValueSets(); | ||
var valueSetNames = valueSets.stream().map(SnapshotBuilderFeatureValueGroup::getName).toList(); | ||
List<SnapshotBuilderTable> pullTables( | ||
SnapshotAccessRequestResponse snapshotAccessRequest, SnapshotBuilderSettings settings) { | ||
List<String> includedTableNames = | ||
snapshotAccessRequest.getSnapshotSpecification().getOutputTables().stream() | ||
.map(SnapshotBuilderOutputTable::getName) | ||
.toList(); | ||
|
||
Map<String, SnapshotBuilderTable> tableMap = populateManualTableMap(); | ||
List<SnapshotBuilderDatasetConceptSet> allTables = settings.getDatasetConceptSets(); | ||
|
||
Set<String> missing = new HashSet<>(valueSetNames); | ||
missing.removeAll(tableMap.keySet()); | ||
Set<String> missing = new HashSet<>(includedTableNames); | ||
allTables.stream().map(SnapshotBuilderDatasetConceptSet::getName).forEach(missing::remove); | ||
if (!missing.isEmpty()) { | ||
throw new IllegalArgumentException("Unknown value set names: " + missing); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: What do you think about changing this error message now that we don't have value sets? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, maybe unknown concept set name? |
||
} | ||
|
||
return valueSetNames.stream().map(tableMap::get).toList(); | ||
} | ||
|
||
private Map<String, SnapshotBuilderTable> populateManualTableMap() { | ||
// manual definition of domain names -> dataset table | ||
Map<String, SnapshotBuilderTable> tableMap = new HashMap<>(); | ||
tableMap.put( | ||
"Demographics", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("person") | ||
.secondaryTableRelationships( | ||
List.of( | ||
"fpk_person_gender_concept", | ||
"fpk_person_race_concept", | ||
"fpk_person_ethnicity_concept", | ||
"fpk_person_gender_concept_s", | ||
"fpk_person_race_concept_s", | ||
"fpk_person_ethnicity_concept_s"))); | ||
tableMap.put( | ||
"Drug", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("drug_exposure") | ||
.primaryTableRelationship("fpk_person_drug") | ||
.secondaryTableRelationships( | ||
List.of( | ||
"fpk_drug_concept", | ||
"fpk_drug_type_concept", | ||
"fpk_drug_route_concept", | ||
"fpk_drug_concept_s"))); | ||
tableMap.put( | ||
"Measurement", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("measurement") | ||
.primaryTableRelationship("fpk_person_measurement") | ||
.secondaryTableRelationships( | ||
List.of( | ||
"fpk_measurement_concept", | ||
"fpk_measurement_unit", | ||
"fpk_measurement_concept_s", | ||
"fpk_measurement_value", | ||
"fpk_measurement_type_concept", | ||
"fpk_measurement_operator"))); | ||
tableMap.put( | ||
"Visit", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("visit_occurrence") | ||
.primaryTableRelationship("fpk_person_visit") | ||
.secondaryTableRelationships( | ||
List.of( | ||
"fpk_visit_preceding", | ||
"fpk_visit_concept_s", | ||
"fpk_visit_type_concept", | ||
"fpk_visit_concept", | ||
"fpk_visit_discharge"))); | ||
tableMap.put( | ||
"Device", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("device_exposure") | ||
.primaryTableRelationship("fpk_person_device") | ||
.secondaryTableRelationships( | ||
List.of("fpk_device_concept", "fpk_device_concept_s", "fpk_device_type_concept"))); | ||
tableMap.put( | ||
"Condition", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("condition_occurrence") | ||
.primaryTableRelationship("fpk_person_condition") | ||
.secondaryTableRelationships( | ||
List.of( | ||
"fpk_condition_concept", | ||
"fpk_condition_type_concept", | ||
"fpk_condition_status_concept", | ||
"fpk_condition_concept_s"))); | ||
tableMap.put( | ||
"Procedure", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("procedure_occurrence") | ||
.primaryTableRelationship("fpk_person_procedure") | ||
.secondaryTableRelationships( | ||
List.of( | ||
"fpk_procedure_concept", | ||
"fpk_procedure_concept_s", | ||
"fpk_procedure_type_concept", | ||
"fpk_procedure_modifier"))); | ||
tableMap.put( | ||
"Observation", | ||
new SnapshotBuilderTable() | ||
.datasetTableName("observation") | ||
.primaryTableRelationship("fpk_person_observation") | ||
.secondaryTableRelationships( | ||
List.of( | ||
"fpk_observation_concept", | ||
"fpk_observation_concept_s", | ||
"fpk_observation_unit", | ||
"fpk_observation_qualifier", | ||
"fpk_observation_type_concept", | ||
"fpk_observation_value"))); | ||
|
||
return tableMap; | ||
return allTables.stream() | ||
.filter((table) -> includedTableNames.contains(table.getName())) | ||
.map(SnapshotBuilderDatasetConceptSet::getTable) | ||
.toList(); | ||
} | ||
|
||
public static AssetSpecification getAssetByNameFromDataset(Dataset dataset, String assetName) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be too much to rename the type as well, in the openapi spec? My thinking here is that the response from the API happens to be a SnapshotAccessRequest object, but not all SnapshotAccessRequest objects are actually a response from the API. Since we use the "response" object in non-response situations, it doesn't make sense for response to be part of the type name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, maybe there is a better name. We have a SnapshotAccessRequest object that represents the request submitted from the user, not something retrieved from the database, rather the input to the database. This is a subset of the data in the SnapshotAccessRequestResponse object which is always the object retrieved from the database, so in a way it kind of is a response. If we want to keep these types distinct, we could leave it as is or some up with a more precise name. If we don't want these types to be distinct, we could make the fields that don't exist on the SnapshotAccessRequest optional on the SnapshotAccessRequestResponse object and use that to represent both. But, I think I might prefer leaving them distinct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to keep the types separate. I do think we may want to separate the database type from the response type at some point; there may be some data that isn't in the database that we'd want to return, or vice versa. It also makes things confusing in that "request" sometimes means an API request and sometimes means an access request (which itself is part of an API request).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could swap it to
SnapshotAccessRequestRequest
andSnapshotAccessRequest
? Because technically it is making a request for a SnapshotAccessRequest. But I also think it is okay to leave it as is, at least for now.