Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support not-null constraints as part of primary key definition #373

Closed
katieclaiborne opened this issue Aug 23, 2023 · 7 comments · Fixed by #452
Closed

Support not-null constraints as part of primary key definition #373

katieclaiborne opened this issue Aug 23, 2023 · 7 comments · Fixed by #452
Labels
enhancement New feature or request Stale

Comments

@katieclaiborne
Copy link

katieclaiborne commented Aug 23, 2023

Describe the feature

I'd like for a model with a unique test and a not-null constraint applied to the same column to satisfy the primary key rule.

Describe alternatives you've considered

In our project, we've considered overriding the primary_key_test_macros variables to define a uniqueness test alone as valid. This would allow developers to replace not-null tests with constraints, but it wouldn't validate that they have.

    primary_key_test_macros: [
      ["data_platform.test_unique"],
      ["data_platform.test_unique_combination_of_columns"]
    ]

Additional context

We use BigQuery, so the not-null constraint is the only one available to us.

Who will this benefit?

Teams looking to leverage constraints without duplicating existing tests or sacrificing test coverage.

Are you interested in contributing this feature?

I'd love to contribute, but would need a hand getting started.

I've poked around a bit to understand how the primary_key_test_macros variable is currently used, and I've inspected how constraints appear in the project manifest. The tricky bit seems to be that column-level constraints appear as part of the relevant model node, while tests appear as their own nodes, with dependencies on (test) macro and model nodes.

@katieclaiborne katieclaiborne added the enhancement New feature or request label Aug 23, 2023
@dave-connors-3
Copy link
Collaborator

dave-connors-3 commented Aug 23, 2023

Hey @katieclaiborne! This is a really great idea!

I think we've intentionally stayed away from too much column-level madness, but I think this could be a good excuse to investigate how to bring those concepts into the package!

To start, we could pull in all column info as a JSON string from the graph object in stg_nodes, to at least pull in the relevant data. Then, we could write another CTE in the int_model_test_summary that flattens and joins that column data back, and checks whether the set of tests and constraints on the column match the full set of tests and constraints you'd like to consider as acceptable.

I can see a couple places where it could get tough!

  1. Not all warehouses are build the same when it comes to dealing with JSON -- would likely require some tricky macro dispatching to make this cross-warehouse compatible!
  2. The logic could get a little weird if we have multiple definitions of "primary key tested" -- in your example, I assume you'd still like non-contracted (and therefore constrained) models to still pass the litmus test if they had a not_null and unique test applied, even without the constraint. We may need to come up with a nice mechanism to define the acceptance criteria across tests and constraints that isn't too hairy

I'm happy to do a little early research and report back!

@katieclaiborne
Copy link
Author

That'd be great! Thanks, Dave.

Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Feb 21, 2024
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 29, 2024
@dave-connors-3
Copy link
Collaborator

@katieclaiborne i am definitely willing to keep this open if you want! i think with the new column-level model, we might have some ability to add this in. Any interest in a contribution?

@katieclaiborne
Copy link
Author

Yes! Thank you, Dave.

@katieclaiborne7
Copy link
Contributor

@dave-connors-3, hello from the other side!

Could you offer a bit of guidance on how you would approach this? I have some time on my hands, and would be game to give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants