Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support data validation using value and tolerance #370

Closed

Conversation

danielhuppmann
Copy link
Member

@danielhuppmann danielhuppmann commented Aug 26, 2024

This PR implements support for configuring validation criteria using value and atol/rtol (optional). It also adds some safeguards that only either bounds or value (and tolerance) can be configured in a DataValidationCriteria instance. And adds a few tests.

@danielhuppmann danielhuppmann self-assigned this Aug 26, 2024
Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As just discussed bilaterally, I think it would be a good idea to subclass DataValidationCriteria.
You'd make two classes UpperLowerCriteria(DataValidationCriteria) and ValuePlusToleranceCriteria(DataValidationCriteria) (or some better name that what I can come up with now).
UpperLowerCriteria would be mostly what DataValidationCritera is now with attributes upper_bound and lower_bound.
ValuePlusToleranceCriteria(DataValidationCriteria) would have value, rtol, and atol. It would also implement two properties upper_bound and lower_bound.
This way both classes can be used in the same way when validating data.

@@ -21,6 +23,29 @@ class DataValidationCriteria(IamcDataFilter):

upper_bound: float = None
lower_bound: float = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lower_bound: float = None
lower_bound: float | None = None

rtol: float = None
atol: float = None

@model_validator(mode="before")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the option mode="before" is not needed here:

Suggested change
@model_validator(mode="before")
@model_validator()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, seems to be required.

@danielhuppmann
Copy link
Member Author

Thanks @phackstock, implemented the optional-field and don't allow extra fields anymore.

For separating this out into separate classes for bounds and value/tolerance, the issue is that this yields a very user-unfriendly error message if a validation-item has both value and bounds. I implemented this on the branch https://github.com/danielhuppmann/nomenclature/tree/feature/validate-range-subclasses, maybe you have an idea on how to generate user-intelligible error messages?

@phackstock
Copy link
Contributor

@danielhuppmann, just took a look at your extra branch, implementation wise that's exactly what I meant.
I'll see if I can figure out a neat way to make the error messages better for the user.

@danielhuppmann
Copy link
Member Author

closing in favor of #371

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants