Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1072 qc automate state properties in qualitycontrol and evaluation based on metric states #1074

Conversation

dbirman
Copy link
Member

@dbirman dbirman commented Sep 23, 2024

This PR refactors the overall_status and evaluation_status fields to built from the status of the QCMetric objects in each evaluation. The complication to this PR is that it would require people to build QualityControl objects to use these features, if you dump to JSON and you don't re-validate as a model you can only access the private status lists indirectly. You can still use the schema without building the classes you just don't get access to the convenience functions that are defined using the @property tags.

Obviously the hope is that nobody has to deal with this stuff since the QC portal handles it for them.

I'll walk through the tests to explain how this works:

        test_eval = QCEvaluation(
            evaluation_name="Drift map",
            evaluation_modality=Modality.ECEPHYS,
            evaluation_stage=Stage.PROCESSING,
            qc_metrics=[
                QCMetric(
                    name="Multiple values example",
                    value={"stuff": "in_a_dict"},
                    metric_status_history=[
                        QCStatus(evaluator="Bob", timestamp=datetime.fromisoformat("2020-10-10"), status=Status.PASS)
                    ],
                ),
                QCMetric(
                    name="Drift map pass/fail",
                    value=False,
                    description="Manual evaluation of whether the drift map looks good",
                    reference="s3://some-data-somewhere",
                    metric_status_history=[
                        QCStatus(evaluator="Bob", timestamp=datetime.fromisoformat("2020-10-10"), status=Status.PASS)
                    ],
                ),
            ],
        )


        q = QualityControl(
            evaluations=[test_eval, test_eval],
        )

This object has status set for the metrics, but not for the evaluation or the full quality control object. If you now call:

        q.evaluate_status()

You'll find that self.assertEqual(q.overall_status.status, Status.PASS)

If you append a metric to one of the evaluations that is set to Status.PENDING or FAIL the evaluation status and the overall status will subsequently resolve to that state:

        # Add a pending metric to the first evaluation
        q.evaluations[0].qc_metrics.append(
            QCMetric(
                name="Drift map pass/fail",
                value=False,
                description="Manual evaluation of whether the drift map looks good",
                reference="s3://some-data-somewhere",
                metric_status_history=[
                    QCStatus(
                        evaluator="Automated", timestamp=datetime.fromisoformat("2020-10-10"), status=Status.PENDING
                    )
                ],
            )
        )

        q.evaluate_status()
        self.assertEqual(q.overall_status.status, Status.PENDING)

Timestamps automatically get set to datetime.now() unless the user specifies a timestamp, to allow for generating the qc.json separately from the actual processing that happened.

@dbirman dbirman requested a review from saskiad September 23, 2024 16:28
You can't do what I wanted (hide the _evaluation_status list) because pydantic doesn't serialize the field. So we have to keep the history exposed but we'll have to encourage people not to access it directly
Copy link
Collaborator

@saskiad saskiad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use the same datetime that we use throughout the schema.

I'm still a little confused by the structure overall - I might have you explain it for me again (or look with fresh eyes in the morning)

@dbirman dbirman linked an issue Oct 1, 2024 that may be closed by this pull request
@dbirman
Copy link
Member Author

dbirman commented Oct 1, 2024

This PR now also closes two other issues that involved adding asset_id fields to QCEvaluation and adding a validator to QCMetric to make sure that a QCStatus object is included when pushed (and not just an empty list in metric_status_history)

Copy link
Collaborator

@saskiad saskiad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to deal with the asset id separately. Otherwise I think it looks good.

@saskiad saskiad self-requested a review October 3, 2024 23:10
@saskiad saskiad added this pull request to the merge queue Oct 3, 2024
Merged via the queue into dev with commit e880ccd Oct 3, 2024
5 checks passed
@saskiad saskiad deleted the 1072-qc-automate-state-properties-in-qualitycontrol-and-evaluation-based-on-metric-states branch October 3, 2024 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants