-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor bwc test suite to re-use existing resources between tests #1171
refactor bwc test suite to re-use existing resources between tests #1171
Conversation
dbb1174
to
69edfc7
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1171 +/- ##
============================================
- Coverage 81.72% 81.55% -0.17%
+ Complexity 2494 1244 -1250
============================================
Files 186 93 -93
Lines 8426 4213 -4213
Branches 1428 714 -714
============================================
- Hits 6886 3436 -3450
+ Misses 1000 506 -494
+ Partials 540 271 -269 ☔ View full report in Codecov by Sentry. |
Why do we need this change? Each test should be run independently in a dedicated environment with fresh, uncontaminated data, preventing interference from other tests and ensuring the test accurately evaluates the functionality of the component being tested in isolation. |
We have faced earlier issues with this where during the test run due to shared names the pipeline either used to get deleted or wrong information has been shared amongst test. |
@@ -28,9 +28,9 @@ public void testSemanticSearch_E2EFlow() throws Exception { | |||
waitForClusterHealthGreen(NODES_BWC_CLUSTER, 90); | |||
switch (getClusterType()) { | |||
case OLD: | |||
modelId = uploadTextEmbeddingModel(); | |||
modelId = getOrUploadTextEmbeddingModel(getIngestionPipeline(PIPELINE_NAME), TEXT_EMBEDDING_PROCESSOR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of OLD it will always upload. Then why it is changed to getOrUpload?
@@ -373,11 +373,11 @@ private static int getHitCount(final Map<String, Object> searchResponseAsMap) { | |||
} | |||
|
|||
public static String getModelId(Map<String, Object> pipeline, String processor) { | |||
assertNotNull(pipeline); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition validates that pipeline created earlier has the model Id. At line 377 we are fetching information from a pipeline. So we need to ensure that it is not null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it is a validation condition
@@ -373,11 +373,11 @@ private static int getHitCount(final Map<String, Object> searchResponseAsMap) { | |||
} | |||
|
|||
public static String getModelId(Map<String, Object> pipeline, String processor) { | |||
assertNotNull(pipeline); | |||
if (pipeline == null) return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In anycase, it will not be null.
I suggest reusing only the model. Pipeline is cheap to create. We likely need just one model per type: @vibrantvarun The reason for reusing the model is that during BWC tests, multiple tests deploy the same model simultaneously, leading to model deployment failures during node upgrades. This, in turn, causes flaky test failures. |
@heemin32 I think test resources created for any test should be just meant to run it. The root cause of the model deployment failure is not the redeployment of the model. It should be something else. |
Our goal isn’t to test model deployment—that should be covered by Model deployment is resource-intensive, and if each test deploys its own model, the test cluster might not be large enough to handle them all at once. Sharing models across tests shouldn’t affect test coverage or validity in any way. The primary objective here is to avoid flaky test failures caused by model deployment issues. We also want to ensure that problems in |
@heemin32 in bwc we do test the model deployed in older version of node still exists in new version of node. Because of that test we are always able to find critical issues with ml-commons just before the release. |
FYI: ml-commons till data does not have BWC framework in place. We as neural-search catch issues in BWC tests. |
@heemin32 I am aligned with you on reducing test flakiness but we need to find the actual root cause. Here we are trying to reduce overhead on us by doing less model deployment at the cost of removing model deployment test in bwc. Model deployment is part of each feature we release. |
bb2012d
to
14cf7f9
Compare
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 4cbf419. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 89bbcb5. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 2423e13. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 69edfc7. Signed-off-by: will-hwang <sang7239@gmail.com>
…tests" This reverts commit 4fb6deb. Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit e371faf. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 94997a1. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 7513c4b. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit d4c0d42. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 65c8add. Signed-off-by: will-hwang <sang7239@gmail.com>
This reverts commit 0ec949c. Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
Signed-off-by: will-hwang <sang7239@gmail.com>
92cb543
to
2d0f6a1
Compare
The main objective of BWC tests is to ensure backward compatibility, not to conduct load testing. Currently, we are deploying the same text embedding model multiple times—once for each test. This approach doesn’t align with real customer scenarios, as they typically wouldn’t deploy the same model repeatedly. Reusing the model across tests will not compromise the purpose of the BWC tests but will help stabilize them, allowing us to focus on what truly matters. |
@will-hwang are you still working on this PR after neural got migrated to 3.0-alpha? |
Description
Currently, models and pipelines are re-created and re-deployed per test, leading to redundant model loads and pipeline creations. This change avoids this redundancy by reusing the created resource between test cases.
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.