Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discover] Cancel S3 Queries Using SQL Plugin #9355

Merged
merged 4 commits into from
Feb 19, 2025

Conversation

sejli
Copy link
Member

@sejli sejli commented Feb 7, 2025

Description

When ongoing S3 queries are aborted in Discover, the backend job should also be canceled/deleted. This PR uses the SQL plugin API to delete ongoing jobs to cancel them.

Issues Resolved

Screenshot

Testing the changes

Changelog

  • feat: Deletes S3 Jobs in Backend when Original Query is Canceled

Check List

  • All tests pass
    • yarn test:jest
    • yarn test:jest_integration
  • New functionality includes testing.
  • New functionality has been documented.
  • Update CHANGELOG.md
  • Commits are signed per the DCO using --signoff

Signed-off-by: Sean Li <lnse@amazon.com>
@sejli sejli added discover for discover reinvent backport 2.x v2.20.0 labels Feb 7, 2025
Copy link

codecov bot commented Feb 7, 2025

Codecov Report

Attention: Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.

Project coverage is 61.70%. Comparing base (0994dbe) to head (6b8e830).
Report is 24 commits behind head on main.

Files with missing lines Patch % Lines
...nts/server/routes/data_source_connection/routes.ts 0.00% 7 Missing ⚠️
src/plugins/query_enhancements/common/utils.ts 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9355      +/-   ##
==========================================
- Coverage   61.71%   61.70%   -0.01%     
==========================================
  Files        3816     3817       +1     
  Lines       91829    91874      +45     
  Branches    14543    14555      +12     
==========================================
+ Hits        56668    56695      +27     
- Misses      31506    31522      +16     
- Partials     3655     3657       +2     
Flag Coverage Δ
Linux_1 29.00% <ø> (+<0.01%) ⬆️
Linux_2 56.46% <ø> (ø)
Linux_3 39.19% <ø> (+<0.01%) ⬆️
Linux_4 28.89% <0.00%> (-0.02%) ⬇️
Windows_1 29.01% <ø> (+<0.01%) ⬆️
Windows_2 56.41% <ø> (ø)
Windows_3 39.20% <ø> (+0.01%) ⬆️
Windows_4 28.89% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Sean Li <lnse@amazon.com>
@kavilla
Copy link
Member

kavilla commented Feb 12, 2025

i think looks good to me. can we confirm this is wired up ? in the places we expect. or did we have fast follows to add that. as in abort query on navigation away from discover. abort query if closing the dataset selector. abort query if new query is fired (pretty sure already done)

kavilla
kavilla previously approved these changes Feb 13, 2025
@sejli
Copy link
Member Author

sejli commented Feb 14, 2025

So far, it's wired up to the point where the delete request will be sent when the query enhancement's fetch() function is aborted. So, entirely dependent on how this fetch function picks up the AbortSignal.

abort query if new query is fired

This is the main thing that this PR acheives.

abort query if closing the dataset selector.

Do we want this? If we start loading something like databases and close the dataset selector, wouldn't we want that to load in the background? I think it's intended behavior.

as in abort query on navigation away from discover

Think this will probably need a fast followup or some improvements on how I'm doing this.

Copy link
Member

@joshuali925 joshuali925 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also want to add a cancel button on UI?

signal,
})
.catch(async (error) => {
if (error.name === 'AbortError' && context.body?.pollQueryResultsParams?.queryId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can't remember but does the current UI abort requests when user clicks Run query before the previous one finishes?

For example,

  1. user starts async query
  2. query id returned
  3. browser sends requests to check job status
  4. server returns job status
  5. browser waits a few seconds if job is pending
  6. step 3-5 repeats until query finishes

If the request in step 3 is aborted, then this would work. But if user runs another query during step 5, would this abort and delete job for the previous query?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this scenario, yes, it will abort if another query is run in between the short polling requests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain a bit why would it abort? i thought during waiting time, the last fetch promise will be resolved and catch won't trigger?

});
} catch (cancelError) {
// eslint-disable-next-line no-console
console.error('Failed to cancel query:', cancelError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it throw the cancel error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think it's necessary, since we don't want this thrown error to stop any other process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but should we let user know if it failed? For example if I accidentally ran a select * on all the data, it might be very expensive. If I started another query or clicked cancel and I don't see any warning on UI, i would assume i won't be fully charged for the previous query but in fact it might still be running

try {
await http.fetch({
method: 'DELETE',
path: API.DATA_SOURCE.ASYNC_JOBS,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this allow each dataset type config to define the action to delete query job? there might be other type configs using pollQueryResultsParams but does not use SQL plugin async jobs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of our followup items to discuss is how we will add the cancel button, which is why I haven't included it in this PR. However, adding a dataset type config would be a good way to do that implementation, thanks for that tip.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, it might be better if S3 type config provides this catch logic as an error handler

});
return response.noContent();
} catch (error) {
const statusCode = error.statusCode === 500 ? 503 : error.statusCode || 503;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe more or may be add comment what exactly the action being taken with these error codes,

also can we write it like to make it more readable?
const statusCode = error.statusCode && error.statusCode !== 500 ? error.statusCode : 503;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added as a way to handle specific errors from SQL plugin. Was mimicking the other routes, which were added in this PR.

Signed-off-by: Sean Li <lnse@amazon.com>
cy.get(`[title="default"]`).click();
cy.get(`[title="http_logs"]`).click();
cy.getElementByTestId('datasetSelectorNext').click();
cy.get(`[class="euiModalHeader__title"]`).should('contain', 'Step 2: Configure data');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: given that this test just verify if it aborts verifyhing if selecting the title has this increases potential points of failure unrelated to this test.

in this test it would be better to just navigate to the page with a s3 query run it and then run it again focusing on the test name.

but this should be addressed later not now.

cy.getElementByTestId('advancedSelectorConfirmButton').click();

// Need to wait a bit for initial query to start
cy.wait(3000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how come? not blocker but something we should address is avoiding abritraty wait times like a env variable that we can control globally. step 2 would be using other features that help us wait without waiting and assuming everything will happen within 3 secs

like i said tho we shouldn't address this in this pr

cy.getElementByTestId(`querySubmitButton`).click();

cy.wait('@cancelRequest').then((interception) => {
console.log('interception.request.url:', interception.request.url);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we want to console log?

cy.waitForSearch();
cy.get(`[data-test-subj="queryResultCompleteMsg"]`).should('be.visible');
cy.getElementByTestId('docTable').should('be.visible');
cy.getElementByTestId('docTable').find('tr').should('have.length', 11);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we potentially luck out on this being 11? for no matter what table.

would it be better if we modify the query to be something we know like SELECT * FROM foo LIMIT 10 then have it be like LIMIT 20 and then we make sure it is still 10?

Copy link
Member

@joshuali925 joshuali925 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my main concerns are #9355 (comment) and extensibility of this implementation, but i don't think they are blocking this PR

@sejli sejli merged commit 3c90fa3 into opensearch-project:main Feb 19, 2025
72 of 73 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch-Dashboards/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch-Dashboards/backport-2.x
# Create a new branch
git switch --create backport/backport-9355-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3c90fa38c1bf53db1858ac1f710c3413f4e5fe7a
# Push it to GitHub
git push --set-upstream origin backport/backport-9355-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch-Dashboards/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-9355-to-2.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants