Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for groovy static analysis for groovy scripts #14844

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

abhishekbafna
Copy link

@abhishekbafna abhishekbafna commented Jan 20, 2025

Github Issue: #14995

Google Doc: https://docs.google.com/document/d/10-j1hevpwOWzaU8q0ndqv3toRBWTiPfOWy6xHojoiL4

It adds the support for static analysis for the groovy functions. Users can configure the allowed receivers, allowed imports, allowed static imports, disallowed method names, toggle method definition allowed in the groovy scripts.

A groovy listener to update the configuration on the server nodes.

  • REST APIs for getting the default/sample groovy configuration, update/create groovy configuration and get the current groovy configurations.
  • The AST analysis covers for query, ingestion and table config updates via controller.

Testing

  • Unit tests
  • Local testing using quickstart setup

@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2025

Codecov Report

Attention: Patch coverage is 63.15789% with 56 lines in your changes missing coverage. Please review.

Project coverage is 63.69%. Comparing base (59551e4) to head (5f19543).
Report is 1692 commits behind head on master.

Files with missing lines Patch % Lines
.../controller/api/resources/PinotClusterConfigs.java 0.00% 24 Missing ⚠️
...egment/local/function/GroovyFunctionEvaluator.java 70.00% 17 Missing and 1 partial ⚠️
...ent/local/function/GroovyStaticAnalyzerConfig.java 87.17% 5 Missing ⚠️
.../pinot/server/starter/helix/BaseServerStarter.java 0.00% 5 Missing ⚠️
...ava/org/apache/pinot/minion/BaseMinionStarter.java 0.00% 2 Missing ⚠️
...ent/local/function/GroovyConfigChangeListener.java 85.71% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14844      +/-   ##
============================================
+ Coverage     61.75%   63.69%   +1.94%     
- Complexity      207     1482    +1275     
============================================
  Files          2436     2729     +293     
  Lines        133233   152590   +19357     
  Branches      20636    23565    +2929     
============================================
+ Hits          82274    97189   +14915     
- Misses        44911    48089    +3178     
- Partials       6048     7312    +1264     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.66% <63.15%> (+1.95%) ⬆️
java-21 63.56% <63.15%> (+1.94%) ⬆️
skip-bytebuffers-false 63.68% <63.15%> (+1.93%) ⬆️
skip-bytebuffers-true 63.54% <63.15%> (+35.81%) ⬆️
temurin 63.69% <63.15%> (+1.94%) ⬆️
unittests 63.68% <63.15%> (+1.94%) ⬆️
unittests1 56.25% <61.86%> (+9.36%) ⬆️
unittests2 34.00% <31.57%> (+6.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@soumitra-st
Copy link
Contributor

I think the checks are disabled by default. Can you create a doc on various configurations supported by Pinot?

@abhishekbafna
Copy link
Author

Github issue link: #14995

@abhishekbafna
Copy link
Author

I think the checks are disabled by default. Can you create a doc on various configurations supported by Pinot?

@soumitra-st I have created a github issue and linked the document. Please review it. Thank you.

Copy link
Contributor

@soumitra-st soumitra-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@npawar
Copy link
Contributor

npawar commented Feb 7, 2025

Github Issue: #14995

Google Doc: https://docs.google.com/document/d/10-j1hevpwOWzaU8q0ndqv3toRBWTiPfOWy6xHojoiL4

It adds the support for static analysis for the groovy functions. Users can configure the allowed receivers, allowed imports, allowed static imports, disallowed method names, toggle method definition allowed in the groovy scripts.

  • REST APIs for getting the default/sample groovy configuration, update/create groovy configuration and get the current groovy configurations.
  • The AST analysis covers for query, ingestion and table config updates via controller.

Testing

  • Unit tests
  • Local testing using quickstart setup

Please add the new config in the PR description, and call out a) where does it need to be set (server/minion/controller conf or cluster config) b) is restart required when it's changed

@npawar npawar added release-notes Referenced by PRs that need attention when compiling the next release notes security labels Feb 7, 2025
@Jackie-Jiang
Copy link
Contributor

Nice! Is this another attempt of #14197?

@Jackie-Jiang Jackie-Jiang added documentation Configuration Config changes (addition/deletion/change in behavior) labels Feb 7, 2025
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is static analysis, we should apply the analysis in 2 places:

  1. When table config gets created/updated, we want to validate the groovy transforms configured
  2. When broker gets a query, we want to validate the groovy transform within it

Take a look at where we block groovy functions right now. We can integrate this logic at the same place.

@abhishekbafna
Copy link
Author

Since this is static analysis, we should apply the analysis in 2 places:

  1. When table config gets created/updated, we want to validate the groovy transforms configured
  2. When broker gets a query, we want to validate the groovy transform within it

Take a look at where we block groovy functions right now. We can integrate this logic at the same place.

The analysis applied at both the suggested places.

  1. The groovy expression is validated and blocked (if found containing in-secure code) at the table creation and update stage. This happens in the controller.
  2. For the queries, the analysis happens in the server and failure is reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Config changes (addition/deletion/change in behavior) documentation release-notes Referenced by PRs that need attention when compiling the next release notes security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants