Skip to content

Partially-exposed CustomQuery #3053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jc-harrison opened this issue Sep 22, 2020 · 2 comments
Closed

Partially-exposed CustomQuery #3053

jc-harrison opened this issue Sep 22, 2020 · 2 comments
Labels
enhancement New feature or request FlowAPI Issues related to the FlowKit API FlowMachine Issues related to FlowMachine

Comments

@jc-harrison
Copy link
Member

There may be times (e.g. when developing or prototyping a new method) when API access is insufficiently flexible even though direct access to individual-level data is not ultimately required. In some such situations, FlowAPI could be suitable if it enabled analysts to run their own SQL in a CustomQuery.

However, directly exposing CustomQuery through the API would allow analysts to run any SQL, and access individual-level results, so that just moves the problem rather than solving it.

If we instead exposed CustomQuery as a sub-query but not a top-level query (so, e.g., a user could run a joined_spatial_aggregate query with custom_location and/or custom_metric sub-queries. The sub-queries could take user-provided SQL, with a requirement that the provided SQL returns the appropriate columns), then a user could get aggregated output of a custom query without being able to directly access the custom query output.

I think we'd want to advise against enabling these query kinds for "standard" users, and only give permission when a user needs to do some prototyping for something that could later be incorporated as a well-defined new query kind. We'd also need to carefully consider the implications (e.g. could a malicious user access individual-level results by writing a query that returns subscriber IDs in a location column, or something like that?).

An alternative would be for analysts to propose new query kinds to be added - at the moment, this would require the analyst talking to somebody who's able to re-deploy an updated FlowKit version; in the future, this could be simplified using a plugin architecture. This would add an extra approval step between the analyst writing a new query and being able to run it, so there's less concern about enabling arbitrary code execution, but on the other hand the overhead of the extra step may prevent sufficiently rapid turnaround required for effective prototyping in some situations.

@jc-harrison jc-harrison added enhancement New feature or request FlowMachine Issues related to FlowMachine FlowAPI Issues related to the FlowKit API labels Sep 22, 2020
@greenape
Copy link
Member

I reckon there's basically no way you can protect the system and data if you allow custom queries - it is sql injection as a feature after all.

I do have a prototype of plugins in #2366 but haven't had time to take it further.

@jc-harrison
Copy link
Member Author

Yes, true. Probably best not to add this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request FlowAPI Issues related to the FlowKit API FlowMachine Issues related to FlowMachine
Projects
None yet
Development

No branches or pull requests

2 participants