-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use query parameters wherever possible in Neo4jStore #330
Conversation
This commit changes `_query()` function in Neo4jStore and adds test that shows how the exploit could be abused in previous versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LilDojd Thanks so much for this PR. Please see my review comments---I have a couple of observations/suggestions.
alchemiscale/storage/statestore.py
Outdated
@@ -2229,7 +2238,9 @@ def set_task_priority( | |||
RETURN scoped_key, t | |||
""" | |||
res = tx.run( | |||
q, scoped_keys=[str(t) for t in tasks], priority=priority | |||
q, | |||
scoped_keys=[str(t) for t in tasks if t is not None], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tasks
will not have (🤞) None
elements based on the type annotation. I'd say it's safe to keep this as is, but I also don't think it will hurt. Maybe @dotsdl has a differing opinion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LilDojd did you notice a case where inputs to this method may include None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really.
But since you have used cypher_list_from_scoped_keys(tasks)
, which filters out None
s internally, in action_tasks
, and it, too, has List[ScopedKey]
in the signature for tasks, I thought it was better to take a precaution
alchemiscale/alchemiscale/storage/statestore.py
Lines 1393 to 1399 in 4926573
def action_tasks( | |
self, | |
tasks: List[ScopedKey], | |
taskhub: ScopedKey, | |
) -> List[Union[ScopedKey, None]]: | |
"""Add Tasks to the TaskHub for a given AlchemicalNetwork. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, there would be no None
s on the server, since
asc.set_tasks_priority(tasks + [None], 1)
Would just throw
AttributeError: 'NoneType' object has no attribute 'dict'
client-side
alchemiscale/alchemiscale/interface/client.py
Lines 1274 to 1277 in 4926573
async def _set_task_priority( | |
self, tasks: List[ScopedKey], priority: int | |
) -> List[Optional[ScopedKey]]: | |
data = dict(tasks=[t.dict() for t in tasks], priority=priority) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, thank you for giving your rationale @LilDojd! I see no harm in leaving this as you have done it here, as we have in action_tasks
. Up to you if you'd like to leave it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have decided to remove None
checks for cleanliness. Will save a few clock cycles :)
alchemiscale/storage/statestore.py
Outdated
@@ -2266,7 +2277,7 @@ def get_task_priority(self, tasks: List[ScopedKey]) -> List[Optional[int]]: | |||
WHERE t._scoped_key = scoped_key | |||
RETURN t.priority as priority | |||
""" | |||
res = tx.run(q, scoped_keys=[str(t) for t in tasks]) | |||
res = tx.run(q, scoped_keys=[str(t) for t in tasks if t is not None]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: annotation + potential @dotsdl opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I don't mind keeping the if t is not None
, but agree it isn't necessary here. Up to @LilDojd what he would like to do.
assert len(n4js.query_transformations(scope=multiple_scopes[0])) == 0 | ||
|
||
mark = n4js._query(qualname="InjectionMark") | ||
assert len(mark) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the *query*
methods from here down, we should also check directly with cypher via the n4js.execute_query
method, even though it appears to be duplicated work. I think this issue is important enough to justify explicitly checking that the convenience query methods are consistent.
MATCH (n:InjectionMark)
RETURN n
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #330 +/- ##
==========================================
+ Coverage 80.10% 80.16% +0.06%
==========================================
Files 27 27
Lines 3478 3484 +6
==========================================
+ Hits 2786 2793 +7
+ Misses 692 691 -1 ☔ View full report in Codecov by Sentry. |
Thank you for your comments, @ianmkenney |
Co-authored-by: Ian Kenney <ianmichaelkenney@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work @LilDojd! Seriously thank you so much for this!
Just a few comments from me, but nothing blocking in my view once @ianmkenney's comments are addressed!
RETURN task, ar | ||
""" | ||
results.append(tx.run(q).to_eager_result()) | ||
tasks_list = [{"task": str(t), "weight": w} for t, w in tasks.items()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brilliant!
OPTIONAL MATCH (th:TaskHub {_scoped_key: $taskhub})-[ar:ACTIONS]->(task:Task {_scoped_key: task_scoped_key}) | ||
RETURN task_scoped_key, ar.weight AS weight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, this should preserve the behavior that if there is no matching Task
actioned on the TaskHub
then weight
should come back as None
.
alchemiscale/storage/statestore.py
Outdated
@@ -2229,7 +2238,9 @@ def set_task_priority( | |||
RETURN scoped_key, t | |||
""" | |||
res = tx.run( | |||
q, scoped_keys=[str(t) for t in tasks], priority=priority | |||
q, | |||
scoped_keys=[str(t) for t in tasks if t is not None], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, thank you for giving your rationale @LilDojd! I see no harm in leaving this as you have done it here, as we have in action_tasks
. Up to you if you'd like to leave it in.
alchemiscale/storage/statestore.py
Outdated
@@ -2266,7 +2277,7 @@ def get_task_priority(self, tasks: List[ScopedKey]) -> List[Optional[int]]: | |||
WHERE t._scoped_key = scoped_key | |||
RETURN t.priority as priority | |||
""" | |||
res = tx.run(q, scoped_keys=[str(t) for t in tasks]) | |||
res = tx.run(q, scoped_keys=[str(t) for t in tasks if t is not None]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I don't mind keeping the if t is not None
, but agree it isn't necessary here. Up to @LilDojd what he would like to do.
Should be taken care of now, please do check. Thanks! |
@LilDojd Thanks for bringing this to our attention and providing this great solution to the issue! |
This PR closes #259 by replacing f-string substitutions with parameterized queries in the
Neo4jStore
class. The changes should slightly improve query performance but the main goal of the PR is to make Cypher injections impossibleChanges Made
ScopedKey
,GufeKey
is validated for common exploits, including whitespace checks, unicode encoding stuff and general adherence to GufeKey format 77d3cde dd65007 99034dc_generate_claim_query
is replaced with a const strCLAIM_QUERY
. Parameters are now supplied on-site when performing a transactionTaskStatusEnum
, f-strings are now consistently used for ease of work with LSPPlease let me know if I have missed anything and thanks!