-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
formatter will stack overflow when formatting a long chain of or
#4712
Comments
I just read https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html which describes the Pratt parsing algorithm. Pratt combines recursive and looping flows to help avoid some of the weaknesses of recursive descent style parsers. I'm not entirely sure the weaknesses cited are "too much recursion" as is our problem case today, but certainly switching to loops where possible should help. I know the formatter is not actually parsing anything (it only consumes an already parsed AST), but it seems like a similar approach could be applied. I have not yet read https://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/ but it's next on my list. |
Discussed as a quicker-fix in the standup today: we could add a depth counter to the recursion which would allow us to produce a user-facing error that is phrased deliberately for this case. It might be we also want to survey for other recursive paths at the same time. N.b. the stack limits are different (lower) for wasm so we should probably use its limits as our ceiling. |
Looks like #4721 ran with the idea of cutting those operator allocations! |
NOTE: If we add a limit we should also support pushdown of |
TL;DR: long
or
chains can cause the formatter to stack overflow. To mitigate, either we need to reduce allocations or make the flow less recursive.At time of writing, given a flux program like:
will cause
fluxcore::formatter::format()
to stack overflow. By my count, today's threshold is around 193or
being the breaking point.Reducing Allocations
There are a few places where we could reduce allocations, for example in this specific case types like
ast::LogicalOperator
are cast to strings at each level of recursion.We could add an impl for
AsRef<str>
which returns the &'static str instead. Better might be to use only the enum variants themselves rather than dropping down to String/&str, but there might be a polymorphic concern -- are two or more enums we feed into this function?These strings are going to be a small savings, but not really enough to make an impact. A POC to move towards using
&'static str
for this got us a single additional level of recursion only.Refactoring to reduce recursion
This is more likely the way we want to go if we really want to mitigate. Ideally, flux (the formatter, and the parser) should be able to happily consume programs that perhaps are rejected by limits downstream (such as in storage). We would prefer to not be the limiting factor.
In order to avoid the stack overflow in this situation, we'd need to refactor the flow to be more flat with plain iteration rather than recursion.
Probably need to discuss design options for refactors if we want to tackle this. DOD will depend on the discussion/proposal.
The text was updated successfully, but these errors were encountered: