-
Notifications
You must be signed in to change notification settings - Fork 132
pre-evaluate let bindings and cache evaluations of free vars #1801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to ca4cf33 in 1 minute and 30 seconds
More details
- Looked at
94
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
6
drafted comments based on config settings.
1. engine/baml-runtime/src/eval_expr.rs:258
- Draft comment:
Caching free var evaluations using RefCell is not thread-safe and may cause issues if beta_reduce is awaited concurrently. Consider using a Mutex or other async-safe cell. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
This is a tricky case. The code is async but that doesn't necessarily mean it runs in parallel threads. We need to consider:
- Is beta_reduce actually called concurrently?
- Is RefCell actually unsafe here?
- Do we have enough context to be sure either way?
Looking at the full file, I don't see clear evidence that beta_reduce is called concurrently from multiple threads. The async seems to be for handling LLM API calls sequentially.
I may be missing evidence of concurrent usage elsewhere in the codebase. The async nature means it's possible for concurrent calls even if not immediately visible here.
While concurrent usage is theoretically possible, without clear evidence that this code needs to be thread-safe, suggesting a thread-safety fix could be premature optimization. RefCell is a perfectly valid choice for single-threaded async code.
The comment raises a theoretical concern but lacks strong evidence that thread-safety is actually needed here. We should not keep speculative comments without clear evidence of the issue.
2. engine/baml-runtime/src/eval_expr.rs:136
- Draft comment:
Let binding handling now evaluates and substitutes the bound expression. Ensure that the use ofbody.close
andsubst2
correctly handles variable capture and edge cases. - Reason this comment was not posted:
Confidence changes required:50%
<= threshold50%
None
3. engine/baml-runtime/src/eval_expr.rs:136
- Draft comment:
Pre-evaluation of let bindings: you now evaluate the bound expression and substitute it into the closed body. Verify that using a fixed target (VarIndex { de_bruijn: 0, tuple: 0 }) works correctly in all nested scopes and preserves semantics (especially if side‐effects or lazy behavior were expected). - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to verify that a specific change works correctly in all nested scopes and preserves semantics. This falls under the rule of not asking the PR author to ensure behavior is intended or to double-check things. Therefore, this comment should be removed.
4. engine/baml-runtime/src/eval_expr.rs:258
- Draft comment:
Caching free variable evaluations: the free var branch now checks and stores results in evaluated_cache. Ensure that potential cycles or recursive dependencies are handled and that using RefCell for caching is safe in your intended async/threading context. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The comment raises valid concerns about thread safety and recursive dependencies, but it's more of a speculative "what if" comment asking for verification rather than pointing out a clear issue. The comment starts with "Ensure that..." which is a red flag. The code appears to be single-threaded async code where RefCell is commonly used, and the caching implementation looks standard.
The comment identifies real potential issues that could cause problems if not handled correctly. Cycles in expression evaluation could cause infinite loops.
While the concerns are valid, the comment is asking for verification rather than pointing out a clear issue. It's speculative and not actionable without specific evidence of problems.
Delete this comment as it violates the rule against asking authors to verify or ensure things. If there are specific issues with RefCell or cycles, they should be pointed out directly with evidence.
5. engine/baml-runtime/src/lib.rs:560
- Draft comment:
Initialization of EvalEnv: the evaluated_cache field is initialized with a new HashMap. Confirm that each EvalEnv instance gets its own cache to prevent leakage across evaluations. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. engine/baml-runtime/src/lib.rs:711
- Draft comment:
Typographical error: In the comment discussing the request body, 'requeset' is misspelled. It should be 'request'. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_Zcie2kXQf9Hif94K
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
ca4cf33
to
834561f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 834561f in 1 minute and 50 seconds
More details
- Looked at
95
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
6
drafted comments based on config settings.
1. engine/baml-runtime/src/eval_expr.rs:259
- Draft comment:
The caching logic for free variables looks correct, but be cautious about potential recursive bindings. Ensure that cyclic dependencies are handled appropriately so that caching doesn’t freeze evaluation. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =30%
<= threshold50%
The comment is suggesting caution about potential recursive bindings in caching logic, which could lead to issues if not handled properly. However, it is phrased in a way that asks the author to ensure something, which violates the rule against asking the author to ensure behavior is intended or tested. The comment could be rephrased to suggest a specific action or ask a specific question about the implementation.
2. engine/baml-runtime/src/eval_expr.rs:137
- Draft comment:
The new let-binding handling via direct evaluation and substitution is clearer. Confirm that the 'close' and subst2 steps correctly handle variable capture and scoping. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is asking the PR author to confirm their intention and ensure the behavior is intended, which violates the rules. It does not provide a specific code suggestion or ask for a specific test to be written.
3. engine/baml-runtime/src/eval_expr.rs:137
- Draft comment:
Let binding now pre-evaluates the bound expression and substitutes it in the body. Confirm that this evaluation order and substitution behavior match the intended semantics. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to confirm the evaluation order and substitution behavior, which violates the rule against asking for confirmation of intention or behavior. It does not provide a specific code suggestion or ask for a specific test to be written.
4. engine/baml-runtime/src/eval_expr.rs:259
- Draft comment:
Caching free variables using a RefCell works in a single-threaded context. If EvalEnv may be used concurrently, consider switching to a Mutex for thread safety. - Reason this comment was not posted:
Marked as duplicate.
5. engine/baml-runtime/src/lib.rs:564
- Draft comment:
The evaluated_cache is instantiated with RefCell here. Ensure that if multiple expressions run in parallel, you replace RefCell with a thread-safe primitive such as Mutex. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The comment makes a speculative suggestion about potential thread safety issues. However, there's no clear evidence that this code runs in parallel - it appears to be part of a single expression evaluation context. RefCell is appropriate for single-threaded runtime borrowing. Without seeing how EvalEnv is actually used across threads (if at all), we can't definitively say Mutex is needed.
I could be wrong about the threading model - there could be parallel expression evaluation happening that I'm not seeing in this code. The comment author may have more context about the runtime architecture.
Even with that possibility, the comment is speculative ("if multiple expressions run in parallel") rather than pointing to a concrete issue. We should only suggest thread-safety changes when we have clear evidence of parallel execution.
The comment should be deleted as it makes a speculative suggestion about thread safety without concrete evidence that parallel execution is actually happening or planned.
6. engine/baml-runtime/src/lib.rs:711
- Draft comment:
Typographical error: "requeset" should be corrected to "request". - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_t6pCOE00MAUoaol5
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
// e.g. (let x = y in f) => (\x y => f) | ||
// TODO: Should rewriting be done here in beta_reduce? Or elsewhere? | ||
// First evaluate the bound expression | ||
let evaluated_value = Box::pin(beta_reduce(env, value)).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review the use of Box::pin
on beta_reduce
calls. If beta_reduce
already returns an async future, you might simplify by awaiting it directly.
let evaluated_value = Box::pin(beta_reduce(env, value)).await?; | |
let evaluated_value = beta_reduce(env, value).await?; |
834561f
to
113bb91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 113bb91 in 1 minute and 22 seconds
More details
- Looked at
88
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
7
drafted comments based on config settings.
1. engine/baml-runtime/src/eval_expr.rs:258
- Draft comment:
Using std::sync::Mutex in async code may block executor threads. Consider using an async-aware mutex (e.g. tokio::sync::Mutex) for evaluated_cache. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50%
This is a legitimate concern - using std::sync::Mutex in async code can block threads if the lock is held during an await. However, in this case the lock is only held very briefly to access a cache, with no awaits while holding the lock. The performance impact would be minimal. The suggestion would make the code more idiomatically correct but isn't fixing a serious issue.
I could be underestimating the performance impact. Even brief blocking could add up with many concurrent requests. The code could be modified in the future to hold the lock longer.
The current usage is safe and performant enough for the simple caching use case. The suggestion is more about code style than fixing a real problem.
While technically correct, this comment suggests a change that would have minimal real-world impact given the current usage pattern. The comment should be removed as it doesn't highlight a significant issue requiring action.
2. engine/baml-runtime/src/lib.rs:560
- Draft comment:
The evaluated_cache is initialized as a new Mutex in each EvalEnv instance. Verify that this cache scope is intended. - Reason this comment was not posted:
Confidence changes required:50%
<= threshold50%
None
3. engine/baml-runtime/src/eval_expr.rs:21
- Draft comment:
New evaluated_cache field added. Consider mentioning concurrency trade-offs (using Mutex) in the docs if heavy concurrent evaluations are expected. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
4. engine/baml-runtime/src/eval_expr.rs:135
- Draft comment:
Let binding evaluation now pre-evaluates the bound expression and substitutes it. Verify that the closing and substitution (usingbody.close
andsubst2
) correctly preserve variable scoping and avoid capture issues. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to verify that the changes correctly preserve variable scoping and avoid capture issues. This falls under the category of asking the author to double-check things, which is against the rules.
5. engine/baml-runtime/src/eval_expr.rs:257
- Draft comment:
Caching free variable evaluations by using a Mutex lock with unwrap(). Consider handling lock poisoning gracefully or documenting that unwrap() is acceptable in this context. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
This is a code quality comment about error handling. While unwrap() usage can be dangerous, in this context:
- Lock poisoning is extremely unlikely since the Mutex only guards a simple cache
- If lock poisoning did occur, it would just mean cache misses, not critical failures
- The code is already handling many other error cases properly with ? operator
- Adding lock poison handling would add complexity for little benefit
I could be underestimating the importance of proper Mutex error handling. Lock poisoning, while rare, could indicate other serious issues in the program.
While lock poisoning can indicate issues, in this case it's just a cache - the worst case is slightly degraded performance. The code can continue functioning without the cache.
The comment should be deleted. The unwrap() usage is acceptable for a cache implementation where lock poisoning would only impact performance, not correctness.
6. engine/baml-runtime/src/lib.rs:559
- Draft comment:
Initializes evaluated_cache per call in EvalEnv. Confirm if caching should be instance-scoped per evaluation or shared across calls. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
7. engine/baml-runtime/src/lib.rs:710
- Draft comment:
Typo detected: In the comment at line 710, 'requeset' should be corrected to 'request' for clarity. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_KJwKoFt9jeYtZpru
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
// e.g. (let x = y in f) => (\x y => f) | ||
// TODO: Should rewriting be done here in beta_reduce? Or elsewhere? | ||
// First evaluate the bound expression | ||
let evaluated_value = Box::pin(beta_reduce(env, value)).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Box::pin(...).await is unnecessary here; beta_reduce is already async so you can simply await its result.
let evaluated_value = Box::pin(beta_reduce(env, value)).await?; | |
let evaluated_value = beta_reduce(env, value).await?; |
Important
Pre-evaluate
let
bindings and cache evaluations of free variables ineval_expr.rs
to improve performance.EvalEnv
usingevaluated_cache
.beta_reduce()
ineval_expr.rs
to pre-evaluatelet
bindings and substitute evaluated values into the body before further evaluation.Arc<Mutex<HashMap<Name, Expr<ExprMetadata>>>>
forevaluated_cache
inEvalEnv
to handle concurrent access.evaluated_cache
inBamlRuntime
inlib.rs
.This description was created by
for 113bb91. It will automatically update as commits are pushed.