Skip to content

Superficial streaming for workflows #1832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 22, 2025
Merged

Superficial streaming for workflows #1832

merged 1 commit into from
Apr 22, 2025

Conversation

imalsogreg
Copy link
Contributor

@imalsogreg imalsogreg commented Apr 22, 2025

When a workflow ends with an LLM call, the workflow will stream that LLM call.

In other words, when you call a function whose final expression is a single LLM call, the result can be streamed. But if the final expression is something other than an LLM call, like a list or a class constructor, the result will be delivered in one shot.


Important

This PR enables streaming for workflows ending with an LLM call by evaluating if the final expression is a single LLM call, with updates to type inference, expression evaluation, and runtime execution.

  • Behavior:
    • Workflows ending with an LLM call now support streaming if the final expression is a single LLM call.
    • Non-LLM final expressions (e.g., lists, class constructors) are delivered in one shot.
  • Type Inference:
    • Updates to infer_types_in_context() in expr_typecheck.rs to handle new expression types.
    • Introduces initial_typing_context() in repr.rs for better type inference.
  • Expression Evaluation:
    • Adds eval_to_value_or_llm_call() in eval_expr.rs to evaluate expressions to either a value or an LLM call.
    • Modifies beta_reduce() in eval_expr.rs to support LLM function evaluation.
  • Runtime Execution:
    • Updates BamlRuntime in lib.rs to handle expression functions and determine if they result in an LLM call.
    • Introduces expr_eval_result() in lib.rs to evaluate expression functions and determine streaming capability.

This description was created by Ellipsis for 002d33f. You can customize this summary. It will automatically update as commits are pushed.

Copy link

vercel bot commented Apr 22, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
baml ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 22, 2025 4:31am

@imalsogreg imalsogreg force-pushed the greg/expr-streaming branch from 5fc064c to 002d33f Compare April 22, 2025 04:24
@imalsogreg imalsogreg marked this pull request as ready for review April 22, 2025 04:25
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to 002d33f in 4 minutes and 23 seconds. Click for details.
  • Reviewed 1195 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 12 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. engine/baml-runtime/src/lib.rs:618
  • Draft comment:
    Avoid swallowing errors in mapping parameters to expressions (see use of .unwrap_or(vec![]) on line 1188). Propagate or log errors instead so they don’t silently hide bugs.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. engine/baml-runtime/src/lib.rs:1167
  • Draft comment:
    Repeated use of Span::fake() is concerning – consider introducing a helper function or a proper span mechanism for generated expressions. This appears to be a temporary hack that might affect debugging.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The comment raises a valid code quality concern about repeated use of a potentially temporary solution. However, this appears to be an intentional pattern for handling synthetic/generated expressions that don't have real source locations. The comment suggests a refactor but doesn't identify a concrete issue or bug. The current approach, while not ideal, seems functional and contained within expression evaluation logic. I may be underestimating the debugging impact of using fake spans. Perhaps there are real maintenance or debugging challenges this creates that I'm not seeing. While better span handling could be nice, the current usage appears intentional for generated code. Without evidence of actual problems caused by the fake spans, this feels more like a "nice to have" refactor than a critical issue. Delete the comment. While it raises a valid point about code quality, it's more of a "nice to have" suggestion without clear evidence of problems. The current approach seems intentional for handling generated expressions.
3. engine/baml-runtime/src/lib.rs:216
  • Draft comment:
    Consider handling errors in unbounded_send calls (e.g. on expr_tx) instead of calling unwrap(), to avoid possible panics in production.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. engine/baml-runtime/src/lib.rs:1187
  • Draft comment:
    The code mixing expression evaluation and streaming logic (in expr_eval_result) is quite complex. Please add more inline documentation explaining the flow—especially the discussion on when an expression function is expected to yield an LLM call versus a final value.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
5. engine/baml-runtime/src/lib.rs:1450
  • Draft comment:
    Consider catching and logging errors from the tracer during span finish calls rather than simply logging a debug message. This might help later when troubleshooting issues related to span closure consistency.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. engine/baml-runtime/src/lib.rs:1520
  • Draft comment:
    The beta reduction in eval_expr uses deep recursion and many clone operations. Consider checking that this approach scales well with larger expressions, or document any design decisions related to performance and memory usage.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. engine/baml-runtime/src/eval_expr.rs:168
  • Draft comment:
    Avoid using unwrap() on the result of substitution. Instead of using .as_ref().unwrap().clone(), consider propagating errors with ? to handle failures gracefully.
  • Reason this comment was not posted:
    Comment was on unchanged code.
8. engine/baml-runtime/src/lib.rs:595
  • Draft comment:
    The branch handling expression functions in call_function_with_expr_events is marked as 'ugly'. Consider refactoring the API so that expression functions and LLM functions have distinct, clearer handling paths.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. engine/baml-runtime/src/lib.rs:320
  • Draft comment:
    Repetitive code in handling different expression types (List, Map, ClassConstructor) in eval_to_value_or_llm_call could be abstracted into helper functions to improve maintainability.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. engine/baml-runtime/src/lib.rs:1141
  • Draft comment:
    Frequent use of Span::fake() for creating synthetic spans may reduce diagnostic quality. Consider propagating real span information where possible.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the comment identifies a real pattern of using Span::fake(), this appears to be intentional in this context. The spans are being used for synthetic/generated expressions and metadata where real source spans don't exist or aren't relevant. The code is handling runtime evaluation of expressions, not parsing source code where source locations would be meaningful. The comment suggests an improvement that may not be feasible or necessary in this case. I could be wrong about whether real spans could be propagated here - perhaps there is source location information available that could be preserved through the expression evaluation. However, given this is runtime expression evaluation code that's being added, and these spans are for synthetic/generated expressions, fake spans seem appropriate here. Real spans would only be useful if we needed to map errors back to source locations. The comment should be deleted. While it identifies a pattern, the use of fake spans appears intentional and appropriate for this runtime evaluation context. The suggested improvement may not be feasible or beneficial.
11. engine/baml-lib/baml-core/src/ir/repr.rs:1998
  • Draft comment:
    Typographical error: In the comment above the Prompt enum, 'stirng' should be spelled 'string'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. engine/baml-runtime/src/eval_expr.rs:295
  • Draft comment:
    In the log message printed in the loop inside eval_to_value_or_llm_call (line ~295-296), "loop eval_to_value_or_lm_call:" is used, which appears to be a typo. It should consistently use "llm" (i.e., "loop eval_to_value_or_llm_call:"). Please fix for consistency.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% While the comment is technically correct about the inconsistency, it's pointing out a very minor issue in a debug log message. Debug messages are not part of the public API or functionality, and this typo doesn't affect the code's behavior. The rules state not to make comments that are obvious or unimportant. The typo could theoretically make logs harder to grep/search for if someone is looking for all llm-related log messages. Also, consistency in naming is generally good practice. However, this is just a debug print statement that only appears during development/debugging. The benefit of fixing this typo is extremely minimal compared to the overhead of making a code change. This comment should be deleted as it points out an unimportant issue in debug logging that doesn't affect functionality.

Workflow ID: wflow_EBMHdadBFPzTDV4V

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Ok(Expr::Atom(baml_value_with_meta))
})
.collect::<Result<_>>()
.unwrap_or(vec![]); //TODO: Is it acceptable to swallow errors here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swallowing errors in parameter conversion can hide issues. Instead of using .collect::<Result<_>>().unwrap_or(vec![]), consider propagating the error or logging it for debugging.

/// If it's an LLM function, just return the function name and params.
///
/// If it's an expression function, determine whether it evaluates to an LLM function,
/// and if so, return that function and its params. Not all expr functios evaluate to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: In the comment for expr_eval_result, 'functios' should be corrected to 'functions'.

Suggested change
/// and if so, return that function and its params. Not all expr functios evaluate to
/// and if so, return that function and its params. Not all expr functions evaluate to

@imalsogreg imalsogreg added this pull request to the merge queue Apr 22, 2025
Merged via the queue into canary with commit ff2e76e Apr 22, 2025
12 checks passed
@imalsogreg imalsogreg deleted the greg/expr-streaming branch April 22, 2025 04:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant