-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add system tool calls to the chat endpoint #1179
base: dev
Are you sure you want to change the base?
Conversation
WalkthroughThis PR introduces a new utility function, Changes
Entelligence.ai can learn from your feedback. Simply add 👍 / 👎 emojis to teach it your preferences. More shortcuts belowEmoji Descriptions:
Interact with the Bot:
|
CI Feedback 🧐(Feedback updated until commit 90a3995)A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
459aa61
to
ea7b342
Compare
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 8e299d8 in 2 minutes and 51 seconds
More details
- Looked at
830
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
6
drafted comments based on config settings.
1. agents-api/agents_api/routers/sessions/chat.py:206
- Draft comment:
Using tool_calls_evaluator with tool_types={"system"} here seems inconsistent with the tests which use 'function'. Please confirm if 'system' is the intended type and update tests or documentation if necessary. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =40%
<= threshold50%
The comment is asking the PR author to confirm their intention and update tests or documentation if necessary. This violates the rule against asking for confirmation or ensuring behavior is intended. However, it does point out an inconsistency which could be useful for the author to address.
2. agents-api/tests/test_tool_calls_evaluator.py:305
- Draft comment:
When setting function_mock.name and function_mock.arguments, ensure these properties are plain strings to avoid nested MagicMock issues. This is correctly handled in tests but consider adding a comment in production or test code for clarity. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
3. agents-api/agents_api/routers/sessions/chat.py:206
- Draft comment:
Production code sets tool_types to {'system'} while tests expect {'function'}. Ensure consistent tool type usage. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
While there does appear to be a mismatch between production and test code, I don't have enough context to know which is correct. The test could be wrong, or the production code could be wrong. Without understanding the tool_calls_evaluator function and its requirements, I can't be confident that changing to "function" is the right fix. The comment is making assumptions without strong evidence.
I might be missing important context about the tool_calls_evaluator function's requirements. There could be a good reason for the difference between test and production code.
Even with more context, suggesting a change without understanding why the current value is "system" is risky. We should err on the side of assuming the production code is correct unless we have strong evidence otherwise.
Delete the comment. While there is a discrepancy, we don't have strong evidence that the production code is wrong and that "function" is the correct value.
4. agents-api/agents_api/routers/utils/tools.py:244
- Draft comment:
The while loop in tool_calls_evaluator uses a 'done' flag that may prematurely skip matching tool calls if a non-matching call is encountered. Consider refactoring to process all matching calls. - Reason this comment was not posted:
Comment looked like it was already resolved.
5. agents-api/agents_api/routers/utils/tools.py:172
- Draft comment:
Awaiting a BackgroundTasks instance is unusual. Verify that 'await bg_runner()' (in chat operations) behaves as intended. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the author to verify the behavior of 'await bg_runner()'. This falls under asking the author to ensure the behavior is intended, which is against the rules. The comment does not provide a specific suggestion or ask for a specific test to be written.
6. agents-api/tests/test_tool_calls_evaluator.py:300
- Draft comment:
Ensure that test cases reflect the intended production tool type conventions to avoid mismatches. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is asking the PR author to ensure that test cases reflect certain conventions, which is similar to asking them to double-check or verify something. This violates the rule against asking the author to ensure or verify things.
Workflow ID: wflow_QgQOiJHOXjoVF10q
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on ebd3769 in 2 minutes and 37 seconds
More details
- Looked at
2409
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
7
drafted comments based on config settings.
1. agents-api/agents_api/routers/utils/tools.py:275
- Draft comment:
TODO already noted – consider implementing streaming response handling for tool call results. This would improve responsiveness for long-running tool calls. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. agents-api/agents_api/routers/utils/tools.py:277
- Draft comment:
Revisit how non-matching tool type calls are handled. Setting 'done = True' might not proceed with additional valid tool calls in later choices; consider clarifying intended behavior. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. agents-api/tests/fixtures.py:91
- Draft comment:
The helper 'make_acompletion_multiple_outputs' is useful but consider adding inline comments to document its intended use for tool call test scenarios. - Reason this comment was not posted:
Confidence changes required:50%
<= threshold50%
None
4. agents-api/tests/utils.py:169
- Draft comment:
The usage of side_effect for acompletion in patch_embed_acompletion_multiple_outputs is clever. Ensure that this side_effect produces async-compatible responses if used in an async context; document any assumptions here. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the author to ensure that a function produces async-compatible responses and to document assumptions. This is similar to asking the author to ensure behavior is intended or tested, which violates the rules.
5. agents-api/agents_api/routers/utils/tools.py:283
- Draft comment:
Make sure tool.function.arguments is always a JSON string. If it might already be a dict, consider adding a type check before calling json.loads to avoid potential TypeErrors. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. agents-api/tests/test_tool_calls_evaluator.py:778
- Draft comment:
Test 'chat: evaluate agent.doc.search tool call' reuses the 'agent_doc_list' fixture. This may be a mistake; consider using a dedicated 'agent_doc_search' fixture to accurately test the search tool call. - Reason this comment was not posted:
Marked as duplicate.
7. agents-api/tests/utils.py:191
- Draft comment:
The construction of the PG DSN using substring [22:] from the test_psql_url seems brittle. Consider using a more robust URL parsing method rather than hard-coding the offset. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_ZeyI4JUQu3R7gpFG
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
fdde051
to
b0244fd
Compare
208caa2
to
20b65d9
Compare
choices=[choice.model_dump() for choice in model_response.choices], | ||
) | ||
|
||
total_tokens_per_user.labels(str(developer.id)).inc( | ||
amount=chat_response.usage.total_tokens if chat_response.usage is not None else 0, | ||
) | ||
# For non-streaming responses, update metrics and return the response | ||
if not chat_input.stream: | ||
total_tokens_per_user.labels(str(developer.id)).inc( | ||
amount=chat_response.usage.total_tokens if chat_response.usage is not None else 0, | ||
) | ||
return chat_response | ||
|
||
return chat_response | ||
# Note: For streaming responses, we've already returned the StreamingResponse above | ||
# This code is unreachable for streaming responses | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function returns None
at line 321 which contradicts the return type annotation ChatResponse | StreamingResponse
and could cause runtime errors.
📝 Committable Code Suggestion
‼️ Ensure you review the code suggestion before committing it to the branch. Make sure it replaces the highlighted code, contains no missing lines, and has no issues with indentation.
choices=[choice.model_dump() for choice in model_response.choices], | |
) | |
total_tokens_per_user.labels(str(developer.id)).inc( | |
amount=chat_response.usage.total_tokens if chat_response.usage is not None else 0, | |
) | |
# For non-streaming responses, update metrics and return the response | |
if not chat_input.stream: | |
total_tokens_per_user.labels(str(developer.id)).inc( | |
amount=chat_response.usage.total_tokens if chat_response.usage is not None else 0, | |
) | |
return chat_response | |
return chat_response | |
# Note: For streaming responses, we've already returned the StreamingResponse above | |
# This code is unreachable for streaming responses | |
return None | |
) -> ChatResponse | StreamingResponse: # FIXME: Update type to include StreamingResponse | |
""" | |
Initiates a chat session. | |
... | |
# This code is unreachable for streaming responses | |
raise RuntimeError("Unreachable code - streaming response should have been returned earlier") |
a9674a2
to
bb14350
Compare
and len(chunk.choices) > 0 | ||
) | ||
|
||
# Update metrics when we detect the final chunk | ||
if final_usage and has_choices and chunk.choices[0].finish_reason: | ||
# This is the last chunk with the finish reason | ||
total_tokens = final_usage.get("total_tokens", 0) | ||
total_tokens_per_user.labels(str(developer.id)).inc( | ||
amount=total_tokens | ||
) | ||
|
||
# Collect content for the full response | ||
if has_choices and hasattr(chunk.choices[0], "delta"): | ||
delta = chunk.choices[0].delta | ||
if hasattr(delta, "content") and delta.content: | ||
content_so_far += delta.content | ||
has_content = True | ||
|
||
# Prepare the response chunk | ||
choices_to_send = [] | ||
if has_choices: | ||
chunk_data = chunk.choices[0].model_dump() | ||
|
||
# Ensure delta always contains a role field | ||
if "delta" in chunk_data and "role" not in chunk_data["delta"]: | ||
chunk_data["delta"]["role"] = "assistant" | ||
|
||
choices_to_send = [chunk_data] | ||
|
||
# Create and send the chunk response | ||
chunk_response = ChunkChatResponse( | ||
id=response_id, | ||
created_at=created_at, | ||
jobs=jobs, | ||
docs=doc_references, | ||
usage=final_usage, | ||
choices=choices_to_send, | ||
) | ||
yield chunk_response.model_dump_json() + "\n" | ||
|
||
except Exception as e: | ||
# Log error details for debugging but send a generic message to client | ||
import logging | ||
|
||
logging.error(f"Error processing chunk: {e!s}") | ||
|
||
error_response = { | ||
"id": str(response_id), | ||
"created_at": created_at.isoformat(), | ||
"error": "An error occurred while processing the response chunk.", | ||
} | ||
yield json.dumps(error_response) + "\n" | ||
# Continue processing remaining chunks | ||
continue | ||
|
||
# Save complete response to history if needed | ||
if chat_input.save and has_content: | ||
try: | ||
# Create entry for the complete response | ||
complete_entry = CreateEntryRequest.from_model_input( | ||
model=settings["model"], | ||
role="assistant", | ||
content=content_so_far, | ||
source="api_response", | ||
) | ||
# Create a task to save the entry without blocking the stream | ||
ref = asyncio.create_task( | ||
create_entries( | ||
developer_id=developer.id, | ||
session_id=session_id, | ||
data=[complete_entry], | ||
) | ||
) | ||
stream_tasks.append(ref) | ||
except Exception as e: | ||
# Log the full error for debugging purposes | ||
import logging | ||
|
||
logging.error(f"Failed to save streamed response: {e!s}") | ||
|
||
# Send a minimal error message to the client | ||
error_response = { | ||
"id": str(response_id), | ||
"created_at": created_at.isoformat(), | ||
"error": "Failed to save response history.", | ||
} | ||
yield json.dumps(error_response) + "\n" | ||
except Exception as e: | ||
# Log the detailed error for system debugging | ||
import logging | ||
|
||
logging.error(f"Streaming error: {e!s}") | ||
|
||
# Send a user-friendly error message to the client | ||
error_response = { | ||
"id": str(response_id), | ||
"created_at": created_at.isoformat(), | ||
"error": "An error occurred during the streaming response.", | ||
} | ||
yield json.dumps(error_response) + "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error handling in streaming mode doesn't properly clean up resources - missing finally
block to ensure model resources are released even if streaming fails.
📝 Committable Code Suggestion
‼️ Ensure you review the code suggestion before committing it to the branch. Make sure it replaces the highlighted code, contains no missing lines, and has no issues with indentation.
try: | |
# Stream chunks from the model_response (CustomStreamWrapper from litellm) | |
async for chunk in model_response: | |
# Process a single chunk of the streaming response | |
try: | |
# Extract usage metrics if available | |
if hasattr(chunk, "usage") and chunk.usage: | |
final_usage = chunk.usage.model_dump() | |
# Check if chunk has valid choices | |
has_choices = ( | |
hasattr(chunk, "choices") | |
and chunk.choices | |
and len(chunk.choices) > 0 | |
) | |
# Update metrics when we detect the final chunk | |
if final_usage and has_choices and chunk.choices[0].finish_reason: | |
# This is the last chunk with the finish reason | |
total_tokens = final_usage.get("total_tokens", 0) | |
total_tokens_per_user.labels(str(developer.id)).inc( | |
amount=total_tokens | |
) | |
# Collect content for the full response | |
if has_choices and hasattr(chunk.choices[0], "delta"): | |
delta = chunk.choices[0].delta | |
if hasattr(delta, "content") and delta.content: | |
content_so_far += delta.content | |
has_content = True | |
# Prepare the response chunk | |
choices_to_send = [] | |
if has_choices: | |
chunk_data = chunk.choices[0].model_dump() | |
# Ensure delta always contains a role field | |
if "delta" in chunk_data and "role" not in chunk_data["delta"]: | |
chunk_data["delta"]["role"] = "assistant" | |
choices_to_send = [chunk_data] | |
# Create and send the chunk response | |
chunk_response = ChunkChatResponse( | |
id=response_id, | |
created_at=created_at, | |
jobs=jobs, | |
docs=doc_references, | |
usage=final_usage, | |
choices=choices_to_send, | |
) | |
yield chunk_response.model_dump_json() + "\n" | |
except Exception as e: | |
# Log error details for debugging but send a generic message to client | |
import logging | |
logging.error(f"Error processing chunk: {e!s}") | |
error_response = { | |
"id": str(response_id), | |
"created_at": created_at.isoformat(), | |
"error": "An error occurred while processing the response chunk.", | |
} | |
yield json.dumps(error_response) + "\n" | |
# Continue processing remaining chunks | |
continue | |
# Save complete response to history if needed | |
if chat_input.save and has_content: | |
try: | |
# Create entry for the complete response | |
complete_entry = CreateEntryRequest.from_model_input( | |
model=settings["model"], | |
role="assistant", | |
content=content_so_far, | |
source="api_response", | |
) | |
# Create a task to save the entry without blocking the stream | |
ref = asyncio.create_task( | |
create_entries( | |
developer_id=developer.id, | |
session_id=session_id, | |
data=[complete_entry], | |
) | |
) | |
stream_tasks.append(ref) | |
except Exception as e: | |
# Log the full error for debugging purposes | |
import logging | |
logging.error(f"Failed to save streamed response: {e!s}") | |
# Send a minimal error message to the client | |
error_response = { | |
"id": str(response_id), | |
"created_at": created_at.isoformat(), | |
"error": "Failed to save response history.", | |
} | |
yield json.dumps(error_response) + "\n" | |
except Exception as e: | |
# Log the detailed error for system debugging | |
import logging | |
logging.error(f"Streaming error: {e!s}") | |
# Send a user-friendly error message to the client | |
error_response = { | |
"id": str(response_id), | |
"created_at": created_at.isoformat(), | |
"error": "An error occurred during the streaming response.", | |
} | |
yield json.dumps(error_response) + "\n" | |
try: | |
# Stream chunks from the model_response (CustomStreamWrapper from litellm) | |
async for chunk in model_response: | |
# Process a single chunk of the streaming response | |
try: | |
# Extract usage metrics if available | |
if hasattr(chunk, "usage") and chunk.usage: | |
final_usage = chunk.usage.model_dump() | |
# Check if chunk has valid choices | |
has_choices = ( | |
hasattr(chunk, "choices") | |
and chunk.choices | |
and len(chunk.choices) > 0 | |
) | |
# Update metrics when we detect the final chunk | |
if final_usage and has_choices and chunk.choices[0].finish_reason: | |
# This is the last chunk with the finish reason | |
total_tokens = final_usage.get("total_tokens", 0) | |
total_tokens_per_user.labels(str(developer.id)).inc( | |
amount=total_tokens | |
) | |
# Collect content for the full response | |
if has_choices and hasattr(chunk.choices[0], "delta"): | |
delta = chunk.choices[0].delta | |
if hasattr(delta, "content") and delta.content: | |
content_so_far += delta.content | |
has_content = True | |
# Prepare the response chunk | |
choices_to_send = [] | |
if has_choices: | |
chunk_data = chunk.choices[0].model_dump() | |
# Ensure delta always contains a role field | |
if "delta" in chunk_data and "role" not in chunk_data["delta"]: | |
chunk_data["delta"]["role"] = "assistant" | |
choices_to_send = [chunk_data] | |
# Create and send the chunk response | |
chunk_response = ChunkChatResponse( | |
id=response_id, | |
created_at=created_at, | |
jobs=jobs, | |
docs=doc_references, | |
usage=final_usage, | |
choices=choices_to_send, | |
) | |
yield chunk_response.model_dump_json() + "\n" | |
except Exception as e: | |
# Log error details for debugging but send a generic message to client | |
import logging | |
logging.error(f"Error processing chunk: {e!s}") | |
error_response = { | |
"id": str(response_id), | |
"created_at": created_at.isoformat(), | |
"error": "An error occurred while processing the response chunk.", | |
} | |
yield json.dumps(error_response) + "\n" | |
# Continue processing remaining chunks | |
continue | |
# Save complete response to history if needed | |
if chat_input.save and has_content: | |
try: | |
# Create entry for the complete response | |
complete_entry = CreateEntryRequest.from_model_input( | |
model=settings["model"], | |
role="assistant", | |
content=content_so_far, | |
source="api_response", | |
) | |
# Create a task to save the entry without blocking the stream | |
ref = asyncio.create_task( | |
create_entries( | |
developer_id=developer.id, | |
session_id=session_id, | |
data=[complete_entry], | |
) | |
) | |
stream_tasks.append(ref) | |
except Exception as e: | |
# Log the full error for debugging purposes | |
import logging | |
logging.error(f"Failed to save streamed response: {e!s}") | |
# Send a minimal error message to the client | |
error_response = { | |
"id": str(response_id), | |
"created_at": created_at.isoformat(), | |
"error": "Failed to save response history.", | |
} | |
yield json.dumps(error_response) + "\n" | |
except Exception as e: | |
# Log the detailed error for system debugging | |
import logging | |
logging.error(f"Streaming error: {e!s}") | |
# Send a user-friendly error message to the client | |
error_response = { | |
"id": str(response_id), | |
"created_at": created_at.isoformat(), | |
"error": "An error occurred during the streaming response.", | |
} | |
yield json.dumps(error_response) + "\n" | |
finally: | |
# Ensure model resources are properly released | |
if hasattr(model_response, "close") and callable(model_response.close): | |
await model_response.close() | |
# Wait for any pending save tasks to complete | |
if stream_tasks: | |
await asyncio.gather(*stream_tasks, return_exceptions=True) |
tool_calls = [] | ||
|
||
if not stream: | ||
response: ModelResponse = await self._completion_func(**kwargs) | ||
if not response.choices: | ||
return response | ||
|
||
tool_calls = response.choices[0].message.tool_calls | ||
|
||
if not tool_calls: | ||
return response | ||
else: | ||
first_chunk, response = await self.get_first_chunk(response) | ||
if first_chunk and not first_chunk.choices: | ||
return response | ||
|
||
delta = first_chunk.choices[0].delta | ||
if delta.content: | ||
return response | ||
|
||
async for chunk in cast(CustomStreamWrapper, response): | ||
delta = chunk.choices[0].delta | ||
if delta.tool_calls: | ||
for tool_call in delta.tool_calls: | ||
if len(tool_calls) <= tool_call.index: | ||
tool_calls.append({ | ||
"id": "", | ||
"type": "function", | ||
"function": {"name": "", "arguments": ""}, | ||
}) | ||
|
||
tc = tool_calls[tool_call.index] | ||
if tool_call.id: | ||
tc["id"] = tool_call.id | ||
if tool_call.function.name: | ||
tc["function"]["name"] += tool_call.function.name | ||
if tool_call.function.arguments: | ||
tc["function"]["arguments"] += tool_call.function.arguments | ||
|
||
for tool in tool_calls: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tool_calls
list is populated with tool calls from streaming response but never checked if they are valid tool types, allowing unauthorized tool execution.
📝 Committable Code Suggestion
‼️ Ensure you review the code suggestion before committing it to the branch. Make sure it replaces the highlighted code, contains no missing lines, and has no issues with indentation.
tool_calls = [] | |
if not stream: | |
response: ModelResponse = await self._completion_func(**kwargs) | |
if not response.choices: | |
return response | |
tool_calls = response.choices[0].message.tool_calls | |
if not tool_calls: | |
return response | |
else: | |
first_chunk, response = await self.get_first_chunk(response) | |
if first_chunk and not first_chunk.choices: | |
return response | |
delta = first_chunk.choices[0].delta | |
if delta.content: | |
return response | |
async for chunk in cast(CustomStreamWrapper, response): | |
delta = chunk.choices[0].delta | |
if delta.tool_calls: | |
for tool_call in delta.tool_calls: | |
if len(tool_calls) <= tool_call.index: | |
tool_calls.append({ | |
"id": "", | |
"type": "function", | |
"function": {"name": "", "arguments": ""}, | |
}) | |
tc = tool_calls[tool_call.index] | |
if tool_call.id: | |
tc["id"] = tool_call.id | |
if tool_call.function.name: | |
tc["function"]["name"] += tool_call.function.name | |
if tool_call.function.arguments: | |
tc["function"]["arguments"] += tool_call.function.arguments | |
for tool in tool_calls: | |
tool_calls = [] | |
for chunk in response: | |
if not chunk.choices: | |
continue | |
delta = chunk.choices[0].delta | |
if not delta: | |
continue | |
if delta.tool_calls: | |
for tool_call in delta.tool_calls: | |
# Validate tool type before adding | |
if tool_call.type not in ['function', 'code']: | |
continue | |
# Initialize tool call if new | |
if tool_call.index >= len(tool_calls): | |
tool_calls.append({ | |
'id': tool_call.id, | |
'type': tool_call.type, | |
'function': { | |
'name': '', | |
'arguments': '' | |
} | |
}) | |
# Update function name if present | |
if tool_call.function and tool_call.function.name: | |
tool_calls[tool_call.index]['function']['name'] = \ | |
tool_call.function.name | |
# Append function arguments if present | |
if tool_call.function and tool_call.function.arguments: | |
tool_calls[tool_call.index]['function']['arguments'] += \ | |
tool_call.function.arguments | |
# Execute validated tool calls | |
for tool_call in tool_calls: | |
if tool_call['type'] == 'function': | |
function_name = tool_call['function']['name'] | |
function_args = tool_call['function']['arguments'] | |
# Execute function call | |
try: | |
function_args = json.loads(function_args) | |
result = self._call_function(function_name, function_args) | |
yield {'type': 'function', 'output': result} | |
except Exception as e: | |
yield {'type': 'error', 'error': str(e)} |
async def peek_first_chunk( | ||
response: CustomStreamWrapper, | ||
) -> tuple[ModelResponseStream, CustomStreamWrapper]: | ||
try: | ||
first_chunk = await response.__anext__() | ||
except StopAsyncIteration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error handling for first_chunk
being None
in peek_first_chunk()
. The function should handle the case where first_chunk
is None
to avoid potential runtime errors.
📝 Committable Code Suggestion
‼️ Ensure you review the code suggestion before committing it to the branch. Make sure it replaces the highlighted code, contains no missing lines, and has no issues with indentation.
async def peek_first_chunk( | |
response: CustomStreamWrapper, | |
) -> tuple[ModelResponseStream, CustomStreamWrapper]: | |
try: | |
first_chunk = await response.__anext__() | |
except StopAsyncIteration: | |
def peek_first_chunk(self): | |
if self.first_chunk is None: | |
return None | |
return self.first_chunk.chunk |
"id": "call_user_doc_list", | ||
"type": "system", | ||
"function": { | ||
"name": "user.doc.list", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this (and related) should be user.docs.list
(docs plural) etc to keep things consistent with the api and sdk
]) | ||
|
||
agent_doc_create = make_acompletion_multiple_outputs( | ||
lambda agent, doc, user, task, user_doc, session: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be simplified as lambda agent, **_: ...
3ff3a18
to
65497b5
Compare
User description
Closes #1285
EntelligenceAI PR Summary
The recent update introduces a new utility function,
eval_tool_calls
, intools.py
to enhance the evaluation of tool calls within chat sessions. Thechat
function inchat.py
has been updated to utilize this utility, replacing direct calls tolitellm.acompletion
. This change aims to improve code modularity and streamline the integration of tool responses in chat sessions.PR Type
Enhancement, Tests
Description
Introduced
tool_calls_evaluator
to streamline tool call handling in chat sessions.Added a new utility function
call_tool
for executing system tool calls.Implemented extensive unit tests for
tool_calls_evaluator
andcall_tool
.Enhanced modularity and maintainability of the chat endpoint.
Changes walkthrough 📝
chat.py
Refactor chat endpoint to use tool evaluator
agents-api/agents_api/routers/sessions/chat.py
tool_calls_evaluator
into thechat
function.litellm.acompletion
calls with evaluated tool calls.tools.py
Introduce tool call evaluation utilities
agents-api/agents_api/routers/utils/tools.py
tool_calls_evaluator
decorator for tool call evaluation.call_tool
function to handle system tool calls._system_tool_handlers
mapping for tool-specific logic.test_tool_calls_evaluator.py
Add unit tests for tool evaluation utilities
agents-api/tests/test_tool_calls_evaluator.py
call_tool
covering various tool operations.tool_calls_evaluator
for correct tool call handling.Important
Refactor chat endpoint to use
tool_calls_evaluator
for tool call handling, enhancing modularity and adding extensive tests.chat()
inchat.py
to usetool_calls_evaluator
for handling tool calls, replacing directlitellm.acompletion
calls.call_tool
intools.py
for executing system tool calls.tool_calls_evaluator
decorator intools.py
for tool call evaluation._system_tool_handlers
mapping intools.py
for tool-specific logic.test_tool_calls_evaluator.py
with unit tests fortool_calls_evaluator
andcall_tool
.tools.py
.This description was created by
for ebd3769. It will automatically update as commits are pushed.