-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V0.23 #39
base: old/main
Are you sure you want to change the base?
V0.23 #39
Conversation
Reviewer's Guide by SourceryThis pull request, titled 'V0.23', introduces several significant changes across multiple files in the PR-Assistant project. The changes include updates to various components such as Git providers (Bitbucket, GitHub, GitLab), server implementations, configuration files, and documentation. Key modifications involve improving token handling, extending patch processing, updating model configurations, and enhancing the functionality of different tools within the PR-Assistant. File-Level Changes
Tips
|
PR Reviewer Guide 🔍
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gitworkflows - I've reviewed your changes - here's some feedback:
Overall Comments:
- This PR includes significant improvements across multiple components of PR-Assistant. However, we noticed the removal of some CI/CD workflows. Could you please clarify your testing strategy and how you plan to maintain code quality without these workflows?
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟡 Testing: 1 issue found
- 🟡 Complexity: 1 issue found
- 🟡 Documentation: 1 issue found
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
@@ -26,6 +26,7 @@ def __init__(self, pr_url: Optional[str] = None): | |||
self.installation_id = context.get("installation_id", None) | |||
except Exception: | |||
self.installation_id = None | |||
self.max_comment_chars = 65000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider making the max_comment_chars value configurable
Instead of hardcoding the max_comment_chars value, consider making it configurable through settings or defining it as a constant at the module level. This would make the code more flexible and easier to adjust for different GitHub API limits in the future.
MAX_COMMENT_CHARS = 65000
class GitHubProvider:
def __init__(self, config):
self.max_comment_chars = config.get('max_comment_chars', MAX_COMMENT_CHARS)
@@ -45,7 +45,7 @@ def test_load_invalid_yaml2(self): | |||
with pytest.raises(ScannerError): | |||
yaml.safe_load(yaml_str) | |||
|
|||
expected_output = [{'relevant file': 'src/app.py:', 'suggestion content': 'The print statement is outside inside the if __name__ ==: '}] | |||
expected_output = [{'relevant file': 'src/app.py:', 'suggestion content': 'The print statement is outside inside the if __name__ ==:'}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Add test for edge case with empty YAML string
Consider adding a test case for an empty YAML string to ensure the load_yaml
function handles it gracefully.
expected_output = [{'relevant file': 'src/app.py:', 'suggestion content': 'The print statement is outside inside the if __name__ ==:'}] | |
expected_output = [{'relevant file': 'src/app.py:', 'suggestion content': 'The print statement is outside inside the if __name__ ==:'}] | |
assert load_yaml(yaml_str) == expected_output | |
# Test empty YAML string | |
empty_yaml_str = "" | |
assert load_yaml(empty_yaml_str) == [] |
@@ -8,7 +66,7 @@ git_provider="azure" | |||
|
|||
Azure DevOps provider supports [PAT token](https://learn.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-authenticate?view=azure-devops&tabs=Windows) or [DefaultAzureCredential](https://learn.microsoft.com/en-us/azure/developer/python/sdk/authentication-overview#authentication-in-server-environments) authentication. | |||
PAT is faster to create, but has build in expiration date, and will use the user identity for API calls. | |||
Using DefaultAzureCredential you can use managed identity or Service principle, which are more secure and will create separate ADO user identity (via AAD) to the agent. | |||
Using DefaultAzureCredential you can use managed identity or Service principle, which are more secure and will create separate ADO user identity (via AAD) to the assistant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (documentation): Typo: 'Service principle' should be 'Service principal'.
The term 'Service principle' is incorrect. It should be 'Service principal'.
@@ -47,3 +47,17 @@ def apply_repo_settings(pr_url): | |||
os.remove(repo_settings_file) | |||
except Exception as e: | |||
get_logger().error(f"Failed to remove temporary settings file {repo_settings_file}", e) | |||
|
|||
# enable switching models with a short definition | |||
if get_settings().config.model.lower()=='claude-3-5-sonnet': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider refactoring the model-specific logic into a separate function.
The new code introduces additional complexity by mixing model-specific logic into the apply_repo_settings
function. This affects separation of concerns, readability, and maintainability. A more streamlined approach would be to use a dictionary for model configurations and a separate function to handle model-specific settings. Here’s a suggested refactor:
def apply_repo_settings():
try:
get_settings().set(section, section_dict, merge=False)
get_logger().info(f"Applying repo settings:\n{new_settings.as_dict()}")
except Exception as e:
get_logger().exception("Failed to apply repo settings", e)
finally:
if repo_settings_file:
try:
os.remove(repo_settings_file)
except Exception as e:
get_logger().error(f"Failed to remove temporary settings file {repo_settings_file}", e)
# Enable switching models with a short definition
model_name = get_settings().config.model.lower()
if model_name in MODEL_CONFIGURATIONS:
set_model_configuration(model_name)
MODEL_CONFIGURATIONS = {
'claude-3-5-sonnet': {
'model': "bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
'model_turbo': "bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
'fallback_models': ["bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0"]
}
}
def set_model_configuration(model_name):
config = MODEL_CONFIGURATIONS[model_name]
get_settings().set('config.model', config['model'])
get_settings().set('config.model_turbo', config['model_turbo'])
get_settings().set('config.fallback_models', config['fallback_models'])
This keeps the apply_repo_settings
function focused and makes the code more maintainable.
else: | ||
# for llms that compatible with openai, should use custom api base | ||
openai_api_base = get_settings().get("OPENAI.API_BASE", None) | ||
if openai_api_base is None or len(openai_api_base) == 0: | ||
return ChatOpenAI(openai_api_key=get_settings().openai.key) | ||
else: | ||
return ChatOpenAI(openai_api_key=get_settings().openai.key, openai_api_base=openai_api_base) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): We've found these issues:
- Remove unnecessary else after guard condition (
remove-unnecessary-else
) - Replace if statement with if expression (
assign-if-exp
)
else: | |
# for llms that compatible with openai, should use custom api base | |
openai_api_base = get_settings().get("OPENAI.API_BASE", None) | |
if openai_api_base is None or len(openai_api_base) == 0: | |
return ChatOpenAI(openai_api_key=get_settings().openai.key) | |
else: | |
return ChatOpenAI(openai_api_key=get_settings().openai.key, openai_api_base=openai_api_base) | |
# for llms that compatible with openai, should use custom api base | |
openai_api_base = get_settings().get("OPENAI.API_BASE", None) | |
return ( | |
ChatOpenAI(openai_api_key=get_settings().openai.key) | |
if openai_api_base is None or len(openai_api_base) == 0 | |
else ChatOpenAI( | |
openai_api_key=get_settings().openai.key, | |
openai_api_base=openai_api_base, | |
) | |
) |
@@ -557,7 +557,7 @@ def _fix_key_value(key: str, value: str): | |||
|
|||
|
|||
def load_yaml(response_text: str, keys_fix_yaml: List[str] = [], first_key="", last_key="") -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace mutable default arguments with None (default-mutable-arg
)
def load_yaml(response_text: str, keys_fix_yaml: List[str] = [], first_key="", last_key="") -> dict: | |
def load_yaml(response_text: str, keys_fix_yaml: List[str] = None, first_key="", last_key="") -> dict: | |
if keys_fix_yaml is None: | |
keys_fix_yaml = [] |
if capability in ['get_issue_comments', 'publish_inline_comments', 'get_labels', 'gfm_markdown', | ||
'publish_file_comments']: | ||
return False | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): We've found these issues:
- Lift code into else after jump in control flow (
reintroduce-else
) - Replace if statement with if expression (
assign-if-exp
) - Use set when checking membership of a collection of literals (
collection-into-set
) - Simplify boolean if expression (
boolean-if-exp-identity
) - Remove unnecessary casts to int, str, float or bool (
remove-unnecessary-cast
)
if capability in ['get_issue_comments', 'publish_inline_comments', 'get_labels', 'gfm_markdown', | |
'publish_file_comments']: | |
return False | |
return True | |
return capability not in { | |
'get_issue_comments', | |
'publish_inline_comments', | |
'get_labels', | |
'gfm_markdown', | |
'publish_file_comments', | |
} |
if response.status_code == 404: # not found | ||
return "" | ||
contents = response.text | ||
return contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): We've found these issues:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
) - Lift code into else after jump in control flow (
reintroduce-else
) - Replace if statement with if expression (
assign-if-exp
)
if response.status_code == 404: # not found | |
return "" | |
contents = response.text | |
return contents | |
return "" if response.status_code == 404 else response.text |
@@ -256,6 +256,9 @@ def get_num_of_files(self): | |||
except Exception as e: | |||
return -1 | |||
|
|||
def limit_output_characters(self, output: str, max_chars: int): | |||
return output[:max_chars] + '...' if len(output) > max_chars else output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Use f-string instead of string concatenation (use-fstring-for-concatenation
)
return output[:max_chars] + '...' if len(output) > max_chars else output | |
return f'{output[:max_chars]}...' if len(output) > max_chars else output |
@@ -167,10 +167,13 @@ async def run(self): | |||
|
|||
async def _prepare_prediction(self, model: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Low code quality found in PRDescription._prepare_prediction - 8% (low-code-quality
)
Explanation
The quality score for this function is below the quality threshold of 25%.This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines. - Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.
user_prompt_tokens = len(encoder.encode(user_prompt)) | ||
return system_prompt_tokens + user_prompt_tokens | ||
try: | ||
environment = Environment(undefined=StrictUndefined) |
Check warning
Code scanning / CodeQL
Jinja2 templating with autoescape=False Medium
PR Code Suggestions ✨
|
User description
Description
This PR fixes #
Notes for Reviewers
Signed commits
PR Type
enhancement, bug fix, documentation
Description
pr_assistant/algo/__init__.py
._create_chat
method inpr_assistant/algo/ai_handlers/langchain_ai_handler.py
for better error handling and simplified chat creation logic.pr_assistant/algo/ai_handlers/litellm_ai_handler.py
.pr_assistant/algo/git_patch_processing.py
._get_system_user_tokens
method inpr_assistant/algo/token_handler.py
.pr_assistant/algo/utils.py
.PRAssistant
class inpr_assistant/assistant/pr_assistant.py
to handle various PR-related commands.PRAssistant
in multiple server files.pr_assistant/git_providers/bitbucket_provider.py
.pr_assistant/git_providers/git_provider.py
.pr_assistant/tools/pr_code_suggestions.py
andpr_assistant/tools/pr_description.py
..pr_assistant.toml
.Changes walkthrough 📝
22 files
__init__.py
Add new models to supported models list.
pr_assistant/algo/init.py
gpt-4o-2024-08-06
,claude-3-5-sonnet
, and severalwatsonx
models.
langchain_ai_handler.py
Refactor chat creation logic and error handling.
pr_assistant/algo/ai_handlers/langchain_ai_handler.py
_create_chat
method for better error handling.litellm_ai_handler.py
Adjust error logging levels.
pr_assistant/algo/ai_handlers/litellm_ai_handler.py
error
towarning
for unknown errors.token_handler.py
Add error handling for token counting.
pr_assistant/algo/token_handler.py
_get_system_user_tokens
method.pr_assistant.py
Add PRAssistant class for handling PR commands.
pr_assistant/assistant/pr_assistant.py
PRAssistant
class to handle various PR-related commands.cli.py
Update import path for PRAssistant.
pr_assistant/cli.py
PRAssistant
.bitbucket_provider.py
Add support for file comments and improve error handling.
pr_assistant/git_providers/bitbucket_provider.py
git_provider.py
Add method to limit output characters.
pr_assistant/git_providers/git_provider.py
github_provider.py
Add character limit for comments and improve error handling.
pr_assistant/git_providers/github_provider.py
gitlab_provider.py
Add character limit for comments and improve error handling.
pr_assistant/git_providers/gitlab_provider.py
utils.py
Add function to set Claude model.
pr_assistant/git_providers/utils.py
azuredevops_server_webhook.py
Update import path for PRAssistant.
pr_assistant/servers/azuredevops_server_webhook.py
PRAssistant
.bitbucket_app.py
Update import path for PRAssistant.
pr_assistant/servers/bitbucket_app.py
PRAssistant
.bitbucket_server_webhook.py
Update import path for PRAssistant.
pr_assistant/servers/bitbucket_server_webhook.py
PRAssistant
.gerrit_server.py
Update import path for PRAssistant.
pr_assistant/servers/gerrit_server.py
PRAssistant
.github_action_runner.py
Update import path for PRAssistant.
pr_assistant/servers/github_action_runner.py
PRAssistant
.github_app.py
Update import path for PRAssistant.
pr_assistant/servers/github_app.py
PRAssistant
.github_polling.py
Update import path for PRAssistant.
pr_assistant/servers/github_polling.py
PRAssistant
.gitlab_webhook.py
Update import path for PRAssistant.
pr_assistant/servers/gitlab_webhook.py
PRAssistant
.pr_code_suggestions.py
Adjust logging levels and improve large PR handling.
pr_assistant/tools/pr_code_suggestions.py
pr_description.py
Improve large PR handling and add logging.
pr_assistant/tools/pr_description.py
pr_reviewer.py
Add checks for review labels configuration.
pr_assistant/tools/pr_reviewer.py
4 files
git_patch_processing.py
Fix variable name typo and add null check.
pr_assistant/algo/git_patch_processing.py
len(original_lines)
tolen_original_lines
.original_file_str
before processing patches.utils.py
Fix YAML code block marker removal.
pr_assistant/algo/utils.py
load_yaml
function to correctly remove YAML code block markers.bitbucket_server_provider.py
Fix HTTPError import and adjust code suggestion formatting.
pr_assistant/git_providers/bitbucket_server_provider.py
test_load_yaml.py
Fix typo in test string.
tests/unittest/test_load_yaml.py
8 files
config_loader.py
Update TOML key to pr-agent.
pr_assistant/config_loader.py
pr-assistant
topr-agent
.dependabot.yml
Remove dependabot configuration.
.github/dependabot.yml
ci_code.yml
Remove CI workflow configuration.
.github/workflows/ci_code.yml
code_coverage.yaml
Update code coverage workflow.
.github/workflows/code_coverage.yaml
publish_pipeline.yml
Remove publish pipeline configuration.
.github/workflows/publish_pipeline.yml
.pr_assistant.toml
Add configuration for Claude model.
.pr_assistant.toml
codecov.yml
Add codecov configuration.
codecov.yml
configuration.toml
Update patch extra lines configuration and app name.
pr_assistant/settings/configuration.toml
pr-agent
.2 files
test_convert_to_markdown.py
Fix missing newline at end of file.
tests/unittest/test_convert_to_markdown.py
Dockerfile
Standardize Dockerfile formatting.
docker/Dockerfile
4 files
format.sh
Remove unused script.
scripts/format.sh
lint.sh
Remove unused script.
scripts/lint.sh
test-cov-html.sh
Remove unused script.
scripts/test-cov-html.sh
test.sh
Remove unused script.
scripts/test.sh
10 files
CITATION.cff
Remove citation file.
CITATION.cff
azure.md
Add Azure DevOps pipeline instructions.
docs/docs/installation/azure.md
gitlab.md
Add GitLab pipeline instructions.
docs/docs/installation/gitlab.md
PR_assistant_pro_models.md
Add instructions for using Claude models.
docs/docs/usage-guide/PR_assistant_pro_models.md
additional_configurations.md
Update patch extra lines configuration.
docs/docs/usage-guide/additional_configurations.md
automations_and_usage.md
Update Azure DevOps instructions.
docs/docs/usage-guide/automations_and_usage.md
footer.html
Fix broken link in footer.
docs/overrides/partials/footer.html
pr_code_suggestions_prompts.toml
Update instructions for quoting variables.
pr_assistant/settings/pr_code_suggestions_prompts.toml
pr_code_suggestions_reflect_prompts.toml
Update instructions for quoting variables.
pr_assistant/settings/pr_code_suggestions_reflect_prompts.toml
pyproject.toml
Update author and maintainer information.
pyproject.toml
Summary by Sourcery
Introduce new methods for handling file comments and PR owner ID in the Bitbucket provider. Add support for running PR-Assistant as Azure DevOps and GitLab pipelines. Enhance error handling and logging across the codebase. Update documentation and tests to reflect the new changes. Rename the application from 'pr-assistant' to 'pr-agent'.
New Features:
publish_file_comments
in the Bitbucket provider.Bug Fixes:
_get_system_user_tokens
method of the token handler.Enhancements:
CI:
Documentation:
Tests:
Chores: