Skip to content

Revert "Revert "Adding Schema for Tool Outputs"" (#894) #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 17, 2025

Conversation

Zeeeepa
Copy link
Owner

@Zeeeepa Zeeeepa commented Apr 17, 2025

Reverts codegen-sh#892


Motivation

Content

Testing

Please check the following before marking your PR as ready for review

  • I have added tests for my changes
  • I have updated the documentation or added new documentation as needed

Summary by Sourcery

Revert a previous reversion and add schema definitions for tool outputs, introducing structured artifacts for various tool operations

New Features:

  • Added new tool_output_types.py with TypedDict definitions for structured tool artifacts
  • Introduced artifact generation for search, view file, list directory, edit file, and semantic edit operations

Enhancements:

  • Updated tool rendering methods to support ToolMessage with artifacts
  • Modified observation classes to generate structured output artifacts
  • Improved error handling and metadata generation for tool operations

Chores:

  • Refactored tool rendering methods across multiple tool extension files
  • Updated type hints and imports to support new artifact generation

Reverts codegen-sh#892

---------

Co-authored-by: Rushil Patel <[email protected]>
Co-authored-by: rushilpatel0 <[email protected]>
Copy link

sourcery-ai bot commented Apr 17, 2025

Reviewer's Guide by Sourcery

This pull request reverts the changes introduced in PR codegen-sh#892, which added schema for tool outputs. The changes modify the tool classes to return a ToolMessage object, including tool results and metadata. The Observation class is updated to return a ToolMessage object or string based on whether a tool_call_id is provided. New TypedDicts are introduced for tool output artifacts.

Sequence diagram for ViewFileTool execution

sequenceDiagram
    participant User
    participant ViewFileTool
    participant view_file
    participant ViewFileObservation
    User->>ViewFileTool: _run(filepath, start_line, end_line, max_lines, line_numbers, tool_call_id)
    ViewFileTool->>view_file: view_file(codebase, filepath, start_line, end_line, max_lines)
    activate view_file
    view_file->>ViewFileObservation: Create ViewFileObservation
    activate ViewFileObservation
    ViewFileObservation-->>view_file: ViewFileObservation
    deactivate ViewFileObservation
    view_file-->>ViewFileTool: ViewFileObservation
    deactivate view_file
    ViewFileTool->>ViewFileObservation: render(tool_call_id)
    activate ViewFileObservation
    ViewFileObservation-->>ViewFileTool: ToolMessage
    deactivate ViewFileObservation
    ViewFileTool-->>User: ToolMessage
Loading

Sequence diagram for ListDirectoryTool execution

sequenceDiagram
    participant User
    participant ListDirectoryTool
    participant list_directory
    participant ListDirectoryObservation
    User->>ListDirectoryTool: _run(tool_call_id, dirpath, depth)
    ListDirectoryTool->>list_directory: list_directory(codebase, dirpath, depth)
    activate list_directory
    list_directory->>ListDirectoryObservation: Create ListDirectoryObservation
    activate ListDirectoryObservation
    ListDirectoryObservation-->>list_directory: ListDirectoryObservation
    deactivate ListDirectoryObservation
    list_directory-->>ListDirectoryTool: ListDirectoryObservation
    deactivate list_directory
    ListDirectoryTool->>ListDirectoryObservation: render(tool_call_id)
    activate ListDirectoryObservation
    ListDirectoryObservation-->>ListDirectoryTool: ToolMessage
    deactivate ListDirectoryObservation
    ListDirectoryTool-->>User: ToolMessage
Loading

Sequence diagram for SearchTool execution

sequenceDiagram
    participant User
    participant SearchTool
    participant search
    participant SearchObservation
    User->>SearchTool: _run(tool_call_id, query, file_extensions, page, files_per_page, use_regex)
    SearchTool->>search: search(codebase, query, file_extensions, page, files_per_page, use_regex)
    activate search
    search->>SearchObservation: Create SearchObservation
    activate SearchObservation
    SearchObservation-->>search: SearchObservation
    deactivate SearchObservation
    search-->>SearchTool: SearchObservation
    deactivate search
    SearchTool->>SearchObservation: render(tool_call_id)
    activate SearchObservation
    SearchObservation-->>SearchTool: ToolMessage
    deactivate SearchObservation
    SearchTool-->>User: ToolMessage
Loading

Sequence diagram for EditFileTool execution

sequenceDiagram
    participant User
    participant EditFileTool
    participant edit_file
    participant EditFileObservation
    User->>EditFileTool: _run(filepath, content, tool_call_id)
    EditFileTool->>edit_file: edit_file(codebase, filepath, content)
    activate edit_file
    edit_file->>EditFileObservation: Create EditFileObservation
    activate EditFileObservation
    EditFileObservation-->>edit_file: EditFileObservation
    deactivate EditFileObservation
    edit_file-->>EditFileTool: EditFileObservation
    deactivate edit_file
    EditFileTool->>EditFileObservation: render(tool_call_id)
    activate EditFileObservation
    EditFileObservation-->>EditFileTool: ToolMessage
    deactivate EditFileObservation
    EditFileTool-->>User: ToolMessage
Loading

Sequence diagram for SemanticEditTool execution

sequenceDiagram
    participant User
    participant SemanticEditTool
    participant semantic_edit
    participant SemanticEditObservation
    User->>SemanticEditTool: _run(filepath, tool_call_id, edit_content, start, end)
    SemanticEditTool->>semantic_edit: semantic_edit(codebase, filepath, edit_content, start, end)
    activate semantic_edit
    semantic_edit->>SemanticEditObservation: Create SemanticEditObservation
    activate SemanticEditObservation
    SemanticEditObservation-->>semantic_edit: SemanticEditObservation
    deactivate SemanticEditObservation
    semantic_edit-->>SemanticEditTool: SemanticEditObservation
    deactivate semantic_edit
    SemanticEditTool->>SemanticEditObservation: render(tool_call_id)
    activate SemanticEditObservation
    SemanticEditObservation-->>SemanticEditTool: ToolMessage
    deactivate SemanticEditObservation
    SemanticEditTool-->>User: ToolMessage
Loading

File-Level Changes

Change Details Files
Updates the SearchObservation class to return a ToolMessage object, including search results and metadata.
  • Modifies the render method to accept a tool_call_id.
  • Adds logic to handle error cases and return a ToolMessage with an error message.
  • Adds logic to build matches and file paths for success cases.
  • Returns a ToolMessage object containing the search results, status, tool name, tool call ID, and artifacts.
src/codegen/extensions/tools/search.py
Updates the ListDirectoryObservation class to return a ToolMessage object, including directory listing and metadata.
  • Adds a to_artifacts method to the DirectoryInfo class to convert directory info to artifacts for UI.
  • Modifies the render method to accept a tool_call_id.
  • Adds logic to handle error cases and return a ToolMessage with an error message.
  • Returns a ToolMessage object containing the directory listing, status, tool name, tool call ID, and artifacts.
src/codegen/extensions/tools/list_directory.py
Updates the ViewFileObservation class to return a ToolMessage object, including file content and metadata.
  • Adds a raw_content field to the ViewFileObservation class.
  • Modifies the render method to accept a tool_call_id.
  • Adds logic to handle error cases and return a ToolMessage with an error message.
  • Returns a ToolMessage object containing the file content, status, tool name, tool call ID, and artifacts.
src/codegen/extensions/tools/view_file.py
Updates the tool classes to accept a tool_call_id and return a ToolMessage object.
  • Adds tool_call_id to the input models for ViewFileTool, ListDirectoryTool, SearchTool, EditFileTool, CreateFileTool, SemanticEditTool, and RelaceEditTool.
  • Modifies the _run methods to accept a tool_call_id.
  • Updates the _run methods to return a ToolMessage object.
src/codegen/extensions/langchain/tools.py
Updates the SemanticEditObservation class to return a ToolMessage object, including edit results and metadata.
  • Modifies the render method to accept a tool_call_id.
  • Adds logic to handle error cases and return a ToolMessage with an error message.
  • Returns a ToolMessage object containing the edit results, status, tool name, tool call ID, and artifacts.
src/codegen/extensions/tools/semantic_edit.py
Updates the Observation class to return a ToolMessage object or string based on whether a tool_call_id is provided.
  • Adds a render method that accepts an optional tool_call_id.
  • Modifies the render method to return a ToolMessage if a tool_call_id is provided, otherwise returns a string.
  • Adds a render_as_string method to return a string representation of the observation.
src/codegen/extensions/tools/observation.py
Updates the EditFileObservation class to return a ToolMessage object, including edit results and metadata.
  • Modifies the render method to accept a tool_call_id.
  • Adds logic to handle error cases and return a ToolMessage with an error message.
  • Returns a ToolMessage object containing the edit results, status, tool name, tool call ID, and artifacts.
src/codegen/extensions/tools/edit_file.py
Updates the RelaceEditObservation class to return a ToolMessage object, including edit results and metadata.
  • Modifies the render method to accept a tool_call_id.
  • Adds logic to handle error cases and return a ToolMessage with an error message.
  • Returns a ToolMessage object containing the edit results, status, tool name, tool call ID, and artifacts.
src/codegen/extensions/tools/relace_edit.py
Updates the ToolMessageData class to include a status field and modifies how tool responses are handled in the agent tracer.
  • Adds a status field to the ToolMessageData class.
  • Modifies the extract_structured_data method to extract the status and artifact from the ToolMessage.
src/codegen/agents/tracer.py
src/codegen/agents/data.py
Introduces new TypedDicts for tool output artifacts.
  • Adds new TypedDicts for EditFileArtifacts, ViewFileArtifacts, ListDirectoryArtifacts, SearchMatch, SearchArtifacts, SemanticEditArtifacts, and RelaceEditArtifacts.
src/codegen/extensions/tools/tool_output_types.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Zeeeepa Zeeeepa merged commit 47ff0ea into develop Apr 17, 2025
1 check was pending
Copy link

codiumai-pr-agent-free bot commented Apr 17, 2025

PR Reviewer Guide 🔍

(Review updated until commit 046b238)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Potential Bug

The condition if isinstance(result, AIMessage) and not result.tool_calls: may cause unexpected behavior if result is not an AIMessage. The original code checked only the type, but the new code assumes tool_calls attribute exists when the type check passes.

if isinstance(result, AIMessage) and not result.tool_calls:
    updated_messages = [*messages, result]
Error Handling

The error message for file not found has been significantly changed and now spans multiple lines. Verify this doesn't break any error handling logic that might be parsing these messages.

            error=f"""File not found: {filepath}. Please use full filepath relative to workspace root.
Ensure that this is indeed the correct filepath, else keep searching to find the correct fullpath.""",
Variable Reference

The error handling code at line 458 uses exception!s but there's no variable named exception in scope. The old code used error_msg which was likely defined elsewhere in the function.

return f"Error executing tool: {exception!s}\n\nPlease check your tool usage and try again with the correct parameters."

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Zeeeepa - I've reviewed your changes - here's some feedback:

Overall Comments:

  • It looks like the render methods are being replaced with render_as_string and a new render method that returns a ToolMessage - can we rename the old render methods to to_string to be more clear about their purpose?
  • It looks like all the tools are now taking a tool_call_id as input - can we use a dependency injection framework to avoid having to pass this in manually?
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +73 to +79
if self.status == "error":
return ToolMessage(
content=content,
status=self.status,
tool_call_id=tool_call_id,
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): We've found these issues:

Suggested change
if self.status == "error":
return ToolMessage(
content=content,
status=self.status,
tool_call_id=tool_call_id,
)

Comment on lines 70 to +71
for match in self.matches:
lines.append(match.render())
lines.append(match.render_as_string())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Replace a for append loop with list extend (for-append-to-extend)

Suggested change
for match in self.matches:
lines.append(match.render())
lines.append(match.render_as_string())
lines.extend(match.render_as_string() for match in self.matches)

Comment on lines +155 to +157
lines.append(result.render_as_string())
lines.append("") # Add blank line between files

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Merge consecutive list appends into a single extend (merge-list-appends-into-extend)

Suggested change
lines.append(result.render_as_string())
lines.append("") # Add blank line between files
lines.extend((result.render_as_string(), ""))

Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix undefined variable reference

The variable exception is used but error_msg was defined earlier in the code.
This will cause a NameError when an error occurs. The variable name should be
consistent.

src/codegen/extensions/langchain/graph.py [457-458]

 # For other types of errors
-return f"Error executing tool: {exception!s}\n\nPlease check your tool usage and try again with the correct parameters."
+return f"Error executing tool: {error_msg}\n\nPlease check your tool usage and try again with the correct parameters."
  • Apply this suggestion
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical bug where the variable exception is used but doesn't exist in the scope. This would cause a runtime error when an exception occurs. The PR changed the variable name from error_msg to exception in one place but not the other.

High
Fix parameter order mismatch

The parameter order in the function definition doesn't match the order used in
the SemanticEditInput class. The tool_call_id parameter is defined as the second
parameter, but in the input model it's the third parameter after filepath and
edit_content.

src/codegen/extensions/tools/semantic_edit.py [364-367]

-def _run(self, filepath: str, tool_call_id: str, edit_content: str, start: int = 1, end: int = -1) -> ToolMessage:
+def _run(self, filepath: str, edit_content: str, tool_call_id: str, start: int = 1, end: int = -1) -> ToolMessage:
     # Create the the draft editor mini llm
     result = semantic_edit(self.codebase, filepath, edit_content, start=start, end=end)
     return result.render(tool_call_id)

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a parameter order mismatch between the method definition and the input model class. This inconsistency would cause runtime errors when the tool is called, as parameters would be passed in the wrong order.

Medium
Fix return type annotation

The _run method signature doesn't match the parameter order in the function
call. The tool_call_id parameter should be passed as the third argument to
result.render(), but it's being used as the second argument in the function
definition.

src/codegen/extensions/tools/edit_file.py [191-193]

-def _run(self, filepath: str, content: str, tool_call_id: str) -> str:
+def _run(self, filepath: str, content: str, tool_call_id: str) -> ToolMessage:
     result = edit_file(self.codebase, filepath, content)
     return result.render(tool_call_id)

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the return type annotation should be ToolMessage instead of str to match the actual return type from the render method. This improves type safety and code consistency.

Medium
  • More

Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix parameter order mismatch

The parameter order in the _run method doesn't match the SemanticEditInput class
definition. In the input class, tool_call_id is the last parameter, but in the
_run method it's the second parameter. This will cause parameter mismatch errors
when the tool is called.

src/codegen/extensions/tools/semantic_edit.py [364-367]

-def _run(self, filepath: str, tool_call_id: str, edit_content: str, start: int = 1, end: int = -1) -> ToolMessage:
+def _run(self, filepath: str, edit_content: str, start: int = 1, end: int = -1, tool_call_id: str = None) -> ToolMessage:
     # Create the the draft editor mini llm
     result = semantic_edit(self.codebase, filepath, edit_content, start=start, end=end)
     return result.render(tool_call_id)

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical parameter order mismatch between the SemanticEditInput class and the _run method. This would cause errors when the tool is called, as the parameters would be passed in the wrong order.

High
Handle None tool_calls attribute

The condition is checking if result.tool_calls is empty, but doesn't handle the
case where tool_calls might be None. This could cause an AttributeError if
tool_calls doesn't exist on the AIMessage object.

src/codegen/extensions/langchain/graph.py [103-105]

-if isinstance(result, AIMessage) and not result.tool_calls:
+if isinstance(result, AIMessage) and not getattr(result, 'tool_calls', None):
     updated_messages = [*messages, result]
     return {"messages": updated_messages, "final_answer": result.content}
  • Apply this suggestion
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a potential AttributeError if the AIMessage object doesn't have a tool_calls attribute. Using getattr with a default value is a robust way to handle this edge case.

Medium
Fix parameter order issue

The function signature has incorrect parameter order. The tool_call_id parameter
is at the end, but in the implementation of render() it's expected as the first
parameter. This mismatch will cause errors when the tool is called.

src/codegen/extensions/tools/edit_file.py [191-193]

-def _run(self, filepath: str, content: str, tool_call_id: str) -> str:
+def _run(self, filepath: str, content: str, tool_call_id: str) -> ToolMessage:
     result = edit_file(self.codebase, filepath, content)
     return result.render(tool_call_id)

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 3

__

Why: The suggestion correctly identifies the return type should be ToolMessage instead of str, but the parameter order issue is incorrect. The parameters are in the correct order in the code.

Low
  • More

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Readability Typo in Replace Artifacts Class Name ▹ view
Performance Inefficient Path String Construction ▹ view
Files scanned
File Path Reviewed
src/codegen/agents/data.py
src/codegen/extensions/tools/edit_file.py
src/codegen/extensions/tools/observation.py
src/codegen/agents/tracer.py
src/codegen/extensions/tools/relace_edit.py
src/codegen/extensions/tools/tool_output_types.py
src/codegen/extensions/tools/view_file.py
src/codegen/extensions/tools/semantic_edit.py
src/codegen/extensions/tools/list_directory.py
src/codegen/extensions/tools/search.py
src/codegen/extensions/langchain/graph.py
src/codegen/extensions/langchain/tools.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

error: Optional[str] # Error message (only present on error)


class RelaceEditArtifacts(TypedDict, total=False):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in Replace Artifacts Class Name category Readability

Tell me more
What is the issue?

There is a typo in the class name 'RelaceEditArtifacts' which should be 'ReplaceEditArtifacts'.

Why this matters

This typo could cause confusion in the codebase and make it difficult for other developers to find and use the correct class name. It could also lead to bugs if code tries to reference the correct spelling.

Suggested change ∙ Feature Preview

Rename the class to fix the typo:

class ReplaceEditArtifacts(TypedDict, total=False):
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.


if self.files is not None:
artifacts["files"] = self.files
artifacts["file_paths"] = [f"{self.path}/{f}" for f in self.files]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inefficient Path String Construction category Performance

Tell me more
What is the issue?

String concatenation in a list comprehension for file paths could be inefficient for large directories.

Why this matters

For directories with many files, creating new strings through concatenation in a loop can lead to excessive memory allocations.

Suggested change ∙ Feature Preview

Use os.path.join or pathlib for more efficient path construction:

from pathlib import Path
artifacts["file_paths"] = [str(Path(self.path) / f) for f in self.files]
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants