Self-healing tool-calls on invalid format #1772

Kitenite · 2025-04-12T00:09:29Z

Description

Related Issues

Type of Change

Testing

Screenshots (if applicable)

Additional Notes

Important

Adds self-healing for invalid tool call parameters in LlmManager using experimental_repairToolCall.

Behavior:
- Adds experimental_repairToolCall method in LlmManager to handle invalid tool call parameters by attempting to repair them using generateObject.
- Logs an error and returns null for invalid tool names using NoSuchToolError.
Error Handling:
- Enhances error handling in stream() method to manage invalid tool call parameters.
- Logs warnings for invalid parameters and attempts to fix them.
Misc:
- Imports NoSuchToolError in index.ts.

^{This description was created by}^{for 5d5e8ca. It will automatically update as commits are pushed.}

ellipsis-dev

👍 Looks good to me! Reviewed everything up to a332d91 in 2 minutes and 11 seconds

More details

Looked at 49 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 8 drafted comments based on config settings.

1. apps/studio/electron/main/chat/index.ts:114

Draft comment:
Check if 'tool' is defined before accessing tool.parameters to avoid potential runtime errors.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The code is using TypeScript's keyof operator which provides type safety. The tools parameter comes from chatToolSet which is imported and used in the streamText call. Since this is a TypeScript error handler specifically for tool calls, tools should always contain valid tools. The NoSuchToolError case is already handled above.
I could be wrong about the type safety - there could be runtime cases where the toolName doesn't match despite the TypeScript types. The error handler is meant to handle invalid cases after all.
The code is specifically in an error handler for parameter validation, not tool existence. Invalid tool names are explicitly handled by the NoSuchToolError check above. The TypeScript types provide sufficient safety here.
The comment should be deleted. The code already handles invalid tool cases appropriately through TypeScript types and the NoSuchToolError check.

2. apps/studio/electron/main/chat/index.ts:116

Draft comment:
Replace console.log with a proper logging framework for production environments.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
While using a proper logging framework is generally good practice, this seems like debug logging in an experimental feature. The file already uses console.log elsewhere (L176, L215, L286) so this would be inconsistent to change just one instance. A full logging framework change would need to be a larger refactor across the codebase.
The comment identifies a real code quality issue. Having proper logging is important for production applications.
However, changing just one console.log instance would be inconsistent, and a proper logging framework should be implemented systematically across the codebase, not just in this one spot.
Delete the comment. While the suggestion has merit, it should be addressed as a larger systematic change rather than a one-off fix in an experimental feature.

3. apps/studio/electron/main/chat/index.ts:133

Draft comment:
Verify that returning the repaired args as a JSON string matches the expected format for toolCall.args.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50%
This comment is asking the PR author to verify something, which is against the rules. It doesn't provide a specific suggestion or point out a clear issue with the code. Therefore, it should be removed.

4. apps/studio/electron/main/chat/index.ts:101

Draft comment:
PR description vs. changes mismatch: The issue mentions onboarding docs, but the diff only adds self-healing tool-call logic.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

5. apps/studio/electron/main/chat/index.ts:114

Draft comment:
Consider adding a guard to ensure 'tool' is defined before using it.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The code is already type-safe through TypeScript's type system. The keyof typeof tools type assertion ensures toolCall.toolName is a valid key. NoSuchToolError handling covers the case of invalid tool names. An additional runtime check would be redundant given these safeguards.
I could be wrong about the TypeScript type safety - there might be edge cases where tools could be modified at runtime that TypeScript can't catch.
Even if there are runtime edge cases, the NoSuchToolError handling already provides the necessary safety net for invalid tool access.
The comment should be deleted as the code already has sufficient safeguards through TypeScript types and error handling.

6. apps/studio/electron/main/chat/index.ts:116

Draft comment:
Consider using a more robust logging mechanism instead of console.log for production diagnostics.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50%
None

7. apps/studio/electron/main/chat/index.ts:120

Draft comment:
Wrap the generateObject async call in a try/catch block to prevent unhandled errors from crashing the stream.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The generateObject() call is already within comprehensive error handling. The outer try/catch block handles various error types including RetryError, DOMException, and AbortError. Any error from generateObject() would propagate to this outer error handler. Adding another try/catch would be redundant and add unnecessary nesting.
I could be wrong about how errors propagate through the async function chain. There might be a specific reason why errors from generateObject need to be handled separately.
The existing error handling is comprehensive and covers all error cases. The experimental_repairToolCall function is meant to attempt a repair and if it fails, the outer error handling will catch it appropriately.
The comment should be deleted as the code already has sufficient error handling through the outer try/catch block.

8. apps/studio/electron/main/chat/index.ts:133

Draft comment:
Verify that stringifying 'repairedArgs' matches downstream format expectations (avoid potential double-stringification issues).
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50%
The comment is asking the PR author to verify the behavior of stringifying repairedArgs, which violates the rule against asking the author to confirm or verify intentions. It doesn't provide a specific suggestion or point out a clear issue.

Workflow ID: wflow_lsyeZIhykNrMuidM

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 5d5e8ca in 1 minute and 35 seconds

More details

Looked at 18 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 6 drafted comments based on config settings.

1. apps/studio/electron/main/chat/index.ts:112

Draft comment:
Using console.error for logging an invalid tool name might be too severe if this case isn’t fatal. Consider if a warning level is more appropriate.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
This is a case where an invalid tool name is encountered, which is a genuine error condition. The code explicitly returns null and gives up on fixing it, indicating this is treated as a hard error. Using console.error seems appropriate here since it represents a real failure case. The next line of code uses console.warn for a different case where the code attempts recovery.
The comment author may have a point that this isn't a fatal error since the code handles it gracefully by returning null. The distinction between error and warning levels could be subjective.
While not fatal, this represents a real error condition where recovery is explicitly not attempted, making console.error appropriate. The code's use of console.warn vs error appears intentionally chosen to distinguish between recoverable and non-recoverable cases.
The current error level appears intentional and appropriate for this non-recoverable error case. The comment should be deleted.

2. apps/studio/electron/main/chat/index.ts:117

Draft comment:
Changed console.log to console.warn for invalid parameters. Confirm that the warning level is consistent with your logging strategy.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50%
None

3. apps/studio/electron/main/chat/index.ts:112

Draft comment:
Check if the PR title and description match the code changes. The description references documentation improvements, but the diff addresses tool-call self-healing.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50%
This comment is asking the PR author to check if the PR title and description match the code changes, which is not allowed according to the rules. The rules specifically state not to ask the PR author to update the PR description or assume that the PR description should be filled out. Therefore, this comment should be removed.

4. apps/studio/electron/main/chat/index.ts:111

Draft comment:
When a tool name is invalid, consider logging the full error details (or more context) along with toolCall.toolName to aid debugging.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50%
None

5. apps/studio/electron/main/chat/index.ts:117

Draft comment:
Changing console.log to console.warn for invalid parameter logging is a good improvement to signal a warning; ensure this level suits the importance of the repair action.
Reason this comment was not posted:
Confidence changes required: 30% <= threshold 50%
None

6. apps/studio/electron/main/chat/index.ts:105

Draft comment:
The PR description and linked issue mention onboarding docs, but this diff only addresses self-healing tool calls. Please ensure the PR details match the code changes.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_npLPLv966q44s75v

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Kitenite added 2 commits April 11, 2025 17:03

Self-healing toolcall

b1847c7

Clean up

a332d91

ellipsis-dev bot reviewed Apr 12, 2025

View reviewed changes

Better logging

5d5e8ca

Kitenite merged commit 093b26f into main Apr 12, 2025

Kitenite deleted the feat/better-toolcall branch April 12, 2025 00:21

ellipsis-dev bot reviewed Apr 12, 2025

View reviewed changes

ml-delaurier pushed a commit to ml-delaurier/nolook that referenced this pull request Apr 23, 2025

Self-healing tool-calls on invalid format (onlook-dev#1772)

b5144f9

t1c1 pushed a commit to t1c1/onlookbotcodes that referenced this pull request Jun 5, 2025

Self-healing tool-calls on invalid format (onlook-dev#1772)

0fb2b3e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Self-healing tool-calls on invalid format #1772

Self-healing tool-calls on invalid format #1772

Uh oh!

Kitenite commented Apr 12, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

Self-healing tool-calls on invalid format #1772

Self-healing tool-calls on invalid format #1772

Uh oh!

Conversation

Kitenite commented Apr 12, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Type of Change

Testing

Screenshots (if applicable)

Additional Notes

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Kitenite commented Apr 12, 2025 •

edited by ellipsis-dev bot

Loading