Skip to content

Remove false positive error message in the executor_runner #7170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 17, 2025

Conversation

ykhrustalev
Copy link
Contributor

@ykhrustalev ykhrustalev commented Dec 4, 2024

Summary

There are false positive error messages from the type check that attempts to verify several allowed types at once, but uses one function call per type leading to some of them not passing the check and emitting an error message.

> ./executor_runner --model-path ~/model.pte
...
I 00:00:00.519929 executorch:executor_runner.cpp:91] Using method forward
I 00:00:00.519931 executorch:executor_runner.cpp:138] Setting up planned buffer 0, size 10753549312.
I 00:00:01.212589 executorch:executor_runner.cpp:161] Method loaded.
I 00:00:01.213544 executorch:executor_runner.cpp:171] Inputs prepared.
E 00:00:02.320414 executorch:tensor_util.h:481] Expected to find Float type, but tensor has type Half
E 00:00:02.320521 executorch:tensor_util.h:487] Check failed (t.scalar_type() == dtype): Expected to find Float type, but tensor has type Half
...
I 00:00:24.911098 executorch:executor_runner.cpp:180] Model executed successfully.
...

Note the repeated lines

Check failed (t.scalar_type() == dtype): Expected to find Float type, but tensor has type Half

Test plan

I can only provide the log message from my local run for that

...
I 00:00:00.745945 executorch:executor_runner.cpp:138] Setting up planned buffer 0, size 13026185216.
I 00:00:02.139255 executorch:executor_runner.cpp:161] Method loaded.
I 00:00:02.139295 executorch:executor_runner.cpp:171] Inputs prepared.
I 00:00:28.297681 executorch:executor_runner.cpp:180] Model executed successfully.
...

Copy link

pytorch-bot bot commented Dec 4, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7170

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit b838755 with merge base c5fea7e (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 4, 2024
@ykhrustalev ykhrustalev force-pushed the ykhrustalev/false-error branch from 12a1b0a to 0225f2f Compare December 4, 2024 05:35
dbort
dbort previously requested changes Dec 4, 2024
Copy link
Contributor

@dbort dbort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would definite be good to avoid printing log statements in the success path like this: ideally, ET will never print logs during successful execution.

@manuelcandales what do you think about this interface?

@@ -484,6 +486,30 @@ inline bool tensor_is_type(
return true;
}

inline bool tensor_is_type(
executorch::aten::Tensor t,
const std::vector<executorch::aten::ScalarType>& dtypes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime must not use streams, vectors or any other type that allocates memory dynamically: see https://pytorch.org/executorch/main/portable-cpp-programming.html for some commentary on this.

In this case you may be able to use a Span<ScalarType> since the container size is known at construction time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted, I will use the Span container

return true;
}

std::stringstream dtype_ss;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not safe to assemble a dynamic string here. If you wanted to print all of the types, printing individual log lines would be the safest and easiest. If you wanted to assemble a string, you could create a fixed-size buffer on the stack and strncat strings onto it, but that can be error-prone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I was thinking about using the c-ctyle buffer for that, I will update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to go a simple way that provides an explicit override without any loops.

The use of strncat will lead to a warning https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83404 which I think is not desired.

@dbort , hope you find that version suitable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbort surficing the thread on top.

@dbort dbort requested a review from manuelcandales December 4, 2024 21:55
@ykhrustalev ykhrustalev requested a review from dbort December 5, 2024 05:55
@manuelcandales manuelcandales added the release notes: ops & kernels Changes to the opset and any new / changed kernel implementations label Jan 29, 2025
@dbort dbort dismissed their stale review January 29, 2025 23:25

obsolete

@shoumikhin
Copy link
Contributor

@manuelcandales can you take a look please how shall we proceed?

@byjlw byjlw merged commit 37265a1 into pytorch:main Mar 17, 2025
5 of 6 checks passed
@ykhrustalev ykhrustalev deleted the ykhrustalev/false-error branch March 17, 2025 17:27
DannyYuyang-quic pushed a commit to CodeLinaro/executorch that referenced this pull request Apr 2, 2025
)

### Summary

There are false positive error messages from the type check that
attempts to verify several allowed types at once, but uses one function
call per type leading to some of them not passing the check and emitting
an error message.

```
> ./executor_runner --model-path ~/model.pte
...
I 00:00:00.519929 executorch:executor_runner.cpp:91] Using method forward
I 00:00:00.519931 executorch:executor_runner.cpp:138] Setting up planned buffer 0, size 10753549312.
I 00:00:01.212589 executorch:executor_runner.cpp:161] Method loaded.
I 00:00:01.213544 executorch:executor_runner.cpp:171] Inputs prepared.
E 00:00:02.320414 executorch:tensor_util.h:481] Expected to find Float type, but tensor has type Half
E 00:00:02.320521 executorch:tensor_util.h:487] Check failed (t.scalar_type() == dtype): Expected to find Float type, but tensor has type Half
...
I 00:00:24.911098 executorch:executor_runner.cpp:180] Model executed successfully.
...
```

Note the repeated lines
> Check failed (t.scalar_type() == dtype): Expected to find Float type,
but tensor has type Half

### Test plan

I can only provide the log message from my local run for that
```
...
I 00:00:00.745945 executorch:executor_runner.cpp:138] Setting up planned buffer 0, size 13026185216.
I 00:00:02.139255 executorch:executor_runner.cpp:161] Method loaded.
I 00:00:02.139295 executorch:executor_runner.cpp:171] Inputs prepared.
I 00:00:28.297681 executorch:executor_runner.cpp:180] Model executed successfully.
...
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: ops & kernels Changes to the opset and any new / changed kernel implementations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants