-
Notifications
You must be signed in to change notification settings - Fork 607
Fix libflatccrt race #10918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix libflatccrt race #10918
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10918
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Pending, 1 Unrelated FailureAs of commit 6a12fa8 with merge base d0464f8 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
26e9d75
to
fb6ae9e
Compare
8899a8e
to
4b3b4ab
Compare
ac72cc1
to
4ccdcc6
Compare
bash backends/qualcomm/scripts/build.sh --skip_aarch64 --job_number 2 --release || \ | ||
bash backends/qualcomm/scripts/build.sh --skip_aarch64 --job_number 2 --release --no_clean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change related?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see the comment right above — this was the way the race was mitigated before in #7570
if(CMAKE_BUILD_TYPE MATCHES "Debug") | ||
set(FLATCC_LIB flatccrt_d) | ||
else() | ||
set(FLATCC_LIB flatccrt) | ||
endif() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you confirmed that debug build will also build flatccrt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seems to have both libs
### Summary Seems like there is a race in building `libflatccrt.a`. This issue has existed for a while: pytorch#7300. It was temporarily mitigated in pytorch#7570 by just reducing the parallelism. In this diff I attempt to fix it. This is just my assumption of what is wrong. Given flatccrt builds a debug version with a `_d` suffix, if the target isn't depended on (i.e. some target don't use the conditional target name) then the order of how the lib is built causes a race. So for now, always use the non-debug version. Given it's a race, I was never able to repro the issue locally — I can't guarantee this is the problem. However, it seems my recent changes in pytorch#10855 has increased the frequency of the problem in CI. ### Test plan CI cc @larryliu0820
@jathu, looks like this causing issue to build qnn backend in debug mode. Looks like the script to build qnn sdk only covers release mode though |
Summary
Seems like there is a race in building
libflatccrt.a
. This issue has existed for a while: #7300. It was temporarily mitigated in #7570 by just reducing the parallelism.In this diff I attempt to fix it. This is just my assumption of what is wrong. Given flatccrt builds a debug version with a
_d
suffix, if the target isn't depended on (i.e. some target don't use the conditional target name) then the order of how the lib is built causes a race. So for now, always use the non-debug version.Given it's a race, I was never able to repro the issue locally — I can't guarantee this is the problem. However, it seems my recent changes in #10855 has increased the frequency of the problem in CI.
Test plan
CI
cc @larryliu0820