-
Notifications
You must be signed in to change notification settings - Fork 607
Work around flatccrt issue in CI script #7570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When running build-qnn-sdk.sh in certain circumstances the libflatccrt.a cannot be found. To work around this we can build the QNN backend twice, second time with option `--no_clean`, to make sure the library is found. Resolves: pytorch#7300 Change-Id: I47e14f1fa318538587b848ee02240f7867c88f50 Signed-off-by: Benjamin Klimczak <[email protected]>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7570
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit cdfbad3 with merge base b16271c ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @dbort and @Olivia-liu. Could one of you have a look please to unblock #5027? |
@Olivia-liu can we dig into this issue and figure out the root cause? |
When running build-qnn-sdk.sh in certain circumstances the libflatccrt.a cannot be found. To work around this we can build the QNN backend twice, second time with option `--no_clean`, to make sure the library is found. Resolves: #7300 Change-Id: I47e14f1fa318538587b848ee02240f7867c88f50 Signed-off-by: Benjamin Klimczak <[email protected]>
### Summary Seems like there is a race in building `libflatccrt.a`. This issue has existed for a while: #7300. It was temporarily mitigated in #7570 by just reducing the parallelism. In this diff I attempt to fix it. This is just my assumption of what is wrong. Given flatccrt builds a debug version with a `_d` suffix, if the target isn't depended on (i.e. some target don't use the conditional target name) then the order of how the lib is built causes a race. So for now, always use the non-debug version. Given it's a race, I was never able to repro the issue locally — I can't guarantee this is the problem. However, it seems my recent changes in #10855 has increased the frequency of the problem in CI. ### Test plan CI cc @larryliu0820
### Summary Seems like there is a race in building `libflatccrt.a`. This issue has existed for a while: pytorch#7300. It was temporarily mitigated in pytorch#7570 by just reducing the parallelism. In this diff I attempt to fix it. This is just my assumption of what is wrong. Given flatccrt builds a debug version with a `_d` suffix, if the target isn't depended on (i.e. some target don't use the conditional target name) then the order of how the lib is built causes a race. So for now, always use the non-debug version. Given it's a race, I was never able to repro the issue locally — I can't guarantee this is the problem. However, it seems my recent changes in pytorch#10855 has increased the frequency of the problem in CI. ### Test plan CI cc @larryliu0820
When running build-qnn-sdk.sh in certain circumstances the libflatccrt.a cannot be found. To work around this we can build the QNN backend twice, second time with option
--no_clean
, to make sure the library is found.Fixes #7300