You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[BOLT] Set call to continuation count in pre-aggregated profile
#109683 identified an issue with pre-aggregated profile where a call to
continuation fallthrough edge count is missing (profile discontinuity).
This issue only affects pre-aggregated profile but not perf data since
LBR stack has the necessary information to determine if the trace (fall-
through) starts at call continuation, whereas pre-aggregated fallthrough
lacks this information.
The solution is to look at branch records in pre-aggregated profiles
that correspond to returns and assign counts to call to continuation
fallthrough:
- BranchFrom is in another function or DSO,
- BranchTo may be a call continuation site:
- not an entry point/landing pad.
Note that we can't directly check if BranchFrom corresponds to a return
instruction if it's in external DSO.
Keep call continuation handling for perf data (`getFallthroughsInTrace`)
[1] as-is due to marginally better performance. The difference is that
return-converted call to continuation fallthrough is slightly more
frequent than other fallthroughs since the former only requires one LBR
address while the latter need two that belong to the profiled binary.
Hence return-converted fallthroughs have larger "weight" which affects
code layout.
[1] `DataAggregator::getFallthroughsInTrace`
https://github.com/llvm/llvm-project/blob/fea18afeed39fe4435d67eee1834f0f34b23013d/bolt/lib/Profile/DataAggregator.cpp#L906-L915
Test Plan: added callcont-fallthru.s
Reviewers: maksfb, ayermolo, ShatianWang, dcci
Reviewed By: maksfb, ShatianWang
Pull Request: #109486
0 commit comments