-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[MC][profgen][NFC] Expand auto for MCDecodedPseudoProbe #102788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MC][profgen][NFC] Expand auto for MCDecodedPseudoProbe #102788
Conversation
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
@llvm/pr-subscribers-mc @llvm/pr-subscribers-pgo Author: Amir Ayupov (aaupov) ChangesExpand autos in select places to trigger type casts. Full diff: https://github.com/llvm/llvm-project/pull/102788.diff 2 Files Affected:
diff --git a/llvm/lib/MC/MCPseudoProbe.cpp b/llvm/lib/MC/MCPseudoProbe.cpp
index bb9538d55b7aa..e9e391c8eadd7 100644
--- a/llvm/lib/MC/MCPseudoProbe.cpp
+++ b/llvm/lib/MC/MCPseudoProbe.cpp
@@ -630,7 +630,7 @@ void MCPseudoProbeDecoder::printProbeForAddress(raw_ostream &OS,
uint64_t Address) {
auto It = Address2ProbesMap.find(Address);
if (It != Address2ProbesMap.end()) {
- for (auto &Probe : It->second) {
+ for (const MCDecodedPseudoProbe &Probe : It->second) {
OS << " [Probe]:\t";
Probe.print(OS, GUID2FuncDescMap, true);
}
@@ -657,7 +657,7 @@ MCPseudoProbeDecoder::getCallProbeForAddr(uint64_t Address) const {
const auto &Probes = It->second;
const MCDecodedPseudoProbe *CallProbe = nullptr;
- for (const auto &Probe : Probes) {
+ for (const MCDecodedPseudoProbe &Probe : Probes) {
if (Probe.isCall()) {
// Disabling the assert and returning first call probe seen so far.
// Subsequent call probes, if any, are ignored. Due to the the way
diff --git a/llvm/tools/llvm-profgen/ProfileGenerator.cpp b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
index 175556c2220e6..c840e7789edfe 100644
--- a/llvm/tools/llvm-profgen/ProfileGenerator.cpp
+++ b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
@@ -1194,7 +1194,7 @@ void ProfileGeneratorBase::extractProbesFromRange(
Binary->getAddress2ProbesMap();
auto It = Address2ProbesMap.find(IP.Address);
if (It != Address2ProbesMap.end()) {
- for (const auto &Probe : It->second) {
+ for (const MCDecodedPseudoProbe &Probe : It->second) {
ProbeCounter[&Probe] += Count;
}
}
|
@@ -657,7 +657,7 @@ MCPseudoProbeDecoder::getCallProbeForAddr(uint64_t Address) const { | |||
const auto &Probes = It->second; | |||
|
|||
const MCDecodedPseudoProbe *CallProbe = nullptr; | |||
for (const auto &Probe : Probes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand autos in select places to trigger type casts.
Can you explain? auto is a syntax sugar. Probes
is of type std::list<MCDecodedPseudoProbe>
, so the deduced type for auto
is MCDecodedPseudoProbe
. What type extra casting do you expect by changing from auto
to explicit type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #102789, Probes
(when accessed via Address2ProbesMap) become std::vector<std::reference_wrapper<MCDecodedPseudoProbe>>
.
auto
would expand to a reference_wrapper, so to access the underlying pseudo probe, it would need explicit get
. To avoid that, we can use reference_wrapper::operator T&
which is functionally identical to get
. This is achieved by providing MCDecodedPseudoProbe &
as type of a range-based variable.
https://en.cppreference.com/w/cpp/utility/functional/reference_wrapper/get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then this change should be part of #102789, and should not be separated. Otherwise the motivation and effect of this change is lost, because the base side of this change still use std::list
, and the change description stating "trigger type casts" can be confusing.
Suggest you merge it back (or at least fix change description).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then this change should be part of #102789, and should not be separated. Otherwise the motivation and effect of this change is lost, because the base side of this change still use
std::list
, and the change description stating "trigger type casts" can be confusing.Suggest you merge it back (or at least fix change description).
I intended to minimize the changes in #102789 and make all preparation changes small, isolated, and verifiably NFC. I've changed the description, hope it makes sense now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks.
I intended to minimize the changes in #102789 and make all preparation changes small, isolated, and verifiably NFC.
You might have gone a bit too far. :) And this isn't the first instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but still I though it's worth clarifying. Generally the purpose of splitting a patch isn't to strip things into individual patches as much as possible, but to help reviewers: 1) to make sure a patch is manageable in size for review; 2) group changes with logical connection/separation (components/steps). The other change isn't that big anyways, and evidently splitting here actually make it more difficult for reviewers instead of helping them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for clarifying. I'm used to operate under the assumption of extremely limited reviewer time, hence splitting into minimal/standalone changes. If you're fine with slightly larger and more loosely connected changes, that's also good with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your assumption is right, and you do what's needed to help reviewer. More often than not splitting helps, but not always. It's not about personal preference, but what helps reviewer.
Created using spr 1.3.4 [skip ci]
…unction records Use #102774 to allocate storage for decoded probes (`PseudoProbeVec`) and function records (`InlineTreeVec`). Leverage that to also shrink sizes of `MCDecodedPseudoProbe`: - Drop Guid since it's accessible via `InlineTree`. `MCDecodedPseudoProbeInlineTree`: - Keep track of probes and inlinees using `ArrayRef`s now that probes and function records belonging to the same function are allocated contiguously. This reduces peak RSS from 13.7 GiB to 9.7 GiB and pseudo probe parsing time (as part of perf2bolt) from 15.3s to 9.6s for a large binary with 400MiB .pseudo_probe section containing 43M probes and 25M function records. Depends on: #102774 #102787 #102788 Reviewers: maksfb, rafaelauler, dcci, ayermolo, wlei-llvm Reviewed By: wlei-llvm Pull Request: #102789
…unction records Use #102774 to allocate storage for decoded probes (`PseudoProbeVec`) and function records (`InlineTreeVec`). Leverage that to also shrink sizes of `MCDecodedPseudoProbe`: - Drop Guid since it's accessible via `InlineTree`. `MCDecodedPseudoProbeInlineTree`: - Keep track of probes and inlinees using `ArrayRef`s now that probes and function records belonging to the same function are allocated contiguously. This reduces peak RSS from 13.7 GiB to 9.7 GiB and pseudo probe parsing time (as part of perf2bolt) from 15.3s to 9.6s for a large binary with 400MiB .pseudo_probe section containing 43M probes and 25M function records. Depends on: #102774 #102787 #102788 Reviewers: maksfb, rafaelauler, dcci, ayermolo, wlei-llvm Reviewed By: wlei-llvm Pull Request: #102789
Expand autos in select places in preparation to #102789.