Skip to content

[llvm-exegesis][X86] Groups ports 2,3, and 11 for Golden Cove #115645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

boomanaiden154
Copy link
Contributor

This patch updates the PFM counter mappings for Sapphire Rapids and Alder Lake (p-cores) to group ports 2,3, and 11 despite the naming of the performance counters. This is how the scheduling models assume things work within LLVM, and seems to be a mistake within the Intel perfmon documentation.

Fixes #113941.

This patch updates the PFM counter mappings for Sapphire Rapids and
Alder Lake (p-cores) to group ports 2,3, and 11 despite the naming of
the performance counters. This is how the scheduling models assume
things work within LLVM, and seems to be a mistake within the Intel
perfmon documentation.

Fixes llvm#113941.
@llvmbot
Copy link
Member

llvmbot commented Nov 10, 2024

@llvm/pr-subscribers-backend-x86

Author: Aiden Grossman (boomanaiden154)

Changes

This patch updates the PFM counter mappings for Sapphire Rapids and Alder Lake (p-cores) to group ports 2,3, and 11 despite the naming of the performance counters. This is how the scheduling models assume things work within LLVM, and seems to be a mistake within the Intel perfmon documentation.

Fixes #113941.


Full diff: https://github.com/llvm/llvm-project/pull/115645.diff

3 Files Affected:

  • (modified) llvm/lib/Target/X86/X86PfmCounters.td (+8-2)
  • (modified) llvm/lib/Target/X86/X86SchedAlderlakeP.td (-1)
  • (modified) llvm/lib/Target/X86/X86SchedSapphireRapids.td (-1)
diff --git a/llvm/lib/Target/X86/X86PfmCounters.td b/llvm/lib/Target/X86/X86PfmCounters.td
index 38d8d19091e0fd..190dec2f4e391e 100644
--- a/llvm/lib/Target/X86/X86PfmCounters.td
+++ b/llvm/lib/Target/X86/X86PfmCounters.td
@@ -210,7 +210,10 @@ def AlderLakePfmCounters : ProcPfmCounters {
   let IssueCounters = [
     PfmIssueCounter<"ADLPPort00", "uops_dispatched_port:port_0">,
     PfmIssueCounter<"ADLPPort01", "uops_dispatched_port:port_1">,
-    PfmIssueCounter<"ADLPPort02_03_10", "uops_dispatched_port:port_2_3_10">,
+    // The perfmon documentation and thus libpfm seems to incorrectly label
+    // this performance counter, as ports 2,3, and 11 are actually grouped
+    // according to most documentation. See #113941 for additional details.
+    PfmIssueCounter<"ADLPPort02_03_11", "uops_dispatched_port:port_2_3_10">,
     PfmIssueCounter<"ADLPPort04_09", "uops_dispatched_port:port_4_9">,
     PfmIssueCounter<"ADLPPort05_11", "uops_dispatched_port:port_5_11">,
     PfmIssueCounter<"ADLPPort06", "uops_dispatched_port:port_6">,
@@ -226,7 +229,10 @@ def SapphireRapidsPfmCounters : ProcPfmCounters {
   let IssueCounters = [
     PfmIssueCounter<"SPRPort00", "uops_dispatched_port:port_0">,
     PfmIssueCounter<"SPRPort01", "uops_dispatched_port:port_1">,
-    PfmIssueCounter<"SPRPort02_03_10", "uops_dispatched_port:port_2_3_10">,
+    // The perfmon documentation and thus libpfm seems to incorrectly label
+    // this performance counter, as ports 2,3, and 11 are actually grouped
+    // according to most documentation. See #113941 for additional details.
+    PfmIssueCounter<"SPRPort02_03_11", "uops_dispatched_port:port_2_3_10">,
     PfmIssueCounter<"SPRPort04_09", "uops_dispatched_port:port_4_9">,
     PfmIssueCounter<"SPRPort05_11", "uops_dispatched_port:port_5_11">,
     PfmIssueCounter<"SPRPort06", "uops_dispatched_port:port_6">,
diff --git a/llvm/lib/Target/X86/X86SchedAlderlakeP.td b/llvm/lib/Target/X86/X86SchedAlderlakeP.td
index aec6906310d96b..f8c6b32a853be9 100644
--- a/llvm/lib/Target/X86/X86SchedAlderlakeP.td
+++ b/llvm/lib/Target/X86/X86SchedAlderlakeP.td
@@ -60,7 +60,6 @@ def ADLPPort01_05_10       : ProcResGroup<[ADLPPort01, ADLPPort05, ADLPPort10]>;
 def ADLPPort02_03          : ProcResGroup<[ADLPPort02, ADLPPort03]>;
 def ADLPPort02_03_07       : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort07]>;
 def ADLPPort02_03_11       : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort11]>;
-def ADLPPort02_03_10       : ProcResGroup<[ADLPPort02, ADLPPort03, ADLPPort10]>;
 def ADLPPort05_11          : ProcResGroup<[ADLPPort05, ADLPPort11]>;
 def ADLPPort07_08          : ProcResGroup<[ADLPPort07, ADLPPort08]>;
 
diff --git a/llvm/lib/Target/X86/X86SchedSapphireRapids.td b/llvm/lib/Target/X86/X86SchedSapphireRapids.td
index b0ebe70c31fd44..0545f9b7f4c00e 100644
--- a/llvm/lib/Target/X86/X86SchedSapphireRapids.td
+++ b/llvm/lib/Target/X86/X86SchedSapphireRapids.td
@@ -59,7 +59,6 @@ def SPRPort01_05          : ProcResGroup<[SPRPort01, SPRPort05]>;
 def SPRPort01_05_10       : ProcResGroup<[SPRPort01, SPRPort05, SPRPort10]>;
 def SPRPort02_03          : ProcResGroup<[SPRPort02, SPRPort03]>;
 def SPRPort02_03_11       : ProcResGroup<[SPRPort02, SPRPort03, SPRPort11]>;
-def SPRPort02_03_10       : ProcResGroup<[SPRPort02, SPRPort03, SPRPort10]>;
 def SPRPort05_11          : ProcResGroup<[SPRPort05, SPRPort11]>;
 def SPRPort07_08          : ProcResGroup<[SPRPort07, SPRPort08]>;
 

@HaohaiWen
Copy link
Contributor

Intel Architecture day also use port 2 3 10.
I believe I had same confusion and contacted Intel internal employee and they told me I should follow Optimization Manual and architecture day picture.
https://download.intel.com/newsroom/2021/client-computing/intel-architecture-day-2021-presentation.pdf

@boomanaiden154
Copy link
Contributor Author

Intel Architecture day also use port 2 3 10.

There doesn't seem to be any explicit groupings in the Architecture Day Slides? Those sides also list port 10 as being ALU/LEA and port 11 as being LOAD/AGU. That matches the association in the optimization manual.

This also matches the behavior of the current scheduling models AlderLake/SapphireRapids scheduling models.

@HaohaiWen
Copy link
Contributor

Intel Architecture day also use port 2 3 10. I believe I had same confusion and contacted Intel internal employee and they told me I should follow Optimization Manual and architecture day picture. https://download.intel.com/newsroom/2021/client-computing/intel-architecture-day-2021-presentation.pdf

Oh, sorry. I confused 10 and 11..
So this PR reverts 2_3_10 group introduced in 37e27a4 and leaves 2_3_11 as group which is consistent with Intel Optimization manual.

@boomanaiden154
Copy link
Contributor Author

So this PR reverts 2_3_10 group introduced in 37e27a4 and leaves 2_3_11 as group which is consistent with Intel Optimization manual.

Yes.

I'll look into submitting a pull request on the Intel perfmon repository so that we can fix what seems to be the source of the issue (the counter being named incorrectly).

@boomanaiden154 boomanaiden154 merged commit 01a5596 into llvm:main Nov 11, 2024
8 of 10 checks passed
@boomanaiden154 boomanaiden154 deleted the exegesis-golden-cove-ports-2-3-11 branch November 11, 2024 07:38
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
…15645)

This patch updates the PFM counter mappings for Sapphire Rapids and
Alder Lake (p-cores) to group ports 2,3, and 11 despite the naming of
the performance counters. This is how the scheduling models assume
things work within LLVM, and seems to be a mistake within the Intel
perfmon documentation.

Fixes llvm#113941.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[llvm-exegesis] Check PFM counter mappings for AlderLake/SapphireRapids
3 participants