Skip to content

[RISCV][llvm-exegesis] Add default Pfm cycle counter. #121866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 7, 2025

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Jan 7, 2025

Also tested with Ubuntu on SiFive's HiFive Premier P550 board. Curiously latency is reporting ~1.5 on basic scalar arithmetic, scalar mul is ~3.5, and div is ~36.5. This 0.5 cycles higher than I expect.

This is stacked on #121862

@llvmbot
Copy link
Member

llvmbot commented Jan 7, 2025

@llvm/pr-subscribers-tools-llvm-exegesis

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

Also tested with Ubuntu on SiFive's HiFive Premier P550 board. Curiously latency is reporting ~1.5 on basic scalar arithmetic, scalar mul is ~3.5, and div is ~36.5. This 0.5 cycles higher than I expect.

This is stacked on #121862


Full diff: https://github.com/llvm/llvm-project/pull/121866.diff

9 Files Affected:

  • (modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1)
  • (modified) llvm/lib/Target/RISCV/RISCV.td (+6)
  • (added) llvm/lib/Target/RISCV/RISCVPfmCounters.td (+18)
  • (modified) llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp (+3-2)
  • (modified) llvm/unittests/tools/llvm-exegesis/CMakeLists.txt (+3)
  • (added) llvm/unittests/tools/llvm-exegesis/RISCV/CMakeLists.txt (+21)
  • (added) llvm/unittests/tools/llvm-exegesis/RISCV/SnippetGeneratorTest.cpp (+124)
  • (added) llvm/unittests/tools/llvm-exegesis/RISCV/TargetTest.cpp (+68)
  • (added) llvm/unittests/tools/llvm-exegesis/RISCV/TestBase.h (+44)
diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index 44661647a86310..98d3615ebab58d 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -15,6 +15,7 @@ tablegen(LLVM RISCVGenRegisterBank.inc -gen-register-bank)
 tablegen(LLVM RISCVGenRegisterInfo.inc -gen-register-info)
 tablegen(LLVM RISCVGenSearchableTables.inc -gen-searchable-tables)
 tablegen(LLVM RISCVGenSubtargetInfo.inc -gen-subtarget)
+tablegen(LLVM RISCVGenExegesis.inc -gen-exegesis)
 
 set(LLVM_TARGET_DEFINITIONS RISCVGISel.td)
 tablegen(LLVM RISCVGenGlobalISel.inc -gen-global-isel)
diff --git a/llvm/lib/Target/RISCV/RISCV.td b/llvm/lib/Target/RISCV/RISCV.td
index 963124140cd035..4e0c64a5ca2c6f 100644
--- a/llvm/lib/Target/RISCV/RISCV.td
+++ b/llvm/lib/Target/RISCV/RISCV.td
@@ -63,6 +63,12 @@ include "RISCVSchedXiangShanNanHu.td"
 
 include "RISCVProcessors.td"
 
+//===----------------------------------------------------------------------===//
+// Pfm Counters
+//===----------------------------------------------------------------------===//
+
+include "RISCVPfmCounters.td"
+
 //===----------------------------------------------------------------------===//
 // Define the RISC-V target.
 //===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/RISCV/RISCVPfmCounters.td b/llvm/lib/Target/RISCV/RISCVPfmCounters.td
new file mode 100644
index 00000000000000..013e789a9e9217
--- /dev/null
+++ b/llvm/lib/Target/RISCV/RISCVPfmCounters.td
@@ -0,0 +1,18 @@
+//===---- RISCVPfmCounters.td - RISC-V Hardware Counters ---*- tablegen -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This describes the available hardware counters for RISC-V.
+//
+//===----------------------------------------------------------------------===//
+
+def CpuCyclesPfmCounter : PfmCounter<"CYCLES">;
+
+def DefaultPfmCounters : ProcPfmCounters {
+  let CycleCounter = CpuCyclesPfmCounter;
+}
+def : PfmCountersDefaultBinding<DefaultPfmCounters>;
diff --git a/llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp b/llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp
index 41d361532908ca..5636782bdf7f6f 100644
--- a/llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp
+++ b/llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp
@@ -24,6 +24,8 @@
 namespace llvm {
 namespace exegesis {
 
+#include "RISCVGenExegesis.inc"
+
 namespace {
 
 // Stores constant value to a general-purpose (integer) register.
@@ -132,8 +134,7 @@ class ExegesisRISCVTarget : public ExegesisTarget {
 };
 
 ExegesisRISCVTarget::ExegesisRISCVTarget()
-    : ExegesisTarget(ArrayRef<CpuAndPfmCounters>{},
-                     RISCV_MC::isOpcodeAvailable) {}
+    : ExegesisTarget(RISCVCpuPfmCounters, RISCV_MC::isOpcodeAvailable) {}
 
 bool ExegesisRISCVTarget::matchesArch(Triple::ArchType Arch) const {
   return Arch == Triple::riscv32 || Arch == Triple::riscv64;
diff --git a/llvm/unittests/tools/llvm-exegesis/CMakeLists.txt b/llvm/unittests/tools/llvm-exegesis/CMakeLists.txt
index 3ee3a0dc6b5d04..735f17ab03e612 100644
--- a/llvm/unittests/tools/llvm-exegesis/CMakeLists.txt
+++ b/llvm/unittests/tools/llvm-exegesis/CMakeLists.txt
@@ -53,6 +53,9 @@ endif()
 if(LLVM_TARGETS_TO_BUILD MATCHES "Mips")
   include(Mips/CMakeLists.txt)
 endif()
+if(LLVM_TARGETS_TO_BUILD MATCHES "RISCV")
+  include(RISCV/CMakeLists.txt)
+endif()
 
 include_directories(${exegesis_includes})
 
diff --git a/llvm/unittests/tools/llvm-exegesis/RISCV/CMakeLists.txt b/llvm/unittests/tools/llvm-exegesis/RISCV/CMakeLists.txt
new file mode 100644
index 00000000000000..1984819be7738b
--- /dev/null
+++ b/llvm/unittests/tools/llvm-exegesis/RISCV/CMakeLists.txt
@@ -0,0 +1,21 @@
+add_llvm_exegesis_unittest_includes(
+  ${LLVM_MAIN_SRC_DIR}/lib/Target/RISCV
+  ${LLVM_BINARY_DIR}/lib/Target/RISCV
+  ${LLVM_MAIN_SRC_DIR}/tools/llvm-exegesis/lib
+  )
+
+add_llvm_exegesis_unittest_link_components(
+  MC
+  MCParser
+  Object
+  Support
+  Symbolize
+  RISCV
+  )
+
+add_llvm_exegesis_unittest_sources(
+  SnippetGeneratorTest.cpp
+  TargetTest.cpp
+  )
+add_llvm_exegesis_unittest_link_libraries(
+  LLVMExegesisRISCV)
diff --git a/llvm/unittests/tools/llvm-exegesis/RISCV/SnippetGeneratorTest.cpp b/llvm/unittests/tools/llvm-exegesis/RISCV/SnippetGeneratorTest.cpp
new file mode 100644
index 00000000000000..22c25d92dfd3eb
--- /dev/null
+++ b/llvm/unittests/tools/llvm-exegesis/RISCV/SnippetGeneratorTest.cpp
@@ -0,0 +1,124 @@
+//===-- SnippetGeneratorTest.cpp --------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "../Common/AssemblerUtils.h"
+#include "LlvmState.h"
+#include "MCInstrDescView.h"
+#include "ParallelSnippetGenerator.h"
+#include "RISCVInstrInfo.h"
+#include "RegisterAliasing.h"
+#include "SerialSnippetGenerator.h"
+#include "TestBase.h"
+
+namespace llvm {
+namespace exegesis {
+namespace {
+
+using testing::AnyOf;
+using testing::ElementsAre;
+using testing::HasSubstr;
+using testing::SizeIs;
+
+MATCHER(IsInvalid, "") { return !arg.isValid(); }
+MATCHER(IsReg, "") { return arg.isReg(); }
+
+template <typename SnippetGeneratorT>
+class RISCVSnippetGeneratorTest : public RISCVTestBase {
+protected:
+  RISCVSnippetGeneratorTest() : Generator(State, SnippetGenerator::Options()) {}
+
+  std::vector<CodeTemplate> checkAndGetCodeTemplates(unsigned Opcode) {
+    randomGenerator().seed(0); // Initialize seed.
+    const Instruction &Instr = State.getIC().getInstr(Opcode);
+    auto CodeTemplateOrError = Generator.generateCodeTemplates(
+        &Instr, State.getRATC().emptyRegisters());
+    EXPECT_FALSE(CodeTemplateOrError.takeError()); // Valid configuration.
+    return std::move(CodeTemplateOrError.get());
+  }
+
+  SnippetGeneratorT Generator;
+};
+
+using RISCVSerialSnippetGeneratorTest =
+    RISCVSnippetGeneratorTest<SerialSnippetGenerator>;
+
+using RISCVParallelSnippetGeneratorTest =
+    RISCVSnippetGeneratorTest<ParallelSnippetGenerator>;
+
+TEST_F(RISCVSerialSnippetGeneratorTest,
+       ImplicitSelfDependencyThroughExplicitRegs) {
+  // - ADD
+  // - Op0 Explicit Def RegClass(GPR)
+  // - Op1 Explicit Use RegClass(GPR)
+  // - Op2 Explicit Use RegClass(GPR)
+  // - Var0 [Op0]
+  // - Var1 [Op1]
+  // - Var2 [Op2]
+  // - hasAliasingRegisters
+  const unsigned Opcode = RISCV::ADD;
+  const auto CodeTemplates = checkAndGetCodeTemplates(Opcode);
+  ASSERT_THAT(CodeTemplates, SizeIs(1));
+  const auto &CT = CodeTemplates[0];
+  EXPECT_THAT(CT.Execution, ExecutionMode::SERIAL_VIA_EXPLICIT_REGS);
+  ASSERT_THAT(CT.Instructions, SizeIs(1));
+  const InstructionTemplate &IT = CT.Instructions[0];
+  EXPECT_THAT(IT.getOpcode(), Opcode);
+  ASSERT_THAT(IT.getVariableValues(), SizeIs(3));
+  EXPECT_THAT(IT.getVariableValues(),
+              AnyOf(ElementsAre(IsReg(), IsInvalid(), IsReg()),
+                    ElementsAre(IsReg(), IsReg(), IsInvalid())))
+      << "Op0 is either set to Op1 or to Op2";
+}
+
+TEST_F(RISCVSerialSnippetGeneratorTest,
+       ImplicitSelfDependencyThroughExplicitRegsForbidAll) {
+  // - XOR
+  // - Op0 Explicit Def RegClass(GPR)
+  // - Op1 Explicit Use RegClass(GPR)
+  // - Op2 Explicit Use RegClass(GPR)
+  // - Var0 [Op0]
+  // - Var1 [Op1]
+  // - Var2 [Op2]
+  // - hasAliasingRegisters
+  randomGenerator().seed(0); // Initialize seed.
+  const Instruction &Instr = State.getIC().getInstr(RISCV::XOR);
+  auto AllRegisters = State.getRATC().emptyRegisters();
+  AllRegisters.flip();
+  auto Error =
+      Generator.generateCodeTemplates(&Instr, AllRegisters).takeError();
+  EXPECT_TRUE((bool)Error);
+  consumeError(std::move(Error));
+}
+
+TEST_F(RISCVParallelSnippetGeneratorTest, MemoryUse) {
+  // LB reads from memory.
+  // - LB
+  // - Op0 Explicit Def RegClass(GPR)
+  // - Op1 Explicit Use Memory RegClass(GPR)
+  // - Op2 Explicit Use Memory
+  // - Var0 [Op0]
+  // - Var1 [Op1]
+  // - Var2 [Op2]
+  // - hasMemoryOperands
+  const unsigned Opcode = RISCV::LB;
+  const auto CodeTemplates = checkAndGetCodeTemplates(Opcode);
+  ASSERT_THAT(CodeTemplates, SizeIs(1));
+  const auto &CT = CodeTemplates[0];
+  EXPECT_THAT(CT.Info, HasSubstr("instruction has no tied variables"));
+  EXPECT_THAT(CT.Execution, ExecutionMode::UNKNOWN);
+  ASSERT_THAT(CT.Instructions,
+              SizeIs(ParallelSnippetGenerator::kMinNumDifferentAddresses));
+  const InstructionTemplate &IT = CT.Instructions[0];
+  EXPECT_THAT(IT.getOpcode(), Opcode);
+  ASSERT_THAT(IT.getVariableValues(), SizeIs(3));
+  EXPECT_EQ(IT.getVariableValues()[1].getReg(), RISCV::X10);
+}
+
+} // namespace
+} // namespace exegesis
+} // namespace llvm
diff --git a/llvm/unittests/tools/llvm-exegesis/RISCV/TargetTest.cpp b/llvm/unittests/tools/llvm-exegesis/RISCV/TargetTest.cpp
new file mode 100644
index 00000000000000..53aed883b9fa67
--- /dev/null
+++ b/llvm/unittests/tools/llvm-exegesis/RISCV/TargetTest.cpp
@@ -0,0 +1,68 @@
+//===-- TargetTest.cpp ---------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "Target.h"
+
+#include <cassert>
+#include <memory>
+
+#include "MCTargetDesc/RISCVMCTargetDesc.h"
+#include "llvm/MC/TargetRegistry.h"
+#include "llvm/Support/TargetSelect.h"
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+
+namespace llvm {
+namespace exegesis {
+
+void InitializeRISCVExegesisTarget();
+
+namespace {
+
+using testing::IsEmpty;
+using testing::Not;
+using testing::NotNull;
+
+constexpr const char kTriple[] = "riscv64-unknown-linux";
+
+class RISCVTargetTest : public ::testing::Test {
+protected:
+  RISCVTargetTest() : ExegesisTarget_(ExegesisTarget::lookup(Triple(kTriple))) {
+    EXPECT_THAT(ExegesisTarget_, NotNull());
+    std::string error;
+    Target_ = TargetRegistry::lookupTarget(kTriple, error);
+    EXPECT_THAT(Target_, NotNull());
+  }
+  static void SetUpTestCase() {
+    LLVMInitializeRISCVTargetInfo();
+    LLVMInitializeRISCVTarget();
+    LLVMInitializeRISCVTargetMC();
+    InitializeRISCVExegesisTarget();
+  }
+
+  const Target *Target_;
+  const ExegesisTarget *const ExegesisTarget_;
+};
+
+TEST_F(RISCVTargetTest, SetRegToConstant) {
+  const std::unique_ptr<MCSubtargetInfo> STI(
+      Target_->createMCSubtargetInfo(kTriple, "generic", ""));
+  const auto Insts = ExegesisTarget_->setRegTo(*STI, RISCV::X10, APInt());
+  EXPECT_THAT(Insts, Not(IsEmpty()));
+}
+
+TEST_F(RISCVTargetTest, DefaultPfmCounters) {
+  const std::string Expected = "CYCLES";
+  EXPECT_EQ(ExegesisTarget_->getPfmCounters("").CycleCounter, Expected);
+  EXPECT_EQ(ExegesisTarget_->getPfmCounters("unknown_cpu").CycleCounter,
+            Expected);
+}
+
+} // namespace
+} // namespace exegesis
+} // namespace llvm
diff --git a/llvm/unittests/tools/llvm-exegesis/RISCV/TestBase.h b/llvm/unittests/tools/llvm-exegesis/RISCV/TestBase.h
new file mode 100644
index 00000000000000..66748fb9a2ce1b
--- /dev/null
+++ b/llvm/unittests/tools/llvm-exegesis/RISCV/TestBase.h
@@ -0,0 +1,44 @@
+//===-- TestBase.h ----------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+// Test fixture common to all RISC-V tests.
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_UNITTESTS_TOOLS_LLVMEXEGESIS_RISCV_TESTBASE_H
+#define LLVM_UNITTESTS_TOOLS_LLVMEXEGESIS_RISCV_TESTBASE_H
+
+#include "LlvmState.h"
+#include "llvm/MC/TargetRegistry.h"
+#include "llvm/Support/TargetSelect.h"
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+
+namespace llvm {
+namespace exegesis {
+
+void InitializeRISCVExegesisTarget();
+
+class RISCVTestBase : public ::testing::Test {
+protected:
+  RISCVTestBase()
+      : State(cantFail(
+            LLVMState::Create("riscv64-unknown-linux", "generic-rv64"))) {}
+
+  static void SetUpTestCase() {
+    LLVMInitializeRISCVTargetInfo();
+    LLVMInitializeRISCVTargetMC();
+    LLVMInitializeRISCVTarget();
+    InitializeRISCVExegesisTarget();
+  }
+
+  const LLVMState State;
+};
+
+} // namespace exegesis
+} // namespace llvm
+
+#endif

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for Pfm Counter changes.

Is it possible for you to use spr/graphite for stacked PRs (or even just manually stack them on Github by setting the branches appropriately on the origin repo)? For this one it doesn't matter too much, but having extra changes stacked in can be a bit confusing.

Everything being 0.5 cycles higher than expected is a bit weird. What are your benchmark settings like? It might be worth trying repetition-mode=middle-half-duplicate|middle-half-loop (the latter with a decent --loop-body-size for single instruction benchmarking to see if it's just a constant overhead (although that seems unlikely given it's 0.5 cycles for instructions with a huge range of latencies).

Not sure if the counter used by the perf subsystem for CYCLES is good either. I know on X86 counters can be particularly tricky. Fixing that would probably require bringing up libpfm for RISCV though which would not be a super trivial effort.

@topperc
Copy link
Collaborator Author

topperc commented Jan 7, 2025

LGTM for Pfm Counter changes.

Is it possible for you to use spr/graphite for stacked PRs (or even just manually stack them on Github by setting the branches appropriately on the origin repo)? For this one it doesn't matter too much, but having extra changes stacked in can be a bit confusing.

Everything being 0.5 cycles higher than expected is a bit weird. What are your benchmark settings like? It might be worth trying repetition-mode=middle-half-duplicate|middle-half-loop (the latter with a decent --loop-body-size for single instruction benchmarking to see if it's just a constant overhead (although that seems unlikely given it's 0.5 cycles for instructions with a huge range of latencies).

Not sure if the counter used by the perf subsystem for CYCLES is good either. I know on X86 counters can be particularly tricky. Fixing that would probably require bringing up libpfm for RISCV though which would not be a super trivial effort.

Using repetition-mode=middle-half-duplicate|middle-half-loop both gave a value closer to 1.0 for simple scalar, and 3.0 for mul. I was using the default before. This is my first time running llvm-exegesis.

@boomanaiden154
Copy link
Contributor

Using repetition-mode=middle-half-duplicate|middle-half-loop both gave a value closer to 1.0 for simple scalar, and 3.0 for mul. I was using the default before. This is my first time running llvm-exegesis.

How much closer? On x86 they're pretty on the dot and can even reasonably consistently give integer throughputs. If that gives a value pretty close to one it's interesting how high the ioctl syscall overhead is.

I'm not sure it's a super common option. I implemented it I think about a year ago and assume I'm probably the only user currently.

@topperc
Copy link
Collaborator Author

topperc commented Jan 7, 2025

Using repetition-mode=middle-half-duplicate|middle-half-loop both gave a value closer to 1.0 for simple scalar, and 3.0 for mul. I was using the default before. This is my first time running llvm-exegesis.

How much closer? On x86 they're pretty on the dot and can even reasonably consistently give integer throughputs. If that gives a value pretty close to one it's interesting how high the ioctl syscall overhead is.

I'm not sure it's a super common option. I implemented it I think about a year ago and assume I'm probably the only user currently.

With middle-half-loop, for ADD I got between 0.9761 and 1.0148. With middle-half-duplicate I got between 0.9839 and 1.0468.

@boomanaiden154
Copy link
Contributor

That should be reasonable enough for the purposes of validating the scheduling models. It's interesting that the noise seems so much higher than on x86 though.

Also tested with Ubuntu on SiFive's HiFive Premier P550 board.
Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@topperc topperc merged commit afa8aee into llvm:main Jan 7, 2025
5 of 7 checks passed
@topperc topperc deleted the pr/riscv-pfm branch January 7, 2025 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants