Skip to content

Commit d8b61d7

Browse files
[llvm-exegesis] Add middle half repetition mode (#77020)
This patch adds two new repetition modes to llvm-exegesis, particularly loop and duplicate repetition modes of what I am terming the middle half repetition mode. The middle half repetition mode essentially runs each measurement twice, one with twice the number of iterations of the other. These two measurements are then agregated by taking their difference. This subtracts away any setup/overhead that is unrelated to the code in the snippet, providing more accurate results. Using this mode on a couple toy examples, I am able to get exact (integer) throughput values on all of them in contrast to the default duplicate/loop repetition modes which show a little bit of noise on the snippet value.
1 parent 6a21e00 commit d8b61d7

File tree

10 files changed

+271
-45
lines changed

10 files changed

+271
-45
lines changed

llvm/docs/CommandGuide/llvm-exegesis.rst

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -301,7 +301,7 @@ OPTIONS
301301
enabled can help determine the effects of the frontend and can be used to
302302
improve latency and throughput estimates.
303303

304-
.. option:: --repetition-mode=[duplicate|loop|min]
304+
.. option:: --repetition-mode=[duplicate|loop|min|middle-half-duplicate|middle-half-loop]
305305

306306
Specify the repetition mode. `duplicate` will create a large, straight line
307307
basic block with `num-repetitions` instructions (repeating the snippet
@@ -314,7 +314,11 @@ OPTIONS
314314
that cache decoded instructions, but consumes a register for counting
315315
iterations. If performing an analysis over many opcodes, it may be best to
316316
instead use the `min` mode, which will run each other mode,
317-
and produce the minimal measured result.
317+
and produce the minimal measured result. The middle half repetition modes
318+
will either duplicate or run the snippet in a loop depending upon the specific
319+
mode. The middle half repetition modes will run two benchmarks, one twice the
320+
length of the first one, and then subtract the difference between them to get
321+
values without overhead.
318322

319323
.. option:: --num-repetitions=<Number of repetitions>
320324

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# REQUIRES: exegesis-can-measure-latency, x86_64-linux
2+
3+
# Check that we can use the middle-half repetition mode without crashing
4+
5+
# RUN: llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -opcode-name=ADD64rr -repetition-mode=middle-half-duplicate | FileCheck %s
6+
# RUN: llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -opcode-name=ADD64rr -repetition-mode=middle-half-loop | FileCheck %s
7+
8+
# CHECK: - { key: latency, value: {{[0-9.]*}}, per_snippet_value: {{[0-9.]*}}

llvm/tools/llvm-exegesis/lib/BenchmarkResult.h

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ struct BenchmarkMeasure {
9090
static BenchmarkMeasure
9191
Create(std::string Key, double Value,
9292
std::map<ValidationEvent, int64_t> ValCounters) {
93-
return {Key, Value, Value, ValCounters};
93+
return {Key, Value, Value, Value, ValCounters};
9494
}
9595
std::string Key;
9696
// This is the per-instruction value, i.e. measured quantity scaled per
@@ -99,6 +99,8 @@ struct BenchmarkMeasure {
9999
// This is the per-snippet value, i.e. measured quantity for one repetition of
100100
// the whole snippet.
101101
double PerSnippetValue;
102+
// This is the raw value collected from the full execution.
103+
double RawValue;
102104
// These are the validation counter values.
103105
std::map<ValidationEvent, int64_t> ValidationCounters;
104106
};
@@ -115,7 +117,13 @@ struct Benchmark {
115117
// The number of instructions inside the repeated snippet. For example, if a
116118
// snippet of 3 instructions is repeated 4 times, this is 12.
117119
unsigned NumRepetitions = 0;
118-
enum RepetitionModeE { Duplicate, Loop, AggregateMin };
120+
enum RepetitionModeE {
121+
Duplicate,
122+
Loop,
123+
AggregateMin,
124+
MiddleHalfDuplicate,
125+
MiddleHalfLoop
126+
};
119127
// Note that measurements are per instruction.
120128
std::vector<BenchmarkMeasure> Measurements;
121129
std::string Error;

llvm/tools/llvm-exegesis/lib/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ add_llvm_library(LLVMExegesis
6464
PerfHelper.cpp
6565
RegisterAliasing.cpp
6666
RegisterValue.cpp
67+
ResultAggregator.cpp
6768
SchedClassResolution.cpp
6869
SerialSnippetGenerator.cpp
6970
SnippetFile.cpp
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
//===-- ResultAggregator.cpp ------------------------------------*- C++ -*-===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#include "ResultAggregator.h"
10+
11+
namespace llvm {
12+
namespace exegesis {
13+
14+
class DefaultResultAggregator : public ResultAggregator {
15+
void AggregateResults(Benchmark &Result,
16+
ArrayRef<Benchmark> OtherResults) const override{};
17+
void AggregateMeasurement(BenchmarkMeasure &Measurement,
18+
const BenchmarkMeasure &NewMeasurement,
19+
const Benchmark &Result) const override{};
20+
};
21+
22+
class MinimumResultAggregator : public ResultAggregator {
23+
void AggregateMeasurement(BenchmarkMeasure &Measurement,
24+
const BenchmarkMeasure &NewMeasurement,
25+
const Benchmark &Result) const override;
26+
};
27+
28+
void MinimumResultAggregator::AggregateMeasurement(
29+
BenchmarkMeasure &Measurement, const BenchmarkMeasure &NewMeasurement,
30+
const Benchmark &Result) const {
31+
Measurement.PerInstructionValue = std::min(
32+
Measurement.PerInstructionValue, NewMeasurement.PerInstructionValue);
33+
Measurement.PerSnippetValue =
34+
std::min(Measurement.PerSnippetValue, NewMeasurement.PerSnippetValue);
35+
Measurement.RawValue =
36+
std::min(Measurement.RawValue, NewMeasurement.RawValue);
37+
}
38+
39+
class MiddleHalfResultAggregator : public ResultAggregator {
40+
void AggregateMeasurement(BenchmarkMeasure &Measurement,
41+
const BenchmarkMeasure &NewMeasurement,
42+
const Benchmark &Result) const override;
43+
};
44+
45+
void MiddleHalfResultAggregator::AggregateMeasurement(
46+
BenchmarkMeasure &Measurement, const BenchmarkMeasure &NewMeasurement,
47+
const Benchmark &Result) const {
48+
Measurement.RawValue = NewMeasurement.RawValue - Measurement.RawValue;
49+
Measurement.PerInstructionValue = Measurement.RawValue;
50+
Measurement.PerInstructionValue /= Result.NumRepetitions;
51+
Measurement.PerSnippetValue = Measurement.RawValue;
52+
Measurement.PerSnippetValue /=
53+
std::ceil(Result.NumRepetitions /
54+
static_cast<double>(Result.Key.Instructions.size()));
55+
}
56+
57+
void ResultAggregator::AggregateResults(
58+
Benchmark &Result, ArrayRef<Benchmark> OtherResults) const {
59+
for (const Benchmark &OtherResult : OtherResults) {
60+
append_range(Result.AssembledSnippet, OtherResult.AssembledSnippet);
61+
62+
if (OtherResult.Measurements.empty())
63+
continue;
64+
65+
assert(OtherResult.Measurements.size() == Result.Measurements.size() &&
66+
"Expected to have an identical number of measurements");
67+
68+
for (auto I : zip(Result.Measurements, OtherResult.Measurements)) {
69+
BenchmarkMeasure &Measurement = std::get<0>(I);
70+
const BenchmarkMeasure &NewMeasurement = std::get<1>(I);
71+
72+
assert(Measurement.Key == NewMeasurement.Key &&
73+
"Expected measurements to be symmetric");
74+
75+
AggregateMeasurement(Measurement, NewMeasurement, Result);
76+
}
77+
}
78+
}
79+
80+
std::unique_ptr<ResultAggregator>
81+
ResultAggregator::CreateAggregator(Benchmark::RepetitionModeE RepetitionMode) {
82+
switch (RepetitionMode) {
83+
case Benchmark::RepetitionModeE::Duplicate:
84+
case Benchmark::RepetitionModeE::Loop:
85+
return std::make_unique<DefaultResultAggregator>();
86+
case Benchmark::RepetitionModeE::AggregateMin:
87+
return std::make_unique<MinimumResultAggregator>();
88+
case Benchmark::RepetitionModeE::MiddleHalfDuplicate:
89+
case Benchmark::RepetitionModeE::MiddleHalfLoop:
90+
return std::make_unique<MiddleHalfResultAggregator>();
91+
}
92+
}
93+
94+
} // namespace exegesis
95+
} // namespace llvm
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
//===-- ResultAggregator.h --------------------------------------*- C++ -*-===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
///
9+
/// \file
10+
/// Defines result aggregators that are used to aggregate the results from
11+
/// multiple full benchmark runs.
12+
///
13+
//===----------------------------------------------------------------------===//
14+
15+
#include "BenchmarkResult.h"
16+
17+
namespace llvm {
18+
namespace exegesis {
19+
20+
class ResultAggregator {
21+
public:
22+
static std::unique_ptr<ResultAggregator>
23+
CreateAggregator(Benchmark::RepetitionModeE RepetitionMode);
24+
25+
virtual void AggregateResults(Benchmark &Result,
26+
ArrayRef<Benchmark> OtherResults) const;
27+
virtual void AggregateMeasurement(BenchmarkMeasure &Measurement,
28+
const BenchmarkMeasure &NewMeasurement,
29+
const Benchmark &Result) const = 0;
30+
31+
virtual ~ResultAggregator() = default;
32+
};
33+
34+
} // namespace exegesis
35+
} // namespace llvm

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,8 +141,10 @@ SnippetRepetitor::Create(Benchmark::RepetitionModeE Mode,
141141
const LLVMState &State) {
142142
switch (Mode) {
143143
case Benchmark::Duplicate:
144+
case Benchmark::MiddleHalfDuplicate:
144145
return std::make_unique<DuplicateSnippetRepetitor>(State);
145146
case Benchmark::Loop:
147+
case Benchmark::MiddleHalfLoop:
146148
return std::make_unique<LoopSnippetRepetitor>(State);
147149
case Benchmark::AggregateMin:
148150
break;

llvm/tools/llvm-exegesis/llvm-exegesis.cpp

Lines changed: 35 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
#include "lib/LlvmState.h"
2121
#include "lib/PerfHelper.h"
2222
#include "lib/ProgressMeter.h"
23+
#include "lib/ResultAggregator.h"
2324
#include "lib/SnippetFile.h"
2425
#include "lib/SnippetRepetitor.h"
2526
#include "lib/Target.h"
@@ -106,10 +107,13 @@ static cl::opt<exegesis::Benchmark::RepetitionModeE> RepetitionMode(
106107
cl::values(
107108
clEnumValN(exegesis::Benchmark::Duplicate, "duplicate",
108109
"Duplicate the snippet"),
109-
clEnumValN(exegesis::Benchmark::Loop, "loop",
110-
"Loop over the snippet"),
110+
clEnumValN(exegesis::Benchmark::Loop, "loop", "Loop over the snippet"),
111111
clEnumValN(exegesis::Benchmark::AggregateMin, "min",
112-
"All of the above and take the minimum of measurements")),
112+
"All of the above and take the minimum of measurements"),
113+
clEnumValN(exegesis::Benchmark::MiddleHalfDuplicate,
114+
"middle-half-duplicate", "Middle half duplicate mode"),
115+
clEnumValN(exegesis::Benchmark::MiddleHalfLoop, "middle-half-loop",
116+
"Middle half loop mode")),
113117
cl::init(exegesis::Benchmark::Duplicate));
114118

115119
static cl::opt<bool> BenchmarkMeasurementsPrintProgress(
@@ -421,30 +425,39 @@ static void runBenchmarkConfigurations(
421425
std::optional<ProgressMeter<>> Meter;
422426
if (BenchmarkMeasurementsPrintProgress)
423427
Meter.emplace(Configurations.size());
428+
429+
SmallVector<unsigned, 2> MinInstructions = {NumRepetitions};
430+
if (RepetitionMode == Benchmark::MiddleHalfDuplicate ||
431+
RepetitionMode == Benchmark::MiddleHalfLoop)
432+
MinInstructions.push_back(NumRepetitions * 2);
433+
424434
for (const BenchmarkCode &Conf : Configurations) {
425435
ProgressMeter<>::ProgressMeterStep MeterStep(Meter ? &*Meter : nullptr);
426436
SmallVector<Benchmark, 2> AllResults;
427437

428438
for (const std::unique_ptr<const SnippetRepetitor> &Repetitor :
429439
Repetitors) {
430-
auto RC = ExitOnErr(Runner.getRunnableConfiguration(
431-
Conf, NumRepetitions, LoopBodySize, *Repetitor));
432-
std::optional<StringRef> DumpFile;
433-
if (DumpObjectToDisk.getNumOccurrences())
434-
DumpFile = DumpObjectToDisk;
435-
auto [Err, BenchmarkResult] =
436-
Runner.runConfiguration(std::move(RC), DumpFile);
437-
if (Err) {
438-
// Errors from executing the snippets are fine.
439-
// All other errors are a framework issue and should fail.
440-
if (!Err.isA<SnippetExecutionFailure>()) {
441-
errs() << "llvm-exegesis error: " << toString(std::move(Err));
442-
exit(1);
440+
for (unsigned IterationRepetitions : MinInstructions) {
441+
auto RC = ExitOnErr(Runner.getRunnableConfiguration(
442+
Conf, IterationRepetitions, LoopBodySize, *Repetitor));
443+
std::optional<StringRef> DumpFile;
444+
if (DumpObjectToDisk.getNumOccurrences())
445+
DumpFile = DumpObjectToDisk;
446+
auto [Err, BenchmarkResult] =
447+
Runner.runConfiguration(std::move(RC), DumpFile);
448+
if (Err) {
449+
// Errors from executing the snippets are fine.
450+
// All other errors are a framework issue and should fail.
451+
if (!Err.isA<SnippetExecutionFailure>()) {
452+
llvm::errs() << "llvm-exegesis error: " << toString(std::move(Err));
453+
exit(1);
454+
}
455+
BenchmarkResult.Error = toString(std::move(Err));
443456
}
444-
BenchmarkResult.Error = toString(std::move(Err));
457+
AllResults.push_back(std::move(BenchmarkResult));
445458
}
446-
AllResults.push_back(std::move(BenchmarkResult));
447459
}
460+
448461
Benchmark &Result = AllResults.front();
449462

450463
// If any of our measurements failed, pretend they all have failed.
@@ -454,29 +467,10 @@ static void runBenchmarkConfigurations(
454467
}))
455468
Result.Measurements.clear();
456469

457-
if (RepetitionMode == Benchmark::RepetitionModeE::AggregateMin) {
458-
for (const Benchmark &OtherResult :
459-
ArrayRef<Benchmark>(AllResults).drop_front()) {
460-
append_range(Result.AssembledSnippet, OtherResult.AssembledSnippet);
461-
// Aggregate measurements, but only if all measurements succeeded.
462-
if (Result.Measurements.empty())
463-
continue;
464-
assert(OtherResult.Measurements.size() == Result.Measurements.size() &&
465-
"Expected to have identical number of measurements.");
466-
for (auto I : zip(Result.Measurements, OtherResult.Measurements)) {
467-
BenchmarkMeasure &Measurement = std::get<0>(I);
468-
const BenchmarkMeasure &NewMeasurement = std::get<1>(I);
469-
assert(Measurement.Key == NewMeasurement.Key &&
470-
"Expected measurements to be symmetric");
471-
472-
Measurement.PerInstructionValue =
473-
std::min(Measurement.PerInstructionValue,
474-
NewMeasurement.PerInstructionValue);
475-
Measurement.PerSnippetValue = std::min(
476-
Measurement.PerSnippetValue, NewMeasurement.PerSnippetValue);
477-
}
478-
}
479-
}
470+
std::unique_ptr<ResultAggregator> ResultAgg =
471+
ResultAggregator::CreateAggregator(RepetitionMode);
472+
ResultAgg->AggregateResults(Result,
473+
ArrayRef<Benchmark>(AllResults).drop_front());
480474

481475
// With dummy counters, measurements are rather meaningless,
482476
// so drop them altogether.

llvm/unittests/tools/llvm-exegesis/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ set(exegesis_sources
1717
ClusteringTest.cpp
1818
ProgressMeterTest.cpp
1919
RegisterValueTest.cpp
20+
ResultAggregatorTest.cpp
2021
)
2122

2223
set(exegesis_link_libraries LLVMExegesis)

0 commit comments

Comments
 (0)