Skip to content

Commit 313b1a8

Browse files
authored
[mlgo] Support composite AOT-ed models (#96276)
This applies to the AOT case where we embed models in the compiler. The change adds support for multiple models for the same agent, and allows the user select one via a command line flag. "agent" refers to e.g. the inline advisor or the register allocator eviction advisor. To avoid build setup complexity, the support is delegated to the saved model. Since saved models define computational graphs, we can generate a composite model (this happens prior to building and embedding it in LLVM and is not shown in this change) that exposes an extra feature with a predefined name: `_model_selector`. The model, then, delegates internally to contained models based on that feature value. Model selection is expected to happen at model instantiation, there is no current scenario for switching them afterwards. If the model doesn't expose such a feature but the user passes one, we report error. If the model exposes such a feature but the user doesn't pass one, we also report an error. Invalid model selector values are expected to be handled by the saved model. Internally, the model uses a pair of uint64 values - the high and low of the MD5 hash of the name. A tool composing models would, then, need to: - expose the extra feature, `_model_selector`, shape (2,), uint64 data type - test its value (`tf.cond` or `tf.case` in Tensorflow) against the MD5 hash, in the [high, low] order, of contained models based on a user-specified name (which the user will then use as flag value to the compiler) Agents just need to add a flag to capture the name of a model and pass it to `ReleaseModeModelRunner` at construction. This can be passed in all cases without checking - the case where the model is not composite and we pass an empty name, everything works as before. This change also factors out the string flags we pass to the `ReleaseModeModelRunner` for better maintainability (we risk confusing parameters that are strings otherwise)
1 parent b097018 commit 313b1a8

File tree

3 files changed

+212
-27
lines changed

3 files changed

+212
-27
lines changed

llvm/include/llvm/Analysis/ReleaseModeModelRunner.h

Lines changed: 82 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,40 +14,94 @@
1414
#ifndef LLVM_ANALYSIS_RELEASEMODEMODELRUNNER_H
1515
#define LLVM_ANALYSIS_RELEASEMODEMODELRUNNER_H
1616

17+
#include "llvm/ADT/StringExtras.h"
1718
#include "llvm/Analysis/MLModelRunner.h"
1819
#include "llvm/Analysis/TensorSpec.h"
1920
#include "llvm/Support/ErrorHandling.h"
21+
#include "llvm/Support/MD5.h"
2022

2123
#include <memory>
22-
#include <vector>
2324

2425
namespace llvm {
2526

2627
/// ReleaseModeModelRunner - production mode implementation of the
2728
/// MLModelRunner. It uses an AOT-compiled SavedModel for efficient execution.
29+
struct EmbeddedModelRunnerOptions {
30+
/// Feed and Fetch feature prefixes - i.e. a feature named "foo" will be
31+
/// looked up as {FeedPrefix}_foo; and the output named "bar" will be looked
32+
/// up as {FetchPrefix}_bar
33+
StringRef FeedPrefix = "feed_";
34+
StringRef FetchPrefix = "fetch_";
35+
36+
/// ModelSelector is the name (recognized by the AOT-ed model) of a sub-model
37+
/// to use. "" is allowed if the model doesn't support sub-models.
38+
StringRef ModelSelector = "";
39+
40+
EmbeddedModelRunnerOptions &setFeedPrefix(StringRef Value) {
41+
FeedPrefix = Value;
42+
return *this;
43+
}
44+
EmbeddedModelRunnerOptions &setFetchPrefix(StringRef Value) {
45+
FetchPrefix = Value;
46+
return *this;
47+
}
48+
EmbeddedModelRunnerOptions &setModelSelector(StringRef Value) {
49+
ModelSelector = Value;
50+
return *this;
51+
}
52+
};
53+
2854
template <class TGen>
2955
class ReleaseModeModelRunner final : public MLModelRunner {
3056
public:
3157
/// FeatureNames' type should be an indexed collection of std::string, like
3258
/// std::array or std::vector, that has a size() method.
3359
template <class FType>
3460
ReleaseModeModelRunner(LLVMContext &Ctx, const FType &InputSpec,
35-
StringRef DecisionName, StringRef FeedPrefix = "feed_",
36-
StringRef FetchPrefix = "fetch_")
37-
: MLModelRunner(Ctx, MLModelRunner::Kind::Release, InputSpec.size()),
61+
StringRef DecisionName,
62+
const EmbeddedModelRunnerOptions &Options = {})
63+
: MLModelRunner(Ctx, MLModelRunner::Kind::Release, InputSpec.size() + 1),
3864
CompiledModel(std::make_unique<TGen>()) {
3965
assert(CompiledModel && "The CompiledModel should be valid");
40-
41-
for (size_t I = 0; I < InputSpec.size(); ++I) {
42-
const int Index =
43-
CompiledModel->LookupArgIndex(FeedPrefix.str() + InputSpec[I].name());
44-
void *Buffer = nullptr;
45-
if (Index >= 0)
46-
Buffer = CompiledModel->arg_data(Index);
47-
setUpBufferForTensor(I, InputSpec[I], Buffer);
66+
// Set up the model_selector past all the InputSpecs in all cases.
67+
// - if the model doesn't have such a feature, but the user requested it,
68+
// we report error. Same if the model supports it but the user didn't
69+
// specify it
70+
// - finally, we compute the MD5 hash of the user input and set the value
71+
// of the model selector to {high, low}
72+
bool InputIsPresent = true;
73+
populateTensor(InputSpec.size(),
74+
TensorSpec::createSpec<uint64_t>("_model_selector", {2}),
75+
Options.FeedPrefix, InputIsPresent);
76+
77+
// If we hit the "report an error" cases outlined above, continue with the
78+
// set up in case there's some custom diagnostics handler installed and it
79+
// doesn't promptly exit.
80+
if (Options.ModelSelector.empty() && InputIsPresent)
81+
Ctx.emitError(
82+
"A model selector was not specified but the underlying model "
83+
"requires selecting one because it exposes a _model_selector input");
84+
uint64_t High = 0;
85+
uint64_t Low = 0;
86+
if (!Options.ModelSelector.empty()) {
87+
if (!InputIsPresent)
88+
Ctx.emitError("A model selector was specified but the underlying model "
89+
"does not expose a _model_selector input");
90+
const auto Hash = MD5::hash(arrayRefFromStringRef(Options.ModelSelector));
91+
High = Hash.high();
92+
Low = Hash.low();
4893
}
49-
50-
ResultIndex = CompiledModel->LookupResultIndex(FetchPrefix.str() +
94+
getTensor<uint64_t>(InputSpec.size())[0] = High;
95+
getTensor<uint64_t>(InputSpec.size())[1] = Low;
96+
// At this point, the model selector is set up. If the user didn't provide
97+
// one, but the model has a _model_selector, it'll be set to (0, 0) which
98+
// the composite model should treat as error as part of its implementation
99+
// (but that should only matter if there is a custom handler that doesn't
100+
// exit on error)
101+
for (size_t I = 0; I < InputSpec.size(); ++I)
102+
populateTensor(I, InputSpec[I], Options.FeedPrefix, InputIsPresent);
103+
104+
ResultIndex = CompiledModel->LookupResultIndex(Options.FetchPrefix.str() +
51105
DecisionName.str());
52106
assert(ResultIndex >= 0 && "Cannot find DecisionName in inlining model");
53107
}
@@ -59,6 +113,20 @@ class ReleaseModeModelRunner final : public MLModelRunner {
59113
}
60114

61115
private:
116+
// fetch the model-provided buffer for the given Spec, or let MLModelRunner
117+
// create a scratch buffer. Indicate back to the caller if the model had that
118+
// input in the first place.
119+
void populateTensor(size_t Pos, const TensorSpec &Spec, StringRef Prefix,
120+
bool &InputIsPresent) {
121+
const int Index =
122+
CompiledModel->LookupArgIndex((Prefix + Spec.name()).str());
123+
void *Buffer = nullptr;
124+
InputIsPresent = Index >= 0;
125+
if (InputIsPresent)
126+
Buffer = CompiledModel->arg_data(Index);
127+
setUpBufferForTensor(Pos, Spec, Buffer);
128+
}
129+
62130
void *evaluateUntyped() override {
63131
CompiledModel->Run();
64132
return CompiledModel->result_data(ResultIndex);

llvm/lib/Analysis/MLInlineAdvisor.cpp

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,9 @@ static cl::opt<SkipMLPolicyCriteria> SkipPolicy(
5656
clEnumValN(SkipMLPolicyCriteria::IfCallerIsNotCold,
5757
"if-caller-not-cold", "if the caller is not cold")));
5858

59+
static cl::opt<std::string> ModelSelector("ml-inliner-model-selector",
60+
cl::Hidden, cl::init(""));
61+
5962
#if defined(LLVM_HAVE_TF_AOT_INLINERSIZEMODEL)
6063
// codegen-ed file
6164
#include "InlinerSizeModel.h" // NOLINT
@@ -73,7 +76,8 @@ llvm::getReleaseModeAdvisor(Module &M, ModuleAnalysisManager &MAM,
7376
std::unique_ptr<MLModelRunner> AOTRunner;
7477
if (InteractiveChannelBaseName.empty())
7578
AOTRunner = std::make_unique<ReleaseModeModelRunner<CompiledModelType>>(
76-
M.getContext(), FeatureMap, DecisionName);
79+
M.getContext(), FeatureMap, DecisionName,
80+
EmbeddedModelRunnerOptions().setModelSelector(ModelSelector));
7781
else {
7882
auto Features = FeatureMap;
7983
if (InteractiveIncludeDefault)

llvm/unittests/Analysis/MLModelRunnerTest.cpp

Lines changed: 125 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,12 @@
77
//===----------------------------------------------------------------------===//
88

99
#include "llvm/Analysis/MLModelRunner.h"
10+
#include "llvm/ADT/StringExtras.h"
1011
#include "llvm/Analysis/InteractiveModelRunner.h"
1112
#include "llvm/Analysis/NoInferenceModelRunner.h"
1213
#include "llvm/Analysis/ReleaseModeModelRunner.h"
1314
#include "llvm/Support/BinaryByteStream.h"
15+
#include "llvm/Support/ErrorHandling.h"
1416
#include "llvm/Support/FileSystem.h"
1517
#include "llvm/Support/FileUtilities.h"
1618
#include "llvm/Support/JSON.h"
@@ -28,28 +30,31 @@ namespace llvm {
2830
// This is a mock of the kind of AOT-generated model evaluator. It has 2 tensors
2931
// of shape {1}, and 'evaluation' adds them.
3032
// The interface is the one expected by ReleaseModelRunner.
31-
class MockAOTModel final {
33+
class MockAOTModelBase {
34+
protected:
3235
int64_t A = 0;
3336
int64_t B = 0;
3437
int64_t R = 0;
3538

3639
public:
37-
MockAOTModel() = default;
38-
int LookupArgIndex(const std::string &Name) {
40+
MockAOTModelBase() = default;
41+
virtual ~MockAOTModelBase() = default;
42+
43+
virtual int LookupArgIndex(const std::string &Name) {
3944
if (Name == "prefix_a")
4045
return 0;
4146
if (Name == "prefix_b")
4247
return 1;
4348
return -1;
4449
}
4550
int LookupResultIndex(const std::string &) { return 0; }
46-
void Run() { R = A + B; }
47-
void *result_data(int RIndex) {
51+
virtual void Run() = 0;
52+
virtual void *result_data(int RIndex) {
4853
if (RIndex == 0)
4954
return &R;
5055
return nullptr;
5156
}
52-
void *arg_data(int Index) {
57+
virtual void *arg_data(int Index) {
5358
switch (Index) {
5459
case 0:
5560
return &A;
@@ -60,6 +65,64 @@ class MockAOTModel final {
6065
}
6166
}
6267
};
68+
69+
class AdditionAOTModel final : public MockAOTModelBase {
70+
public:
71+
AdditionAOTModel() = default;
72+
void Run() override { R = A + B; }
73+
};
74+
75+
class DiffAOTModel final : public MockAOTModelBase {
76+
public:
77+
DiffAOTModel() = default;
78+
void Run() override { R = A - B; }
79+
};
80+
81+
static const char *M1Selector = "the model that subtracts";
82+
static const char *M2Selector = "the model that adds";
83+
84+
static MD5::MD5Result Hash1 = MD5::hash(arrayRefFromStringRef(M1Selector));
85+
static MD5::MD5Result Hash2 = MD5::hash(arrayRefFromStringRef(M2Selector));
86+
class ComposedAOTModel final {
87+
DiffAOTModel M1;
88+
AdditionAOTModel M2;
89+
uint64_t Selector[2] = {0};
90+
91+
bool isHashSameAsSelector(const std::pair<uint64_t, uint64_t> &Words) const {
92+
return Selector[0] == Words.first && Selector[1] == Words.second;
93+
}
94+
MockAOTModelBase *getModel() {
95+
if (isHashSameAsSelector(Hash1.words()))
96+
return &M1;
97+
if (isHashSameAsSelector(Hash2.words()))
98+
return &M2;
99+
llvm_unreachable("Should be one of the two");
100+
}
101+
102+
public:
103+
ComposedAOTModel() = default;
104+
int LookupArgIndex(const std::string &Name) {
105+
if (Name == "prefix__model_selector")
106+
return 2;
107+
return getModel()->LookupArgIndex(Name);
108+
}
109+
int LookupResultIndex(const std::string &Name) {
110+
return getModel()->LookupResultIndex(Name);
111+
}
112+
void *arg_data(int Index) {
113+
if (Index == 2)
114+
return Selector;
115+
return getModel()->arg_data(Index);
116+
}
117+
void *result_data(int RIndex) { return getModel()->result_data(RIndex); }
118+
void Run() { getModel()->Run(); }
119+
};
120+
121+
static EmbeddedModelRunnerOptions makeOptions() {
122+
EmbeddedModelRunnerOptions Opts;
123+
Opts.setFeedPrefix("prefix_");
124+
return Opts;
125+
}
63126
} // namespace llvm
64127

65128
TEST(NoInferenceModelRunner, AccessTensors) {
@@ -86,8 +149,8 @@ TEST(ReleaseModeRunner, NormalUse) {
86149
LLVMContext Ctx;
87150
std::vector<TensorSpec> Inputs{TensorSpec::createSpec<int64_t>("a", {1}),
88151
TensorSpec::createSpec<int64_t>("b", {1})};
89-
auto Evaluator = std::make_unique<ReleaseModeModelRunner<MockAOTModel>>(
90-
Ctx, Inputs, "", "prefix_");
152+
auto Evaluator = std::make_unique<ReleaseModeModelRunner<AdditionAOTModel>>(
153+
Ctx, Inputs, "", makeOptions());
91154
*Evaluator->getTensor<int64_t>(0) = 1;
92155
*Evaluator->getTensor<int64_t>(1) = 2;
93156
EXPECT_EQ(Evaluator->evaluate<int64_t>(), 3);
@@ -100,8 +163,8 @@ TEST(ReleaseModeRunner, ExtraFeatures) {
100163
std::vector<TensorSpec> Inputs{TensorSpec::createSpec<int64_t>("a", {1}),
101164
TensorSpec::createSpec<int64_t>("b", {1}),
102165
TensorSpec::createSpec<int64_t>("c", {1})};
103-
auto Evaluator = std::make_unique<ReleaseModeModelRunner<MockAOTModel>>(
104-
Ctx, Inputs, "", "prefix_");
166+
auto Evaluator = std::make_unique<ReleaseModeModelRunner<AdditionAOTModel>>(
167+
Ctx, Inputs, "", makeOptions());
105168
*Evaluator->getTensor<int64_t>(0) = 1;
106169
*Evaluator->getTensor<int64_t>(1) = 2;
107170
*Evaluator->getTensor<int64_t>(2) = -3;
@@ -118,8 +181,8 @@ TEST(ReleaseModeRunner, ExtraFeaturesOutOfOrder) {
118181
TensorSpec::createSpec<int64_t>("c", {1}),
119182
TensorSpec::createSpec<int64_t>("b", {1}),
120183
};
121-
auto Evaluator = std::make_unique<ReleaseModeModelRunner<MockAOTModel>>(
122-
Ctx, Inputs, "", "prefix_");
184+
auto Evaluator = std::make_unique<ReleaseModeModelRunner<AdditionAOTModel>>(
185+
Ctx, Inputs, "", makeOptions());
123186
*Evaluator->getTensor<int64_t>(0) = 1; // a
124187
*Evaluator->getTensor<int64_t>(1) = 2; // c
125188
*Evaluator->getTensor<int64_t>(2) = -3; // b
@@ -129,6 +192,56 @@ TEST(ReleaseModeRunner, ExtraFeaturesOutOfOrder) {
129192
EXPECT_EQ(*Evaluator->getTensor<int64_t>(2), -3);
130193
}
131194

195+
// We expect an error to be reported early if the user tried to specify a model
196+
// selector, but the model in fact doesn't support that.
197+
TEST(ReleaseModelRunner, ModelSelectorNoInputFeaturePresent) {
198+
LLVMContext Ctx;
199+
std::vector<TensorSpec> Inputs{TensorSpec::createSpec<int64_t>("a", {1}),
200+
TensorSpec::createSpec<int64_t>("b", {1})};
201+
EXPECT_DEATH(std::make_unique<ReleaseModeModelRunner<AdditionAOTModel>>(
202+
Ctx, Inputs, "", makeOptions().setModelSelector(M2Selector)),
203+
"A model selector was specified but the underlying model does "
204+
"not expose a _model_selector input");
205+
}
206+
207+
TEST(ReleaseModelRunner, ModelSelectorNoSelectorGiven) {
208+
LLVMContext Ctx;
209+
std::vector<TensorSpec> Inputs{TensorSpec::createSpec<int64_t>("a", {1}),
210+
TensorSpec::createSpec<int64_t>("b", {1})};
211+
EXPECT_DEATH(
212+
std::make_unique<ReleaseModeModelRunner<ComposedAOTModel>>(
213+
Ctx, Inputs, "", makeOptions()),
214+
"A model selector was not specified but the underlying model requires "
215+
"selecting one because it exposes a _model_selector input");
216+
}
217+
218+
// Test that we correctly set up the _model_selector tensor value. We are only
219+
// responsbile for what happens if the user doesn't specify a value (but the
220+
// model supports the feature), or if the user specifies one, and we correctly
221+
// populate the tensor, and do so upfront (in case the model implementation
222+
// needs that for subsequent tensor buffer lookups).
223+
TEST(ReleaseModelRunner, ModelSelector) {
224+
LLVMContext Ctx;
225+
std::vector<TensorSpec> Inputs{TensorSpec::createSpec<int64_t>("a", {1}),
226+
TensorSpec::createSpec<int64_t>("b", {1})};
227+
// This explicitly asks for M1
228+
auto Evaluator = std::make_unique<ReleaseModeModelRunner<ComposedAOTModel>>(
229+
Ctx, Inputs, "", makeOptions().setModelSelector(M1Selector));
230+
*Evaluator->getTensor<int64_t>(0) = 1;
231+
*Evaluator->getTensor<int64_t>(1) = 2;
232+
EXPECT_EQ(Evaluator->evaluate<int64_t>(), -1);
233+
234+
// Ask for M2
235+
Evaluator = std::make_unique<ReleaseModeModelRunner<ComposedAOTModel>>(
236+
Ctx, Inputs, "", makeOptions().setModelSelector(M2Selector));
237+
*Evaluator->getTensor<int64_t>(0) = 1;
238+
*Evaluator->getTensor<int64_t>(1) = 2;
239+
EXPECT_EQ(Evaluator->evaluate<int64_t>(), 3);
240+
241+
// Asking for a model that's not supported isn't handled by our infra and we
242+
// expect the model implementation to fail at a point.
243+
}
244+
132245
#if defined(LLVM_ON_UNIX)
133246
TEST(InteractiveModelRunner, Evaluation) {
134247
LLVMContext Ctx;

0 commit comments

Comments
 (0)