Skip to content

Commit 496156a

Browse files
committed
[X86][AMX] Multiple configure for AMX register.
The previous solution depends on variable name to record the shape information. However it is not reliable, because in release build compiler would not set the variable name. It can be accomplished with an additional option `fno-discard-value-names`, but it is not acceptable for users. This patch is to preconfigure the tile register with machine instruction. It follow the same way what sigle configure does. In the future we can fall back to multiple configure when single configure fails due to the shape dependency issue. The algorithm to configure the tile register is simple in the patch. We may improve it in the future. It configure tile register based on basic block. Compiler would spill the tile register if it live out the basic block. After the configure there should be no spill across tile confgiure in the register alloction. Just like fast register allocation the algorithm walk the instruction in reverse order. When the shape dependency doesn't meet, it insert ldtilecfg after the last instruction that define the shape. In post configuration compiler also walk the basic block to collect the physical tile register number and generate instruction to fill the stack slot for the correponding shape information. TODO: There is some following work in D125602. The risk is modifying the fast RA may cause regression as fast RA is usded for different targets. We may create an independent RA for tile register. Differential Revision: https://reviews.llvm.org/D125075
1 parent 62a9b36 commit 496156a

17 files changed

+2079
-407
lines changed

llvm/lib/Target/X86/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ set(sources
3838
X86PreAMXConfig.cpp
3939
X86LowerAMXIntrinsics.cpp
4040
X86TileConfig.cpp
41+
X86FastPreTileConfig.cpp
4142
X86FastTileConfig.cpp
4243
X86PreTileConfig.cpp
4344
X86ExpandPseudo.cpp

llvm/lib/Target/X86/X86.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,9 @@ FunctionPass *createX86DynAllocaExpander();
7979
/// Return a pass that config the tile registers.
8080
FunctionPass *createX86TileConfigPass();
8181

82+
/// Return a pass that preconfig the tile registers before fast reg allocation.
83+
FunctionPass *createX86FastPreTileConfigPass();
84+
8285
/// Return a pass that config the tile registers after fast reg allocation.
8386
FunctionPass *createX86FastTileConfigPass();
8487

@@ -175,6 +178,7 @@ void initializeX86PartialReductionPass(PassRegistry &);
175178
void initializeX86SpeculativeLoadHardeningPassPass(PassRegistry &);
176179
void initializeX86SpeculativeExecutionSideEffectSuppressionPass(PassRegistry &);
177180
void initializeX86PreTileConfigPass(PassRegistry &);
181+
void initializeX86FastPreTileConfigPass(PassRegistry &);
178182
void initializeX86FastTileConfigPass(PassRegistry &);
179183
void initializeX86TileConfigPass(PassRegistry &);
180184
void initializeX86LowerAMXTypeLegacyPassPass(PassRegistry &);

0 commit comments

Comments
 (0)