[GlobalISel] Change MatchTable entries to 1 byte each #74429

Pierre-vh · 2023-12-05T07:54:06Z

See https://discourse.llvm.org/t/rfc-make-globalisel-match-table-entries-1-byte-instead-of-8/75411

This helps reduce llc's binary size, at the cost of some added complexity to the MatchTable machinery.

llvmbot · 2023-12-05T07:54:35Z

@llvm/pr-subscribers-llvm-globalisel

Author: Pierre van Houtryve (Pierre-vh)

Changes

See https://discourse.llvm.org/t/rfc-make-globalisel-match-table-entries-1-byte-instead-of-8/75411

This helps reduce llc's binary size, at the cost of some added complexity to the MatchTable machinery.

NOTE: This is a work-in-progress. There are 40 tests to update manually and I'm doing them bit by bit.

Patch is 282.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74429.diff

23 Files Affected:

(modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h (+215-188)
(modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h (+225-194)
(modified) llvm/test/TableGen/ContextlessPredicates.td (+49-51)
(modified) llvm/test/TableGen/DefaultOpsGlobalISel.td (+194-95)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-eraseroot.td (+6-6)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/builtins/match-table-replacerreg.td (+19-19)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-imms.td (+30-30)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-miflags.td (+13-13)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-operand-types.td (+12-12)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-patfrag-root.td (+30-30)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-permutations.td (+130-130)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-typeof.td (+19-19)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-variadics.td (+25-25)
(modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td (+61-61)
(modified) llvm/test/TableGen/GlobalISelEmitterSubreg.td (+84-84)
(modified) llvm/test/TableGen/GlobalISelEmitterVariadic.td (+11-11)
(modified) llvm/test/TableGen/HasNoUse.td (+4-4)
(modified) llvm/test/TableGen/immarg-predicated.td (+4-4)
(modified) llvm/test/TableGen/immarg.td (+3-3)
(modified) llvm/test/TableGen/predicate-patfags.td (+10-10)
(modified) llvm/utils/TableGen/GlobalISelMatchTable.cpp (+319-196)
(modified) llvm/utils/TableGen/GlobalISelMatchTable.h (+19-13)
(modified) llvm/utils/TableGen/GlobalISelMatchTableExecutorEmitter.cpp (+6-2)

diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
index f5d9f5f40881c..2fb149144ab76 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
@@ -54,10 +54,30 @@ enum {
   GICXXCustomAction_Invalid = 0,
 };
 
+/// The MatchTable is encoded as an array of bytes.
+/// Thus, opcodes are expected to be <255.
+///
+/// Operands can be variable-sized, their size is always after their name
+/// in the docs, e.g. "Foo(4)" means that "Foo" takes 4 entries in the table,
+/// so 4 bytes. "Foo()"
+///
+/// As a general rule of thumb:
+///   - Instruction & Operand IDs are ULEB128
+///   - LLT IDs are 1 byte
+///   - Predicates and target opcodes, register and register class IDs are 2
+///     bytes.
+///   - Indexes into the table are 4 bytes.
+///   - Inline constants are 8 bytes
+///
+/// Design notes:
+///   - Inst/Op IDs have to be LEB128 because some targets generate
+///     extremely long patterns which need more than 255 temporaries.
+///     We could just use 2 bytes everytime, but then some targets like
+///     X86/AMDGPU that have no need for it will pay the price all the time.
 enum {
   /// Begin a try-block to attempt a match and jump to OnFail if it is
   /// unsuccessful.
-  /// - OnFail - The MatchTable entry at which to resume if the match fails.
+  /// - OnFail(4) - The MatchTable entry at which to resume if the match fails.
   ///
   /// FIXME: This ought to take an argument indicating the number of try-blocks
   ///        to exit on failure. It's usually one but the last match attempt of
@@ -68,100 +88,103 @@ enum {
   GIM_Try,
 
   /// Switch over the opcode on the specified instruction
-  /// - InsnID - Instruction ID
-  /// - LowerBound - numerically minimum opcode supported
-  /// - UpperBound - numerically maximum + 1 opcode supported
-  /// - Default - failure jump target
-  /// - JumpTable... - (UpperBound - LowerBound) (at least 2) jump targets
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - LowerBound(2) - numerically minimum opcode supported
+  /// - UpperBound(2) - numerically maximum + 1 opcode supported
+  /// - Default(4) - failure jump target
+  /// - JumpTable(4)... - (UpperBound - LowerBound) (at least 2) jump targets
   GIM_SwitchOpcode,
 
   /// Switch over the LLT on the specified instruction operand
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - LowerBound - numerically minimum Type ID supported
-  /// - UpperBound - numerically maximum + 1 Type ID supported
-  /// - Default - failure jump target
-  /// - JumpTable... - (UpperBound - LowerBound) (at least 2) jump targets
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - LowerBound(2) - numerically minimum Type ID supported
+  /// - UpperBound(2) - numerically maximum + 1 Type ID supported
+  /// - Default(4) - failure jump target
+  /// - JumpTable(4)... - (UpperBound - LowerBound) (at least 2) jump targets
   GIM_SwitchType,
 
   /// Record the specified instruction.
   /// The IgnoreCopies variant ignores COPY instructions.
-  /// - NewInsnID - Instruction ID to define
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
+  /// - NewInsnID(ULEB128) - Instruction ID to define
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
   GIM_RecordInsn,
   GIM_RecordInsnIgnoreCopies,
 
   /// Check the feature bits
-  /// - Expected features
+  ///   Feature(2) - Expected features
   GIM_CheckFeatures,
 
   /// Check the opcode on the specified instruction
-  /// - InsnID - Instruction ID
-  /// - Expected opcode
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - Opc(2) - Expected opcode
   GIM_CheckOpcode,
 
   /// Check the opcode on the specified instruction, checking 2 acceptable
   /// alternatives.
-  /// - InsnID - Instruction ID
-  /// - Expected opcode
-  /// - Alternative expected opcode
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - Opc(2) - Expected opcode
+  /// - Opc(2) - Alternative expected opcode
   GIM_CheckOpcodeIsEither,
 
   /// Check the instruction has the right number of operands
-  /// - InsnID - Instruction ID
-  /// - Expected number of operands
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - Ops(ULEB128) - Expected number of operands
   GIM_CheckNumOperands,
+
   /// Check an immediate predicate on the specified instruction
-  /// - InsnID - Instruction ID
-  /// - The predicate to test
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - Pred(2) - The predicate to test
   GIM_CheckI64ImmPredicate,
   /// Check an immediate predicate on the specified instruction via an APInt.
-  /// - InsnID - Instruction ID
-  /// - The predicate to test
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - Pred(2) - The predicate to test
   GIM_CheckAPIntImmPredicate,
   /// Check a floating point immediate predicate on the specified instruction.
-  /// - InsnID - Instruction ID
-  /// - The predicate to test
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - Pred(2) - The predicate to test
   GIM_CheckAPFloatImmPredicate,
   /// Check an immediate predicate on the specified instruction
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - The predicate to test
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - Pred(2) - The predicate to test
   GIM_CheckImmOperandPredicate,
+
   /// Check a memory operation has the specified atomic ordering.
-  /// - InsnID - Instruction ID
-  /// - Ordering - The AtomicOrdering value
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - Ordering(ULEB128) - The AtomicOrdering value
   GIM_CheckAtomicOrdering,
   GIM_CheckAtomicOrderingOrStrongerThan,
   GIM_CheckAtomicOrderingWeakerThan,
+
   /// Check the size of the memory access for the given machine memory operand.
-  /// - InsnID - Instruction ID
-  /// - MMOIdx - MMO index
-  /// - Size - The size in bytes of the memory access
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - MMOIdx(ULEB128) - MMO index
+  /// - Size(4) - The size in bytes of the memory access
   GIM_CheckMemorySizeEqualTo,
 
   /// Check the address space of the memory access for the given machine memory
   /// operand.
-  /// - InsnID - Instruction ID
-  /// - MMOIdx - MMO index
-  /// - NumAddrSpace - Number of valid address spaces
-  /// - AddrSpaceN - An allowed space of the memory access
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - MMOIdx(ULEB128) - MMO index
+  /// - NumAddrSpace(ULEB128) - Number of valid address spaces
+  /// - AddrSpaceN(ULEB128) - An allowed space of the memory access
   /// - AddrSpaceN+1 ...
   GIM_CheckMemoryAddressSpace,
 
   /// Check the minimum alignment of the memory access for the given machine
   /// memory operand.
-  /// - InsnID - Instruction ID
-  /// - MMOIdx - MMO index
-  /// - MinAlign - Minimum acceptable alignment
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - MMOIdx(ULEB128) - MMO index
+  /// - MinAlign(ULEB128) - Minimum acceptable alignment
   GIM_CheckMemoryAlignment,
 
   /// Check the size of the memory access for the given machine memory operand
   /// against the size of an operand.
-  /// - InsnID - Instruction ID
-  /// - MMOIdx - MMO index
-  /// - OpIdx - The operand index to compare the MMO against
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - MMOIdx(ULEB128) - MMO index
+  /// - OpIdx(ULEB128) - The operand index to compare the MMO against
   GIM_CheckMemorySizeEqualToLLT,
   GIM_CheckMemorySizeLessThanLLT,
   GIM_CheckMemorySizeGreaterThanLLT,
@@ -170,106 +193,110 @@ enum {
   /// constant. This is valid for both G_BUILD_VECTOR as well as
   /// G_BUILD_VECTOR_TRUNC. For AllOnes refers to individual bits, so a -1
   /// element.
-  /// - InsnID - Instruction ID
+  /// - InsnID(ULEB128) - Instruction ID
   GIM_CheckIsBuildVectorAllOnes,
   GIM_CheckIsBuildVectorAllZeros,
 
   /// Check a trivial predicate which takes no arguments.
   /// This can be used by executors to implement custom flags that don't fit in
   /// target features.
+  /// - Pred(2) - Predicate ID to check.
   GIM_CheckSimplePredicate,
 
   /// Check a generic C++ instruction predicate
-  /// - InsnID - Instruction ID
-  /// - PredicateID - The ID of the predicate function to call
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - PredicateID(2) - The ID of the predicate function to call
   GIM_CheckCxxInsnPredicate,
 
   /// Check if there's no use of the first result.
-  /// - InsnID - Instruction ID
+  /// - InsnID(ULEB128) - Instruction ID
   GIM_CheckHasNoUse,
 
   /// Check the type for the specified operand
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - Expected type
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - Ty(1) - Expected type
   GIM_CheckType,
+
   /// Check the type of a pointer to any address space.
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - SizeInBits - The size of the pointer value in bits.
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - SizeInBits(ULEB128) - The size of the pointer value in bits.
   GIM_CheckPointerToAny,
+
   /// Check the register bank for the specified operand
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - Expected register bank (specified as a register class)
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - RC(2) - Expected register bank (specified as a register class)
   GIM_CheckRegBankForClass,
 
   /// Check the operand matches a complex predicate
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - RendererID - The renderer to hold the result
-  /// - Complex predicate ID
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - RendererID(2) - The renderer to hold the result
+  /// - Pred(2) - Complex predicate ID
   GIM_CheckComplexPattern,
 
   /// Check the operand is a specific integer
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - Expected integer
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - Val(8) Expected integer
   GIM_CheckConstantInt,
   /// Check the operand is a specific literal integer (i.e. MO.isImm() or
   /// MO.isCImm() is true).
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - Expected integer
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - Val(8) - Expected integer
   GIM_CheckLiteralInt,
+
   /// Check the operand is a specific intrinsic ID
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - Expected Intrinsic ID
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - IID(2) - Expected Intrinsic ID
   GIM_CheckIntrinsicID,
 
   /// Check the operand is a specific predicate
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - Expected predicate
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - Pred(2) - Expected predicate
   GIM_CheckCmpPredicate,
 
   /// Check the specified operand is an MBB
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
   GIM_CheckIsMBB,
 
   /// Check the specified operand is an Imm
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
   GIM_CheckIsImm,
 
   /// Check if the specified operand is safe to fold into the current
   /// instruction.
-  /// - InsnID - Instruction ID
+  /// - InsnID(ULEB128) - Instruction ID
   GIM_CheckIsSafeToFold,
 
   /// Check the specified operands are identical.
   /// The IgnoreCopies variant looks through COPY instructions before
   /// comparing the operands.
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - OtherInsnID - Other instruction ID
-  /// - OtherOpIdx - Other operand index
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - OtherInsnID(ULEB128) - Other instruction ID
+  /// - OtherOpIdx(ULEB128) - Other operand index
   GIM_CheckIsSameOperand,
   GIM_CheckIsSameOperandIgnoreCopies,
 
   /// Check we can replace all uses of a register with another.
-  /// - OldInsnID
-  /// - OldOpIdx
-  /// - NewInsnID
-  /// - NewOpIdx
+  /// - OldInsnID(ULEB128)
+  /// - OldOpIdx(ULEB128)
+  /// - NewInsnID(ULEB128)
+  /// - NewOpIdx(ULEB128)
   GIM_CheckCanReplaceReg,
 
   /// Check that a matched instruction has, or doesn't have a MIFlag.
   ///
-  /// - InsnID  - Instruction to check.
-  /// - Flag(s) - (can be one or more flags OR'd together)
+  /// - InsnID(ULEB128) - Instruction to check.
+  /// - Flags(4) - (can be one or more flags OR'd together)
   GIM_MIFlags,
   GIM_MIFlagsNot,
 
@@ -277,15 +304,15 @@ enum {
   /// named operands that will be recorded in RecordedOperands. Names of these
   /// operands are referenced in predicate argument list. Emitter determines
   /// StoreIdx(corresponds to the order in which names appear in argument list).
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - StoreIdx - Store location in RecordedOperands.
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - StoreIdx(ULEB128) - Store location in RecordedOperands.
   GIM_RecordNamedOperand,
 
   /// Records an operand's register type into the set of temporary types.
-  /// - InsnID - Instruction ID
-  /// - OpIdx - Operand index
-  /// - TempTypeIdx - Temp Type Index, always negative.
+  /// - InsnID(ULEB128) - Instruction ID
+  /// - OpIdx(ULEB128) - Operand index
+  /// - TempTypeIdx(1) - Temp Type Index, always negative.
   GIM_RecordRegType,
 
   /// Fail the current try-block, or completely fail to match if there is no
@@ -295,121 +322,122 @@ enum {
   //=== Renderers ===
 
   /// Mutate an instruction
-  /// - NewInsnID - Instruction ID to define
-  /// - OldInsnID - Instruction ID to mutate
-  /// - NewOpcode - The new opcode to use
+  /// - NewInsnID(ULEB128) - Instruction ID to define
+  /// - OldInsnID(ULEB128) - Instruction ID to mutate
+  /// - NewOpcode(2) - The new opcode to use
   GIR_MutateOpcode,
 
   /// Build a new instruction
-  /// - InsnID - Instruction ID to define
-  /// - Opcode - The new opcode to use
+  /// - InsnID(ULEB128) - Instruction ID to define
+  /// - Opcode(2) - The new opcode to use
   GIR_BuildMI,
 
   /// Builds a constant and stores its result in a TempReg.
-  /// - TempRegID - Temp Register to define.
-  /// - Imm - The immediate to add
+  /// - TempRegID(ULEB128) - Temp Register to define.
+  /// - Imm(8) - The immediate to add
   GIR_BuildConstant,
 
   /// Copy an operand to the specified instruction
-  /// - NewInsnID - Instruction ID to modify
-  /// - OldInsnID - Instruction ID to copy from
-  /// - OpIdx - The operand to copy
+  /// - NewInsnID(ULEB128) - Instruction ID to modify
+  /// - OldInsnID(ULEB128) - Instruction ID to copy from
+  /// - OpIdx(ULEB128) - The operand to copy
   GIR_Copy,
 
   /// Copy an operand to the specified instruction or add a zero register if the
   /// operand is a zero immediate.
-  /// - NewInsnID - Instruction ID to modify
-  /// - OldInsnID - Instruction ID to copy from
-  /// - OpIdx - The operand to copy
-  /// - ZeroReg - The zero register to use
+  /// - NewInsnID(ULEB128) - Instruction ID to modify
+  /// - OldInsnID(ULEB128) - Instruction ID to copy from
+  /// - OpIdx(ULEB128) - The operand to copy
+  /// - ZeroReg(2) - The zero register to use
   GIR_CopyOrAddZeroReg,
   /// Copy an operand to the specified instruction
-  /// - NewInsnID - Instruction ID to modify
-  /// - OldInsnID - Instruction ID to copy from
-  /// - OpIdx - The operand to copy
-  /// - SubRegIdx - The subregister to copy
+  /// - NewInsnID(ULEB128) - Instruction ID to modify
+  /// - OldInsnID(ULEB128) - Instruction ID to copy from
+  /// - OpIdx(ULEB128) - The operand to copy
+  /// - SubRegIdx(2) - The subregister to copy
   GIR_CopySubReg,
 
   /// Add an implicit register def to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - RegNum - The register to add
-  /// - Flags - Register Flags
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - RegNum(2) - The register to add
+  /// - Flags(2) - Register Flags
   GIR_AddImplicitDef,
   /// Add an implicit register use to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - RegNum - The register to add
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - RegNum(2) - The register to add
   GIR_AddImplicitUse,
   /// Add an register to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - RegNum - The register to add
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - RegNum(2) - The register to add
+  /// - Flags(2) - Register Flags
   GIR_AddRegister,
 
   /// Marks the implicit def of a register as dead.
-  /// - InsnID - Instruction ID to modify
-  /// - OpIdx - The implicit def operand index
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - OpIdx(ULEB128) - The implicit def operand index
   ///
   /// OpIdx starts at 0 for the first implicit def.
   GIR_SetImplicitDefDead,
 
   /// Set or unset a MIFlag on an instruction.
   ///
-  /// - InsnID  - Instruction to modify.
-  /// - Flag(s) - (can be one or more flags OR'd together)
+  /// - InsnID(ULEB128)  - Instruction to modify.
+  /// - Flags(4) - (can be one or more flags OR'd together)
   GIR_SetMIFlags,
   GIR_UnsetMIFlags,
 
   /// Copy the MIFlags of a matched instruction into an
   /// output instruction. The flags are OR'd together.
   ///
-  /// - InsnID     - Instruction to modify.
-  /// - OldInsnID  - Matched instruction to copy flags from.
+  /// - InsnID(ULEB128)     - Instruction to modify.
+  /// - OldInsnID(ULEB128)  - Matched instruction to copy flags from.
   GIR_CopyMIFlags,
 
   /// Add a temporary register to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - TempRegID - The temporary register ID to add
-  /// - TempRegFlags - The register flags to set
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - TempRegID(ULEB128) - The temporary register ID to add
+  /// - TempRegFlags(2) - The register flags to set
   GIR_AddTempRegister,
 
   /// Add a temporary register to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - TempRegID - The temporary register ID to add
-  /// - TempRegFlags - The register flags to set
-  /// - SubRegIndex - The subregister index to set
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - TempRegID(ULEB128) - The temporary register ID to add
+  /// - TempRegFlags(2) - The register flags to set
+  /// - SubRegIndex(2) - The subregister index to set
   GIR_AddTempSubRegister,
 
   /// Add an immediate to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - Imm - The immediate to add
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - Imm(8) - The immediate to add
   GIR_AddImm,
 
   /// Add an CImm to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - Ty - Type of the constant immediate.
-  /// - Imm - The immediate to add
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - Ty(1) - Type of the constant immediate.
+  /// - Imm(8) - The immediate to add
   GIR_AddCImm,
 
   /// Render complex operands to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - RendererID - The renderer to call
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - RendererID(2) - The renderer to call
   GIR_ComplexRenderer,
   /// Render sub-operands of complex operands to the specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - RendererID - The renderer to call
-  /// - RenderOpID - The suboperand to render.
+  /// - InsnID(ULEB128) - Instruction ID to modify
+  /// - RendererID(2) - The renderer to call
+  /// - RenderOpID(ULEB128) - The suboperand to render.
   GIR_ComplexSubOperandRenderer,
   /// Render subregisters of suboperands of complex operands to the
   /// specified instruction
-  /// - InsnID - Instruction ID to modify
-  /// - Render...
[truncated]

wangpc-pp

The code is LGTM in general, but the changes are large and I need more eyes.

llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h

llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h

llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h

llvm/utils/TableGen/GlobalISelMatchTable.cpp

github-actions · 2023-12-07T08:25:54Z

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

wangpc-pp

LGTM!
I'd like to see this great work landed but please wait for one more approval from other reviewers to see if they have some comments.

llvm/utils/TableGen/GlobalISelMatchTable.cpp

See https://discourse.llvm.org/t/rfc-make-globalisel-match-table-entries-1-byte-instead-of-8/75411 This helps reduce llc's binary size, at the cost of some added complexity to the MatchTable machinery.

Pierre-vh · 2023-12-12T06:55:00Z

@arsenm / @aemerson is this good to land?

qcolombet

Given we already talked about compile time in a previous thread and things were looking good, this LGTM

aemerson · 2023-12-12T10:43:06Z

Please hold off for now, I'm still trying to collect data.

aemerson · 2023-12-12T17:06:15Z

So I built clang with release mode and no assertions, with thinlto. I used fc791b6 as the baseline and:

commit 44f6d94c58b8073432f52b368258414f9afd0ae4 (compact-gisel-match-table)
Author: pvanhout <[email protected]>
Date:   Fri Dec 8 08:56:30 2023 +0100

    Simplify getEncodedEmitStr

as the test commit.

Building CTMark in the test suite with -j1 and 10 runs, here's the data for -Os, with compare.py using --merge-average.

Program                                       compile_time
                                              before        after  diff
tramp3d-v4/tramp3d-v4                           5.37         5.42  0.8%
mafft/pairlocalalign                            2.51         2.53  0.5%
kimwitu++/kc                                    6.45         6.47  0.4%
ClamAV/clamscan                                 4.83         4.84  0.2%
sqlite3/sqlite3                                 2.35         2.35  0.1%
SPASS/SPASS                                     4.26         4.26  0.0%
lencod/lencod                                   4.39         4.39 -0.0%
Bullet/bullet                                  10.28        10.27 -0.1%
7zip/7zip-benchmark                            13.91        13.90 -0.1%
consumer-typeset/consumer-typeset               3.39         3.38 -0.3%
                           Geomean difference                      0.2%

So it seems there's a 0.2% geomean regression but I'm not fully convinced this isn't noise.

I decided to dig further by trying to run llc on a sqlite3 optimized bitcode file, to test just codegen performance and try to reduce the effect of noise from the rest of compilation/file IO. Doing this and measuring CPU cycles I saw a 0.5% mean regression.

I then looked at your PR here: #74823 and measuring this resulted in a no statistically significant change in CTMark overall compilation times vs baseline (so no regression anymore), however I did see a 0.8% reduction in average cycles for the llc test, compared to the baseline. So it seems that PR brings a needed tiny perf improvement.

Overall, given that this change is a predecessor to that PR and that it does give a huge size improvement, I think this is ok to go.

wangpc-pp · 2023-12-13T03:54:07Z

I then looked at your PR here: #74823 and measuring this resulted in a no statistically significant change in CTMark overall compilation times vs baseline (so no regression anymore), however I did see a 0.8% reduction in average cycles for the llc test, compared to the baseline. So it seems that PR brings a needed tiny perf improvement.

I think the measurement result of #74823 may be noise too, because that PR reduces less code size than this PR.
In a word, the measurement result shows no regression here. So, I think we are OK to go further?

aemerson · 2023-12-13T04:23:09Z

I then looked at your PR here: #74823 and measuring this resulted in a no statistically significant change in CTMark overall compilation times vs baseline (so no regression anymore), however I did see a 0.8% reduction in average cycles for the llc test, compared to the baseline. So it seems that PR brings a needed tiny perf improvement.

I think the measurement result of #74823 may be noise too, because that PR reduces less code size than this PR.

In a word, the measurement result shows no regression here. So, I think we are OK to go further?

I think that's possible perhaps but unlikely. These measurements were from 40 runs each, directly reading PMU counters. The root cause for the improvement I don't know yet, but yes agreed that either way it's worth the binary size reduction.

Pierre-vh · 2023-12-13T07:48:46Z

Thanks a lot for the extensive testing @aemerson :)

vitalybuka · 2023-12-13T21:50:21Z

Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39231/steps/9/logs/stdio
Could you please fix or revert

vitalybuka · 2023-12-13T22:05:51Z

Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39231/steps/9/logs/stdio Could you please fix or revert

Bot should be fixed with eabf7ec

BTW. Does it need to take into account llvm::endian?

Pierre-vh · 2023-12-14T07:11:39Z

Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39231/steps/9/logs/stdio Could you please fix or revert

Bot should be fixed with eabf7ec

BTW. Does it need to take into account llvm::endian?

I don't think it needs to, the host machine should read the value as big/little endian naturally. The encoding part in TableGen is endian-aware and encodes it depending on big/little endian though

Pierre-vh requested review from aemerson, jayfoad, arsenm, dsandersllvm and wangpc-pp December 5, 2023 07:54

llvmbot added the llvm:globalisel label Dec 5, 2023

wangpc-pp reviewed Dec 6, 2023

View reviewed changes

Pierre-vh force-pushed the compact-gisel-match-table branch from 3726968 to ca4ba8b Compare December 7, 2023 08:23

Pierre-vh force-pushed the compact-gisel-match-table branch from cf70675 to 82d90dc Compare December 7, 2023 08:38

Pierre-vh mentioned this pull request Dec 7, 2023

[TableGen][GlobalISel] Emit Comment with MatchTable Size #74701

Merged

Pierre-vh force-pushed the compact-gisel-match-table branch from 82d90dc to 39fd7e1 Compare December 7, 2023 08:42

wangpc-pp approved these changes Dec 7, 2023

View reviewed changes

llvm/utils/TableGen/GlobalISelMatchTable.cpp Show resolved Hide resolved

llvm/utils/TableGen/GlobalISelMatchTable.cpp Outdated Show resolved Hide resolved

llvm/utils/TableGen/GlobalISelMatchTable.cpp Show resolved Hide resolved

Pierre-vh mentioned this pull request Dec 8, 2023

[TableGen][GlobalISel] Add specialized opcodes #74823

Merged

Pierre-vh force-pushed the compact-gisel-match-table branch from 44f6d94 to 6fadf20 Compare December 11, 2023 10:24

Pierre-vh requested a review from qcolombet December 11, 2023 10:24

[GlobalISel] Change MatchTable entries to 1 byte each

cfb7372

See https://discourse.llvm.org/t/rfc-make-globalisel-match-table-entries-1-byte-instead-of-8/75411 This helps reduce llc's binary size, at the cost of some added complexity to the MatchTable machinery.

Pierre-vh force-pushed the compact-gisel-match-table branch from 6fadf20 to cfb7372 Compare December 12, 2023 06:54

qcolombet approved these changes Dec 12, 2023

View reviewed changes

Pierre-vh merged commit a110e99 into llvm:main Dec 13, 2023

Pierre-vh deleted the compact-gisel-match-table branch December 13, 2023 07:49

vitalybuka added a commit that referenced this pull request Dec 13, 2023

[GlobalISel] Fix misaligned read after #74429

eabf7ec

[GlobalISel] Change MatchTable entries to 1 byte each #74429

[GlobalISel] Change MatchTable entries to 1 byte each #74429

Uh oh!

Conversation

Pierre-vh commented Dec 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Dec 5, 2023

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Pierre-vh commented Dec 12, 2023

Uh oh!

qcolombet left a comment

Choose a reason for hiding this comment

Uh oh!

aemerson commented Dec 12, 2023

Uh oh!

aemerson commented Dec 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangpc-pp commented Dec 13, 2023

Uh oh!

aemerson commented Dec 13, 2023

Uh oh!

Pierre-vh commented Dec 13, 2023

Uh oh!

vitalybuka commented Dec 13, 2023

Uh oh!

vitalybuka commented Dec 13, 2023

Uh oh!

Pierre-vh commented Dec 14, 2023

Uh oh!

Uh oh!

Pierre-vh commented Dec 5, 2023 •

edited

Loading

github-actions bot commented Dec 7, 2023 •

edited

Loading

aemerson commented Dec 12, 2023 •

edited

Loading