[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter #136187

jurahul · 2025-04-17T19:39:05Z

Add command line option num-to-skip-size to parameterize the size of NumToSkip bytes in the decoder table.
Default value will be 2, and targets that need larger size can use 3.
Keep all existing targets, except AArch64, to use size 2, and change AArch64 to use size 3 since it run into the "disassembler decoding table too large" error with size 2.
Following is a rough reduction in size for the decoder tables by switching to size 2.

jurahul · 2025-04-17T19:42:40Z

Note, now the decoder table generated by the decoder emitter looks like:

struct DecoderTable2Bytes { // Decoder Table with 2 bytes NumToSkip.
   ArrayRef<uint8_t> Data;
};
struct DecoderTable3Bytes { // Decoder Table with 3 bytes NumToSkip.
   ArrayRef<uint8_t> Data;
};
static constexpr uint8_t DecoderTable32RawData[] = ... 
static constexpr DecoderTable3Bytes DecoderTable32{DecoderTable32RawData};
...
template <typename InsnType>
static DecodeStatus decodeInstruction(DecoderTable3Bytes DecodeTable,
                                      MCInst &MI, InsnType insn,
                                      uint64_t Address,
                                      const MCDisassembler *DisAsm,
                                      const MCSubtargetInfo &STI)

So we will now have 2 version of decodeInstruction with different signatures and hence safe from tramping on each other.

…mitter" (llvm#136017)" (llvm#136068) This reverts commit 6d8bf3c.

jurahul · 2025-04-17T20:25:39Z

@topperc can you PTAL at the fix I am proposing for the expensive checks failure with this change I saw yesterday? For context, here's my assessment of my this broke it: #136019 (comment). Not asking for a full review, just some high-level eyeballing.

The other option I considered is encoding the NumToSkipSizeInBytes in the first byte of each DecodeTable (as a table header): 0x80 | NumToSkipSizeInBytes and then keeping a single decodeInstruction function that first reads this byte and then steers to a templated impl function based on size 2 or 3. But that means every time we decode we pay this 1-time cost of the branch. This approach instead avoids that and steers the code directly to the right one.

If this looks ok overall, I'll likely separate the AMDGPU refactor to move calls to decodeInstruction out of the header file and commit that first and then commit this. I had to do this because the new DecoderTable2Bytes type cannot be referenced in the header as the header does not (and should not) include the generated .inc file.

jurahul · 2025-04-17T20:27:24Z

FYI, this PR has 2 commits, so I've kept the additional changes on top of yesterday's changes in a separate commit

Add wrapper structs to discriminate between decode tables that use 2 vs 3 byte for NumToSkip, and overload the `decodeInstruction` function based on this type. This ensures that if we have a mix of 2 and 3 byte coode, it works correctly. Without this, 2 different compilation units may generate 2 different versions of `decodeInstruction` and can cause problems.

Reapply "Reapply "[LLVM][TableGen] Parameterize NumToSkip in DecoderE…

9de5ff5

…mitter" (llvm#136017)" (llvm#136068) This reverts commit 6d8bf3c.

jurahul force-pushed the decoder_num_to_skip_expensive_checks branch from 20fcc35 to b572c68 Compare April 17, 2025 20:13

jurahul changed the title ~~[LLVM][TableGen] Paramaterize NumToSkip in DecoderEmitter~~ [LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter Apr 17, 2025

jurahul force-pushed the decoder_num_to_skip_expensive_checks branch from b572c68 to 83e5b4f Compare April 17, 2025 20:29

jurahul closed this Apr 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter #136187

[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter #136187

Uh oh!

jurahul commented Apr 17, 2025

Uh oh!

jurahul commented Apr 17, 2025 •

edited

Loading

Uh oh!

jurahul commented Apr 17, 2025

Uh oh!

jurahul commented Apr 17, 2025

Uh oh!

Uh oh!

[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter #136187

[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter #136187

Uh oh!

Conversation

jurahul commented Apr 17, 2025

Uh oh!

jurahul commented Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jurahul commented Apr 17, 2025

Uh oh!

jurahul commented Apr 17, 2025

Uh oh!

Uh oh!

jurahul commented Apr 17, 2025 •

edited

Loading