Skip to content

Let M68kMCCodeEmitter set Scratch size. #69898

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 26, 2023
Merged

Let M68kMCCodeEmitter set Scratch size. #69898

merged 2 commits into from
Oct 26, 2023

Conversation

erikfjon
Copy link
Contributor

The Scratch buffer passed to getBinaryCodeForInst needs to be able to
hold any value returned by getMachineOpValue or other custom encoders.
It's better to let the caller of getBinaryCodeForInst set the size of
Scratch as it's impossible for VarLenCodeEmitterGen to know what the
smallest needed size is.

VarLenCodeEmitterGen now calculates its smallest needed Scratch bit
width based on the slice operations and zero extends Scratch if it's too
small. This only guarantees that Scratch has enough bits for the
generated code not for getMachineOpValue or custom encoders.

The smallest internal APInt representation uses one uint64_t word so
there is no point in using a smaller size.

@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2023

@llvm/pr-subscribers-backend-m68k

Author: Erik Jonsson (erikfjon)

Changes

The Scratch buffer passed to getBinaryCodeForInst needs to be able to
hold any value returned by getMachineOpValue or other custom encoders.
It's better to let the caller of getBinaryCodeForInst set the size of
Scratch as it's impossible for VarLenCodeEmitterGen to know what the
smallest needed size is.

VarLenCodeEmitterGen now calculates its smallest needed Scratch bit
width based on the slice operations and zero extends Scratch if it's too
small. This only guarantees that Scratch has enough bits for the
generated code not for getMachineOpValue or custom encoders.

The smallest internal APInt representation uses one uint64_t word so
there is no point in using a smaller size.


Full diff: https://github.com/llvm/llvm-project/pull/69898.diff

3 Files Affected:

  • (modified) llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp (+1-1)
  • (modified) llvm/test/TableGen/VarLenEncoder.td (+2-2)
  • (modified) llvm/utils/TableGen/VarLenCodeEmitterGen.cpp (+15-5)
diff --git a/llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp b/llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp
index 97f5d7a3dc0776f..a9ff059bc990b8e 100644
--- a/llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp
+++ b/llvm/lib/Target/M68k/MCTargetDesc/M68kMCCodeEmitter.cpp
@@ -209,7 +209,7 @@ void M68kMCCodeEmitter::encodeInstruction(const MCInst &MI,
 
   // Try using the new method first.
   APInt EncodedInst(16, 0U);
-  APInt Scratch(16, 0U);
+  APInt Scratch(64, 0U); // One APInt word is enough.
   getBinaryCodeForInstr(MI, Fixups, EncodedInst, Scratch, STI);
 
   ArrayRef<uint64_t> Data(EncodedInst.getRawData(), EncodedInst.getNumWords());
diff --git a/llvm/test/TableGen/VarLenEncoder.td b/llvm/test/TableGen/VarLenEncoder.td
index 3dd100a50fc58ad..0fabf4150b79d5d 100644
--- a/llvm/test/TableGen/VarLenEncoder.td
+++ b/llvm/test/TableGen/VarLenEncoder.td
@@ -65,7 +65,7 @@ def FOO32 : MyVarInst<MemOp32<"src">>;
 // CHECK: UINT64_C(46848), // FOO32
 
 // CHECK-LABEL: case ::FOO16: {
-// CHECK: Scratch = Scratch.zext(41);
+// CHECK: Scratch.getBitWidth() < 16
 // src.reg
 // CHECK: getMachineOpValue(MI, MI.getOperand(1), /*Pos=*/0, Scratch, Fixups, STI);
 // CHECK: Inst.insertBits(Scratch.extractBits(8, 0), 0);
@@ -83,7 +83,7 @@ def FOO32 : MyVarInst<MemOp32<"src">>;
 // CHECK: Inst.insertBits(Scratch.extractBits(2, 0), 39);
 
 // CHECK-LABEL: case ::FOO32: {
-// CHECK: Scratch = Scratch.zext(57);
+// CHECK: Scratch.getBitWidth() < 32
 // src.reg
 // CHECK: getMachineOpValue(MI, MI.getOperand(1), /*Pos=*/0, Scratch, Fixups, STI);
 // CHECK: Inst.insertBits(Scratch.extractBits(8, 0), 0);
diff --git a/llvm/utils/TableGen/VarLenCodeEmitterGen.cpp b/llvm/utils/TableGen/VarLenCodeEmitterGen.cpp
index 24f116bbeaced5f..90ce23479e50147 100644
--- a/llvm/utils/TableGen/VarLenCodeEmitterGen.cpp
+++ b/llvm/utils/TableGen/VarLenCodeEmitterGen.cpp
@@ -60,6 +60,8 @@
 #include "llvm/TableGen/Error.h"
 #include "llvm/TableGen/Record.h"
 
+#include <algorithm>
+
 using namespace llvm;
 
 namespace {
@@ -445,20 +447,17 @@ std::string VarLenCodeEmitterGen::getInstructionCases(Record *R,
 std::string VarLenCodeEmitterGen::getInstructionCaseForEncoding(
     Record *R, AltEncodingTy Mode, const VarLenInst &VLI, CodeGenTarget &Target,
     int I) {
-  size_t BitWidth = VLI.size();
 
   CodeGenInstruction &CGI = Target.getInstruction(R);
 
   std::string Case;
   raw_string_ostream SS(Case);
-  // Resize the scratch buffer.
-  if (BitWidth && !VLI.isFixedValueOnly())
-    SS.indent(I) << "Scratch = Scratch.zext(" << BitWidth << ");\n";
   // Populate based value.
   SS.indent(I) << "Inst = getInstBits" << Modes[Mode] << "(opcode);\n";
 
   // Process each segment in VLI.
   size_t Offset = 0U;
+  unsigned HighScratchAccess = 0U;
   for (const auto &ES : VLI) {
     unsigned NumBits = ES.BitWidth;
     const Init *Val = ES.Value;
@@ -497,6 +496,8 @@ std::string VarLenCodeEmitterGen::getInstructionCaseForEncoding(
                    << "Scratch.extractBits(" << utostr(NumBits) << ", "
                    << utostr(LoBit) << ")"
                    << ", " << Offset << ");\n";
+
+      HighScratchAccess = std::max(HighScratchAccess, NumBits + LoBit);
     }
     Offset += NumBits;
   }
@@ -505,7 +506,16 @@ std::string VarLenCodeEmitterGen::getInstructionCaseForEncoding(
   if (!PostEmitter.empty())
     SS.indent(I) << "Inst = " << PostEmitter << "(MI, Inst, STI);\n";
 
-  return Case;
+  // Resize the scratch buffer if it's to small.
+  std::string ScratchResizeStr = "";
+  if (VLI.size() && !VLI.isFixedValueOnly()) {
+    raw_string_ostream RS(ScratchResizeStr);
+    std::string High = utostr(HighScratchAccess);
+    RS.indent(I) << "if (Scratch.getBitWidth() < " + High + ") "
+                 << "{ Scratch = Scratch.zext(" + High + "); }\n";
+  }
+
+  return ScratchResizeStr + Case;
 }
 
 namespace llvm {

@erikfjon
Copy link
Contributor Author

I have locally verified that check-llvm-mc-m68k and check-llvm-codegen-m68k passes.

The Scratch buffer passed to getBinaryCodeForInst needs to be able to
hold any value returned by getMachineOpValue or other custom encoders.
It's better to let the caller of getBinaryCodeForInst set the size of
Scratch as it's impossible for VarLenCodeEmitterGen to know what the
smallest needed size is.

VarLenCodeEmitterGen now calculates its smallest needed Scratch bit
width based on the slice operations and zero extends Scratch if it's too
small. This only guarantees that Scratch has enough bits for the
generated code not for getMachineOpValue or custom encoders.

The smallest internal APInt representation uses one uint64_t word so
there is no point in using a smaller size.
@erikfjon
Copy link
Contributor Author

Updated commit message with missing tags [Tablegen] and [M68k].

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty neat optimization as the original version might allocated more bits than we needed for Scratch. Thank you!
I only have a few comments.

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikaelholmen mikaelholmen merged commit fabcadf into llvm:main Oct 26, 2023
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
The Scratch buffer passed to getBinaryCodeForInst needs to be able to
hold any value returned by getMachineOpValue or other custom encoders.
It's better to let the caller of getBinaryCodeForInst set the size of
Scratch as it's impossible for VarLenCodeEmitterGen to know what the
smallest needed size is.

VarLenCodeEmitterGen now calculates its smallest needed Scratch bit
width based on the slice operations and zero extends Scratch if it's too
small. This only guarantees that Scratch has enough bits for the
generated code not for getMachineOpValue or custom encoders.

The smallest internal APInt representation uses one uint64_t word so
there is no point in using a smaller size.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants