Skip to content

[llvm-mca] Add command line option -call-latency #92958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 22, 2024

Conversation

chinmaydd
Copy link
Contributor

Currently we assume a constant latency of 100 cycles for call instructions. This commit allows the user to specify a custom value for the same as a command line argument. Default latency is set to 100.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented May 21, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-tools-llvm-mca

Author: Chinmay Deshpande (chinmaydd)

Changes

Currently we assume a constant latency of 100 cycles for call instructions. This commit allows the user to specify a custom value for the same as a command line argument. Default latency is set to 100.


Full diff: https://github.com/llvm/llvm-project/pull/92958.diff

3 Files Affected:

  • (modified) llvm/include/llvm/MCA/InstrBuilder.h (+3)
  • (modified) llvm/lib/MCA/InstrBuilder.cpp (+9-8)
  • (modified) llvm/tools/llvm-mca/llvm-mca.cpp (+6)
diff --git a/llvm/include/llvm/MCA/InstrBuilder.h b/llvm/include/llvm/MCA/InstrBuilder.h
index 3594372489148..49ba8d93ec76f 100644
--- a/llvm/include/llvm/MCA/InstrBuilder.h
+++ b/llvm/include/llvm/MCA/InstrBuilder.h
@@ -78,6 +78,7 @@ class InstrBuilder {
 
   bool FirstCallInst;
   bool FirstReturnInst;
+  unsigned CallLatency;
 
   using InstRecycleCallback = std::function<Instruction *(const InstrDesc &)>;
   InstRecycleCallback InstRecycleCB;
@@ -111,6 +112,8 @@ class InstrBuilder {
   /// or null if there isn't any.
   void setInstRecycleCallback(InstRecycleCallback CB) { InstRecycleCB = CB; }
 
+  void setCallLatency(unsigned CL) { CallLatency = CL; }
+
   Expected<std::unique_ptr<Instruction>>
   createInstruction(const MCInst &MCI, const SmallVector<Instrument *> &IVec);
 };
diff --git a/llvm/lib/MCA/InstrBuilder.cpp b/llvm/lib/MCA/InstrBuilder.cpp
index bcf065c566918..e608fa520fc19 100644
--- a/llvm/lib/MCA/InstrBuilder.cpp
+++ b/llvm/lib/MCA/InstrBuilder.cpp
@@ -33,7 +33,7 @@ InstrBuilder::InstrBuilder(const llvm::MCSubtargetInfo &sti,
                            const llvm::MCInstrAnalysis *mcia,
                            const mca::InstrumentManager &im)
     : STI(sti), MCII(mcii), MRI(mri), MCIA(mcia), IM(im), FirstCallInst(true),
-      FirstReturnInst(true) {
+      FirstReturnInst(true), CallLatency(100U) {
   const MCSchedModel &SM = STI.getSchedModel();
   ProcResourceMasks.resize(SM.getNumProcResourceKinds());
   computeProcResourceMasks(STI.getSchedModel(), ProcResourceMasks);
@@ -220,17 +220,18 @@ static void initializeUsedResources(InstrDesc &ID,
 
 static void computeMaxLatency(InstrDesc &ID, const MCInstrDesc &MCDesc,
                               const MCSchedClassDesc &SCDesc,
-                              const MCSubtargetInfo &STI) {
+                              const MCSubtargetInfo &STI,
+                              unsigned CallLatency) {
   if (MCDesc.isCall()) {
     // We cannot estimate how long this call will take.
-    // Artificially set an arbitrarily high latency (100cy).
-    ID.MaxLatency = 100U;
+    // Artificially set an arbitrarily high latency (default: 100cy).
+    ID.MaxLatency = CallLatency;
     return;
   }
 
   int Latency = MCSchedModel::computeInstrLatency(STI, SCDesc);
-  // If latency is unknown, then conservatively assume a MaxLatency of 100cy.
-  ID.MaxLatency = Latency < 0 ? 100U : static_cast<unsigned>(Latency);
+  // If latency is unknown, then conservatively assume a MaxLatency set for calls (default: 100cy).
+  ID.MaxLatency = Latency < 0 ? CallLatency : static_cast<unsigned>(Latency);
 }
 
 static Error verifyOperands(const MCInstrDesc &MCDesc, const MCInst &MCI) {
@@ -568,7 +569,7 @@ InstrBuilder::createInstrDescImpl(const MCInst &MCI,
     // We don't correctly model calls.
     WithColor::warning() << "found a call in the input assembly sequence.\n";
     WithColor::note() << "call instructions are not correctly modeled. "
-                      << "Assume a latency of 100cy.\n";
+                      << "Assume a latency of " << CallLatency << "cy.\n";
     FirstCallInst = false;
   }
 
@@ -580,7 +581,7 @@ InstrBuilder::createInstrDescImpl(const MCInst &MCI,
   }
 
   initializeUsedResources(*ID, SCDesc, STI, ProcResourceMasks);
-  computeMaxLatency(*ID, MCDesc, SCDesc, STI);
+  computeMaxLatency(*ID, MCDesc, SCDesc, STI, CallLatency);
 
   if (Error Err = verifyOperands(MCDesc, MCI))
     return std::move(Err);
diff --git a/llvm/tools/llvm-mca/llvm-mca.cpp b/llvm/tools/llvm-mca/llvm-mca.cpp
index 03d7d7944b9cd..73b704283b8a8 100644
--- a/llvm/tools/llvm-mca/llvm-mca.cpp
+++ b/llvm/tools/llvm-mca/llvm-mca.cpp
@@ -135,6 +135,11 @@ static cl::opt<unsigned>
                                "(instructions per cycle)"),
                       cl::cat(ToolOptions), cl::init(0));
 
+static cl::opt<unsigned>
+    CallLatency("call-latency", cl::Hidden,
+                cl::desc("Number of cycles to assume for a call instruction"),
+                cl::cat(ToolOptions), cl::init(100U));
+
 enum class SkipType { NONE, LACK_SCHED, PARSE_FAILURE, ANY_FAILURE };
 
 static cl::opt<enum SkipType> SkipUnsupportedInstructions(
@@ -569,6 +574,7 @@ int main(int argc, char **argv) {
 
   // Create an instruction builder.
   mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM);
+  IB.setCallLatency(CallLatency);
 
   // Create a context to control ownership of the pipeline hardware.
   mca::Context MCA(*MRI, *STI);

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending the patch. Could you add a simple LIT test?

@@ -33,7 +33,7 @@ InstrBuilder::InstrBuilder(const llvm::MCSubtargetInfo &sti,
const llvm::MCInstrAnalysis *mcia,
const mca::InstrumentManager &im)
: STI(sti), MCII(mcii), MRI(mri), MCIA(mcia), IM(im), FirstCallInst(true),
FirstReturnInst(true) {
FirstReturnInst(true), CallLatency(100U) {
Copy link
Contributor

@michaelmaitland michaelmaitland May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pass CallLatency when we construct the InstrBuilder and remove the call to setCallLatency? This will simplify the number of places where we have to specify the default value to 100.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I tried to address it.

return;
}

int Latency = MCSchedModel::computeInstrLatency(STI, SCDesc);
// If latency is unknown, then conservatively assume a MaxLatency of 100cy.
ID.MaxLatency = Latency < 0 ? 100U : static_cast<unsigned>(Latency);
// If latency is unknown, then conservatively assume a MaxLatency set for calls (default: 100cy).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this line may be too long according to LLVM Coding Standards. Can you either format it accordingly or drop the default part since it no longer defaults to 100 inside InstrBuilder. I am content with either approach. If you end up going with the latter, please also remove the (default: 100cy) above on line 228.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. I should have checked the formatting guidelines before pushing.

Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the patch!

@michaelmaitland
Copy link
Contributor

Please let me know if you would like my help in committing.

@chinmaydd
Copy link
Contributor Author

Sure, that would be much appreciated. This is my first contribution to LLVM.

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI reported build failures. Please update the InstrBuilder usage in https://github.com/llvm/llvm-project/blob/main/llvm/unittests/tools/llvm-mca/MCATestBase.cpp as well.

@mshockwave
Copy link
Member

The CI reported build failures. Please update the InstrBuilder usage in https://github.com/llvm/llvm-project/blob/main/llvm/unittests/tools/llvm-mca/MCATestBase.cpp as well.

Or provide a default value to the new InstrBuilder ctor argument you added.

@mshockwave
Copy link
Member

I think the CI is failing on the new call-latency.s test you added. It seems like there is a mismatch on the number of uOps so you might want to rerun update_mca_check.py again and/or rebase in case there is any change on uOps in the scheduling model.

Note that passing (pre-commit) CI is not a requirement for merging PR in LLVM but it'll be easier to address any issue early, otherwise 10+ (post-commit) buildbots might flood you with build failure emails after merge.

@chinmaydd
Copy link
Contributor Author

Ah, thanks for pointing it out Min. I'll check again.

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff dfdf1c5fe45a82b9c578306f3d7627fd251d63f8 3d13902620ea866688b6381983df835d9a12d4de -- llvm/include/llvm/MCA/InstrBuilder.h llvm/lib/MCA/InstrBuilder.cpp llvm/tools/llvm-mca/llvm-mca.cpp llvm/unittests/tools/llvm-mca/MCATestBase.cpp llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/MCA/InstrBuilder.cpp b/llvm/lib/MCA/InstrBuilder.cpp
index 4d3b9d952d..d5cbdc5de0 100644
--- a/llvm/lib/MCA/InstrBuilder.cpp
+++ b/llvm/lib/MCA/InstrBuilder.cpp
@@ -31,8 +31,7 @@ InstrBuilder::InstrBuilder(const llvm::MCSubtargetInfo &sti,
                            const llvm::MCInstrInfo &mcii,
                            const llvm::MCRegisterInfo &mri,
                            const llvm::MCInstrAnalysis *mcia,
-                           const mca::InstrumentManager &im,
-                           unsigned cl)
+                           const mca::InstrumentManager &im, unsigned cl)
     : STI(sti), MCII(mcii), MRI(mri), MCIA(mcia), IM(im), FirstCallInst(true),
       FirstReturnInst(true), CallLatency(cl) {
   const MCSchedModel &SM = STI.getSchedModel();

@@ -31,9 +31,10 @@ InstrBuilder::InstrBuilder(const llvm::MCSubtargetInfo &sti,
const llvm::MCInstrInfo &mcii,
const llvm::MCRegisterInfo &mri,
const llvm::MCInstrAnalysis *mcia,
const mca::InstrumentManager &im)
const mca::InstrumentManager &im,
unsigned cl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format this

Currently we assume a constant latency of 100 cycles for a call
instruction. This commit allows the user to specify a custom value for
the same as a command line argument. We assume a value of 100 if none is
provided.
Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you and congrats on your first PR!

@mshockwave mshockwave merged commit 848bef5 into llvm:main May 22, 2024
2 of 3 checks passed
Copy link

@chinmaydd Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@chinmaydd chinmaydd deleted the mca-call branch June 6, 2024 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants