-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Make AMDGPUCombinerHelper methods const #121740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make AMDGPUCombinerHelper methods const #121740
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: Paul Bowen-Huggett (paulhuggett) ChangesThis is a follow-up to a previous PR (#119529) which eliminated several "TODO: make CombinerHelper methods const" remarks. As promised in that earlier change, this change completes the set by also making the methods of AMDGPUCombinerHelper const so that the Helper member of AMDGPUPreLegalizerCombinerImpl can be const rather than explicitly mutable. Full diff: https://github.com/llvm/llvm-project/pull/121740.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.cpp
index e5a376ab7357c1..6fa81170048999 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.cpp
@@ -190,7 +190,7 @@ static unsigned inverseMinMax(unsigned Opc) {
}
bool AMDGPUCombinerHelper::matchFoldableFneg(MachineInstr &MI,
- MachineInstr *&MatchInfo) {
+ MachineInstr *&MatchInfo) const {
Register Src = MI.getOperand(1).getReg();
MatchInfo = MRI.getVRegDef(Src);
@@ -259,7 +259,7 @@ bool AMDGPUCombinerHelper::matchFoldableFneg(MachineInstr &MI,
}
void AMDGPUCombinerHelper::applyFoldableFneg(MachineInstr &MI,
- MachineInstr *&MatchInfo) {
+ MachineInstr *&MatchInfo) const {
// Transform:
// %A = inst %Op1, ...
// %B = fneg %A
@@ -418,7 +418,7 @@ static bool isFPExtFromF16OrConst(const MachineRegisterInfo &MRI,
bool AMDGPUCombinerHelper::matchExpandPromotedF16FMed3(MachineInstr &MI,
Register Src0,
Register Src1,
- Register Src2) {
+ Register Src2) const {
assert(MI.getOpcode() == TargetOpcode::G_FPTRUNC);
Register SrcReg = MI.getOperand(1).getReg();
if (!MRI.hasOneNonDBGUse(SrcReg) || MRI.getType(SrcReg) != LLT::scalar(32))
@@ -431,7 +431,7 @@ bool AMDGPUCombinerHelper::matchExpandPromotedF16FMed3(MachineInstr &MI,
void AMDGPUCombinerHelper::applyExpandPromotedF16FMed3(MachineInstr &MI,
Register Src0,
Register Src1,
- Register Src2) {
+ Register Src2) const {
// We expect fptrunc (fpext x) to fold out, and to constant fold any constant
// sources.
Src0 = Builder.buildFPTrunc(LLT::scalar(16), Src0).getReg(0);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.h b/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.h
index 6510abe9d23218..30601126e833bf 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.h
@@ -23,13 +23,13 @@ class AMDGPUCombinerHelper : public CombinerHelper {
public:
using CombinerHelper::CombinerHelper;
- bool matchFoldableFneg(MachineInstr &MI, MachineInstr *&MatchInfo);
- void applyFoldableFneg(MachineInstr &MI, MachineInstr *&MatchInfo);
+ bool matchFoldableFneg(MachineInstr &MI, MachineInstr *&MatchInfo) const;
+ void applyFoldableFneg(MachineInstr &MI, MachineInstr *&MatchInfo) const;
bool matchExpandPromotedF16FMed3(MachineInstr &MI, Register Src0,
- Register Src1, Register Src2);
+ Register Src1, Register Src2) const;
void applyExpandPromotedF16FMed3(MachineInstr &MI, Register Src0,
- Register Src1, Register Src2);
+ Register Src1, Register Src2) const;
};
} // namespace llvm
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp
index ff8189ce31f7f7..ac431ccc30903e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp
@@ -45,8 +45,7 @@ class AMDGPUPreLegalizerCombinerImpl : public Combiner {
protected:
const AMDGPUPreLegalizerCombinerImplRuleConfig &RuleConfig;
const GCNSubtarget &STI;
- // TODO: Make CombinerHelper methods const.
- mutable AMDGPUCombinerHelper Helper;
+ const AMDGPUCombinerHelper Helper;
public:
AMDGPUPreLegalizerCombinerImpl(
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty obvious to me, thanks!
@paulhuggett could you update the commit message to include "[NFC]" ? Thanks ! |
This is a follow-up to a previous commit (ee7ca0d) which eliminated several "TODO: make CombinerHelper methods const" remarks. As promised in that ealier commit, this change completes the set by also making the methods of AMDGPUCombinerHelper const so that the Helper member of AMDGPUPreLegalizerCombinerImpl can be const rather than explicitly mutable.
fccef14
to
d98c994
Compare
This PR removes tests with `br i1 undef` under `llvm/tests/CodeGen/X86`. There will be more PRs in the future for this directory. Replacing `undef` with a new function argument doesn't work in some of the tests, instead, I've replaced them with `poison`.
This adds a Maintainers.md files to libclc. Recently I needed to find a libclc maintainer and I had no idea there was one listed in llvm/ instead of in libclc/.
) https://sourceware.org/gdb/current/onlinedocs/gdb.html/Skipping-Over-Functions-and-Files.html We can't emulate all the features of that command but we can skip a function by name with some extra steps. As far as I know this only matches function name unlike GDB that can filter on file and line and so on: ``` target.process.thread.step-avoid-regexp -- A regular expression defining functions step-in won't stop in. ``` It's likely it's got some corner cases that don't work, maybe inlining, but it doesn't seem worth going into it here. I don't think we can chain lldb interpreter commands, so I have shown the steps separately. I have also mentioned `thread step-in` and its alias `sif`. Which were new to me too.
Currently hfinkel is listed as the AliasAnalysis maintainer, but I believe he hasn't been actively working on LLVM in the last couple of years, so I'd like to update this information. I'd like to nominate fhahn and myself as the new maintainers for AA. While here, I'd also like to nominate alinas as the maintainer for MemorySSA.
If directive is put inside `#if __cplusplus`, it should reflect the condition, instead of being generic `expected`.
…-rt (llvm#121625) This compile time test uses inline asm with `.arch` directives to set the target feature. It is however broken and always fails, since each `asm()` construct in LLVM sets up a new AsmParser, and therefore the `.arch` directive has no effect on later `asm()` contents. To fix this we need to use a single inline `asm()` call with the entire code chunk to emit contained inside.
…ion (llvm#121559) As part llvm#112171, support for FEAT_PAuthLR's CFI instructions was added. However, the CFI instructions are emitted in the incorrect location. This leads to incorrect CodeGen being generated and possible issues when running a program. According to the ABI, the CFI instructions should be emitted before the signing instruction. This is now done properly. ABI information can be found here: https://github.com/ARM-software/abi-aa/blob/bf0e2c8047c70987165f3e05e571d7836370ade9/aadwarf64/aadwarf64.rst#44call-frame-instructions
…vm#121681) The new line types help to annotate */&/&& in simple requirements as binary operators. Fixes llvm#121675.
Print operations are often used for debugging, immediately before the compiler aborts. In such cases, it is sometimes possible that the output isn't fully produced yet. Make sure it is by explicitly flushing the output.
…rt for single reductions in ComplexDeinterleavingPass (llvm#112875)" (llvm#120441) This reverts commit 76714be, fixing the build failure that caused the revert. The failure stemmed from the complex deinterleaving pass identifying a series of add operations as a "complex to single reduction", so when it tried to transform this erroneously identified pattern, it faulted. The fix applied is to ensure that complex numbers (or patterns that match them) are used throughout, by checking if there is a deinterleave node amidst the graph.
This PR is in reference to porting LLDB on AIX. Link to discussions on llvm discourse and github: 1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640 2. llvm#101657 The complete changes for porting are present in this draft PR: llvm#102601 Added a HostInfoAIX file for the AIX platform. Most of the common functionalities are handled by the parent HostInfoPosix now, So we just have some basic functions implemented here.
Move the debug output that prints out the selected VF from selectVectorizationFactor -> computeBestVF. This means that the output will still be written even after removing the assert for the legacy and vplan cost models matching.
… binary operation on input (llvm#120207) Add codegen for when the input type has 4 times as many elements as the output type and the input to the partial reduction does not have a binary operation performed on it.
…1550) An instantiated templated function definition may not have a body due to parsing errors inside the templated function. When serializing, an assert is triggered inside `ASTRecordWriter::AddFunctionDefinition`. The instantiation may happen on an intermediate module. The test case was reduced from `mp-units`.
This commit add an NVIDIA-specific lowering of `cf.assert` to to `__assertfail`. Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and `getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can be reused.
) This is trivially additional support for the existing ALLOCATE directive, which allows an ALIGN clause. The ALLOCATE directive is currently not implemented, so this is just addding the necessary parser parts to allow the compiler to not say "Huh? I don't get this" [or "Expected OpenMP construct"] when it encounters the ALIGN clause. Some parser testing is updated and a new todo test, just in case the feature of align clause is not supported by the initial support for ALLOCATE.
This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. Note: `llvm.sincos.*` is only emitted by `__builtin_sincos[f|l]` functions in this initial patch.
…ffle kinds to legal types (llvm#120599) (llvm#121760) Now that processShuffleMasks can correctly handle 2 src shuffles, we can completely remove the shuffle kind limits and correctly recognize the number of active subvectors per legalized shuffle - improveShuffleKindFromMask will determine the shuffle kind for each split subvector.
…m#121753) In the simplifySelectWithEquivalence fold, simplify both operands before comparing them, instead of comparing one simplified operand with a non-simplified operand. This is slightly more powerful.
…inae-friendly error (llvm#121333) LWG3929 suggests that passing incomplete types to __is_base_of and other builtins supporting [meta.unary] should result in a non-sfinaeable error. This is consistent with GCC's behavior and avoid inconsistency when using a builtin instead of a standard trait in a concept-definition. Fixes llvm#121278
…vm#120095) Summary: The documentation at https://llvm.org/docs/AMDGPUUsage.html#memory-scopes states that these 'one-as' modifiers are more specific versions of the scopes that only apply to a specific address space. This doesn't make sense for fences which have no associated address space to use, and it's a more restrictive version the normal scope. This should not tbe the default behavior, but it is currently emitted in all cases except for sequentially consistent.
IRLinker emits warning when linking two modules of different target triples. The warning is disabled if the source module is libdevice. When using libdevice embedded in LLVM library via MLIR_NVVM_EMBED_LIBDEVICE, IRLinker can no longer tell whether the source module is libdevice via module identifier. Since `nvptx64-nvidia-gpulibs` is a magic triple that identifies the libdevice module already, the libdevice filename check is redundant. This patch fixes the triple mismatch warning by just removing the filename check.
Summary: This test uses too much stack and crashes, make the buffer `static` to push it to `.bss`. This shouldn't change behavior because the tests are all run single threaded.
This is a follow-up to a previous commit (ee7ca0d) which eliminated several "TODO: make CombinerHelper methods const" remarks. As promised in that ealier commit, this change completes the set by also making the methods of AMDGPUCombinerHelper const so that the Helper member of AMDGPUPreLegalizerCombinerImpl can be const rather than explicitly mutable.
Sorry, I screwed up resolving conflicts. I'm going to abandon this PR and try again. |
(This replaces #121740. Sorry for wasting your time.) This is a follow-up to a previous commit (ee7ca0d) which eliminated several "TODO: make CombinerHelper methods const" remarks. As promised in that ealier commit, this change completes the set by also making the methods of AMDGPUCombinerHelper const so that the Helper member of AMDGPUPreLegalizerCombinerImpl can be const rather than explicitly mutable.
(This replaces llvm#121740. Sorry for wasting your time.) This is a follow-up to a previous commit (ee7ca0d) which eliminated several "TODO: make CombinerHelper methods const" remarks. As promised in that ealier commit, this change completes the set by also making the methods of AMDGPUCombinerHelper const so that the Helper member of AMDGPUPreLegalizerCombinerImpl can be const rather than explicitly mutable.
This is a follow-up to a previous PR (#119529) which eliminated several "TODO: make CombinerHelper methods const" remarks. As promised in that earlier change, this change completes the set by also making the methods of AMDGPUCombinerHelper const so that the Helper member of AMDGPUPreLegalizerCombinerImpl can be const rather than explicitly mutable.