Skip to content

Commit e7621f4

Browse files
authored
Reland "[NVVM] Upgrade nvvm.ptr.* intrinics to addrspace cast" (#110262)
Remove the following intrinsics which can be trivially replaced with an `addrspacecast` * llvm.nvvm.ptr.gen.to.global * llvm.nvvm.ptr.gen.to.shared * llvm.nvvm.ptr.gen.to.constant * llvm.nvvm.ptr.gen.to.local * llvm.nvvm.ptr.global.to.gen * llvm.nvvm.ptr.shared.to.gen * llvm.nvvm.ptr.constant.to.gen * llvm.nvvm.ptr.local.to.gen Also, cleanup the NVPTX lowering of `addrspacecast` making it more concise. This was reverted to avoid conflicts while reverting #107655. Re-landing unchanged.
1 parent e9c0c66 commit e7621f4

File tree

9 files changed

+134
-228
lines changed

9 files changed

+134
-228
lines changed

llvm/docs/NVPTXUsage.rst

Lines changed: 0 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -127,69 +127,6 @@ Example: 64-bit PTX for CUDA Driver API: ``nvptx64-nvidia-cuda``
127127
NVPTX Intrinsics
128128
================
129129

130-
Address Space Conversion
131-
------------------------
132-
133-
'``llvm.nvvm.ptr.*.to.gen``' Intrinsics
134-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
135-
136-
Syntax:
137-
"""""""
138-
139-
These are overloaded intrinsics. You can use these on any pointer types.
140-
141-
.. code-block:: llvm
142-
143-
declare ptr @llvm.nvvm.ptr.global.to.gen.p0.p1(ptr addrspace(1))
144-
declare ptr @llvm.nvvm.ptr.shared.to.gen.p0.p3(ptr addrspace(3))
145-
declare ptr @llvm.nvvm.ptr.constant.to.gen.p0.p4(ptr addrspace(4))
146-
declare ptr @llvm.nvvm.ptr.local.to.gen.p0.p5(ptr addrspace(5))
147-
148-
Overview:
149-
"""""""""
150-
151-
The '``llvm.nvvm.ptr.*.to.gen``' intrinsics convert a pointer in a non-generic
152-
address space to a generic address space pointer.
153-
154-
Semantics:
155-
""""""""""
156-
157-
These intrinsics modify the pointer value to be a valid generic address space
158-
pointer.
159-
160-
161-
'``llvm.nvvm.ptr.gen.to.*``' Intrinsics
162-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
163-
164-
Syntax:
165-
"""""""
166-
167-
These are overloaded intrinsics. You can use these on any pointer types.
168-
169-
.. code-block:: llvm
170-
171-
declare ptr addrspace(1) @llvm.nvvm.ptr.gen.to.global.p1.p0(ptr)
172-
declare ptr addrspace(3) @llvm.nvvm.ptr.gen.to.shared.p3.p0(ptr)
173-
declare ptr addrspace(4) @llvm.nvvm.ptr.gen.to.constant.p4.p0(ptr)
174-
declare ptr addrspace(5) @llvm.nvvm.ptr.gen.to.local.p5.p0(ptr)
175-
176-
Overview:
177-
"""""""""
178-
179-
The '``llvm.nvvm.ptr.gen.to.*``' intrinsics convert a pointer in the generic
180-
address space to a pointer in the target address space. Note that these
181-
intrinsics are only useful if the address space of the target address space of
182-
the pointer is known. It is not legal to use address space conversion
183-
intrinsics to convert a pointer from one non-generic address space to another
184-
non-generic address space.
185-
186-
Semantics:
187-
""""""""""
188-
189-
These intrinsics modify the pointer value to be a valid pointer in the target
190-
non-generic address space.
191-
192-
193130
Reading PTX Special Registers
194131
-----------------------------
195132

llvm/include/llvm/IR/IntrinsicsNVVM.td

Lines changed: 12 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,18 @@
3030
// * llvm.nvvm.max.ui --> select(x ule y, x, y)
3131
// * llvm.nvvm.max.ull --> ibid.
3232
// * llvm.nvvm.h2f --> llvm.convert.to.fp16.f32
33-
// * llvm.nvvm.bitcast.f2i --> bitcast
34-
// * llvm.nvvm.bitcast.i2f --> ibid.
35-
// * llvm.nvvm.bitcast.d2ll --> ibid.
36-
// * llvm.nvvm.bitcast.ll2d --> ibid.
33+
// * llvm.nvvm.bitcast.f2i --> bitcast
34+
// * llvm.nvvm.bitcast.i2f --> ibid.
35+
// * llvm.nvvm.bitcast.d2ll --> ibid.
36+
// * llvm.nvvm.bitcast.ll2d --> ibid.
37+
// * llvm.nvvm.ptr.gen.to.global --> addrspacecast
38+
// * llvm.nvvm.ptr.gen.to.shared --> ibid.
39+
// * llvm.nvvm.ptr.gen.to.constant --> ibid.
40+
// * llvm.nvvm.ptr.gen.to.local --> ibid.
41+
// * llvm.nvvm.ptr.global.to.gen --> ibid.
42+
// * llvm.nvvm.ptr.shared.to.gen --> ibid.
43+
// * llvm.nvvm.ptr.constant.to.gen --> ibid.
44+
// * llvm.nvvm.ptr.local.to.gen --> ibid.
3745

3846
def llvm_global_ptr_ty : LLVMQualPointerType<1>; // (global)ptr
3947
def llvm_shared_ptr_ty : LLVMQualPointerType<3>; // (shared)ptr
@@ -1602,40 +1610,6 @@ def int_nvvm_ldg_global_p : Intrinsic<[llvm_anyptr_ty],
16021610
[IntrReadMem, IntrArgMemOnly, IntrNoCallback, IntrWillReturn, NoCapture<ArgIndex<0>>],
16031611
"llvm.nvvm.ldg.global.p">;
16041612

1605-
// Use for generic pointers
1606-
// - These intrinsics are used to convert address spaces.
1607-
// - The input pointer and output pointer must have the same type, except for
1608-
// the address-space. (This restriction is not enforced here as there is
1609-
// currently no way to describe it).
1610-
// - This complements the llvm bitcast, which can be used to cast one type
1611-
// of pointer to another type of pointer, while the address space remains
1612-
// the same.
1613-
def int_nvvm_ptr_local_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1614-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1615-
"llvm.nvvm.ptr.local.to.gen">;
1616-
def int_nvvm_ptr_shared_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1617-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1618-
"llvm.nvvm.ptr.shared.to.gen">;
1619-
def int_nvvm_ptr_global_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1620-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1621-
"llvm.nvvm.ptr.global.to.gen">;
1622-
def int_nvvm_ptr_constant_to_gen: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1623-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1624-
"llvm.nvvm.ptr.constant.to.gen">;
1625-
1626-
def int_nvvm_ptr_gen_to_global: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1627-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1628-
"llvm.nvvm.ptr.gen.to.global">;
1629-
def int_nvvm_ptr_gen_to_shared: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1630-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1631-
"llvm.nvvm.ptr.gen.to.shared">;
1632-
def int_nvvm_ptr_gen_to_local: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1633-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1634-
"llvm.nvvm.ptr.gen.to.local">;
1635-
def int_nvvm_ptr_gen_to_constant: DefaultAttrsIntrinsic<[llvm_anyptr_ty],
1636-
[llvm_anyptr_ty], [IntrNoMem, IntrSpeculatable],
1637-
"llvm.nvvm.ptr.gen.to.constant">;
1638-
16391613
// Used in nvvm internally to help address space opt and ptx code generation
16401614
// This is for params that are passed to kernel functions by pointer by-val.
16411615
def int_nvvm_ptr_gen_to_param: Intrinsic<[llvm_anyptr_ty],

llvm/lib/IR/AutoUpgrade.cpp

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1275,6 +1275,16 @@ static bool upgradeIntrinsicFunction1(Function *F, Function *&NewFn,
12751275
else if (Name.consume_front("rotate."))
12761276
// nvvm.rotate.{b32,b64,right.b64}
12771277
Expand = Name == "b32" || Name == "b64" || Name == "right.b64";
1278+
else if (Name.consume_front("ptr.gen.to."))
1279+
// nvvm.ptr.gen.to.{local,shared,global,constant}
1280+
Expand = Name.starts_with("local") || Name.starts_with("shared") ||
1281+
Name.starts_with("global") || Name.starts_with("constant");
1282+
else if (Name.consume_front("ptr."))
1283+
// nvvm.ptr.{local,shared,global,constant}.to.gen
1284+
Expand =
1285+
(Name.consume_front("local") || Name.consume_front("shared") ||
1286+
Name.consume_front("global") || Name.consume_front("constant")) &&
1287+
Name.starts_with(".to.gen");
12781288
else
12791289
Expand = false;
12801290

@@ -2338,6 +2348,15 @@ static Value *upgradeNVVMIntrinsicCall(StringRef Name, CallBase *CI,
23382348
Value *ZExtShiftAmt = Builder.CreateZExt(CI->getOperand(1), Int64Ty);
23392349
Rep = Builder.CreateIntrinsic(Int64Ty, Intrinsic::fshr,
23402350
{Arg, Arg, ZExtShiftAmt});
2351+
} else if ((Name.consume_front("ptr.gen.to.") &&
2352+
(Name.starts_with("local") || Name.starts_with("shared") ||
2353+
Name.starts_with("global") || Name.starts_with("constant"))) ||
2354+
(Name.consume_front("ptr.") &&
2355+
(Name.consume_front("local") || Name.consume_front("shared") ||
2356+
Name.consume_front("global") ||
2357+
Name.consume_front("constant")) &&
2358+
Name.starts_with(".to.gen"))) {
2359+
Rep = Builder.CreateAddrSpaceCast(CI->getArgOperand(0), CI->getType());
23412360
} else {
23422361
Intrinsic::ID IID = shouldUpgradeNVPTXBF16Intrinsic(Name);
23432362
if (IID != Intrinsic::not_intrinsic &&

llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp

Lines changed: 28 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1109,38 +1109,38 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
11091109
AddrSpaceCastSDNode *CastN = cast<AddrSpaceCastSDNode>(N);
11101110
unsigned SrcAddrSpace = CastN->getSrcAddressSpace();
11111111
unsigned DstAddrSpace = CastN->getDestAddressSpace();
1112+
SDLoc DL(N);
11121113
assert(SrcAddrSpace != DstAddrSpace &&
11131114
"addrspacecast must be between different address spaces");
11141115

11151116
if (DstAddrSpace == ADDRESS_SPACE_GENERIC) {
11161117
// Specific to generic
1118+
1119+
if (TM.is64Bit() && TM.getPointerSizeInBits(SrcAddrSpace) == 32) {
1120+
SDValue CvtNone =
1121+
CurDAG->getTargetConstant(NVPTX::PTXCvtMode::NONE, DL, MVT::i32);
1122+
SDNode *Cvt = CurDAG->getMachineNode(NVPTX::CVT_u64_u32, DL, MVT::i64,
1123+
Src, CvtNone);
1124+
Src = SDValue(Cvt, 0);
1125+
}
1126+
11171127
unsigned Opc;
11181128
switch (SrcAddrSpace) {
11191129
default: report_fatal_error("Bad address space in addrspacecast");
11201130
case ADDRESS_SPACE_GLOBAL:
11211131
Opc = TM.is64Bit() ? NVPTX::cvta_global_64 : NVPTX::cvta_global;
11221132
break;
11231133
case ADDRESS_SPACE_SHARED:
1124-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(SrcAddrSpace) == 32
1125-
? NVPTX::cvta_shared_6432
1126-
: NVPTX::cvta_shared_64)
1127-
: NVPTX::cvta_shared;
1134+
Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
11281135
break;
11291136
case ADDRESS_SPACE_CONST:
1130-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(SrcAddrSpace) == 32
1131-
? NVPTX::cvta_const_6432
1132-
: NVPTX::cvta_const_64)
1133-
: NVPTX::cvta_const;
1137+
Opc = TM.is64Bit() ? NVPTX::cvta_const_64 : NVPTX::cvta_const;
11341138
break;
11351139
case ADDRESS_SPACE_LOCAL:
1136-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(SrcAddrSpace) == 32
1137-
? NVPTX::cvta_local_6432
1138-
: NVPTX::cvta_local_64)
1139-
: NVPTX::cvta_local;
1140+
Opc = TM.is64Bit() ? NVPTX::cvta_local_64 : NVPTX::cvta_local;
11401141
break;
11411142
}
1142-
ReplaceNode(N, CurDAG->getMachineNode(Opc, SDLoc(N), N->getValueType(0),
1143-
Src));
1143+
ReplaceNode(N, CurDAG->getMachineNode(Opc, DL, N->getValueType(0), Src));
11441144
return;
11451145
} else {
11461146
// Generic to specific
@@ -1153,30 +1153,28 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
11531153
Opc = TM.is64Bit() ? NVPTX::cvta_to_global_64 : NVPTX::cvta_to_global;
11541154
break;
11551155
case ADDRESS_SPACE_SHARED:
1156-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(DstAddrSpace) == 32
1157-
? NVPTX::cvta_to_shared_3264
1158-
: NVPTX::cvta_to_shared_64)
1159-
: NVPTX::cvta_to_shared;
1156+
Opc = TM.is64Bit() ? NVPTX::cvta_to_shared_64 : NVPTX::cvta_to_shared;
11601157
break;
11611158
case ADDRESS_SPACE_CONST:
1162-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(DstAddrSpace) == 32
1163-
? NVPTX::cvta_to_const_3264
1164-
: NVPTX::cvta_to_const_64)
1165-
: NVPTX::cvta_to_const;
1159+
Opc = TM.is64Bit() ? NVPTX::cvta_to_const_64 : NVPTX::cvta_to_const;
11661160
break;
11671161
case ADDRESS_SPACE_LOCAL:
1168-
Opc = TM.is64Bit() ? (TM.getPointerSizeInBits(DstAddrSpace) == 32
1169-
? NVPTX::cvta_to_local_3264
1170-
: NVPTX::cvta_to_local_64)
1171-
: NVPTX::cvta_to_local;
1162+
Opc = TM.is64Bit() ? NVPTX::cvta_to_local_64 : NVPTX::cvta_to_local;
11721163
break;
11731164
case ADDRESS_SPACE_PARAM:
1174-
Opc = TM.is64Bit() ? NVPTX::nvvm_ptr_gen_to_param_64
1175-
: NVPTX::nvvm_ptr_gen_to_param;
1165+
Opc = TM.is64Bit() ? NVPTX::IMOV64rr : NVPTX::IMOV32rr;
11761166
break;
11771167
}
1178-
ReplaceNode(N, CurDAG->getMachineNode(Opc, SDLoc(N), N->getValueType(0),
1179-
Src));
1168+
1169+
SDNode *CVTA = CurDAG->getMachineNode(Opc, DL, N->getValueType(0), Src);
1170+
if (TM.is64Bit() && TM.getPointerSizeInBits(DstAddrSpace) == 32) {
1171+
SDValue CvtNone =
1172+
CurDAG->getTargetConstant(NVPTX::PTXCvtMode::NONE, DL, MVT::i32);
1173+
CVTA = CurDAG->getMachineNode(NVPTX::CVT_u32_u64, DL, MVT::i32,
1174+
SDValue(CVTA, 0), CvtNone);
1175+
}
1176+
1177+
ReplaceNode(N, CVTA);
11801178
return;
11811179
}
11821180
}

llvm/lib/Target/NVPTX/NVPTXInstrInfo.td

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -174,10 +174,6 @@ def hasSM90a : Predicate<"Subtarget->getFullSmVersion() == 901">;
174174
def hasSHFL : Predicate<"!(Subtarget->getSmVersion() >= 70"
175175
"&& Subtarget->getPTXVersion() >= 64)">;
176176

177-
def useShortPtrLocal : Predicate<"TM.is64Bit() && TM.getPointerSizeInBits(ADDRESS_SPACE_LOCAL) == 32">;
178-
def useShortPtrShared : Predicate<"TM.is64Bit() && TM.getPointerSizeInBits(ADDRESS_SPACE_SHARED) == 32">;
179-
def useShortPtrConst : Predicate<"TM.is64Bit() && TM.getPointerSizeInBits(ADDRESS_SPACE_CONST) == 32">;
180-
181177
def useFP16Math: Predicate<"Subtarget->allowFP16Math()">;
182178
def hasBF16Math: Predicate<"Subtarget->hasBF16Math()">;
183179

llvm/lib/Target/NVPTX/NVPTXIntrinsics.td

Lines changed: 30 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -2543,59 +2543,45 @@ defm INT_PTX_LDG_G_v4f32_ELE
25432543
: VLDG_G_ELE_V4<"v4.f32 \t{{$dst1, $dst2, $dst3, $dst4}}, [$src];", Float32Regs>;
25442544

25452545

2546-
multiclass NG_TO_G<string Str, Intrinsic Intrin, Predicate ShortPtr> {
2546+
multiclass NG_TO_G<string Str> {
25472547
def "" : NVPTXInst<(outs Int32Regs:$result), (ins Int32Regs:$src),
2548-
!strconcat("cvta.", Str, ".u32 \t$result, $src;"),
2549-
[(set Int32Regs:$result, (Intrin Int32Regs:$src))]>;
2548+
"cvta." # Str # ".u32 \t$result, $src;", []>;
25502549
def _64 : NVPTXInst<(outs Int64Regs:$result), (ins Int64Regs:$src),
2551-
!strconcat("cvta.", Str, ".u64 \t$result, $src;"),
2552-
[(set Int64Regs:$result, (Intrin Int64Regs:$src))]>;
2553-
def _6432 : NVPTXInst<(outs Int64Regs:$result), (ins Int32Regs:$src),
2554-
"{{ .reg .b64 %tmp;\n\t"
2555-
#" cvt.u64.u32 \t%tmp, $src;\n\t"
2556-
#" cvta." # Str # ".u64 \t$result, %tmp; }}",
2557-
[(set Int64Regs:$result, (Intrin Int32Regs:$src))]>,
2558-
Requires<[ShortPtr]>;
2550+
"cvta." # Str # ".u64 \t$result, $src;", []>;
25592551
}
25602552

2561-
multiclass G_TO_NG<string Str, Intrinsic Intrin, Predicate ShortPtr> {
2553+
multiclass G_TO_NG<string Str> {
25622554
def "" : NVPTXInst<(outs Int32Regs:$result), (ins Int32Regs:$src),
2563-
!strconcat("cvta.to.", Str, ".u32 \t$result, $src;"),
2564-
[(set Int32Regs:$result, (Intrin Int32Regs:$src))]>;
2555+
"cvta.to." # Str # ".u32 \t$result, $src;", []>;
25652556
def _64 : NVPTXInst<(outs Int64Regs:$result), (ins Int64Regs:$src),
2566-
!strconcat("cvta.to.", Str, ".u64 \t$result, $src;"),
2567-
[(set Int64Regs:$result, (Intrin Int64Regs:$src))]>;
2568-
def _3264 : NVPTXInst<(outs Int32Regs:$result), (ins Int64Regs:$src),
2569-
"{{ .reg .b64 %tmp;\n\t"
2570-
#" cvta.to." # Str # ".u64 \t%tmp, $src;\n\t"
2571-
#" cvt.u32.u64 \t$result, %tmp; }}",
2572-
[(set Int32Regs:$result, (Intrin Int64Regs:$src))]>,
2573-
Requires<[ShortPtr]>;
2574-
}
2575-
2576-
defm cvta_local : NG_TO_G<"local", int_nvvm_ptr_local_to_gen, useShortPtrLocal>;
2577-
defm cvta_shared : NG_TO_G<"shared", int_nvvm_ptr_shared_to_gen, useShortPtrShared>;
2578-
defm cvta_global : NG_TO_G<"global", int_nvvm_ptr_global_to_gen, False>;
2579-
defm cvta_const : NG_TO_G<"const", int_nvvm_ptr_constant_to_gen, useShortPtrConst>;
2580-
defm cvta_param : NG_TO_G<"param", int_nvvm_ptr_param_to_gen, False>;
2581-
2582-
defm cvta_to_local : G_TO_NG<"local", int_nvvm_ptr_gen_to_local, useShortPtrLocal>;
2583-
defm cvta_to_shared : G_TO_NG<"shared", int_nvvm_ptr_gen_to_shared, useShortPtrShared>;
2584-
defm cvta_to_global : G_TO_NG<"global", int_nvvm_ptr_gen_to_global, False>;
2585-
defm cvta_to_const : G_TO_NG<"const", int_nvvm_ptr_gen_to_constant, useShortPtrConst>;
2557+
"cvta.to." # Str # ".u64 \t$result, $src;", []>;
2558+
}
2559+
2560+
defm cvta_local : NG_TO_G<"local">;
2561+
defm cvta_shared : NG_TO_G<"shared">;
2562+
defm cvta_global : NG_TO_G<"global">;
2563+
defm cvta_const : NG_TO_G<"const">;
2564+
2565+
defm cvta_to_local : G_TO_NG<"local">;
2566+
defm cvta_to_shared : G_TO_NG<"shared">;
2567+
defm cvta_to_global : G_TO_NG<"global">;
2568+
defm cvta_to_const : G_TO_NG<"const">;
2569+
2570+
// nvvm.ptr.param.to.gen
2571+
defm cvta_param : NG_TO_G<"param">;
2572+
2573+
def : Pat<(int_nvvm_ptr_param_to_gen Int32Regs:$src),
2574+
(cvta_param Int32Regs:$src)>;
2575+
2576+
def : Pat<(int_nvvm_ptr_param_to_gen Int64Regs:$src),
2577+
(cvta_param_64 Int64Regs:$src)>;
25862578

25872579
// nvvm.ptr.gen.to.param
2588-
def nvvm_ptr_gen_to_param : NVPTXInst<(outs Int32Regs:$result),
2589-
(ins Int32Regs:$src),
2590-
"mov.u32 \t$result, $src;",
2591-
[(set Int32Regs:$result,
2592-
(int_nvvm_ptr_gen_to_param Int32Regs:$src))]>;
2593-
def nvvm_ptr_gen_to_param_64 : NVPTXInst<(outs Int64Regs:$result),
2594-
(ins Int64Regs:$src),
2595-
"mov.u64 \t$result, $src;",
2596-
[(set Int64Regs:$result,
2597-
(int_nvvm_ptr_gen_to_param Int64Regs:$src))]>;
2580+
def : Pat<(int_nvvm_ptr_gen_to_param Int32Regs:$src),
2581+
(IMOV32rr Int32Regs:$src)>;
25982582

2583+
def : Pat<(int_nvvm_ptr_gen_to_param Int64Regs:$src),
2584+
(IMOV64rr Int64Regs:$src)>;
25992585

26002586
// nvvm.move intrinsicc
26012587
def nvvm_move_i16 : NVPTXInst<(outs Int16Regs:$r), (ins Int16Regs:$s),
@@ -2638,24 +2624,6 @@ def nvvm_move_sym64 : NVPTXInst<(outs Int64Regs:$r), (ins imem:$s),
26382624
[(set Int64Regs:$r,
26392625
(int_nvvm_move_ptr texternalsym:$s))]>;*/
26402626

2641-
2642-
// MoveParam %r1, param
2643-
// ptr_local_to_gen %r2, %r1
2644-
// ptr_gen_to_local %r3, %r2
2645-
// ->
2646-
// mov %r1, param
2647-
2648-
// @TODO: Revisit this. There is a type
2649-
// contradiction between iPTRAny and iPTR for the addr defs, so the move_sym
2650-
// instructions are not currently defined. However, we can use the ptr
2651-
// variants and the asm printer will do the right thing.
2652-
def : Pat<(i64 (int_nvvm_ptr_gen_to_local (int_nvvm_ptr_local_to_gen
2653-
(MoveParam texternalsym:$src)))),
2654-
(nvvm_move_ptr64 texternalsym:$src)>;
2655-
def : Pat<(i32 (int_nvvm_ptr_gen_to_local (int_nvvm_ptr_local_to_gen
2656-
(MoveParam texternalsym:$src)))),
2657-
(nvvm_move_ptr32 texternalsym:$src)>;
2658-
26592627
def texsurf_handles
26602628
: NVPTXInst<(outs Int64Regs:$result), (ins imem:$src),
26612629
"mov.u64 \t$result, $src;", []>;

0 commit comments

Comments
 (0)