Skip to content

Commit 983eeb0

Browse files
committed
[NVPTX] Set boolean contents to zero-or-one
Note that this field is a property of SelectionDAG rather than one of the ISA. The previous value wasn't incorrect, per se, but since NVPTX uses 1-bit predicate registers, either choice is possible. The problem with using zero-or-negative-one manifested in issues during instruction selection where LLVM would fold a sign-extension of a setcc into the setcc itself, which isn't a legal operation. We could alternatively let this happen and then custom expand it back into a separate setcc + sext later. Or we could provide explicit patterns for the setp/selp sequence. However, preventing it from forming in the first place feels more appropriate to the ISA. It does not appear to have a significant knock-on effect on codegen tests, aside from one minor change which looks harmless: selp.s64 %rd6, -1, 0, %p1; add.s64 %rd7, %rd5, %rd6; to selp.u64 %rd6, 1, 0, %p1; sub.s64 %rd7, %rd5, %rd6;
1 parent 1e23a61 commit 983eeb0

File tree

3 files changed

+33
-4
lines changed

3 files changed

+33
-4
lines changed

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -416,8 +416,8 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
416416
MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = (unsigned) 0xFFFFFFFF;
417417
MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = (unsigned) 0xFFFFFFFF;
418418

419-
setBooleanContents(ZeroOrNegativeOneBooleanContent);
420-
setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);
419+
setBooleanContents(ZeroOrOneBooleanContent);
420+
setBooleanVectorContents(ZeroOrOneBooleanContent);
421421

422422
// Jump is Expensive. Don't create extra control flow for 'and', 'or'
423423
// condition branches.

llvm/test/CodeGen/NVPTX/add-sub-128bit.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ define i128 @test_add(i128 %a, i128 %b) {
2323
define i128 @test_sub(i128 %a, i128 %b) {
2424
; NOCARRY: sub.s64
2525
; NOCARRY-NEXT: setp.lt.u64
26-
; NOCARRY-NEXT: selp.s64
27-
; NOCARRY-NEXT: add.s64
26+
; NOCARRY-NEXT: selp.u64
27+
; NOCARRY-NEXT: sub.s64
2828
; NOCARRY-NEXT: sub.s64
2929

3030
; CARRY: sub.cc.s64

llvm/test/CodeGen/NVPTX/sext-setcc.ll

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
2+
; RUN: llc < %s -mtriple=nvptx64 -mcpu=sm_80 | FileCheck %s
3+
; RUN: %if ptxas-11.0 %{ llc < %s -march=nvptx64 -mcpu=sm_80 -mattr=+ptx70 | %ptxas-verify -arch=sm_80 %}
4+
5+
define <2 x i16> @sext_setcc(ptr %p) {
6+
; CHECK-LABEL: sext_setcc(
7+
; CHECK: {
8+
; CHECK-NEXT: .reg .pred %p<3>;
9+
; CHECK-NEXT: .reg .b16 %rs<5>;
10+
; CHECK-NEXT: .reg .b32 %r<3>;
11+
; CHECK-NEXT: .reg .b64 %rd<2>;
12+
; CHECK-EMPTY:
13+
; CHECK-NEXT: // %bb.0: // %entry
14+
; CHECK-NEXT: ld.param.u64 %rd1, [sext_setcc_param_0];
15+
; CHECK-NEXT: ld.u32 %r1, [%rd1];
16+
; CHECK-NEXT: mov.b32 {%rs1, %rs2}, %r1;
17+
; CHECK-NEXT: setp.eq.s16 %p1, %rs1, 0;
18+
; CHECK-NEXT: setp.eq.s16 %p2, %rs2, 0;
19+
; CHECK-NEXT: selp.s16 %rs3, -1, 0, %p2;
20+
; CHECK-NEXT: selp.s16 %rs4, -1, 0, %p1;
21+
; CHECK-NEXT: mov.b32 %r2, {%rs4, %rs3};
22+
; CHECK-NEXT: st.param.b32 [func_retval0+0], %r2;
23+
; CHECK-NEXT: ret;
24+
entry:
25+
%v = load <2 x i16>, ptr %p, align 4
26+
%cmp = icmp eq <2 x i16> %v, zeroinitializer
27+
%sext = sext <2 x i1> %cmp to <2 x i16>
28+
ret <2 x i16> %sext
29+
}

0 commit comments

Comments
 (0)