Skip to content

[clang][CodeGen] Fix in codegen for __builtin_popcountg/ctzg/clzg #90845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2024

Conversation

bjope
Copy link
Collaborator

@bjope bjope commented May 2, 2024

Make sure that the result from the popcnt/ctlz/cttz intrinsics is unsigned casted to int, rather than casted as a signed value, when expanding the __builtin_popcountg/__builtin_ctzg/__builtin_clzg builtins.

An example would be
unsigned _BitInt(1) x = ...;
int y = __builtin_popcountg(x);
which previously was incorrectly expanded to
%1 = call i1 @llvm.ctpop.i1(i1 %0)
%cast = sext i1 %1 to i32

Since the input type is generic for those "g" versions of the builtins the intrinsic call may return a value for which the sign bit is set (that could typically for BitInt of size 1 and 2). So we need to emit a zext rather than a sext to avoid negative results.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels May 2, 2024
@llvmbot
Copy link
Member

llvmbot commented May 2, 2024

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Björn Pettersson (bjope)

Changes

Make sure that the result from the popcnt/ctlz/cttz intrinsics is unsigned casted to int, rather than casted as a signed value, when expanding the __builtin_popcountg/__builtin_ctzg/__builtin_clzg builtins.

An example would be
unsigned _BitInt(1) x = ...;
int y = __builtin_popcountg(x);
which previously was incorrectly expanded to
%1 = call i1 @llvm.ctpop.i1(i1 %0)
%cast = sext i1 %1 to i32

Since the input type is generic for those "g" versions of the builtins the intrinsic call may return a value for which the sign bit is set (that could typically for BitInt of size 1 and 2). So we need to emit a zext rather than a sext to avoid negative results.


Full diff: https://github.com/llvm/llvm-project/pull/90845.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+3-3)
  • (modified) clang/test/CodeGen/builtins.c (+10-10)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index a370734e00d3e1..fc1dbb87ed1bf2 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3239,7 +3239,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
         Builder.getInt1(HasFallback || getTarget().isCLZForZeroUndef());
     Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});
     if (Result->getType() != ResultType)
-      Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,
+      Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/false,
                                      "cast");
     if (!HasFallback)
       return RValue::get(Result);
@@ -3271,7 +3271,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
         Builder.getInt1(HasFallback || getTarget().isCLZForZeroUndef());
     Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});
     if (Result->getType() != ResultType)
-      Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,
+      Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/false,
                                      "cast");
     if (!HasFallback)
       return RValue::get(Result);
@@ -3351,7 +3351,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     llvm::Type *ResultType = ConvertType(E->getType());
     Value *Result = Builder.CreateCall(F, ArgValue);
     if (Result->getType() != ResultType)
-      Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,
+      Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/false,
                                      "cast");
     return RValue::get(Result);
   }
diff --git a/clang/test/CodeGen/builtins.c b/clang/test/CodeGen/builtins.c
index 407e0857d22311..b41efb59e61dbc 100644
--- a/clang/test/CodeGen/builtins.c
+++ b/clang/test/CodeGen/builtins.c
@@ -949,12 +949,12 @@ void test_builtin_popcountg(unsigned char uc, unsigned short us,
   pop = __builtin_popcountg(uc);
   // CHECK: %1 = load i8, ptr %uc.addr, align 1
   // CHECK-NEXT: %2 = call i8 @llvm.ctpop.i8(i8 %1)
-  // CHECK-NEXT: %cast = sext i8 %2 to i32
+  // CHECK-NEXT: %cast = zext i8 %2 to i32
   // CHECK-NEXT: store volatile i32 %cast, ptr %pop, align 4
   pop = __builtin_popcountg(us);
   // CHECK-NEXT: %3 = load i16, ptr %us.addr, align 2
   // CHECK-NEXT: %4 = call i16 @llvm.ctpop.i16(i16 %3)
-  // CHECK-NEXT: %cast1 = sext i16 %4 to i32
+  // CHECK-NEXT: %cast1 = zext i16 %4 to i32
   // CHECK-NEXT: store volatile i32 %cast1, ptr %pop, align 4
   pop = __builtin_popcountg(ui);
   // CHECK-NEXT: %5 = load i32, ptr %ui.addr, align 4
@@ -992,12 +992,12 @@ void test_builtin_clzg(unsigned char uc, unsigned short us, unsigned int ui,
   lz = __builtin_clzg(uc);
   // CHECK: %1 = load i8, ptr %uc.addr, align 1
   // CHECK-NEXT: %2 = call i8 @llvm.ctlz.i8(i8 %1, i1 true)
-  // CHECK-NEXT: %cast = sext i8 %2 to i32
+  // CHECK-NEXT: %cast = zext i8 %2 to i32
   // CHECK-NEXT: store volatile i32 %cast, ptr %lz, align 4
   lz = __builtin_clzg(us);
   // CHECK-NEXT: %3 = load i16, ptr %us.addr, align 2
   // CHECK-NEXT: %4 = call i16 @llvm.ctlz.i16(i16 %3, i1 true)
-  // CHECK-NEXT: %cast1 = sext i16 %4 to i32
+  // CHECK-NEXT: %cast1 = zext i16 %4 to i32
   // CHECK-NEXT: store volatile i32 %cast1, ptr %lz, align 4
   lz = __builtin_clzg(ui);
   // CHECK-NEXT: %5 = load i32, ptr %ui.addr, align 4
@@ -1026,7 +1026,7 @@ void test_builtin_clzg(unsigned char uc, unsigned short us, unsigned int ui,
   lz = __builtin_clzg(uc, sc);
   // CHECK-NEXT: %15 = load i8, ptr %uc.addr, align 1
   // CHECK-NEXT: %16 = call i8 @llvm.ctlz.i8(i8 %15, i1 true)
-  // CHECK-NEXT: %cast6 = sext i8 %16 to i32
+  // CHECK-NEXT: %cast6 = zext i8 %16 to i32
   // CHECK-NEXT: %iszero = icmp eq i8 %15, 0
   // CHECK-NEXT: %17 = load i8, ptr %sc.addr, align 1
   // CHECK-NEXT: %conv = sext i8 %17 to i32
@@ -1035,7 +1035,7 @@ void test_builtin_clzg(unsigned char uc, unsigned short us, unsigned int ui,
   lz = __builtin_clzg(us, uc);
   // CHECK-NEXT: %18 = load i16, ptr %us.addr, align 2
   // CHECK-NEXT: %19 = call i16 @llvm.ctlz.i16(i16 %18, i1 true)
-  // CHECK-NEXT: %cast7 = sext i16 %19 to i32
+  // CHECK-NEXT: %cast7 = zext i16 %19 to i32
   // CHECK-NEXT: %iszero8 = icmp eq i16 %18, 0
   // CHECK-NEXT: %20 = load i8, ptr %uc.addr, align 1
   // CHECK-NEXT: %conv9 = zext i8 %20 to i32
@@ -1094,12 +1094,12 @@ void test_builtin_ctzg(unsigned char uc, unsigned short us, unsigned int ui,
   tz = __builtin_ctzg(uc);
   // CHECK: %1 = load i8, ptr %uc.addr, align 1
   // CHECK-NEXT: %2 = call i8 @llvm.cttz.i8(i8 %1, i1 true)
-  // CHECK-NEXT: %cast = sext i8 %2 to i32
+  // CHECK-NEXT: %cast = zext i8 %2 to i32
   // CHECK-NEXT: store volatile i32 %cast, ptr %tz, align 4
   tz = __builtin_ctzg(us);
   // CHECK-NEXT: %3 = load i16, ptr %us.addr, align 2
   // CHECK-NEXT: %4 = call i16 @llvm.cttz.i16(i16 %3, i1 true)
-  // CHECK-NEXT: %cast1 = sext i16 %4 to i32
+  // CHECK-NEXT: %cast1 = zext i16 %4 to i32
   // CHECK-NEXT: store volatile i32 %cast1, ptr %tz, align 4
   tz = __builtin_ctzg(ui);
   // CHECK-NEXT: %5 = load i32, ptr %ui.addr, align 4
@@ -1128,7 +1128,7 @@ void test_builtin_ctzg(unsigned char uc, unsigned short us, unsigned int ui,
   tz = __builtin_ctzg(uc, sc);
   // CHECK-NEXT: %15 = load i8, ptr %uc.addr, align 1
   // CHECK-NEXT: %16 = call i8 @llvm.cttz.i8(i8 %15, i1 true)
-  // CHECK-NEXT: %cast6 = sext i8 %16 to i32
+  // CHECK-NEXT: %cast6 = zext i8 %16 to i32
   // CHECK-NEXT: %iszero = icmp eq i8 %15, 0
   // CHECK-NEXT: %17 = load i8, ptr %sc.addr, align 1
   // CHECK-NEXT: %conv = sext i8 %17 to i32
@@ -1137,7 +1137,7 @@ void test_builtin_ctzg(unsigned char uc, unsigned short us, unsigned int ui,
   tz = __builtin_ctzg(us, uc);
   // CHECK-NEXT: %18 = load i16, ptr %us.addr, align 2
   // CHECK-NEXT: %19 = call i16 @llvm.cttz.i16(i16 %18, i1 true)
-  // CHECK-NEXT: %cast7 = sext i16 %19 to i32
+  // CHECK-NEXT: %cast7 = zext i16 %19 to i32
   // CHECK-NEXT: %iszero8 = icmp eq i16 %18, 0
   // CHECK-NEXT: %20 = load i8, ptr %uc.addr, align 1
   // CHECK-NEXT: %conv9 = zext i8 %20 to i32

Copy link

github-actions bot commented May 2, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@efriedma-quic
Copy link
Collaborator

Please add a testcase for i1/i2 (which are the only cases where this actually matters).

@bjope
Copy link
Collaborator Author

bjope commented May 2, 2024

Please add a testcase for i1/i2 (which are the only cases where this actually matters).

I added tests in a new file (builtins-bitint.c) to show that we get expected results for a target that isn't treating a zero input to ctzg/clzg as poison.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix clang-format warnings. Otherwise LGTM

Make sure that the result from the popcnt/ctlz/cttz intrinsics is
unsigned casted to int, rather than casted as a signed value, when
expanding the __builtin_popcountg/__builtin_ctzg/__builtin_clzg
builtins.

An example would be
  unsigned _BitInt(1) x = ...;
  int y = __builtin_popcountg(x);
which previously was incorrectly expanded to
  %1 = call i1 @llvm.ctpop.i1(i1 %0)
  %cast = sext i1 %1 to i32

Since the input type is generic for those "g" versions of the builtins
the intrinsic call may return a value for which the sign bit is set
(that could typically for BitInt of size 1 and 2). So we need to
emit a zext rather than a sext to avoid negative results.
@bjope bjope merged commit 7298ae3 into llvm:main May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c23 clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category miscompilation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants