Skip to content

[clang] Implement __builtin_{clzg,ctzg} #83431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 21, 2024

Conversation

overmighty
Copy link
Member

@overmighty overmighty commented Feb 29, 2024

Fixes #83075, fixes #83076.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Feb 29, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 29, 2024

@llvm/pr-subscribers-clang-codegen

Author: OverMighty (overmighty)

Changes

Fixes #83075, #83076.


Patch is 26.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/83431.diff

9 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+41)
  • (modified) clang/include/clang/Basic/Builtins.td (+12)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+8-7)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+34-6)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+53)
  • (modified) clang/test/CodeGen/builtins.c (+204)
  • (modified) clang/test/CodeGen/ubsan-builtin-checks.c (+6)
  • (removed) clang/test/Sema/builtin-popcountg.c (-23)
  • (added) clang/test/Sema/count-builtins.c (+87)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index bcd69198eafdbe..3d73d772f698ba 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -3504,6 +3504,47 @@ argument can be of any unsigned integer type.
 ``__builtin_popcount{,l,ll}`` builtins, with support for other integer types,
 such as ``unsigned __int128`` and C23 ``unsigned _BitInt(N)``.
 
+``__builtin_clzg`` and ``__builtin_ctzg``
+-----------------------------------------
+
+``__builtin_clzg`` (respectively ``__builtin_ctzg``) returns the number of
+leading (respectively trailing) 0 bits in the first argument. The first argument
+can be of any unsigned integer type.
+
+If the first argument is 0 and an optional second argument of ``int`` type is
+provided, then the second argument is returned. If the first argument is 0, but
+only one argument is provided, then the returned value is undefined.
+
+**Syntax**:
+
+.. code-block:: c++
+
+  int __builtin_clzg(type x[, int fallback])
+  int __builtin_ctzg(type x[, int fallback])
+
+**Examples**:
+
+.. code-block:: c++
+
+  unsigned int x = 1;
+  int x_lz = __builtin_clzg(x);
+  int x_tz = __builtin_ctzg(x);
+
+  unsigned long y = 2;
+  int y_lz = __builtin_clzg(y);
+  int y_tz = __builtin_ctzg(y);
+
+  unsigned _BitInt(128) z = 4;
+  int z_lz = __builtin_clzg(z);
+  int z_tz = __builtin_ctzg(z);
+
+**Description**:
+
+``__builtin_clzg`` (respectively ``__builtin_ctzg``) is meant to be a
+type-generic alternative to the ``__builtin_clz{,l,ll}`` (respectively
+``__builtin_ctz{,l,ll}``) builtins, with support for other integer types, such
+as ``unsigned __int128`` and C23 ``unsigned _BitInt(N)``.
+
 Multiprecision Arithmetic Builtins
 ----------------------------------
 
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 2fbc56d49a59a1..ec5a8819ed4057 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -662,6 +662,12 @@ def Clz : Builtin, BitShort_Int_Long_LongLongTemplate {
 
 // FIXME: Add int clzimax(uintmax_t)
 
+def Clzg : Builtin {
+  let Spellings = ["__builtin_clzg"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "int(...)";
+}
+
 def Ctz : Builtin, BitShort_Int_Long_LongLongTemplate {
   let Spellings = ["__builtin_ctz"];
   let Attributes = [NoThrow, Const, Constexpr];
@@ -670,6 +676,12 @@ def Ctz : Builtin, BitShort_Int_Long_LongLongTemplate {
 
 // FIXME: Add int ctzimax(uintmax_t)
 
+def Ctzg : Builtin {
+  let Spellings = ["__builtin_ctzg"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "int(...)";
+}
+
 def FFS : Builtin, BitInt_Long_LongLongTemplate {
   let Spellings = ["__builtin_ffs"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, Const, Constexpr];
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index f726805dc02bd9..d2a18b7bb9cfdd 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -11978,13 +11978,14 @@ def err_builtin_launder_invalid_arg : Error<
   "'__builtin_launder' is not allowed">;
 
 def err_builtin_invalid_arg_type: Error <
-  "%ordinal0 argument must be a "
-  "%select{vector, integer or floating point type|matrix|"
-  "pointer to a valid matrix element type|"
-  "signed integer or floating point type|vector type|"
-  "floating point type|"
-  "vector of integers|"
-  "type of unsigned integer}1 (was %2)">;
+  "%ordinal0 argument must be "
+  "%select{a vector, integer or floating point type|a matrix|"
+  "a pointer to a valid matrix element type|"
+  "a signed integer or floating point type|a vector type|"
+  "a floating point type|"
+  "a vector of integers|"
+  "an unsigned integer|"
+  "an 'int'}1 (was %2)">;
 
 def err_builtin_matrix_disabled: Error<
   "matrix types extension is disabled. Pass -fenable-matrix to enable it">;
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2d16e7cdc06053..bd7cf9ae79f690 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3128,8 +3128,14 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
   case Builtin::BI__builtin_ctzs:
   case Builtin::BI__builtin_ctz:
   case Builtin::BI__builtin_ctzl:
-  case Builtin::BI__builtin_ctzll: {
-    Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CTZPassedZero);
+  case Builtin::BI__builtin_ctzll:
+  case Builtin::BI__builtin_ctzg: {
+    bool HasFallback = BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_ctzg &&
+                       E->getNumArgs() > 1;
+
+    Value *ArgValue =
+        HasFallback ? EmitScalarExpr(E->getArg(0))
+                    : EmitCheckedArgForBuiltin(E->getArg(0), BCK_CTZPassedZero);
 
     llvm::Type *ArgType = ArgValue->getType();
     Function *F = CGM.getIntrinsic(Intrinsic::cttz, ArgType);
@@ -3140,13 +3146,27 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     if (Result->getType() != ResultType)
       Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,
                                      "cast");
-    return RValue::get(Result);
+    if (!HasFallback)
+      return RValue::get(Result);
+
+    Value *Zero = Constant::getNullValue(ArgType);
+    Value *IsZero = Builder.CreateICmpEQ(ArgValue, Zero, "iszero");
+    Value *FallbackValue = EmitScalarExpr(E->getArg(1));
+    Value *ResultOrFallback =
+        Builder.CreateSelect(IsZero, FallbackValue, Result, "ctzg");
+    return RValue::get(ResultOrFallback);
   }
   case Builtin::BI__builtin_clzs:
   case Builtin::BI__builtin_clz:
   case Builtin::BI__builtin_clzl:
-  case Builtin::BI__builtin_clzll: {
-    Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CLZPassedZero);
+  case Builtin::BI__builtin_clzll:
+  case Builtin::BI__builtin_clzg: {
+    bool HasFallback = BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_clzg &&
+                       E->getNumArgs() > 1;
+
+    Value *ArgValue =
+        HasFallback ? EmitScalarExpr(E->getArg(0))
+                    : EmitCheckedArgForBuiltin(E->getArg(0), BCK_CLZPassedZero);
 
     llvm::Type *ArgType = ArgValue->getType();
     Function *F = CGM.getIntrinsic(Intrinsic::ctlz, ArgType);
@@ -3157,7 +3177,15 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     if (Result->getType() != ResultType)
       Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,
                                      "cast");
-    return RValue::get(Result);
+    if (!HasFallback)
+      return RValue::get(Result);
+
+    Value *Zero = Constant::getNullValue(ArgType);
+    Value *IsZero = Builder.CreateICmpEQ(ArgValue, Zero, "iszero");
+    Value *FallbackValue = EmitScalarExpr(E->getArg(1));
+    Value *ResultOrFallback =
+        Builder.CreateSelect(IsZero, FallbackValue, Result, "clzg");
+    return RValue::get(ResultOrFallback);
   }
   case Builtin::BI__builtin_ffs:
   case Builtin::BI__builtin_ffsl:
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 979b63884359fc..d0352d5d007a7a 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2212,6 +2212,54 @@ static bool SemaBuiltinPopcountg(Sema &S, CallExpr *TheCall) {
   return false;
 }
 
+/// Checks that __builtin_{clzg,ctzg} was called with a first argument, which is
+/// an unsigned integer, and an optional second argument, which is promoted to
+/// an 'int'.
+static bool SemaBuiltinCountZeroBitsGeneric(Sema &S, CallExpr *TheCall) {
+  if (checkArgCountRange(S, TheCall, 1, 2))
+    return true;
+
+  ExprResult Arg0Res = S.DefaultLvalueConversion(TheCall->getArg(0));
+  if (Arg0Res.isInvalid())
+    return true;
+
+  Expr *Arg0 = Arg0Res.get();
+  TheCall->setArg(0, Arg0);
+
+  QualType Arg0Ty = Arg0->getType();
+
+  if (!Arg0Ty->isUnsignedIntegerType()) {
+    S.Diag(Arg0->getBeginLoc(), diag::err_builtin_invalid_arg_type)
+        << 1 << /*unsigned integer ty*/ 7 << Arg0Ty;
+    return true;
+  }
+
+  if (TheCall->getNumArgs() > 1) {
+    ExprResult Arg1Res = S.DefaultLvalueConversion(TheCall->getArg(1));
+    if (Arg1Res.isInvalid())
+      return true;
+
+    Expr *Arg1 = Arg1Res.get();
+    TheCall->setArg(1, Arg1);
+
+    QualType Arg1Ty = Arg1->getType();
+
+    if (S.Context.isPromotableIntegerType(Arg1Ty)) {
+      Arg1Ty = S.Context.getPromotedIntegerType(Arg1Ty);
+      Arg1 = S.ImpCastExprToType(Arg1, Arg1Ty, CK_IntegralCast).get();
+      TheCall->setArg(1, Arg1);
+    }
+
+    if (!Arg1Ty->isSpecificBuiltinType(BuiltinType::Int)) {
+      S.Diag(Arg1->getBeginLoc(), diag::err_builtin_invalid_arg_type)
+          << 2 << /*'int' ty*/ 8 << Arg1Ty;
+      return true;
+    }
+  }
+
+  return false;
+}
+
 ExprResult
 Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
                                CallExpr *TheCall) {
@@ -2988,6 +3036,11 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
     if (SemaBuiltinPopcountg(*this, TheCall))
       return ExprError();
     break;
+  case Builtin::BI__builtin_clzg:
+  case Builtin::BI__builtin_ctzg:
+    if (SemaBuiltinCountZeroBitsGeneric(*this, TheCall))
+      return ExprError();
+    break;
   }
 
   if (getLangOpts().HLSL && CheckHLSLBuiltinFunctionCall(BuiltinID, TheCall))
diff --git a/clang/test/CodeGen/builtins.c b/clang/test/CodeGen/builtins.c
index 4f9641d357b7ba..407e0857d22311 100644
--- a/clang/test/CodeGen/builtins.c
+++ b/clang/test/CodeGen/builtins.c
@@ -983,4 +983,208 @@ void test_builtin_popcountg(unsigned char uc, unsigned short us,
   // CHECK-NEXT: ret void
 }
 
+// CHECK-LABEL: define{{.*}} void @test_builtin_clzg
+void test_builtin_clzg(unsigned char uc, unsigned short us, unsigned int ui,
+                       unsigned long ul, unsigned long long ull,
+                       unsigned __int128 ui128, unsigned _BitInt(128) ubi128,
+                       signed char sc, short s, int i) {
+  volatile int lz;
+  lz = __builtin_clzg(uc);
+  // CHECK: %1 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %2 = call i8 @llvm.ctlz.i8(i8 %1, i1 true)
+  // CHECK-NEXT: %cast = sext i8 %2 to i32
+  // CHECK-NEXT: store volatile i32 %cast, ptr %lz, align 4
+  lz = __builtin_clzg(us);
+  // CHECK-NEXT: %3 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %4 = call i16 @llvm.ctlz.i16(i16 %3, i1 true)
+  // CHECK-NEXT: %cast1 = sext i16 %4 to i32
+  // CHECK-NEXT: store volatile i32 %cast1, ptr %lz, align 4
+  lz = __builtin_clzg(ui);
+  // CHECK-NEXT: %5 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %6 = call i32 @llvm.ctlz.i32(i32 %5, i1 true)
+  // CHECK-NEXT: store volatile i32 %6, ptr %lz, align 4
+  lz = __builtin_clzg(ul);
+  // CHECK-NEXT: %7 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %8 = call i64 @llvm.ctlz.i64(i64 %7, i1 true)
+  // CHECK-NEXT: %cast2 = trunc i64 %8 to i32
+  // CHECK-NEXT: store volatile i32 %cast2, ptr %lz, align 4
+  lz = __builtin_clzg(ull);
+  // CHECK-NEXT: %9 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %10 = call i64 @llvm.ctlz.i64(i64 %9, i1 true)
+  // CHECK-NEXT: %cast3 = trunc i64 %10 to i32
+  // CHECK-NEXT: store volatile i32 %cast3, ptr %lz, align 4
+  lz = __builtin_clzg(ui128);
+  // CHECK-NEXT: %11 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %12 = call i128 @llvm.ctlz.i128(i128 %11, i1 true)
+  // CHECK-NEXT: %cast4 = trunc i128 %12 to i32
+  // CHECK-NEXT: store volatile i32 %cast4, ptr %lz, align 4
+  lz = __builtin_clzg(ubi128);
+  // CHECK-NEXT: %13 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %14 = call i128 @llvm.ctlz.i128(i128 %13, i1 true)
+  // CHECK-NEXT: %cast5 = trunc i128 %14 to i32
+  // CHECK-NEXT: store volatile i32 %cast5, ptr %lz, align 4
+  lz = __builtin_clzg(uc, sc);
+  // CHECK-NEXT: %15 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %16 = call i8 @llvm.ctlz.i8(i8 %15, i1 true)
+  // CHECK-NEXT: %cast6 = sext i8 %16 to i32
+  // CHECK-NEXT: %iszero = icmp eq i8 %15, 0
+  // CHECK-NEXT: %17 = load i8, ptr %sc.addr, align 1
+  // CHECK-NEXT: %conv = sext i8 %17 to i32
+  // CHECK-NEXT: %clzg = select i1 %iszero, i32 %conv, i32 %cast6
+  // CHECK-NEXT: store volatile i32 %clzg, ptr %lz, align 4
+  lz = __builtin_clzg(us, uc);
+  // CHECK-NEXT: %18 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %19 = call i16 @llvm.ctlz.i16(i16 %18, i1 true)
+  // CHECK-NEXT: %cast7 = sext i16 %19 to i32
+  // CHECK-NEXT: %iszero8 = icmp eq i16 %18, 0
+  // CHECK-NEXT: %20 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %conv9 = zext i8 %20 to i32
+  // CHECK-NEXT: %clzg10 = select i1 %iszero8, i32 %conv9, i32 %cast7
+  // CHECK-NEXT: store volatile i32 %clzg10, ptr %lz, align 4
+  lz = __builtin_clzg(ui, s);
+  // CHECK-NEXT: %21 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %22 = call i32 @llvm.ctlz.i32(i32 %21, i1 true)
+  // CHECK-NEXT: %iszero11 = icmp eq i32 %21, 0
+  // CHECK-NEXT: %23 = load i16, ptr %s.addr, align 2
+  // CHECK-NEXT: %conv12 = sext i16 %23 to i32
+  // CHECK-NEXT: %clzg13 = select i1 %iszero11, i32 %conv12, i32 %22
+  // CHECK-NEXT: store volatile i32 %clzg13, ptr %lz, align 4
+  lz = __builtin_clzg(ul, us);
+  // CHECK-NEXT: %24 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %25 = call i64 @llvm.ctlz.i64(i64 %24, i1 true)
+  // CHECK-NEXT: %cast14 = trunc i64 %25 to i32
+  // CHECK-NEXT: %iszero15 = icmp eq i64 %24, 0
+  // CHECK-NEXT: %26 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %conv16 = zext i16 %26 to i32
+  // CHECK-NEXT: %clzg17 = select i1 %iszero15, i32 %conv16, i32 %cast14
+  // CHECK-NEXT: store volatile i32 %clzg17, ptr %lz, align 4
+  lz = __builtin_clzg(ull, i);
+  // CHECK-NEXT: %27 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %28 = call i64 @llvm.ctlz.i64(i64 %27, i1 true)
+  // CHECK-NEXT: %cast18 = trunc i64 %28 to i32
+  // CHECK-NEXT: %iszero19 = icmp eq i64 %27, 0
+  // CHECK-NEXT: %29 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %clzg20 = select i1 %iszero19, i32 %29, i32 %cast18
+  // CHECK-NEXT: store volatile i32 %clzg20, ptr %lz, align 4
+  lz = __builtin_clzg(ui128, i);
+  // CHECK-NEXT: %30 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %31 = call i128 @llvm.ctlz.i128(i128 %30, i1 true)
+  // CHECK-NEXT: %cast21 = trunc i128 %31 to i32
+  // CHECK-NEXT: %iszero22 = icmp eq i128 %30, 0
+  // CHECK-NEXT: %32 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %clzg23 = select i1 %iszero22, i32 %32, i32 %cast21
+  // CHECK-NEXT: store volatile i32 %clzg23, ptr %lz, align 4
+  lz = __builtin_clzg(ubi128, i);
+   // CHECK-NEXT: %33 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %34 = call i128 @llvm.ctlz.i128(i128 %33, i1 true)
+  // CHECK-NEXT: %cast24 = trunc i128 %34 to i32
+  // CHECK-NEXT: %iszero25 = icmp eq i128 %33, 0
+  // CHECK-NEXT: %35 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %clzg26 = select i1 %iszero25, i32 %35, i32 %cast24
+  // CHECK-NEXT: store volatile i32 %clzg26, ptr %lz, align 4
+  // CHECK-NEXT: ret void
+}
+
+// CHECK-LABEL: define{{.*}} void @test_builtin_ctzg
+void test_builtin_ctzg(unsigned char uc, unsigned short us, unsigned int ui,
+                       unsigned long ul, unsigned long long ull,
+                       unsigned __int128 ui128, unsigned _BitInt(128) ubi128,
+                       signed char sc, short s, int i) {
+  volatile int tz;
+  tz = __builtin_ctzg(uc);
+  // CHECK: %1 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %2 = call i8 @llvm.cttz.i8(i8 %1, i1 true)
+  // CHECK-NEXT: %cast = sext i8 %2 to i32
+  // CHECK-NEXT: store volatile i32 %cast, ptr %tz, align 4
+  tz = __builtin_ctzg(us);
+  // CHECK-NEXT: %3 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %4 = call i16 @llvm.cttz.i16(i16 %3, i1 true)
+  // CHECK-NEXT: %cast1 = sext i16 %4 to i32
+  // CHECK-NEXT: store volatile i32 %cast1, ptr %tz, align 4
+  tz = __builtin_ctzg(ui);
+  // CHECK-NEXT: %5 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %6 = call i32 @llvm.cttz.i32(i32 %5, i1 true)
+  // CHECK-NEXT: store volatile i32 %6, ptr %tz, align 4
+  tz = __builtin_ctzg(ul);
+  // CHECK-NEXT: %7 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %8 = call i64 @llvm.cttz.i64(i64 %7, i1 true)
+  // CHECK-NEXT: %cast2 = trunc i64 %8 to i32
+  // CHECK-NEXT: store volatile i32 %cast2, ptr %tz, align 4
+  tz = __builtin_ctzg(ull);
+  // CHECK-NEXT: %9 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %10 = call i64 @llvm.cttz.i64(i64 %9, i1 true)
+  // CHECK-NEXT: %cast3 = trunc i64 %10 to i32
+  // CHECK-NEXT: store volatile i32 %cast3, ptr %tz, align 4
+  tz = __builtin_ctzg(ui128);
+  // CHECK-NEXT: %11 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %12 = call i128 @llvm.cttz.i128(i128 %11, i1 true)
+  // CHECK-NEXT: %cast4 = trunc i128 %12 to i32
+  // CHECK-NEXT: store volatile i32 %cast4, ptr %tz, align 4
+  tz = __builtin_ctzg(ubi128);
+  // CHECK-NEXT: %13 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %14 = call i128 @llvm.cttz.i128(i128 %13, i1 true)
+  // CHECK-NEXT: %cast5 = trunc i128 %14 to i32
+  // CHECK-NEXT: store volatile i32 %cast5, ptr %tz, align 4
+  tz = __builtin_ctzg(uc, sc);
+  // CHECK-NEXT: %15 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %16 = call i8 @llvm.cttz.i8(i8 %15, i1 true)
+  // CHECK-NEXT: %cast6 = sext i8 %16 to i32
+  // CHECK-NEXT: %iszero = icmp eq i8 %15, 0
+  // CHECK-NEXT: %17 = load i8, ptr %sc.addr, align 1
+  // CHECK-NEXT: %conv = sext i8 %17 to i32
+  // CHECK-NEXT: %ctzg = select i1 %iszero, i32 %conv, i32 %cast6
+  // CHECK-NEXT: store volatile i32 %ctzg, ptr %tz, align 4
+  tz = __builtin_ctzg(us, uc);
+  // CHECK-NEXT: %18 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %19 = call i16 @llvm.cttz.i16(i16 %18, i1 true)
+  // CHECK-NEXT: %cast7 = sext i16 %19 to i32
+  // CHECK-NEXT: %iszero8 = icmp eq i16 %18, 0
+  // CHECK-NEXT: %20 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %conv9 = zext i8 %20 to i32
+  // CHECK-NEXT: %ctzg10 = select i1 %iszero8, i32 %conv9, i32 %cast7
+  // CHECK-NEXT: store volatile i32 %ctzg10, ptr %tz, align 4
+  tz = __builtin_ctzg(ui, s);
+  // CHECK-NEXT: %21 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %22 = call i32 @llvm.cttz.i32(i32 %21, i1 true)
+  // CHECK-NEXT: %iszero11 = icmp eq i32 %21, 0
+  // CHECK-NEXT: %23 = load i16, ptr %s.addr, align 2
+  // CHECK-NEXT: %conv12 = sext i16 %23 to i32
+  // CHECK-NEXT: %ctzg13 = select i1 %iszero11, i32 %conv12, i32 %22
+  // CHECK-NEXT: store volatile i32 %ctzg13, ptr %tz, align 4
+  tz = __builtin_ctzg(ul, us);
+  // CHECK-NEXT: %24 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %25 = call i64 @llvm.cttz.i64(i64 %24, i1 true)
+  // CHECK-NEXT: %cast14 = trunc i64 %25 to i32
+  // CHECK-NEXT: %iszero15 = icmp eq i64 %24, 0
+  // CHECK-NEXT: %26 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %conv16 = zext i16 %26 to i32
+  // CHECK-NEXT: %ctzg17 = select i1 %iszero15, i32 %conv16, i32 %cast14
+  // CHECK-NEXT: store volatile i32 %ctzg17, ptr %tz, align 4
+  tz = __builtin_ctzg(ull, i);
+  // CHECK-NEXT: %27 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %28 = call i64 @llvm.cttz.i64(i64 %27, i1 true)
+  // CHECK-NEXT: %cast18 = trunc i64 %28 to i32
+  // CHECK-NEXT: %iszero19 = icmp eq i64 %27, 0
+  // CHECK-NEXT: %29 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %ctzg20 = select i1 %iszero19, i32 %29, i32 %cast18
+  // CHECK-NEXT: store volatile i32 %ctzg20, ptr %tz, align 4
+  tz = __builtin_ctzg(ui128, i);
+  // CHECK-NEXT: %30 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %31 = call i128 @llvm.cttz.i128(i128 %30, i1 true)
+  // CHECK-NEXT: %cast21 = trunc i128 %31 to i32
+  // CHECK-NEXT: %iszero22 = icmp eq i128 %30, 0
+  // CHECK-NEXT: %32 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %ctzg23 = select i1 %iszero22, i32 %32, i32 %cast21
+  // CHECK-NEXT: store volatile i32 %ctzg23, ptr %tz, align 4
+  tz = __builtin_ctzg(ubi128, i);
+  // CHECK-NEXT: %33 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %34 = call i128 @llvm.cttz.i128(i128 %33, i1 true)
+  // CHECK-NEXT: %cast24 = trunc i128 %34 to i32
+  // CHECK-NEXT: %iszero25 = icmp eq i128 %33, 0
+  // CHECK-NEXT: %35 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %ctzg26 = select i1 %iszero25, i32 %35, i32 %cast24
+  // CHECK-NEXT: store vola...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Feb 29, 2024

@llvm/pr-subscribers-clang

Author: OverMighty (overmighty)

Changes

Fixes #83075, #83076.


Patch is 26.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/83431.diff

9 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+41)
  • (modified) clang/include/clang/Basic/Builtins.td (+12)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+8-7)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+34-6)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+53)
  • (modified) clang/test/CodeGen/builtins.c (+204)
  • (modified) clang/test/CodeGen/ubsan-builtin-checks.c (+6)
  • (removed) clang/test/Sema/builtin-popcountg.c (-23)
  • (added) clang/test/Sema/count-builtins.c (+87)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index bcd69198eafdbe..3d73d772f698ba 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -3504,6 +3504,47 @@ argument can be of any unsigned integer type.
 ``__builtin_popcount{,l,ll}`` builtins, with support for other integer types,
 such as ``unsigned __int128`` and C23 ``unsigned _BitInt(N)``.
 
+``__builtin_clzg`` and ``__builtin_ctzg``
+-----------------------------------------
+
+``__builtin_clzg`` (respectively ``__builtin_ctzg``) returns the number of
+leading (respectively trailing) 0 bits in the first argument. The first argument
+can be of any unsigned integer type.
+
+If the first argument is 0 and an optional second argument of ``int`` type is
+provided, then the second argument is returned. If the first argument is 0, but
+only one argument is provided, then the returned value is undefined.
+
+**Syntax**:
+
+.. code-block:: c++
+
+  int __builtin_clzg(type x[, int fallback])
+  int __builtin_ctzg(type x[, int fallback])
+
+**Examples**:
+
+.. code-block:: c++
+
+  unsigned int x = 1;
+  int x_lz = __builtin_clzg(x);
+  int x_tz = __builtin_ctzg(x);
+
+  unsigned long y = 2;
+  int y_lz = __builtin_clzg(y);
+  int y_tz = __builtin_ctzg(y);
+
+  unsigned _BitInt(128) z = 4;
+  int z_lz = __builtin_clzg(z);
+  int z_tz = __builtin_ctzg(z);
+
+**Description**:
+
+``__builtin_clzg`` (respectively ``__builtin_ctzg``) is meant to be a
+type-generic alternative to the ``__builtin_clz{,l,ll}`` (respectively
+``__builtin_ctz{,l,ll}``) builtins, with support for other integer types, such
+as ``unsigned __int128`` and C23 ``unsigned _BitInt(N)``.
+
 Multiprecision Arithmetic Builtins
 ----------------------------------
 
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 2fbc56d49a59a1..ec5a8819ed4057 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -662,6 +662,12 @@ def Clz : Builtin, BitShort_Int_Long_LongLongTemplate {
 
 // FIXME: Add int clzimax(uintmax_t)
 
+def Clzg : Builtin {
+  let Spellings = ["__builtin_clzg"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "int(...)";
+}
+
 def Ctz : Builtin, BitShort_Int_Long_LongLongTemplate {
   let Spellings = ["__builtin_ctz"];
   let Attributes = [NoThrow, Const, Constexpr];
@@ -670,6 +676,12 @@ def Ctz : Builtin, BitShort_Int_Long_LongLongTemplate {
 
 // FIXME: Add int ctzimax(uintmax_t)
 
+def Ctzg : Builtin {
+  let Spellings = ["__builtin_ctzg"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "int(...)";
+}
+
 def FFS : Builtin, BitInt_Long_LongLongTemplate {
   let Spellings = ["__builtin_ffs"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, Const, Constexpr];
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index f726805dc02bd9..d2a18b7bb9cfdd 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -11978,13 +11978,14 @@ def err_builtin_launder_invalid_arg : Error<
   "'__builtin_launder' is not allowed">;
 
 def err_builtin_invalid_arg_type: Error <
-  "%ordinal0 argument must be a "
-  "%select{vector, integer or floating point type|matrix|"
-  "pointer to a valid matrix element type|"
-  "signed integer or floating point type|vector type|"
-  "floating point type|"
-  "vector of integers|"
-  "type of unsigned integer}1 (was %2)">;
+  "%ordinal0 argument must be "
+  "%select{a vector, integer or floating point type|a matrix|"
+  "a pointer to a valid matrix element type|"
+  "a signed integer or floating point type|a vector type|"
+  "a floating point type|"
+  "a vector of integers|"
+  "an unsigned integer|"
+  "an 'int'}1 (was %2)">;
 
 def err_builtin_matrix_disabled: Error<
   "matrix types extension is disabled. Pass -fenable-matrix to enable it">;
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2d16e7cdc06053..bd7cf9ae79f690 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3128,8 +3128,14 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
   case Builtin::BI__builtin_ctzs:
   case Builtin::BI__builtin_ctz:
   case Builtin::BI__builtin_ctzl:
-  case Builtin::BI__builtin_ctzll: {
-    Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CTZPassedZero);
+  case Builtin::BI__builtin_ctzll:
+  case Builtin::BI__builtin_ctzg: {
+    bool HasFallback = BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_ctzg &&
+                       E->getNumArgs() > 1;
+
+    Value *ArgValue =
+        HasFallback ? EmitScalarExpr(E->getArg(0))
+                    : EmitCheckedArgForBuiltin(E->getArg(0), BCK_CTZPassedZero);
 
     llvm::Type *ArgType = ArgValue->getType();
     Function *F = CGM.getIntrinsic(Intrinsic::cttz, ArgType);
@@ -3140,13 +3146,27 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     if (Result->getType() != ResultType)
       Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,
                                      "cast");
-    return RValue::get(Result);
+    if (!HasFallback)
+      return RValue::get(Result);
+
+    Value *Zero = Constant::getNullValue(ArgType);
+    Value *IsZero = Builder.CreateICmpEQ(ArgValue, Zero, "iszero");
+    Value *FallbackValue = EmitScalarExpr(E->getArg(1));
+    Value *ResultOrFallback =
+        Builder.CreateSelect(IsZero, FallbackValue, Result, "ctzg");
+    return RValue::get(ResultOrFallback);
   }
   case Builtin::BI__builtin_clzs:
   case Builtin::BI__builtin_clz:
   case Builtin::BI__builtin_clzl:
-  case Builtin::BI__builtin_clzll: {
-    Value *ArgValue = EmitCheckedArgForBuiltin(E->getArg(0), BCK_CLZPassedZero);
+  case Builtin::BI__builtin_clzll:
+  case Builtin::BI__builtin_clzg: {
+    bool HasFallback = BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_clzg &&
+                       E->getNumArgs() > 1;
+
+    Value *ArgValue =
+        HasFallback ? EmitScalarExpr(E->getArg(0))
+                    : EmitCheckedArgForBuiltin(E->getArg(0), BCK_CLZPassedZero);
 
     llvm::Type *ArgType = ArgValue->getType();
     Function *F = CGM.getIntrinsic(Intrinsic::ctlz, ArgType);
@@ -3157,7 +3177,15 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     if (Result->getType() != ResultType)
       Result = Builder.CreateIntCast(Result, ResultType, /*isSigned*/true,
                                      "cast");
-    return RValue::get(Result);
+    if (!HasFallback)
+      return RValue::get(Result);
+
+    Value *Zero = Constant::getNullValue(ArgType);
+    Value *IsZero = Builder.CreateICmpEQ(ArgValue, Zero, "iszero");
+    Value *FallbackValue = EmitScalarExpr(E->getArg(1));
+    Value *ResultOrFallback =
+        Builder.CreateSelect(IsZero, FallbackValue, Result, "clzg");
+    return RValue::get(ResultOrFallback);
   }
   case Builtin::BI__builtin_ffs:
   case Builtin::BI__builtin_ffsl:
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 979b63884359fc..d0352d5d007a7a 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2212,6 +2212,54 @@ static bool SemaBuiltinPopcountg(Sema &S, CallExpr *TheCall) {
   return false;
 }
 
+/// Checks that __builtin_{clzg,ctzg} was called with a first argument, which is
+/// an unsigned integer, and an optional second argument, which is promoted to
+/// an 'int'.
+static bool SemaBuiltinCountZeroBitsGeneric(Sema &S, CallExpr *TheCall) {
+  if (checkArgCountRange(S, TheCall, 1, 2))
+    return true;
+
+  ExprResult Arg0Res = S.DefaultLvalueConversion(TheCall->getArg(0));
+  if (Arg0Res.isInvalid())
+    return true;
+
+  Expr *Arg0 = Arg0Res.get();
+  TheCall->setArg(0, Arg0);
+
+  QualType Arg0Ty = Arg0->getType();
+
+  if (!Arg0Ty->isUnsignedIntegerType()) {
+    S.Diag(Arg0->getBeginLoc(), diag::err_builtin_invalid_arg_type)
+        << 1 << /*unsigned integer ty*/ 7 << Arg0Ty;
+    return true;
+  }
+
+  if (TheCall->getNumArgs() > 1) {
+    ExprResult Arg1Res = S.DefaultLvalueConversion(TheCall->getArg(1));
+    if (Arg1Res.isInvalid())
+      return true;
+
+    Expr *Arg1 = Arg1Res.get();
+    TheCall->setArg(1, Arg1);
+
+    QualType Arg1Ty = Arg1->getType();
+
+    if (S.Context.isPromotableIntegerType(Arg1Ty)) {
+      Arg1Ty = S.Context.getPromotedIntegerType(Arg1Ty);
+      Arg1 = S.ImpCastExprToType(Arg1, Arg1Ty, CK_IntegralCast).get();
+      TheCall->setArg(1, Arg1);
+    }
+
+    if (!Arg1Ty->isSpecificBuiltinType(BuiltinType::Int)) {
+      S.Diag(Arg1->getBeginLoc(), diag::err_builtin_invalid_arg_type)
+          << 2 << /*'int' ty*/ 8 << Arg1Ty;
+      return true;
+    }
+  }
+
+  return false;
+}
+
 ExprResult
 Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
                                CallExpr *TheCall) {
@@ -2988,6 +3036,11 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
     if (SemaBuiltinPopcountg(*this, TheCall))
       return ExprError();
     break;
+  case Builtin::BI__builtin_clzg:
+  case Builtin::BI__builtin_ctzg:
+    if (SemaBuiltinCountZeroBitsGeneric(*this, TheCall))
+      return ExprError();
+    break;
   }
 
   if (getLangOpts().HLSL && CheckHLSLBuiltinFunctionCall(BuiltinID, TheCall))
diff --git a/clang/test/CodeGen/builtins.c b/clang/test/CodeGen/builtins.c
index 4f9641d357b7ba..407e0857d22311 100644
--- a/clang/test/CodeGen/builtins.c
+++ b/clang/test/CodeGen/builtins.c
@@ -983,4 +983,208 @@ void test_builtin_popcountg(unsigned char uc, unsigned short us,
   // CHECK-NEXT: ret void
 }
 
+// CHECK-LABEL: define{{.*}} void @test_builtin_clzg
+void test_builtin_clzg(unsigned char uc, unsigned short us, unsigned int ui,
+                       unsigned long ul, unsigned long long ull,
+                       unsigned __int128 ui128, unsigned _BitInt(128) ubi128,
+                       signed char sc, short s, int i) {
+  volatile int lz;
+  lz = __builtin_clzg(uc);
+  // CHECK: %1 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %2 = call i8 @llvm.ctlz.i8(i8 %1, i1 true)
+  // CHECK-NEXT: %cast = sext i8 %2 to i32
+  // CHECK-NEXT: store volatile i32 %cast, ptr %lz, align 4
+  lz = __builtin_clzg(us);
+  // CHECK-NEXT: %3 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %4 = call i16 @llvm.ctlz.i16(i16 %3, i1 true)
+  // CHECK-NEXT: %cast1 = sext i16 %4 to i32
+  // CHECK-NEXT: store volatile i32 %cast1, ptr %lz, align 4
+  lz = __builtin_clzg(ui);
+  // CHECK-NEXT: %5 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %6 = call i32 @llvm.ctlz.i32(i32 %5, i1 true)
+  // CHECK-NEXT: store volatile i32 %6, ptr %lz, align 4
+  lz = __builtin_clzg(ul);
+  // CHECK-NEXT: %7 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %8 = call i64 @llvm.ctlz.i64(i64 %7, i1 true)
+  // CHECK-NEXT: %cast2 = trunc i64 %8 to i32
+  // CHECK-NEXT: store volatile i32 %cast2, ptr %lz, align 4
+  lz = __builtin_clzg(ull);
+  // CHECK-NEXT: %9 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %10 = call i64 @llvm.ctlz.i64(i64 %9, i1 true)
+  // CHECK-NEXT: %cast3 = trunc i64 %10 to i32
+  // CHECK-NEXT: store volatile i32 %cast3, ptr %lz, align 4
+  lz = __builtin_clzg(ui128);
+  // CHECK-NEXT: %11 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %12 = call i128 @llvm.ctlz.i128(i128 %11, i1 true)
+  // CHECK-NEXT: %cast4 = trunc i128 %12 to i32
+  // CHECK-NEXT: store volatile i32 %cast4, ptr %lz, align 4
+  lz = __builtin_clzg(ubi128);
+  // CHECK-NEXT: %13 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %14 = call i128 @llvm.ctlz.i128(i128 %13, i1 true)
+  // CHECK-NEXT: %cast5 = trunc i128 %14 to i32
+  // CHECK-NEXT: store volatile i32 %cast5, ptr %lz, align 4
+  lz = __builtin_clzg(uc, sc);
+  // CHECK-NEXT: %15 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %16 = call i8 @llvm.ctlz.i8(i8 %15, i1 true)
+  // CHECK-NEXT: %cast6 = sext i8 %16 to i32
+  // CHECK-NEXT: %iszero = icmp eq i8 %15, 0
+  // CHECK-NEXT: %17 = load i8, ptr %sc.addr, align 1
+  // CHECK-NEXT: %conv = sext i8 %17 to i32
+  // CHECK-NEXT: %clzg = select i1 %iszero, i32 %conv, i32 %cast6
+  // CHECK-NEXT: store volatile i32 %clzg, ptr %lz, align 4
+  lz = __builtin_clzg(us, uc);
+  // CHECK-NEXT: %18 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %19 = call i16 @llvm.ctlz.i16(i16 %18, i1 true)
+  // CHECK-NEXT: %cast7 = sext i16 %19 to i32
+  // CHECK-NEXT: %iszero8 = icmp eq i16 %18, 0
+  // CHECK-NEXT: %20 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %conv9 = zext i8 %20 to i32
+  // CHECK-NEXT: %clzg10 = select i1 %iszero8, i32 %conv9, i32 %cast7
+  // CHECK-NEXT: store volatile i32 %clzg10, ptr %lz, align 4
+  lz = __builtin_clzg(ui, s);
+  // CHECK-NEXT: %21 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %22 = call i32 @llvm.ctlz.i32(i32 %21, i1 true)
+  // CHECK-NEXT: %iszero11 = icmp eq i32 %21, 0
+  // CHECK-NEXT: %23 = load i16, ptr %s.addr, align 2
+  // CHECK-NEXT: %conv12 = sext i16 %23 to i32
+  // CHECK-NEXT: %clzg13 = select i1 %iszero11, i32 %conv12, i32 %22
+  // CHECK-NEXT: store volatile i32 %clzg13, ptr %lz, align 4
+  lz = __builtin_clzg(ul, us);
+  // CHECK-NEXT: %24 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %25 = call i64 @llvm.ctlz.i64(i64 %24, i1 true)
+  // CHECK-NEXT: %cast14 = trunc i64 %25 to i32
+  // CHECK-NEXT: %iszero15 = icmp eq i64 %24, 0
+  // CHECK-NEXT: %26 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %conv16 = zext i16 %26 to i32
+  // CHECK-NEXT: %clzg17 = select i1 %iszero15, i32 %conv16, i32 %cast14
+  // CHECK-NEXT: store volatile i32 %clzg17, ptr %lz, align 4
+  lz = __builtin_clzg(ull, i);
+  // CHECK-NEXT: %27 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %28 = call i64 @llvm.ctlz.i64(i64 %27, i1 true)
+  // CHECK-NEXT: %cast18 = trunc i64 %28 to i32
+  // CHECK-NEXT: %iszero19 = icmp eq i64 %27, 0
+  // CHECK-NEXT: %29 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %clzg20 = select i1 %iszero19, i32 %29, i32 %cast18
+  // CHECK-NEXT: store volatile i32 %clzg20, ptr %lz, align 4
+  lz = __builtin_clzg(ui128, i);
+  // CHECK-NEXT: %30 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %31 = call i128 @llvm.ctlz.i128(i128 %30, i1 true)
+  // CHECK-NEXT: %cast21 = trunc i128 %31 to i32
+  // CHECK-NEXT: %iszero22 = icmp eq i128 %30, 0
+  // CHECK-NEXT: %32 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %clzg23 = select i1 %iszero22, i32 %32, i32 %cast21
+  // CHECK-NEXT: store volatile i32 %clzg23, ptr %lz, align 4
+  lz = __builtin_clzg(ubi128, i);
+   // CHECK-NEXT: %33 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %34 = call i128 @llvm.ctlz.i128(i128 %33, i1 true)
+  // CHECK-NEXT: %cast24 = trunc i128 %34 to i32
+  // CHECK-NEXT: %iszero25 = icmp eq i128 %33, 0
+  // CHECK-NEXT: %35 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %clzg26 = select i1 %iszero25, i32 %35, i32 %cast24
+  // CHECK-NEXT: store volatile i32 %clzg26, ptr %lz, align 4
+  // CHECK-NEXT: ret void
+}
+
+// CHECK-LABEL: define{{.*}} void @test_builtin_ctzg
+void test_builtin_ctzg(unsigned char uc, unsigned short us, unsigned int ui,
+                       unsigned long ul, unsigned long long ull,
+                       unsigned __int128 ui128, unsigned _BitInt(128) ubi128,
+                       signed char sc, short s, int i) {
+  volatile int tz;
+  tz = __builtin_ctzg(uc);
+  // CHECK: %1 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %2 = call i8 @llvm.cttz.i8(i8 %1, i1 true)
+  // CHECK-NEXT: %cast = sext i8 %2 to i32
+  // CHECK-NEXT: store volatile i32 %cast, ptr %tz, align 4
+  tz = __builtin_ctzg(us);
+  // CHECK-NEXT: %3 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %4 = call i16 @llvm.cttz.i16(i16 %3, i1 true)
+  // CHECK-NEXT: %cast1 = sext i16 %4 to i32
+  // CHECK-NEXT: store volatile i32 %cast1, ptr %tz, align 4
+  tz = __builtin_ctzg(ui);
+  // CHECK-NEXT: %5 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %6 = call i32 @llvm.cttz.i32(i32 %5, i1 true)
+  // CHECK-NEXT: store volatile i32 %6, ptr %tz, align 4
+  tz = __builtin_ctzg(ul);
+  // CHECK-NEXT: %7 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %8 = call i64 @llvm.cttz.i64(i64 %7, i1 true)
+  // CHECK-NEXT: %cast2 = trunc i64 %8 to i32
+  // CHECK-NEXT: store volatile i32 %cast2, ptr %tz, align 4
+  tz = __builtin_ctzg(ull);
+  // CHECK-NEXT: %9 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %10 = call i64 @llvm.cttz.i64(i64 %9, i1 true)
+  // CHECK-NEXT: %cast3 = trunc i64 %10 to i32
+  // CHECK-NEXT: store volatile i32 %cast3, ptr %tz, align 4
+  tz = __builtin_ctzg(ui128);
+  // CHECK-NEXT: %11 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %12 = call i128 @llvm.cttz.i128(i128 %11, i1 true)
+  // CHECK-NEXT: %cast4 = trunc i128 %12 to i32
+  // CHECK-NEXT: store volatile i32 %cast4, ptr %tz, align 4
+  tz = __builtin_ctzg(ubi128);
+  // CHECK-NEXT: %13 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %14 = call i128 @llvm.cttz.i128(i128 %13, i1 true)
+  // CHECK-NEXT: %cast5 = trunc i128 %14 to i32
+  // CHECK-NEXT: store volatile i32 %cast5, ptr %tz, align 4
+  tz = __builtin_ctzg(uc, sc);
+  // CHECK-NEXT: %15 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %16 = call i8 @llvm.cttz.i8(i8 %15, i1 true)
+  // CHECK-NEXT: %cast6 = sext i8 %16 to i32
+  // CHECK-NEXT: %iszero = icmp eq i8 %15, 0
+  // CHECK-NEXT: %17 = load i8, ptr %sc.addr, align 1
+  // CHECK-NEXT: %conv = sext i8 %17 to i32
+  // CHECK-NEXT: %ctzg = select i1 %iszero, i32 %conv, i32 %cast6
+  // CHECK-NEXT: store volatile i32 %ctzg, ptr %tz, align 4
+  tz = __builtin_ctzg(us, uc);
+  // CHECK-NEXT: %18 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %19 = call i16 @llvm.cttz.i16(i16 %18, i1 true)
+  // CHECK-NEXT: %cast7 = sext i16 %19 to i32
+  // CHECK-NEXT: %iszero8 = icmp eq i16 %18, 0
+  // CHECK-NEXT: %20 = load i8, ptr %uc.addr, align 1
+  // CHECK-NEXT: %conv9 = zext i8 %20 to i32
+  // CHECK-NEXT: %ctzg10 = select i1 %iszero8, i32 %conv9, i32 %cast7
+  // CHECK-NEXT: store volatile i32 %ctzg10, ptr %tz, align 4
+  tz = __builtin_ctzg(ui, s);
+  // CHECK-NEXT: %21 = load i32, ptr %ui.addr, align 4
+  // CHECK-NEXT: %22 = call i32 @llvm.cttz.i32(i32 %21, i1 true)
+  // CHECK-NEXT: %iszero11 = icmp eq i32 %21, 0
+  // CHECK-NEXT: %23 = load i16, ptr %s.addr, align 2
+  // CHECK-NEXT: %conv12 = sext i16 %23 to i32
+  // CHECK-NEXT: %ctzg13 = select i1 %iszero11, i32 %conv12, i32 %22
+  // CHECK-NEXT: store volatile i32 %ctzg13, ptr %tz, align 4
+  tz = __builtin_ctzg(ul, us);
+  // CHECK-NEXT: %24 = load i64, ptr %ul.addr, align 8
+  // CHECK-NEXT: %25 = call i64 @llvm.cttz.i64(i64 %24, i1 true)
+  // CHECK-NEXT: %cast14 = trunc i64 %25 to i32
+  // CHECK-NEXT: %iszero15 = icmp eq i64 %24, 0
+  // CHECK-NEXT: %26 = load i16, ptr %us.addr, align 2
+  // CHECK-NEXT: %conv16 = zext i16 %26 to i32
+  // CHECK-NEXT: %ctzg17 = select i1 %iszero15, i32 %conv16, i32 %cast14
+  // CHECK-NEXT: store volatile i32 %ctzg17, ptr %tz, align 4
+  tz = __builtin_ctzg(ull, i);
+  // CHECK-NEXT: %27 = load i64, ptr %ull.addr, align 8
+  // CHECK-NEXT: %28 = call i64 @llvm.cttz.i64(i64 %27, i1 true)
+  // CHECK-NEXT: %cast18 = trunc i64 %28 to i32
+  // CHECK-NEXT: %iszero19 = icmp eq i64 %27, 0
+  // CHECK-NEXT: %29 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %ctzg20 = select i1 %iszero19, i32 %29, i32 %cast18
+  // CHECK-NEXT: store volatile i32 %ctzg20, ptr %tz, align 4
+  tz = __builtin_ctzg(ui128, i);
+  // CHECK-NEXT: %30 = load i128, ptr %ui128.addr, align 16
+  // CHECK-NEXT: %31 = call i128 @llvm.cttz.i128(i128 %30, i1 true)
+  // CHECK-NEXT: %cast21 = trunc i128 %31 to i32
+  // CHECK-NEXT: %iszero22 = icmp eq i128 %30, 0
+  // CHECK-NEXT: %32 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %ctzg23 = select i1 %iszero22, i32 %32, i32 %cast21
+  // CHECK-NEXT: store volatile i32 %ctzg23, ptr %tz, align 4
+  tz = __builtin_ctzg(ubi128, i);
+  // CHECK-NEXT: %33 = load i128, ptr %ubi128.addr, align 8
+  // CHECK-NEXT: %34 = call i128 @llvm.cttz.i128(i128 %33, i1 true)
+  // CHECK-NEXT: %cast24 = trunc i128 %34 to i32
+  // CHECK-NEXT: %iszero25 = icmp eq i128 %33, 0
+  // CHECK-NEXT: %35 = load i32, ptr %i.addr, align 4
+  // CHECK-NEXT: %ctzg26 = select i1 %iszero25, i32 %35, i32 %cast24
+  // CHECK-NEXT: store vola...
[truncated]

@overmighty overmighty force-pushed the clang-builtin-clzg-ctzg branch from 3dbfd1f to 580ee1d Compare February 29, 2024 14:37
@@ -662,6 +662,12 @@ def Clz : Builtin, BitShort_Int_Long_LongLongTemplate {

// FIXME: Add int clzimax(uintmax_t)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this FIXME and the ctzimax(uintmax_t) one below be removed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably; it's unlikely we're going to add them at this point.


QualType Arg1Ty = Arg1->getType();

if (S.Context.isPromotableIntegerType(Arg1Ty)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite the same as the way promotion normally works for arguments; it forbids some conversions we would normally do. Maybe that's fine? Maybe worth stating a bit more explicitly in the documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had forgotten about Sema::UsualUnaryConversions, I think that's what I want.

@@ -3157,7 +3177,15 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
if (Result->getType() != ResultType)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the way this is handling isCLZForZeroUndef() is what you want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since in the additions below I check if the argument itself is zero instead of checking the result of ctlz, I don't see why the ctlz result for zero being undefined would be a problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking of the opposite: we don't need to make the result of the clz defined if we're not using it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Is it worth making tests for this with a target where isCLZForZeroUndef() is false?

@@ -662,6 +662,12 @@ def Clz : Builtin, BitShort_Int_Long_LongLongTemplate {

// FIXME: Add int clzimax(uintmax_t)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably; it's unlikely we're going to add them at this point.

@overmighty overmighty force-pushed the clang-builtin-clzg-ctzg branch from 580ee1d to 94a640c Compare March 4, 2024 13:28
@overmighty
Copy link
Member Author

cc @nickdesaulniers

@nickdesaulniers nickdesaulniers self-requested a review March 5, 2024 23:05
@nickdesaulniers
Copy link
Member

@efriedma-quic parting thoughts here?

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one minor comment


If the first argument is 0 and an optional second argument of ``int`` type is
provided, then the second argument is returned. If the first argument is 0, but
only one argument is provided, then the returned value is undefined.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should say "the behavior is undefined", to match the C standard wording for this sort of thing. (The IR intrinsic returns poison, but "poison" isn't a C language concept.)

@nickdesaulniers
Copy link
Member

@overmighty do you need one of us to commit this for you? Thanks for working on it!

@overmighty
Copy link
Member Author

I do need one of you to commit this for me. You're welcome. :)

@nickdesaulniers nickdesaulniers merged commit c1c2551 into llvm:main Mar 21, 2024
chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
// FIXME: Add int clzimax(uintmax_t)
def Clzg : Builtin {
let Spellings = ["__builtin_clzg"];
let Attributes = [NoThrow, Const, CustomTypeChecking];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Constexpr be in this list of attributes?

https://godbolt.org/z/Efd3EW5xE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will implement constexpr support. It requires adding a few lines in clang/lib/AST/ExprConstant.cpp and clang/lib/AST/Interp/InterpBuiltin.cpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #86549

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang] implement __builtin_ctzg [clang] implement __builtin_clzg
4 participants