Skip to content

[ClangFE] Improve handling of casting of atomic memory operations. #86691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 2, 2024

Conversation

JonPsson1
Copy link
Contributor

  • Factor out a shouldCastToInt() method.
  • Also pass through pointer type values to not be casted to integer.

The follow up improvement patch per recent review. The non-ieee FP types left out as it seems easier if someone working with that target does this part including test updates, which should be simple enough by now. Next step might be to allow FP CmpXchg, but leaving that for someone interested in that. :)

CC @uweigand

@JonPsson1 JonPsson1 requested a review from arsenm March 26, 2024 16:17
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Mar 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 26, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Jonas Paulsson (JonPsson1)

Changes
  • Factor out a shouldCastToInt() method.
  • Also pass through pointer type values to not be casted to integer.

The follow up improvement patch per recent review. The non-ieee FP types left out as it seems easier if someone working with that target does this part including test updates, which should be simple enough by now. Next step might be to allow FP CmpXchg, but leaving that for someone interested in that. :)

CC @uweigand


Full diff: https://github.com/llvm/llvm-project/pull/86691.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGAtomic.cpp (+41-39)
  • (modified) clang/test/CodeGen/atomic.c (+3-6)
diff --git a/clang/lib/CodeGen/CGAtomic.cpp b/clang/lib/CodeGen/CGAtomic.cpp
index fb03d013e8afc7..f543529efe5b69 100644
--- a/clang/lib/CodeGen/CGAtomic.cpp
+++ b/clang/lib/CodeGen/CGAtomic.cpp
@@ -197,11 +197,11 @@ namespace {
     llvm::Value *getScalarRValValueOrNull(RValue RVal) const;
 
     /// Converts an rvalue to integer value if needed.
-    llvm::Value *convertRValueToInt(RValue RVal, bool CastFP = true) const;
+    llvm::Value *convertRValueToInt(RValue RVal, bool CmpXchg = false) const;
 
     RValue ConvertToValueOrAtomic(llvm::Value *IntVal, AggValueSlot ResultSlot,
                                   SourceLocation Loc, bool AsValue,
-                                  bool CastFP = true) const;
+                                  bool CmpXchg = false) const;
 
     /// Copy an atomic r-value into atomic-layout memory.
     void emitCopyIntoMemory(RValue rvalue) const;
@@ -264,7 +264,7 @@ namespace {
                                llvm::AtomicOrdering AO, bool IsVolatile);
     /// Emits atomic load as LLVM instruction.
     llvm::Value *EmitAtomicLoadOp(llvm::AtomicOrdering AO, bool IsVolatile,
-                                  bool CastFP = true);
+                                  bool CmpXchg = false);
     /// Emits atomic compare-and-exchange op as a libcall.
     llvm::Value *EmitAtomicCompareExchangeLibcall(
         llvm::Value *ExpectedAddr, llvm::Value *DesiredAddr,
@@ -272,7 +272,9 @@ namespace {
             llvm::AtomicOrdering::SequentiallyConsistent,
         llvm::AtomicOrdering Failure =
             llvm::AtomicOrdering::SequentiallyConsistent);
-    /// Emits atomic compare-and-exchange op as LLVM instruction.
+    /// Emits atomic compare-and-exchange op as LLVM instruction. Operands
+    /// must be of integer or pointer type, so float must be casted.
+    /// TODO: this could change - see comment in AtomicExpandPass.cpp.
     std::pair<llvm::Value *, llvm::Value *> EmitAtomicCompareExchangeOp(
         llvm::Value *ExpectedVal, llvm::Value *DesiredVal,
         llvm::AtomicOrdering Success =
@@ -1399,13 +1401,22 @@ RValue AtomicInfo::convertAtomicTempToRValue(Address addr,
       LVal.getBaseInfo(), TBAAAccessInfo()));
 }
 
+static bool shouldCastToInt(llvm::Type *ValTy, bool CmpXchg) {
+  // TODO: Also pass through non-ieee FP types.
+  bool KeepType = (ValTy->isIntegerTy() || ValTy->isPointerTy() ||
+                   (ValTy->isIEEELikeFPTy() && !CmpXchg));
+  return !KeepType;
+}
+
 RValue AtomicInfo::ConvertToValueOrAtomic(llvm::Value *Val,
                                           AggValueSlot ResultSlot,
                                           SourceLocation Loc, bool AsValue,
-                                          bool CastFP) const {
+                                          bool CmpXchg) const {
   // Try not to in some easy cases.
-  assert((Val->getType()->isIntegerTy() || Val->getType()->isIEEELikeFPTy()) &&
-         "Expected integer or floating point value");
+  assert((Val->getType()->isIntegerTy() || Val->getType()->isPointerTy() ||
+          Val->getType()->isIEEELikeFPTy()) &&
+         "Expected integer, pointer or floating point value when converting "
+         "result.");
   if (getEvaluationKind() == TEK_Scalar &&
       (((!LVal.isBitField() ||
          LVal.getBitFieldInfo().Size == ValueSizeInBits) &&
@@ -1414,13 +1425,11 @@ RValue AtomicInfo::ConvertToValueOrAtomic(llvm::Value *Val,
     auto *ValTy = AsValue
                       ? CGF.ConvertTypeForMem(ValueTy)
                       : getAtomicAddress().getElementType();
-    if (ValTy->isIntegerTy() || (!CastFP && ValTy->isIEEELikeFPTy())) {
+    if (!shouldCastToInt(ValTy, CmpXchg)) {
       assert((!ValTy->isIntegerTy() || Val->getType() == ValTy) &&
              "Different integer types.");
       return RValue::get(CGF.EmitFromMemory(Val, ValueTy));
-    } else if (ValTy->isPointerTy())
-      return RValue::get(CGF.Builder.CreateIntToPtr(Val, ValTy));
-    else if (llvm::CastInst::isBitCastable(Val->getType(), ValTy))
+    } else if (llvm::CastInst::isBitCastable(Val->getType(), ValTy))
       return RValue::get(CGF.Builder.CreateBitCast(Val, ValTy));
   }
 
@@ -1457,10 +1466,10 @@ void AtomicInfo::EmitAtomicLoadLibcall(llvm::Value *AddForLoaded,
 }
 
 llvm::Value *AtomicInfo::EmitAtomicLoadOp(llvm::AtomicOrdering AO,
-                                          bool IsVolatile, bool CastFP) {
+                                          bool IsVolatile, bool CmpXchg) {
   // Okay, we're doing this natively.
   Address Addr = getAtomicAddress();
-  if (!(Addr.getElementType()->isIEEELikeFPTy() && !CastFP))
+  if (shouldCastToInt(Addr.getElementType(), CmpXchg))
     Addr = castToAtomicIntPointer(Addr);
   llvm::LoadInst *Load = CGF.Builder.CreateLoad(Addr, "atomic-load");
   Load->setAtomic(AO);
@@ -1521,7 +1530,7 @@ RValue AtomicInfo::EmitAtomicLoad(AggValueSlot ResultSlot, SourceLocation Loc,
   }
 
   // Okay, we're doing this natively.
-  auto *Load = EmitAtomicLoadOp(AO, IsVolatile, /*CastFP=*/false);
+  auto *Load = EmitAtomicLoadOp(AO, IsVolatile);
 
   // If we're ignoring an aggregate return, don't do anything.
   if (getEvaluationKind() == TEK_Aggregate && ResultSlot.isIgnored())
@@ -1529,8 +1538,7 @@ RValue AtomicInfo::EmitAtomicLoad(AggValueSlot ResultSlot, SourceLocation Loc,
 
   // Okay, turn that back into the original value or atomic (for non-simple
   // lvalues) type.
-  return ConvertToValueOrAtomic(Load, ResultSlot, Loc, AsValue,
-                                /*CastFP=*/false);
+  return ConvertToValueOrAtomic(Load, ResultSlot, Loc, AsValue);
 }
 
 /// Emit a load from an l-value of atomic type.  Note that the r-value
@@ -1599,20 +1607,17 @@ llvm::Value *AtomicInfo::getScalarRValValueOrNull(RValue RVal) const {
   return nullptr;
 }
 
-llvm::Value *AtomicInfo::convertRValueToInt(RValue RVal, bool CastFP) const {
+llvm::Value *AtomicInfo::convertRValueToInt(RValue RVal, bool CmpXchg) const {
   // If we've got a scalar value of the right size, try to avoid going
   // through memory. Floats get casted if needed by AtomicExpandPass.
   if (llvm::Value *Value = getScalarRValValueOrNull(RVal)) {
-    if (isa<llvm::IntegerType>(Value->getType()) ||
-        (!CastFP && Value->getType()->isIEEELikeFPTy()))
+    if (!shouldCastToInt(Value->getType(), CmpXchg))
       return CGF.EmitToMemory(Value, ValueTy);
     else {
       llvm::IntegerType *InputIntTy = llvm::IntegerType::get(
           CGF.getLLVMContext(),
           LVal.isSimple() ? getValueSizeInBits() : getAtomicSizeInBits());
-      if (isa<llvm::PointerType>(Value->getType()))
-        return CGF.Builder.CreatePtrToInt(Value, InputIntTy);
-      else if (llvm::BitCastInst::isBitCastable(Value->getType(), InputIntTy))
+      if (llvm::BitCastInst::isBitCastable(Value->getType(), InputIntTy))
         return CGF.Builder.CreateBitCast(Value, InputIntTy);
     }
   }
@@ -1685,13 +1690,14 @@ std::pair<RValue, llvm::Value *> AtomicInfo::EmitAtomicCompareExchange(
 
   // If we've got a scalar value of the right size, try to avoid going
   // through memory.
-  auto *ExpectedVal = convertRValueToInt(Expected);
-  auto *DesiredVal = convertRValueToInt(Desired);
+  auto *ExpectedVal = convertRValueToInt(Expected, /*CmpXchg=*/true);
+  auto *DesiredVal = convertRValueToInt(Desired, /*CmpXchg=*/true);
   auto Res = EmitAtomicCompareExchangeOp(ExpectedVal, DesiredVal, Success,
                                          Failure, IsWeak);
   return std::make_pair(
       ConvertToValueOrAtomic(Res.first, AggValueSlot::ignored(),
-                             SourceLocation(), /*AsValue=*/false),
+                             SourceLocation(), /*AsValue=*/false,
+                             /*CmpXchg=*/true),
       Res.second);
 }
 
@@ -1785,7 +1791,7 @@ void AtomicInfo::EmitAtomicUpdateOp(
   auto Failure = llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(AO);
 
   // Do the atomic load.
-  auto *OldVal = EmitAtomicLoadOp(Failure, IsVolatile);
+  auto *OldVal = EmitAtomicLoadOp(Failure, IsVolatile, /*CmpXchg=*/true);
   // For non-simple lvalues perform compare-and-swap procedure.
   auto *ContBB = CGF.createBasicBlock("atomic_cont");
   auto *ExitBB = CGF.createBasicBlock("atomic_exit");
@@ -1801,7 +1807,8 @@ void AtomicInfo::EmitAtomicUpdateOp(
     CGF.Builder.CreateStore(PHI, NewAtomicIntAddr);
   }
   auto OldRVal = ConvertToValueOrAtomic(PHI, AggValueSlot::ignored(),
-                                        SourceLocation(), /*AsValue=*/false);
+                                        SourceLocation(), /*AsValue=*/false,
+                                        /*CmpXchg=*/true);
   EmitAtomicUpdateValue(CGF, *this, OldRVal, UpdateOp, NewAtomicAddr);
   auto *DesiredVal = CGF.Builder.CreateLoad(NewAtomicIntAddr);
   // Try to write new value using cmpxchg operation.
@@ -1867,7 +1874,7 @@ void AtomicInfo::EmitAtomicUpdateOp(llvm::AtomicOrdering AO, RValue UpdateRVal,
   auto Failure = llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(AO);
 
   // Do the atomic load.
-  auto *OldVal = EmitAtomicLoadOp(Failure, IsVolatile);
+  auto *OldVal = EmitAtomicLoadOp(Failure, IsVolatile, /*CmpXchg=*/true);
   // For non-simple lvalues perform compare-and-swap procedure.
   auto *ContBB = CGF.createBasicBlock("atomic_cont");
   auto *ExitBB = CGF.createBasicBlock("atomic_exit");
@@ -1966,21 +1973,16 @@ void CodeGenFunction::EmitAtomicStore(RValue rvalue, LValue dest,
     }
 
     // Okay, we're doing this natively.
-    llvm::Value *ValToStore =
-        atomics.convertRValueToInt(rvalue, /*CastFP=*/false);
+    llvm::Value *ValToStore = atomics.convertRValueToInt(rvalue);
 
     // Do the atomic store.
     Address Addr = atomics.getAtomicAddress();
-    bool ShouldCastToInt = true;
     if (llvm::Value *Value = atomics.getScalarRValValueOrNull(rvalue))
-      if (isa<llvm::IntegerType>(Value->getType()) ||
-          Value->getType()->isIEEELikeFPTy())
-        ShouldCastToInt = false;
-    if (ShouldCastToInt) {
-      Addr = atomics.castToAtomicIntPointer(Addr);
-      ValToStore = Builder.CreateIntCast(ValToStore, Addr.getElementType(),
-                                         /*isSigned=*/false);
-    }
+      if (shouldCastToInt(Value->getType(), /*CmpXchg=*/false)) {
+        Addr = atomics.castToAtomicIntPointer(Addr);
+        ValToStore = Builder.CreateIntCast(ValToStore, Addr.getElementType(),
+                                           /*isSigned=*/false);
+      }
     llvm::StoreInst *store = Builder.CreateStore(ValToStore, Addr);
 
     if (AO == llvm::AtomicOrdering::Acquire)
diff --git a/clang/test/CodeGen/atomic.c b/clang/test/CodeGen/atomic.c
index af5c056bbfe6e8..e919f2ad374447 100644
--- a/clang/test/CodeGen/atomic.c
+++ b/clang/test/CodeGen/atomic.c
@@ -134,14 +134,11 @@ static _Atomic float glob_flt = 0.0f;
 
 void force_global_uses(void) {
   (void)glob_pointer;
-  // CHECK: %[[LOCAL_INT:.+]] = load atomic i32, ptr @[[GLOB_POINTER]] seq_cst
-  // CHECK-NEXT: inttoptr i32 %[[LOCAL_INT]] to ptr
+  // CHECK: load atomic ptr, ptr @[[GLOB_POINTER]] seq_cst
   (void)glob_pointer_from_int;
-  // CHECK: %[[LOCAL_INT_2:.+]] = load atomic i32, ptr @[[GLOB_POINTER_FROM_INT]] seq_cst
-  // CHECK-NEXT: inttoptr i32 %[[LOCAL_INT_2]] to ptr
+  // CHECK-NEXT: load atomic ptr, ptr @[[GLOB_POINTER_FROM_INT]] seq_cst
   (void)nonstatic_glob_pointer_from_int;
-  // CHECK: %[[LOCAL_INT_3:.+]] = load atomic i32, ptr @[[NONSTATIC_GLOB_POINTER_FROM_INT]] seq_cst
-  // CHECK-NEXT: inttoptr i32 %[[LOCAL_INT_3]] to ptr
+  // CHECK-NEXT: load atomic ptr, ptr @[[NONSTATIC_GLOB_POINTER_FROM_INT]] seq_cst
   (void)glob_int;
   // CHECK: load atomic i32, ptr @[[GLOB_INT]] seq_cst
   (void)glob_flt;

@@ -1399,13 +1401,22 @@ RValue AtomicInfo::convertAtomicTempToRValue(Address addr,
LVal.getBaseInfo(), TBAAAccessInfo()));
}

static bool shouldCastToInt(llvm::Type *ValTy, bool CmpXchg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer should just be false. I see no reason to maintain any of this logic in clang

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit messy...

  • For one thing, there can't currently be an atomic CmpXchg of float type - per the spec - so that must be casted to int. That would be a later patch on its own, including as well changes in the spec document and AtomicExpand.

  • An atomic struct needs to be casted to integer in EmitAtomicLoadOp().

  • In ConvertToValueOrAtomic() and convertRValueToInt(), scalars are treated specially. But still FP80 and float/CmpXchg scalars currrently need casting.

  • FP80 also seems like a separate patch if there's a need. I guess maybe in that case it does make sense to cast it to i128 here, as there is only one target using it, and it's a special non-power-of-2 case..?

So it still looks to me that there are these two changes that would first need handling, and then this could probably be improved further.

@arsenm
Copy link
Contributor

arsenm commented Mar 27, 2024

The non-ieee FP types left out as it seems easier if someone working with that target does this part including test updates, which should be simple enough by now.

Just add the tests

@JonPsson1
Copy link
Contributor Author

The non-ieee FP types left out as it seems easier if someone working with that target does this part including test updates, which should be simple enough by now.

Just add the tests

I think this case isn't that simple as it is an 80 bit value. Currently that is loaded atomically first with i128, then stored as a temporary and then loaded as an fp80. If I remove that casting, the verifier complains "atomic memory access' operand must have a power-of-two size".

@arsenm
Copy link
Contributor

arsenm commented Mar 27, 2024

I think this case isn't that simple as it is an 80 bit value. Currently that is loaded atomically first with i128, then stored as a temporary and then loaded as an fp80. If I remove that casting, the verifier complains "atomic memory access' operand must have a power-of-two size".

I think this should be more targeted based on the bitesize. PPCf128 should be simple

@JonPsson1
Copy link
Contributor Author

I think this case isn't that simple as it is an 80 bit value. Currently that is loaded atomically first with i128, then stored as a temporary and then loaded as an fp80. If I remove that casting, the verifier complains "atomic memory access' operand must have a power-of-two size".

I think this should be more targeted based on the bitesize. PPCf128 should be simple

Patch updated to handle all FP types except FP80.

(void)nonstatic_glob_pointer_from_int;
// CHECK: %[[LOCAL_INT_3:.+]] = load atomic i32, ptr @[[NONSTATIC_GLOB_POINTER_FROM_INT]] seq_cst
// CHECK-NEXT: inttoptr i32 %[[LOCAL_INT_3]] to ptr
// CHECK-NEXT: load atomic ptr, ptr @[[NONSTATIC_GLOB_POINTER_FROM_INT]] seq_cst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests for the all the FP cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests for double and long double, and also a run for systemz as the long double type is fp80 on x86, evidently.

@@ -1399,13 +1401,22 @@ RValue AtomicInfo::convertAtomicTempToRValue(Address addr,
LVal.getBaseInfo(), TBAAAccessInfo()));
}

static bool shouldCastToInt(llvm::Type *ValTy, bool CmpXchg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the CmpXchg parameter. Also needs a docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was a bit more clear than 'CastFP', as the reason for it now is the compare-and-exchange instruction only.
Also added comment.

@JonPsson1
Copy link
Contributor Author

Updated per review.

@JonPsson1 JonPsson1 merged commit 4d5e834 into llvm:main Apr 2, 2024
@JonPsson1 JonPsson1 deleted the CFEAtomicCasts branch April 2, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants