Skip to content

[CIR] Lowering to LLVM for global pointers #125619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 5, 2025

Conversation

andykaylor
Copy link
Contributor

Add support for lowering global variables of any pointer type to LLVM IR.

Add support for lowering global variables of any pointer type to LLVM IR.
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Feb 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 4, 2025

@llvm/pr-subscribers-clangir

@llvm/pr-subscribers-clang

Author: Andy Kaylor (andykaylor)

Changes

Add support for lowering global variables of any pointer type to LLVM IR.


Full diff: https://github.com/llvm/llvm-project/pull/125619.diff

4 Files Affected:

  • (added) clang/include/clang/CIR/Dialect/IR/CIRAttrVisitor.h (+51)
  • (modified) clang/include/clang/CIR/MissingFeatures.h (+3)
  • (modified) clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp (+84)
  • (modified) clang/test/CIR/Lowering/global-var-simple.cpp (+21)
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrVisitor.h b/clang/include/clang/CIR/Dialect/IR/CIRAttrVisitor.h
new file mode 100644
index 000000000000000..4babccc48038e61
--- /dev/null
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrVisitor.h
@@ -0,0 +1,51 @@
+#ifndef LLVM_CLANG_CIR_DIALECT_IR_CIRATTRVISITOR_H
+#define LLVM_CLANG_CIR_DIALECT_IR_CIRATTRVISITOR_H
+
+#include "clang/CIR/Dialect/IR/CIRAttrs.h"
+
+namespace cir {
+
+template <typename ImplClass, typename RetTy> class CirAttrVisitor {
+public:
+  // FIXME: Create a TableGen list to automatically handle new attributes
+  template <typename... Args>
+  RetTy visit(mlir::Attribute attr, Args &&...args) {
+    if (const auto intAttr = mlir::dyn_cast<cir::IntAttr>(attr))
+      return static_cast<ImplClass *>(this)->visitCirIntAttr(
+          intAttr, std::forward<Args>(args)...);
+    if (const auto fltAttr = mlir::dyn_cast<cir::FPAttr>(attr))
+      return static_cast<ImplClass *>(this)->visitCirFPAttr(
+          fltAttr, std::forward<Args>(args)...);
+    if (const auto ptrAttr = mlir::dyn_cast<cir::ConstPtrAttr>(attr))
+      return static_cast<ImplClass *>(this)->visitCirConstPtrAttr(
+          ptrAttr, std::forward<Args>(args)...);
+    llvm_unreachable("unhandled attribute type");
+  }
+
+  // If the implementation chooses not to implement a certain visit
+  // method, fall back to the parent.
+  template <typename... Args>
+  RetTy visitCirIntAttr(cir::IntAttr attr, Args &&...args) {
+    return static_cast<ImplClass *>(this)->visitCirAttr(
+        attr, std::forward<Args>(args)...);
+  }
+  template <typename... Args>
+  RetTy visitCirFPAttr(cir::FPAttr attr, Args &&...args) {
+    return static_cast<ImplClass *>(this)->visitCirAttr(
+        attr, std::forward<Args>(args)...);
+  }
+  template <typename... Args>
+  RetTy visitCirConstPtrAttr(cir::ConstPtrAttr attr, Args &&...args) {
+    return static_cast<ImplClass *>(this)->visitCirAttr(
+        attr, std::forward<Args>(args)...);
+  }
+
+  template <typename... Args>
+  RetTy visitCirAttr(mlir::Attribute attr, Args &&...args) {
+    return RetTy();
+  }
+};
+
+} // namespace cir
+
+#endif // LLVM_CLANG_CIR_DIALECT_IR_CIRATTRVISITOR_H
diff --git a/clang/include/clang/CIR/MissingFeatures.h b/clang/include/clang/CIR/MissingFeatures.h
index 3c018aeea650142..d4fcd52e7e6e3bc 100644
--- a/clang/include/clang/CIR/MissingFeatures.h
+++ b/clang/include/clang/CIR/MissingFeatures.h
@@ -27,6 +27,9 @@ struct MissingFeatures {
   // Address space related
   static bool addressSpace() { return false; }
 
+  // This isn't needed until we add support for bools.
+  static bool convertTypeForMemory() { return false; }
+
   // Unhandled global/linkage information.
   static bool opGlobalDSOLocal() { return false; }
   static bool opGlobalThreadLocal() { return false; }
diff --git a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
index af8ca7d0b89e681..66f6ee328e55e5d 100644
--- a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+++ b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
@@ -24,6 +24,7 @@
 #include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
 #include "mlir/Target/LLVMIR/Export.h"
 #include "mlir/Transforms/DialectConversion.h"
+#include "clang/CIR/Dialect/IR/CIRAttrVisitor.h"
 #include "clang/CIR/Dialect/IR/CIRDialect.h"
 #include "clang/CIR/MissingFeatures.h"
 #include "llvm/IR/Module.h"
@@ -35,6 +36,54 @@ using namespace llvm;
 namespace cir {
 namespace direct {
 
+class CIRAttrToValue : public CirAttrVisitor<CIRAttrToValue, mlir::Value> {
+public:
+  mlir::Value lowerCirAttrAsValue(mlir::Operation *parentOp,
+                                  mlir::Attribute attr,
+                                  mlir::ConversionPatternRewriter &rewriter,
+                                  const mlir::TypeConverter *converter,
+                                  mlir::DataLayout const &dataLayout) {
+    return visit(attr, parentOp, rewriter, converter, dataLayout);
+  }
+
+  mlir::Value visitCirIntAttr(cir::IntAttr intAttr, mlir::Operation *parentOp,
+                              mlir::ConversionPatternRewriter &rewriter,
+                              const mlir::TypeConverter *converter,
+                              mlir::DataLayout const &dataLayout) {
+    auto loc = parentOp->getLoc();
+    return rewriter.create<mlir::LLVM::ConstantOp>(
+        loc, converter->convertType(intAttr.getType()), intAttr.getValue());
+  }
+
+  mlir::Value visitCirFPAttr(cir::FPAttr fltAttr, mlir::Operation *parentOp,
+                             mlir::ConversionPatternRewriter &rewriter,
+                             const mlir::TypeConverter *converter,
+                             mlir::DataLayout const &dataLayout) {
+    auto loc = parentOp->getLoc();
+    return rewriter.create<mlir::LLVM::ConstantOp>(
+        loc, converter->convertType(fltAttr.getType()), fltAttr.getValue());
+  }
+
+  mlir::Value visitCirConstPtrAttr(cir::ConstPtrAttr ptrAttr,
+                                   mlir::Operation *parentOp,
+                                   mlir::ConversionPatternRewriter &rewriter,
+                                   const mlir::TypeConverter *converter,
+                                   mlir::DataLayout const &dataLayout) {
+    auto loc = parentOp->getLoc();
+    if (ptrAttr.isNullValue()) {
+      return rewriter.create<mlir::LLVM::ZeroOp>(
+          loc, converter->convertType(ptrAttr.getType()));
+    }
+    mlir::DataLayout layout(parentOp->getParentOfType<mlir::ModuleOp>());
+    mlir::Value ptrVal = rewriter.create<mlir::LLVM::ConstantOp>(
+        loc,
+        rewriter.getIntegerType(layout.getTypeSizeInBits(ptrAttr.getType())),
+        ptrAttr.getValue().getInt());
+    return rewriter.create<mlir::LLVM::IntToPtrOp>(
+        loc, converter->convertType(ptrAttr.getType()), ptrVal);
+  }
+};
+
 // This pass requires the CIR to be in a "flat" state. All blocks in each
 // function must belong to the parent region. Once scopes and control flow
 // are implemented in CIR, a pass will be run before this one to flatten
@@ -55,14 +104,19 @@ struct ConvertCIRToLLVMPass
   StringRef getArgument() const override { return "cir-flat-to-llvm"; }
 };
 
+/// Replace CIR global with a region initialized LLVM global and update
+/// insertion point to the end of the initializer block.
+
 mlir::LogicalResult CIRToLLVMGlobalOpLowering::matchAndRewrite(
     cir::GlobalOp op, OpAdaptor adaptor,
     mlir::ConversionPatternRewriter &rewriter) const {
 
   // Fetch required values to create LLVM op.
   const mlir::Type cirSymType = op.getSymType();
+  const auto loc = op.getLoc();
 
   // This is the LLVM dialect type.
+  assert(!cir::MissingFeatures::convertTypeForMemory());
   const mlir::Type llvmType = getTypeConverter()->convertType(cirSymType);
   // FIXME: These default values are placeholders until the the equivalent
   //        attributes are available on cir.global ops.
@@ -84,6 +138,19 @@ mlir::LogicalResult CIRToLLVMGlobalOpLowering::matchAndRewrite(
   SmallVector<mlir::NamedAttribute> attributes;
 
   if (init.has_value()) {
+    auto setupRegionInitializedLLVMGlobalOp = [&]() {
+      assert(!cir::MissingFeatures::convertTypeForMemory());
+      const mlir::Type llvmType =
+          getTypeConverter()->convertType(op.getSymType());
+      SmallVector<mlir::NamedAttribute> attributes;
+      auto newGlobalOp = rewriter.replaceOpWithNewOp<mlir::LLVM::GlobalOp>(
+          op, llvmType, isConst, linkage, op.getSymName(), nullptr, alignment,
+          addrSpace, isDsoLocal, isThreadLocal,
+          /*comdat=*/mlir::SymbolRefAttr(), attributes);
+      newGlobalOp.getRegion().push_back(new mlir::Block());
+      rewriter.setInsertionPointToEnd(newGlobalOp.getInitializerBlock());
+    };
+
     if (const auto fltAttr = mlir::dyn_cast<cir::FPAttr>(init.value())) {
       // Initializer is a constant floating-point number: convert to MLIR
       // builtin constant.
@@ -92,6 +159,16 @@ mlir::LogicalResult CIRToLLVMGlobalOpLowering::matchAndRewrite(
                    mlir::dyn_cast<cir::IntAttr>(init.value())) {
       // Initializer is a constant array: convert it to a compatible llvm init.
       init = rewriter.getIntegerAttr(llvmType, intAttr.getValue());
+    } else if (isa<cir::ConstPtrAttr>(init.value())) {
+      // TODO(cir): once LLVM's dialect has proper equivalent attributes this
+      // should be updated. For now, we use a custom op to initialize globals
+      // to the appropriate value.
+      setupRegionInitializedLLVMGlobalOp();
+      CIRAttrToValue attrVisitor;
+      mlir::Value value = attrVisitor.lowerCirAttrAsValue(
+          op, init.value(), rewriter, typeConverter, dataLayout);
+      rewriter.create<mlir::LLVM::ReturnOp>(loc, value);
+      return mlir::success();
     } else {
       op.emitError() << "unsupported initializer '" << init.value() << "'";
       return mlir::failure();
@@ -109,6 +186,13 @@ mlir::LogicalResult CIRToLLVMGlobalOpLowering::matchAndRewrite(
 
 static void prepareTypeConverter(mlir::LLVMTypeConverter &converter,
                                  mlir::DataLayout &dataLayout) {
+  converter.addConversion([&](cir::PointerType type) -> mlir::Type {
+    // Drop pointee type since LLVM dialect only allows opaque pointers.
+    assert(!cir::MissingFeatures::addressSpace());
+    unsigned targetAS = 0;
+
+    return mlir::LLVM::LLVMPointerType::get(type.getContext(), targetAS);
+  });
   converter.addConversion([&](cir::IntType type) -> mlir::Type {
     // LLVM doesn't work with signed types, so we drop the CIR signs here.
     return mlir::IntegerType::get(type.getContext(), type.getWidth());
diff --git a/clang/test/CIR/Lowering/global-var-simple.cpp b/clang/test/CIR/Lowering/global-var-simple.cpp
index 06050e409d54401..a5adb4011931afe 100644
--- a/clang/test/CIR/Lowering/global-var-simple.cpp
+++ b/clang/test/CIR/Lowering/global-var-simple.cpp
@@ -79,3 +79,24 @@ long double ld;
 
 __float128 f128;
 // CHECK: @f128 = external dso_local global fp128
+
+void *vp;
+// CHECK: @vp = external dso_local global ptr{{$}}
+
+int *ip = 0;
+// CHECK: @ip = dso_local global ptr null
+
+double *dp;
+// CHECK: @dp = external dso_local global ptr{{$}}
+
+char **cpp;
+// CHECK: @cpp = external dso_local global ptr{{$}}
+
+void (*fp)();
+// CHECK: @fp = external dso_local global ptr{{$}}
+
+int (*fpii)(int) = 0;
+// CHECK: @fpii = dso_local global ptr null
+
+void (*fpvar)(int, ...);
+// CHECK: @fpvar = external dso_local global ptr{{$}}

@andykaylor
Copy link
Contributor Author

This patch includes an experimental attempt at a CIR attribute visitor along the lines that @erichkeane suggested in a previous PR review. Unfortunately, I wasn't able to use the new visitor in the place that he suggested it without an ugly shoehorn because different handlers there would need different return behavior. I thought it would be worth posting it for discussion of the one place I was able to use it to see if we think it's worth pursuing to implement with proper TableGen support and some way of handling the init value case.

@@ -84,6 +138,19 @@ mlir::LogicalResult CIRToLLVMGlobalOpLowering::matchAndRewrite(
SmallVector<mlir::NamedAttribute> attributes;

if (init.has_value()) {
auto setupRegionInitializedLLVMGlobalOp = [&]() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer this get extracted to a function in an anonymous namespace (or a static function). It does quite a lot of work for a lambda.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a separate function in the incubator. The reason I moved it to a lambda is that there was a lot of state information needed that would either need to be passed in as arguments or recreated. The state is per-global so it can't just be stored in the CIRToLLVMGlobalOpLowering object. It's the set of arguments that are passed to replaceOpWithNewOp()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I see, I missed how much it was capturing. Can you link me to the incubator version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite as bad there because everything is coming from the op object. It was worse here because of all the placeholders for things the op doesn't have yet here. Still the incubator implementation retrieves a lot of this in multiple places and only calls the function in this scope.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, hrmph. That is unfortunately a much nicer version as a separate function. Can we have a FIXME to move this once dataLayout, typeConverter, and init.value get moved? (or whatever else needs to be moved?). It would be great if we could do that sooner, but I see what you mean.

mlir::Value value = attrVisitor.lowerCirAttrAsValue(
op, init.value(), rewriter, typeConverter, dataLayout);
rewriter.create<mlir::LLVM::ReturnOp>(loc, value);
return mlir::success();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... this already changing to make these not 'equal' in some way is causing me consternation. I realize this is the reason for an inability to skip the visitor, which I see the reasoning for here, but this function is going to QUICKLY get unreadable without some sort of better organization.

We should noodle on a way to better organize this into something that isn't going to get unwieldy. And by 'we' I believe that is a group-homework-assignment :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Looking at the incubator implementation (which is a huge series of if-else-if statements), I see that there are basically two categories -- (1) types for which the init value is updated directly before falling through to the replaceOpWithNewOp() call, and (2) types for which a region-initialized global is required. I think I can restructure the code around this decision and make it much clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3 Feel free to do so as a followup and not here if you'd prefer. This is more of a "we should fix this before it gets bad", not that it is bad yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea Andy

mlir::LogicalResult CIRToLLVMGlobalOpLowering::matchAndRewrite(
cir::GlobalOp op, OpAdaptor adaptor,
mlir::ConversionPatternRewriter &rewriter) const {

// Fetch required values to create LLVM op.
const mlir::Type cirSymType = op.getSymType();
const auto loc = op.getLoc();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this passes our use of auto policy here. Unless the type is named Loc :)

EDIT: Looked, it isnt:

Suggested change
const auto loc = op.getLoc();
const Location loc = op.getLoc();

}

mlir::Value visitCirIntAttr(cir::IntAttr intAttr, mlir::Operation *parentOp,
mlir::ConversionPatternRewriter &rewriter,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of all the arguments, typically with the visitors, we have them become members of the visitor itself. So this would be:

mlir::Value visitCirIntAttr(cir::IntAttr intAttr) {
  Location Loc parentOp->getLoc();
return rewriter.create<mlir::LLVM::ConstantOp>(
        loc, converter->convertType(intAttr.getType()), intAttr.getValue());
}

still, but they would be members instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my first thought as well. As I mentioned in another response, the compiler didn't want to copy my reference to the rewriter, something about a deleted copy constructor. I'll take another look at that to see if I can figure out what was going on.

mlir::ConversionPatternRewriter &rewriter,
const mlir::TypeConverter *converter,
mlir::DataLayout const &dataLayout) {
auto loc = parentOp->getLoc();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto loc = parentOp->getLoc();
Location loc = parentOp->getLoc();

template <typename... Args>
RetTy visit(mlir::Attribute attr, Args &&...args) {
if (const auto intAttr = mlir::dyn_cast<cir::IntAttr>(attr))
return static_cast<ImplClass *>(this)->visitCirIntAttr(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a getDerived instead of the static_cast everywhere.

public:
// FIXME: Create a TableGen list to automatically handle new attributes
template <typename... Args>
RetTy visit(mlir::Attribute attr, Args &&...args) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variadic is an interesting idea! I fear that gets unwieldy in practice though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I picked this idea up from the rewriter's replaceOpWithNewOp function. That actually did give me fits when I was trying to figure out the expected argument types because the compiler kept telling me I was calling build() with the wrong argument types and it took me a while to figure out where the build() call was getting its arguments.

The reason I went with it here is that I needed a way to get the rewriter (which is passed to my calling function as a reference) to the visit functions. The compiler didn't like me trying to save the rewriter reference as a member variable, and the lifetime management of that felt sketchy anyway. The variadic here lets me pass it through as a reference to where it needed to be, which is what happens in the incubator's non-visitor implementation of this.

I'm sure I can figure out a different way to handle this if we all agree that the variadic makes things too obscure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the compiler kept telling me I was calling build() with the wrong argument types

Yea, this isn't great experience (being myself there often) - I'd say we should optimize for our ability to be efficient writing code, these things can be a time drain

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 comment based issues, else lgtm

@@ -0,0 +1,40 @@
#ifndef LLVM_CLANG_CIR_DIALECT_IR_CIRATTRVISITOR_H
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file probably needs our common comment header.

@@ -55,14 +121,81 @@ struct ConvertCIRToLLVMPass
StringRef getArgument() const override { return "cir-flat-to-llvm"; }
};

bool CIRToLLVMGlobalOpLowering::attrRequiresRegionInitialization(
mlir::Attribute attr) const {
// There will be more case added later.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// There will be more case added later.
// There will be more cases added later.

@andykaylor
Copy link
Contributor Author

@lanza @bcardosolopes Do you want to look this over before I commit it? This change introduces some restructuring that I'll want to incorporate in the incubator, where it will have a somewhat bigger impact, so I'd like to get your buy-in or objections now before I start working on that incubator change. Thanks.

@bcardosolopes
Copy link
Member

@lanza @bcardosolopes Do you want to look this over before I commit it? This change introduces some restructuring that I'll want to incorporate in the incubator, where it will have a somewhat bigger impact, so I'd like to get your buy-in or objections now before I start working on that incubator change. Thanks.

Looking now!

@@ -0,0 +1,52 @@
//===- TypeConverter.h - Convert builtin to LLVM dialect types --*- C++ -*-===//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeConverter.h -> CIRAttrVisitor.h

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andykaylor andykaylor merged commit 2b5cc89 into llvm:main Feb 5, 2025
6 of 7 checks passed
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
Add support for lowering global variables of any pointer type to LLVM
IR.
@andykaylor andykaylor deleted the cir-lower-global-ptr branch March 4, 2025 22:53
bcardosolopes pushed a commit to bcardosolopes/llvm-project that referenced this pull request Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants