Skip to content

[RFC][flang] Replace special symbols in uniqued global names. #104859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion flang/include/flang/Optimizer/CodeGen/CGPasses.td
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,11 @@ def FIRToLLVMLowering : Pass<"fir-to-llvm-ir", "mlir::ModuleOp"> {
Option<"forcedTargetFeatures", "target-features", "std::string",
/*default=*/"", "Override module's target features.">,
Option<"applyTBAA", "apply-tbaa", "bool", /*default=*/"false",
"Attach TBAA tags to memory accessing operations.">
"Attach TBAA tags to memory accessing operations.">,
Option<"typeDescriptorsRenamedForAssembly",
"type-descriptors-renamed-for-assembly", "bool", /*default=*/"false",
"Global variables created to describe derived types "
"have been renamed to avoid special symbols in their names.">
];
}

Expand Down
10 changes: 10 additions & 0 deletions flang/include/flang/Optimizer/CodeGen/CodeGen.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,16 @@ struct FIRToLLVMPassOptions {

// Force the usage of a unified tbaa tree in TBAABuilder.
bool forceUnifiedTBAATree = false;

// If set to true, then the global variables created
// for the derived types have been renamed to avoid usage
// of special symbols that may not be supported by all targets.
// The renaming is done by the CompilerGeneratedNamesConversion pass.
// If it is true, FIR-to-LLVM pass has to use
// fir::NameUniquer::getTypeDescriptorAssemblyName() to take
// the name of the global variable corresponding to a derived
// type's descriptor.
bool typeDescriptorsRenamedForAssembly = false;
};

/// Convert FIR to the LLVM IR dialect with default options.
Expand Down
25 changes: 20 additions & 5 deletions flang/include/flang/Optimizer/Support/InternalNames.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,23 @@
#include <cstdint>
#include <optional>

static constexpr llvm::StringRef typeDescriptorSeparator = ".dt.";
static constexpr llvm::StringRef componentInitSeparator = ".di.";
static constexpr llvm::StringRef bindingTableSeparator = ".v.";
static constexpr llvm::StringRef boxprocSuffix = "UnboxProc";

namespace fir {

static constexpr llvm::StringRef kNameSeparator = ".";
static constexpr llvm::StringRef kBoundsSeparator = ".b.";
static constexpr llvm::StringRef kComponentSeparator = ".c.";
static constexpr llvm::StringRef kComponentInitSeparator = ".di.";
static constexpr llvm::StringRef kDataPtrInitSeparator = ".dp.";
static constexpr llvm::StringRef kTypeDescriptorSeparator = ".dt.";
static constexpr llvm::StringRef kKindParameterSeparator = ".kp.";
static constexpr llvm::StringRef kLenKindSeparator = ".lpk.";
static constexpr llvm::StringRef kLenParameterSeparator = ".lv.";
static constexpr llvm::StringRef kNameStringSeparator = ".n.";
static constexpr llvm::StringRef kProcPtrSeparator = ".p.";
static constexpr llvm::StringRef kSpecialBindingSeparator = ".s.";
static constexpr llvm::StringRef kBindingTableSeparator = ".v.";
static constexpr llvm::StringRef boxprocSuffix = "UnboxProc";

/// Internal name mangling of identifiers
///
/// In order to generate symbolically referencable artifacts in a ModuleOp,
Expand Down Expand Up @@ -150,6 +160,9 @@ struct NameUniquer {
/// not a valid mangled derived type name.
static std::string getTypeDescriptorName(llvm::StringRef mangledTypeName);

static std::string
getTypeDescriptorAssemblyName(llvm::StringRef mangledTypeName);

/// Given a mangled derived type name, get the name of the related binding
/// table object. Returns an empty string if \p mangledTypeName is not a valid
/// mangled derived type name.
Expand All @@ -169,6 +182,8 @@ struct NameUniquer {
static llvm::StringRef
dropTypeConversionMarkers(llvm::StringRef mangledTypeName);

static std::string replaceSpecialSymbols(const std::string &name);

private:
static std::string intAsString(std::int64_t i);
static std::string doKind(std::int64_t kind);
Expand Down
1 change: 1 addition & 0 deletions flang/include/flang/Optimizer/Transforms/Passes.h
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ namespace fir {
#define GEN_PASS_DECL_VSCALEATTR
#define GEN_PASS_DECL_FUNCTIONATTR
#define GEN_PASS_DECL_CONSTANTARGUMENTGLOBALISATIONOPT
#define GEN_PASS_DECL_COMPILERGENERATEDNAMESCONVERSION

#include "flang/Optimizer/Transforms/Passes.h.inc"

Expand Down
18 changes: 18 additions & 0 deletions flang/include/flang/Optimizer/Transforms/Passes.td
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,24 @@ def ExternalNameConversion : Pass<"external-name-interop", "mlir::ModuleOp"> {
];
}

def CompilerGeneratedNamesConversion : Pass<"compiler-generated-names",
"mlir::ModuleOp"> {
let summary = "Convert names of compiler generated globals";
let description = [{
Transforms names of compiler generated globals to avoid
characters that might be unsupported by some target toolchains.
All special symbols are replaced with a predefined 'X' character.
This is only done for uniqued names that are not externally facing.
The uniqued names always use '_Q' prefix, and the user entity names
are always lower cased, so using 'X' instead of the special symbols
will guarantee that the converted name will not conflict with the user
space. This pass does not affect the externally facing names,
because the expectation is that the compiler will not generate
externally facing names on its own, and these names cannot use
special symbols.
}];
}

def MemRefDataFlowOpt : Pass<"fir-memref-dataflow-opt", "::mlir::func::FuncOp"> {
let summary =
"Perform store/load forwarding and potentially removing dead stores.";
Expand Down
10 changes: 10 additions & 0 deletions flang/include/flang/Tools/CLOptions.inc
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,8 @@ DisableOption(ExternalNameConversion, "external-name-interop",
"convert names with external convention");
EnableOption(ConstantArgumentGlobalisation, "constant-argument-globalisation",
"the local constant argument to global constant conversion");
DisableOption(CompilerGeneratedNamesConversion, "compiler-generated-names",
"replace special symbols in compiler generated names");

using PassConstructor = std::unique_ptr<mlir::Pass>();

Expand Down Expand Up @@ -222,6 +224,8 @@ inline void addFIRToLLVMPass(
options.ignoreMissingTypeDescriptors = ignoreMissingTypeDescriptors;
options.applyTBAA = config.AliasAnalysis;
options.forceUnifiedTBAATree = useOldAliasTags;
options.typeDescriptorsRenamedForAssembly =
!disableCompilerGeneratedNamesConversion;
addPassConditionally(pm, disableFirToLlvmIr,
[&]() { return fir::createFIRToLLVMPass(options); });
// The dialect conversion framework may leave dead unrealized_conversion_cast
Expand All @@ -248,6 +252,11 @@ inline void addExternalNameConversionPass(
[&]() { return fir::createExternalNameConversion({appendUnderscore}); });
}

inline void addCompilerGeneratedNamesConversionPass(mlir::PassManager &pm) {
addPassConditionally(pm, disableCompilerGeneratedNamesConversion,
[&]() { return fir::createCompilerGeneratedNamesConversion(); });
}

// Use inliner extension point callback to register the default inliner pass.
inline void registerDefaultInlinerPass(MLIRToLLVMPassPipelineConfig &config) {
config.registerFIRInlinerCallback(
Expand Down Expand Up @@ -379,6 +388,7 @@ inline void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm,
fir::addCodeGenRewritePass(
pm, (config.DebugInfo != llvm::codegenoptions::NoDebugInfo));
fir::addTargetRewritePass(pm);
fir::addCompilerGeneratedNamesConversionPass(pm);
fir::addExternalNameConversionPass(pm, config.Underscoring);
fir::createDebugPasses(pm, config.DebugInfo, config.OptLevel, inputFilename);

Expand Down
13 changes: 11 additions & 2 deletions flang/lib/Optimizer/CodeGen/CodeGen.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1201,7 +1201,9 @@ struct EmboxCommonConversion : public fir::FIROpConversion<OP> {
mlir::Location loc,
fir::RecordType recType) const {
std::string name =
fir::NameUniquer::getTypeDescriptorName(recType.getName());
this->options.typeDescriptorsRenamedForAssembly
? fir::NameUniquer::getTypeDescriptorAssemblyName(recType.getName())
: fir::NameUniquer::getTypeDescriptorName(recType.getName());
mlir::Type llvmPtrTy = ::getLlvmPtrType(mod.getContext());
if (auto global = mod.template lookupSymbol<fir::GlobalOp>(name)) {
return rewriter.create<mlir::LLVM::AddressOfOp>(loc, llvmPtrTy,
Expand Down Expand Up @@ -2704,7 +2706,10 @@ struct TypeDescOpConversion : public fir::FIROpConversion<fir::TypeDescOp> {
auto recordType = mlir::dyn_cast<fir::RecordType>(inTy);
auto module = typeDescOp.getOperation()->getParentOfType<mlir::ModuleOp>();
std::string typeDescName =
fir::NameUniquer::getTypeDescriptorName(recordType.getName());
this->options.typeDescriptorsRenamedForAssembly
? fir::NameUniquer::getTypeDescriptorAssemblyName(
recordType.getName())
: fir::NameUniquer::getTypeDescriptorName(recordType.getName());
auto llvmPtrTy = ::getLlvmPtrType(typeDescOp.getContext());
if (auto global = module.lookupSymbol<mlir::LLVM::GlobalOp>(typeDescName)) {
rewriter.replaceOpWithNewOp<mlir::LLVM::AddressOfOp>(
Expand Down Expand Up @@ -3653,6 +3658,10 @@ class FIRToLLVMLowering
if (!forcedTargetFeatures.empty())
fir::setTargetFeatures(mod, forcedTargetFeatures);

if (typeDescriptorsRenamedForAssembly)
options.typeDescriptorsRenamedForAssembly =
typeDescriptorsRenamedForAssembly;

// Run dynamic pass pipeline for converting Math dialect
// operations into other dialects (llvm, func, etc.).
// Some conversions of Math operations cannot be done
Expand Down
31 changes: 23 additions & 8 deletions flang/lib/Optimizer/Support/InternalNames.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include "mlir/IR/Diagnostics.h"
#include "llvm/Support/CommandLine.h"
#include <optional>
#include <regex>

static llvm::cl::opt<std::string> mainEntryName(
"main-entry-name",
Expand Down Expand Up @@ -59,7 +60,11 @@ convertToStringRef(const std::optional<std::string> &from) {

static std::string readName(llvm::StringRef uniq, std::size_t &i,
std::size_t init, std::size_t end) {
for (i = init; i < end && (uniq[i] < 'A' || uniq[i] > 'Z'); ++i) {
// Allow 'X' to be part of the mangled name, which
// can happen after the special symbols are replaced
// in the mangled names by CompilerGeneratedNamesConversionPass.
for (i = init; i < end && (uniq[i] < 'A' || uniq[i] > 'Z' || uniq[i] == 'X');
++i) {
// do nothing
}
return uniq.substr(init, i - init).str();
Expand Down Expand Up @@ -348,7 +353,7 @@ mangleTypeDescriptorKinds(llvm::ArrayRef<std::int64_t> kinds) {
return "";
std::string result;
for (std::int64_t kind : kinds)
result += "." + std::to_string(kind);
result += (fir::kNameSeparator + std::to_string(kind)).str();
return result;
}

Expand All @@ -373,26 +378,36 @@ static std::string getDerivedTypeObjectName(llvm::StringRef mangledTypeName,

std::string
fir::NameUniquer::getTypeDescriptorName(llvm::StringRef mangledTypeName) {
return getDerivedTypeObjectName(mangledTypeName, typeDescriptorSeparator);
return getDerivedTypeObjectName(mangledTypeName,
fir::kTypeDescriptorSeparator);
}

std::string fir::NameUniquer::getTypeDescriptorAssemblyName(
llvm::StringRef mangledTypeName) {
return replaceSpecialSymbols(getTypeDescriptorName(mangledTypeName));
}

std::string fir::NameUniquer::getTypeDescriptorBindingTableName(
llvm::StringRef mangledTypeName) {
return getDerivedTypeObjectName(mangledTypeName, bindingTableSeparator);
return getDerivedTypeObjectName(mangledTypeName, fir::kBindingTableSeparator);
}

std::string
fir::NameUniquer::getComponentInitName(llvm::StringRef mangledTypeName,
llvm::StringRef componentName) {

std::string prefix =
getDerivedTypeObjectName(mangledTypeName, componentInitSeparator);
return prefix + "." + componentName.str();
getDerivedTypeObjectName(mangledTypeName, fir::kComponentInitSeparator);
return (prefix + fir::kNameSeparator + componentName).str();
}

llvm::StringRef
fir::NameUniquer::dropTypeConversionMarkers(llvm::StringRef mangledTypeName) {
if (mangledTypeName.ends_with(boxprocSuffix))
return mangledTypeName.drop_back(boxprocSuffix.size());
if (mangledTypeName.ends_with(fir::boxprocSuffix))
return mangledTypeName.drop_back(fir::boxprocSuffix.size());
return mangledTypeName;
}

std::string fir::NameUniquer::replaceSpecialSymbols(const std::string &name) {
return std::regex_replace(name, std::regex{"\\."}, "X");
}
1 change: 1 addition & 0 deletions flang/lib/Optimizer/Transforms/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ add_flang_library(FIRTransforms
AnnotateConstant.cpp
AssumedRankOpConversion.cpp
CharacterConversion.cpp
CompilerGeneratedNames.cpp
ConstantArgumentGlobalisation.cpp
ControlFlowConverter.cpp
CufOpConversion.cpp
Expand Down
80 changes: 80 additions & 0 deletions flang/lib/Optimizer/Transforms/CompilerGeneratedNames.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
//=== CompilerGeneratedNames.cpp - convert special symbols in global names ===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "flang/Optimizer/Dialect/FIRDialect.h"
#include "flang/Optimizer/Dialect/FIROps.h"
#include "flang/Optimizer/Dialect/FIROpsSupport.h"
#include "flang/Optimizer/Support/InternalNames.h"
#include "flang/Optimizer/Transforms/Passes.h"
#include "mlir/IR/Attributes.h"
#include "mlir/IR/SymbolTable.h"
#include "mlir/Pass/Pass.h"

namespace fir {
#define GEN_PASS_DEF_COMPILERGENERATEDNAMESCONVERSION
#include "flang/Optimizer/Transforms/Passes.h.inc"
} // namespace fir

using namespace mlir;

namespace {

class CompilerGeneratedNamesConversionPass
: public fir::impl::CompilerGeneratedNamesConversionBase<
CompilerGeneratedNamesConversionPass> {
public:
using CompilerGeneratedNamesConversionBase<
CompilerGeneratedNamesConversionPass>::
CompilerGeneratedNamesConversionBase;

mlir::ModuleOp getModule() { return getOperation(); }
void runOnOperation() override;
};
} // namespace

void CompilerGeneratedNamesConversionPass::runOnOperation() {
auto op = getOperation();
auto *context = &getContext();

llvm::DenseMap<mlir::StringAttr, mlir::FlatSymbolRefAttr> remappings;
for (auto &funcOrGlobal : op->getRegion(0).front()) {
if (llvm::isa<mlir::func::FuncOp>(funcOrGlobal) ||
llvm::isa<fir::GlobalOp>(funcOrGlobal)) {
auto symName = funcOrGlobal.getAttrOfType<mlir::StringAttr>(
mlir::SymbolTable::getSymbolAttrName());
auto deconstructedName = fir::NameUniquer::deconstruct(symName);
if (deconstructedName.first != fir::NameUniquer::NameKind::NOT_UNIQUED &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IO lowering is producing some global symbols with dots for the purpose of type bound IO that may not be considered unique here I think.

See:

! CHECK: fir.global linkonce @default.nonTbpDefinedIoTable constant : tuple<i64, !fir.ref<!fir.array<0xtuple<!fir.ref<none>, !fir.ref<none>, i32, i1>>>, i1>

? "default" + suffix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the .nonTbpDefinedIoTable suffix, IO.cpp also has two instances of a .list suffix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pointers, Jean and Val! I would like to address these cases separately.

These names do not look quite from the reserved name space to me, so I guess they may conflict with the user space if I try to replace . with X in them (e.g. when linking with an object file created from a C module). I think I can try to put them into the "Flang's name space", e.g. @_QEdefault.nonTbpDefinedIoTable or @_QECdefault.nonTbpDefinedIoTable, then the new pass should work for them and we will be safe from the name conflicts (given that _Q is reserved).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solution you suggest makes sense to me, I am OK doing it in separate patch.

!fir::NameUniquer::isExternalFacingUniquedName(deconstructedName)) {
std::string newName =
fir::NameUniquer::replaceSpecialSymbols(symName.getValue().str());
if (newName != symName) {
auto newAttr = mlir::StringAttr::get(context, newName);
mlir::SymbolTable::setSymbolName(&funcOrGlobal, newAttr);
auto newSymRef = mlir::FlatSymbolRefAttr::get(newAttr);
remappings.try_emplace(symName, newSymRef);
}
}
}
}

if (remappings.empty())
return;

// Update all uses of the functions and globals that have been renamed.
op.walk([&remappings](mlir::Operation *nestedOp) {
llvm::SmallVector<std::pair<mlir::StringAttr, mlir::SymbolRefAttr>> updates;
for (const mlir::NamedAttribute &attr : nestedOp->getAttrDictionary())
if (auto symRef = llvm::dyn_cast<mlir::SymbolRefAttr>(attr.getValue()))
if (auto remap = remappings.find(symRef.getRootReference());
remap != remappings.end())
updates.emplace_back(std::pair<mlir::StringAttr, mlir::SymbolRefAttr>{
attr.getName(), mlir::SymbolRefAttr(remap->second)});
for (auto update : updates)
nestedOp->setAttr(update.first, update.second);
});
}
Loading
Loading