-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[ObjCDirect] Move nil check to a thunk function #126639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
clang/lib/CodeGen/CGObjCMac.cpp
Outdated
Builder.CreateStore(result.getScalarVal(), selfAddr); | ||
|
||
// Nullable `Class` expressions cannot be messaged with a direct method | ||
// so the only reason why the receive can be null would be because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should ReceiverCanBeNull default to false then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually no, this code path handles both instance method and class method. For all instance methods, receiver can be null.
d861143
to
abd8cb2
Compare
clang/lib/AST/Mangle.cpp
Outdated
OS << '['; | ||
OS << ClassName; | ||
if (CategoryName) | ||
OS << "(" << CategoryName.value() << ")"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are already checking CategoryName
one line above, using *CategoryName
should be slightly less work.
clang/lib/CodeGen/CGCall.cpp
Outdated
if (InnerFn) { | ||
// Don't emit any arg except for `self` if we are in a thunk function. | ||
// We still need self for nil check, other arguments aren't used in this | ||
// function and thus is not needed. Avoid emitting them also prevents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: is not needed
→ are not needed
clang/lib/CodeGen/CGObjCMac.cpp
Outdated
// return (ReturnType){ }; | ||
// } | ||
|
||
if (OMD->isClassMethod()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the summary you say that class methods do not have nil checks, and canHaveNilCheckThunk
checks for the method being an instance method. Is this piece of code here for future improvements or is it dead code and can it be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the semantics here is a bit vague as @AdamCmiel had the same question. This code path is shared between nil check thunk and class method. I should probably separate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the function kinda implies this is for "direct". From the calls to this function, one might be not related to direct (the one that only uses InnerFn
to decide if to invoke, but I thought InnerFn
was a direct detail).
int vprintf(const char * restrict format, va_list ap); | ||
#define NULL ((void *)0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/vprintf-vprintf-l-vwprintf-vwprintf-l?view=msvc-170 you need to import stdio.h
, stdarg.h
and varargs.h
for UNIX V compatibility (if you want to try… I am not sure va_list
is going to be visible without varargs.h
, but the transitive headers are always a problem).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to import
stdio.h
Windows was not happy about #include <stdio.h>
saying it couldn't find it. That's why I only included necessary declaration.
@llvm/pr-subscribers-clang-codegen Author: Peter Rong (DataCorrupted) ChangesTL;DR“In Objective-C, it is valid to send a message to nil—it simply has no effect at runtime.”[1] Such requirements can be easily implemented in the runtime. For However, not all direct method calls need to do this check if we can be sure at static time that Detailed designOverall
Class methods are excludedThe receiver for the class method (self) is the class object. That said, there is only one (global) instance of this receiver and it is lazily initialized. That means most of the time nil check doesn’t happen, and the code is not generated. Variadic argument functions are excludedWe need to deal with different calling conventions on different platforms to make variadic argument functions work. Since there is only a handful amount of these functions, we feel reasonable to exclude them. ResultsIn our testing pre-upstreaming, we've discovered that the added symbols introduced <0.1% size regression in some of our iOS apps. With this patch we open the window for more optimizations including:
As a result, we believe this trade off is acceptable. Future plansThis PR is the first patch towards ObjCDirect ABI visibility change. In the future, we plan to do the following:
Patch is 60.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126639.diff 24 Files Affected:
diff --git a/clang/include/clang/AST/DeclObjC.h b/clang/include/clang/AST/DeclObjC.h
index 4663603f79754..69c36b77da6c2 100644
--- a/clang/include/clang/AST/DeclObjC.h
+++ b/clang/include/clang/AST/DeclObjC.h
@@ -482,6 +482,12 @@ class ObjCMethodDecl : public NamedDecl, public DeclContext {
/// True if the method is tagged as objc_direct
bool isDirectMethod() const;
+ // Only direct instance method that have a fixed number of arguments can have
+ // nil check thunk functions.
+ bool canHaveNilCheckThunk() const {
+ return isDirectMethod() && isInstanceMethod() && !isVariadic();
+ }
+
/// True if the method has a parameter that's destroyed in the callee.
bool hasParamDestroyedInCallee() const;
diff --git a/clang/include/clang/AST/Mangle.h b/clang/include/clang/AST/Mangle.h
index 6134a70c04bd3..7d6f2181b8326 100644
--- a/clang/include/clang/AST/Mangle.h
+++ b/clang/include/clang/AST/Mangle.h
@@ -40,6 +40,13 @@ struct ThisAdjustment;
struct ThunkInfo;
class VarDecl;
+/// Extract mangling function name from MangleContext such that swift can call
+/// it to prepare for ObjCDirect in swift.
+void mangleObjCMethodName(raw_ostream &OS, bool includePrefixByte,
+ bool isInstanceMethod, StringRef ClassName,
+ std::optional<StringRef> CategoryName,
+ StringRef MethodName, bool isThunk);
+
/// MangleContext - Context for tracking state which persists across multiple
/// calls to the C++ name mangler.
class MangleContext {
@@ -153,7 +160,8 @@ class MangleContext {
void mangleObjCMethodName(const ObjCMethodDecl *MD, raw_ostream &OS,
bool includePrefixByte = true,
- bool includeCategoryNamespace = true);
+ bool includeCategoryNamespace = true,
+ bool isThunk = true);
void mangleObjCMethodNameAsSourceName(const ObjCMethodDecl *MD,
raw_ostream &);
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index a7f5f1abbb825..837b6e5ae4e9d 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -359,6 +359,7 @@ CODEGENOPT(EmitLLVMUseLists, 1, 0) ///< Control whether to serialize use-lists.
CODEGENOPT(WholeProgramVTables, 1, 0) ///< Whether to apply whole-program
/// vtable optimization.
+CODEGENOPT(ObjCEmitNilCheckThunk , 1, 0) ///< Whether objc_direct methods should emit a nil check thunk.
CODEGENOPT(VirtualFunctionElimination, 1, 0) ///< Whether to apply the dead
/// virtual function elimination
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 383440ddbc0ea..d41693667621c 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -372,6 +372,7 @@ LANGOPT(IncludeDefaultHeader, 1, 0, "Include default header file for OpenCL")
LANGOPT(DeclareOpenCLBuiltins, 1, 0, "Declare OpenCL builtin functions")
BENIGN_LANGOPT(DelayedTemplateParsing , 1, 0, "delayed template parsing")
LANGOPT(BlocksRuntimeOptional , 1, 0, "optional blocks runtime")
+LANGOPT(ObjCEmitNilCheckThunk, 1, 0, "Emit a thunk to do nil check for objc direct methods")
LANGOPT(
CompleteMemberPointers, 1, 0,
"Require member pointer base types to be complete at the point where the "
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 5ad187926e710..eb06410f57172 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3001,6 +3001,8 @@ def fthin_link_bitcode_EQ : Joined<["-"], "fthin-link-bitcode=">,
Visibility<[ClangOption, CLOption, CC1Option]>, Group<f_Group>,
HelpText<"Write minimized bitcode to <file> for the ThinLTO thin link only">,
MarshallingInfoString<CodeGenOpts<"ThinLinkBitcodeFile">>;
+def fobjc_emit_nil_check_thunk:
+ Flag<["-"], "fobjc-emit-nil-check-thunk">, Visibility<[ClangOption, CC1Option]>, Group<f_Group>;
defm fat_lto_objects : BoolFOption<"fat-lto-objects",
CodeGenOpts<"FatLTO">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption, CC1Option], "Enable">,
diff --git a/clang/lib/AST/Mangle.cpp b/clang/lib/AST/Mangle.cpp
index 15be9c62bf888..fa37e80a76c88 100644
--- a/clang/lib/AST/Mangle.cpp
+++ b/clang/lib/AST/Mangle.cpp
@@ -29,6 +29,24 @@
using namespace clang;
+void clang::mangleObjCMethodName(raw_ostream &OS, bool includePrefixByte,
+ bool isInstanceMethod, StringRef ClassName,
+ std::optional<StringRef> CategoryName,
+ StringRef MethodName, bool isThunk) {
+ // \01+[ContainerName(CategoryName) SelectorName]
+ if (includePrefixByte)
+ OS << "\01";
+ OS << (isInstanceMethod ? '-' : '+');
+ OS << '[';
+ OS << ClassName;
+ if (CategoryName)
+ OS << "(" << *CategoryName << ")";
+ OS << " ";
+ OS << MethodName;
+ OS << ']';
+ if (!isThunk)
+ OS << "_inner";
+}
// FIXME: For blocks we currently mimic GCC's mangling scheme, which leaves
// much to be desired. Come up with a better mangling scheme.
@@ -327,7 +345,8 @@ void MangleContext::mangleBlock(const DeclContext *DC, const BlockDecl *BD,
void MangleContext::mangleObjCMethodName(const ObjCMethodDecl *MD,
raw_ostream &OS,
bool includePrefixByte,
- bool includeCategoryNamespace) {
+ bool includeCategoryNamespace,
+ bool isThunk) {
if (getASTContext().getLangOpts().ObjCRuntime.isGNUFamily()) {
// This is the mangling we've always used on the GNU runtimes, but it
// has obvious collisions in the face of underscores within class
@@ -361,24 +380,24 @@ void MangleContext::mangleObjCMethodName(const ObjCMethodDecl *MD,
}
// \01+[ContainerName(CategoryName) SelectorName]
- if (includePrefixByte) {
- OS << '\01';
- }
- OS << (MD->isInstanceMethod() ? '-' : '+') << '[';
+ auto CategoryName = std::optional<StringRef>();
+ StringRef ClassName;
if (const auto *CID = MD->getCategory()) {
- OS << CID->getClassInterface()->getName();
- if (includeCategoryNamespace) {
- OS << '(' << *CID << ')';
- }
+ if (includeCategoryNamespace)
+ CategoryName = CID->getName();
+ ClassName = CID->getClassInterface()->getName();
+
} else if (const auto *CD =
dyn_cast<ObjCContainerDecl>(MD->getDeclContext())) {
- OS << CD->getName();
+ ClassName = CD->getName();
} else {
llvm_unreachable("Unexpected ObjC method decl context");
}
- OS << ' ';
- MD->getSelector().print(OS);
- OS << ']';
+ std::string MethodName;
+ llvm::raw_string_ostream MethodNameOS(MethodName);
+ MD->getSelector().print(MethodNameOS);
+ clang::mangleObjCMethodName(OS, includePrefixByte, MD->isInstanceMethod(),
+ ClassName, CategoryName, MethodName, isThunk);
}
void MangleContext::mangleObjCMethodNameAsSourceName(const ObjCMethodDecl *MD,
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 47bfd470dbafb..c522b794fbfe3 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2702,6 +2702,17 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
ArgAttrs[IRArgs.first] = llvm::AttributeSet::get(getLLVMContext(), Attrs);
}
+ // Direct method prologue should not contain nil check anymore.
+ // As a result, we can set `self` to be NonNull to prepare for further
+ // optimizations.
+ if (TargetDecl) {
+ auto OMD = dyn_cast<ObjCMethodDecl>(TargetDecl);
+ if (shouldHaveNilCheckThunk(OMD) && !IsThunk) {
+ auto IRArgs = IRFunctionArgs.getIRArgs(0);
+ ArgAttrs[IRArgs.first] = ArgAttrs[IRArgs.first].addAttribute(
+ getLLVMContext(), llvm::Attribute::NonNull);
+ }
+ }
unsigned ArgNo = 0;
for (CGFunctionInfo::const_arg_iterator I = FI.arg_begin(),
@@ -3386,7 +3397,13 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
}
}
- if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()) {
+ if (InnerFn) {
+ // Don't emit any arg except for `self` if we are in a thunk function.
+ // We still need self for nil check, other arguments aren't used in this
+ // function and thus are not needed. Avoid emitting them also prevents
+ // accidental release/retain.
+ EmitParmDecl(*Args[0], ArgVals[0], 1);
+ } else if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()) {
for (int I = Args.size() - 1; I >= 0; --I)
EmitParmDecl(*Args[I], ArgVals[I], I + 1);
} else {
diff --git a/clang/lib/CodeGen/CGObjC.cpp b/clang/lib/CodeGen/CGObjC.cpp
index 27c7c2fa9cba1..1fd4c1d2c8a0f 100644
--- a/clang/lib/CodeGen/CGObjC.cpp
+++ b/clang/lib/CodeGen/CGObjC.cpp
@@ -756,12 +756,13 @@ void CodeGenFunction::StartObjCMethod(const ObjCMethodDecl *OMD,
if (OMD->hasAttr<NoDebugAttr>())
DebugInfo = nullptr; // disable debug info indefinitely for this function
- llvm::Function *Fn = CGM.getObjCRuntime().GenerateMethod(OMD, CD);
+ bool isInner = CGM.shouldHaveNilCheckThunk(OMD) && !InnerFn;
+ llvm::Function *Fn = CGM.getObjCRuntime().GenerateMethod(OMD, CD, !isInner);
const CGFunctionInfo &FI = CGM.getTypes().arrangeObjCMethodDeclaration(OMD);
if (OMD->isDirectMethod()) {
Fn->setVisibility(llvm::Function::HiddenVisibility);
- CGM.SetLLVMFunctionAttributes(OMD, FI, Fn, /*IsThunk=*/false);
+ CGM.SetLLVMFunctionAttributes(OMD, FI, Fn, /*IsThunk=*/InnerFn);
CGM.SetLLVMFunctionAttributesForDefinition(OMD, Fn);
} else {
CGM.SetInternalFunctionAttributes(OMD, Fn, FI);
@@ -780,10 +781,14 @@ void CodeGenFunction::StartObjCMethod(const ObjCMethodDecl *OMD,
OMD->getLocation(), StartLoc);
if (OMD->isDirectMethod()) {
- // This function is a direct call, it has to implement a nil check
- // on entry.
- //
- // TODO: possibly have several entry points to elide the check
+ // Having `InnerFn` indicates that we are generating a nil check thunk.
+ // In that case our job is done here.
+ if (InnerFn)
+ return;
+ // Only NeXTFamily can have nil check thunk.
+ if (CGM.shouldHaveNilCheckThunk(OMD))
+ // Go generate a nil check thunk around `Fn`
+ CodeGenFunction(CGM, /*InnerFn=*/Fn).GenerateObjCDirectThunk(OMD, CD);
CGM.getObjCRuntime().GenerateDirectMethodPrologue(*this, Fn, OMD, CD);
}
@@ -1637,6 +1642,44 @@ void CodeGenFunction::GenerateObjCSetter(ObjCImplementationDecl *IMP,
FinishFunction(OMD->getEndLoc());
}
+void CodeGenFunction::GenerateObjCDirectThunk(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD) {
+ assert(InnerFn && CGM.shouldHaveNilCheckThunk(OMD) &&
+ "Should only generate wrapper when the flag is set.");
+ StartObjCMethod(OMD, CD);
+
+ // Manually pop all the clean up that doesn't need to happen in the outer
+ // function. InnerFn will do this for us.
+ while (EHStack.stable_begin() != PrologueCleanupDepth)
+ EHStack.popCleanup();
+
+ // Generate a nil check.
+ CGM.getObjCRuntime().GenerateDirectMethodPrologue(*this, CurFn, OMD, CD);
+ // Call the InnerFn and pass the return value
+ SmallVector<llvm::Value *> Args(CurFn->arg_size());
+ std::transform(CurFn->arg_begin(), CurFn->arg_end(), Args.begin(),
+ [](llvm::Argument &arg) { return &arg; });
+
+ // This will be optimized into a tail call.
+ auto *CallInst = EmitCallOrInvoke(InnerFn, Args);
+ // Preserve the inner function's attributes to the call instruction.
+ CallInst->setAttributes(InnerFn->getAttributes());
+ llvm::Value *RetVal = CallInst;
+
+ // If `AutoreleaseResult` is set, the return value is not void.
+ if (AutoreleaseResult)
+ RetVal = EmitARCRetainAutoreleasedReturnValue(RetVal);
+
+ // This excessive store is totally unnecessary.
+ // But `FinishFunction` really wants us to store the result so it can
+ // clean up the function properly.
+ // The unnecessary store-load of the ret value will be optimized out anyway.
+ if (!CurFn->getReturnType()->isVoidTy())
+ Builder.CreateStore(RetVal, ReturnValue);
+
+ // Nil check's end location is the function's start location.
+ FinishFunction(OMD->getBeginLoc());
+}
namespace {
struct DestroyIvar final : EHScopeStack::Cleanup {
private:
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index d1876f47c0eea..802168befd628 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -595,7 +595,13 @@ class CGObjCGNU : public CGObjCRuntime {
llvm::Constant *GetEHType(QualType T) override;
llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) override;
+ const ObjCContainerDecl *CD,
+ bool isThunk) override {
+ // isThunk is irrelevent for GNU.
+ return GenerateMethod(OMD, CD);
+ };
+ llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD);
// Map to unify direct method definitions.
llvm::DenseMap<const ObjCMethodDecl *, llvm::Function *>
diff --git a/clang/lib/CodeGen/CGObjCMac.cpp b/clang/lib/CodeGen/CGObjCMac.cpp
index 01552b6e53d00..4f0d7567b4b97 100644
--- a/clang/lib/CodeGen/CGObjCMac.cpp
+++ b/clang/lib/CodeGen/CGObjCMac.cpp
@@ -1068,12 +1068,13 @@ class CGObjCCommonMac : public CodeGen::CGObjCRuntime {
ConstantAddress GenerateConstantString(const StringLiteral *SL) override;
ConstantAddress GenerateConstantNSString(const StringLiteral *SL);
- llvm::Function *
- GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD = nullptr) override;
+ llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD = nullptr,
+ bool isThunk = true) override;
llvm::Function *GenerateDirectMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD);
+ const ObjCContainerDecl *CD,
+ bool isThunk);
void GenerateDirectMethodPrologue(CodeGenFunction &CGF, llvm::Function *Fn,
const ObjCMethodDecl *OMD,
@@ -2094,7 +2095,8 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
llvm::FunctionCallee Fn = nullptr;
if (Method && Method->isDirectMethod()) {
assert(!IsSuper);
- Fn = GenerateDirectMethod(Method, Method->getClassInterface());
+ Fn = GenerateDirectMethod(Method, Method->getClassInterface(),
+ /*isThunk=*/true);
// Direct methods will synthesize the proper `_cmd` internally,
// so just don't bother with setting the `_cmd` argument.
RequiresSelValue = false;
@@ -2118,10 +2120,6 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
: ObjCTypes.getSendFn(IsSuper);
}
- // Cast function to proper signature
- llvm::Constant *BitcastFn = cast<llvm::Constant>(
- CGF.Builder.CreateBitCast(Fn.getCallee(), MSI.MessengerType));
-
// We don't need to emit a null check to zero out an indirect result if the
// result is ignored.
if (Return.isUnused())
@@ -2131,6 +2129,16 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
if (!RequiresNullCheck && Method && Method->hasParamDestroyedInCallee())
RequiresNullCheck = true;
+ // `RequiresNullCheck` will do the null check at the caller site.
+ // In that case, a direct instance method can skip the null check and call the
+ // inner function instead.
+ if (CGM.shouldHaveNilCheckThunk(Method))
+ if (RequiresNullCheck || !ReceiverCanBeNull)
+ Fn = GenerateDirectMethod(Method, Method->getClassInterface(),
+ /*isThunk=*/false);
+ // Cast function to proper signature
+ llvm::Constant *BitcastFn = cast<llvm::Constant>(
+ CGF.Builder.CreateBitCast(Fn.getCallee(), MSI.MessengerType));
NullReturnState nullReturn;
if (RequiresNullCheck) {
nullReturn.init(CGF, Arg0);
@@ -3857,11 +3865,12 @@ CGObjCMac::emitMethodList(Twine name, MethodListType MLT,
}
llvm::Function *CGObjCCommonMac::GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) {
+ const ObjCContainerDecl *CD,
+ bool isThunk) {
llvm::Function *Method;
if (OMD->isDirectMethod()) {
- Method = GenerateDirectMethod(OMD, CD);
+ Method = GenerateDirectMethod(OMD, CD, isThunk);
} else {
auto Name = getSymbolNameForMethod(OMD);
@@ -3877,14 +3886,13 @@ llvm::Function *CGObjCCommonMac::GenerateMethod(const ObjCMethodDecl *OMD,
return Method;
}
-llvm::Function *
-CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) {
+llvm::Function *CGObjCCommonMac::GenerateDirectMethod(
+ const ObjCMethodDecl *OMD, const ObjCContainerDecl *CD, bool isThunk) {
auto *COMD = OMD->getCanonicalDecl();
auto I = DirectMethodDefinitions.find(COMD);
llvm::Function *OldFn = nullptr, *Fn = nullptr;
- if (I != DirectMethodDefinitions.end()) {
+ if (isThunk && I != DirectMethodDefinitions.end()) {
// Objective-C allows for the declaration and implementation types
// to differ slightly.
//
@@ -3913,11 +3921,16 @@ CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
// Replace the cached function in the map.
I->second = Fn;
} else {
- auto Name = getSymbolNameForMethod(OMD, /*include category*/ false);
-
- Fn = llvm::Function::Create(MethodTy, llvm::GlobalValue::ExternalLinkage,
- Name, &CGM.getModule());
- DirectMethodDefinitions.insert(std::make_pair(COMD, Fn));
+ auto Name =
+ getSymbolNameForMethod(OMD, /*include category*/ false, isThunk);
+ // Non-thunk functions are not cached and may be repeatedly created.
+ // Therefore, we try to find it before we create one.
+ Fn = CGM.getModule().getFunction(Name);
+ if (!Fn)
+ Fn = llvm::Function::Create(MethodTy, llvm::GlobalValue::ExternalLinkage,
+ Name, &CGM.getModule());
+ if (isThunk)
+ DirectMethodDefinitions.insert(std::make_pair(COMD, Fn));
}
return Fn;
@@ -3930,6 +3943,8 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
bool ReceiverCanBeNull = true;
auto selfAddr = CGF.GetAddrOfLocalVar(OMD->getSelfDecl());
auto selfValue = Builder.CreateLoad(selfAddr);
+ bool shouldHaveNilCheckThunk = CGF.CGM.shouldHaveNilCheckThunk(OMD);
+ bool isNilCheckThunk = shouldHaveNilCheckThunk && CGF.InnerFn;
// Generate:
//
@@ -3971,7 +3986,9 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
ReceiverCanBeNull = isWeakLinkedClass(OID);
}
- if (ReceiverCanBeNull) {
+ // Only emit nil check if this is a nil check thunk or the method
+ // decides that its receiver can be null
+ if (isNilCheckThunk || (!shouldHaveNilCheckThunk && ReceiverCanBeNull)) {
llvm::BasicBlock *SelfIsNilBlock =
CGF.createBasicBlock("objc_direct_method.self_is_nil");
llvm::BasicBlock *ContBlock =
@@ -4001,14 +4018,19 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
Builder.SetInsertPoint(ContBlock);
}
- // only synthesize _cmd if it's referenced
- if (OMD->getCmdDecl()->isUsed()) {
+ // Only synthesize _cmd if it's referenced
+ // However, a nil check thunk doesn't need _cmd even if it's referenced
+ if (!isNilCheckThunk && OMD->getCmdDecl()->isUsed()) {
// `_cmd` is not a parameter to direct methods, so storage must be
// explicitly declared for it.
CGF.EmitVarDecl(*OMD->getCmdDecl());
Builder.CreateStore(GetSelector(CGF, OMD),
CGF.GetAddrOfLocalVar(OMD->getCmdDecl()));
}
+
+ // It's possible that selfValue is never used.
+ if (se...
[truncated]
|
@llvm/pr-subscribers-clang-driver Author: Peter Rong (DataCorrupted) ChangesTL;DR“In Objective-C, it is valid to send a message to nil—it simply has no effect at runtime.”[1] Such requirements can be easily implemented in the runtime. For However, not all direct method calls need to do this check if we can be sure at static time that Detailed designOverall
Class methods are excludedThe receiver for the class method (self) is the class object. That said, there is only one (global) instance of this receiver and it is lazily initialized. That means most of the time nil check doesn’t happen, and the code is not generated. Variadic argument functions are excludedWe need to deal with different calling conventions on different platforms to make variadic argument functions work. Since there is only a handful amount of these functions, we feel reasonable to exclude them. ResultsIn our testing pre-upstreaming, we've discovered that the added symbols introduced <0.1% size regression in some of our iOS apps. With this patch we open the window for more optimizations including:
As a result, we believe this trade off is acceptable. Future plansThis PR is the first patch towards ObjCDirect ABI visibility change. In the future, we plan to do the following:
Patch is 60.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126639.diff 24 Files Affected:
diff --git a/clang/include/clang/AST/DeclObjC.h b/clang/include/clang/AST/DeclObjC.h
index 4663603f79754..69c36b77da6c2 100644
--- a/clang/include/clang/AST/DeclObjC.h
+++ b/clang/include/clang/AST/DeclObjC.h
@@ -482,6 +482,12 @@ class ObjCMethodDecl : public NamedDecl, public DeclContext {
/// True if the method is tagged as objc_direct
bool isDirectMethod() const;
+ // Only direct instance method that have a fixed number of arguments can have
+ // nil check thunk functions.
+ bool canHaveNilCheckThunk() const {
+ return isDirectMethod() && isInstanceMethod() && !isVariadic();
+ }
+
/// True if the method has a parameter that's destroyed in the callee.
bool hasParamDestroyedInCallee() const;
diff --git a/clang/include/clang/AST/Mangle.h b/clang/include/clang/AST/Mangle.h
index 6134a70c04bd3..7d6f2181b8326 100644
--- a/clang/include/clang/AST/Mangle.h
+++ b/clang/include/clang/AST/Mangle.h
@@ -40,6 +40,13 @@ struct ThisAdjustment;
struct ThunkInfo;
class VarDecl;
+/// Extract mangling function name from MangleContext such that swift can call
+/// it to prepare for ObjCDirect in swift.
+void mangleObjCMethodName(raw_ostream &OS, bool includePrefixByte,
+ bool isInstanceMethod, StringRef ClassName,
+ std::optional<StringRef> CategoryName,
+ StringRef MethodName, bool isThunk);
+
/// MangleContext - Context for tracking state which persists across multiple
/// calls to the C++ name mangler.
class MangleContext {
@@ -153,7 +160,8 @@ class MangleContext {
void mangleObjCMethodName(const ObjCMethodDecl *MD, raw_ostream &OS,
bool includePrefixByte = true,
- bool includeCategoryNamespace = true);
+ bool includeCategoryNamespace = true,
+ bool isThunk = true);
void mangleObjCMethodNameAsSourceName(const ObjCMethodDecl *MD,
raw_ostream &);
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index a7f5f1abbb825..837b6e5ae4e9d 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -359,6 +359,7 @@ CODEGENOPT(EmitLLVMUseLists, 1, 0) ///< Control whether to serialize use-lists.
CODEGENOPT(WholeProgramVTables, 1, 0) ///< Whether to apply whole-program
/// vtable optimization.
+CODEGENOPT(ObjCEmitNilCheckThunk , 1, 0) ///< Whether objc_direct methods should emit a nil check thunk.
CODEGENOPT(VirtualFunctionElimination, 1, 0) ///< Whether to apply the dead
/// virtual function elimination
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 383440ddbc0ea..d41693667621c 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -372,6 +372,7 @@ LANGOPT(IncludeDefaultHeader, 1, 0, "Include default header file for OpenCL")
LANGOPT(DeclareOpenCLBuiltins, 1, 0, "Declare OpenCL builtin functions")
BENIGN_LANGOPT(DelayedTemplateParsing , 1, 0, "delayed template parsing")
LANGOPT(BlocksRuntimeOptional , 1, 0, "optional blocks runtime")
+LANGOPT(ObjCEmitNilCheckThunk, 1, 0, "Emit a thunk to do nil check for objc direct methods")
LANGOPT(
CompleteMemberPointers, 1, 0,
"Require member pointer base types to be complete at the point where the "
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 5ad187926e710..eb06410f57172 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3001,6 +3001,8 @@ def fthin_link_bitcode_EQ : Joined<["-"], "fthin-link-bitcode=">,
Visibility<[ClangOption, CLOption, CC1Option]>, Group<f_Group>,
HelpText<"Write minimized bitcode to <file> for the ThinLTO thin link only">,
MarshallingInfoString<CodeGenOpts<"ThinLinkBitcodeFile">>;
+def fobjc_emit_nil_check_thunk:
+ Flag<["-"], "fobjc-emit-nil-check-thunk">, Visibility<[ClangOption, CC1Option]>, Group<f_Group>;
defm fat_lto_objects : BoolFOption<"fat-lto-objects",
CodeGenOpts<"FatLTO">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption, CC1Option], "Enable">,
diff --git a/clang/lib/AST/Mangle.cpp b/clang/lib/AST/Mangle.cpp
index 15be9c62bf888..fa37e80a76c88 100644
--- a/clang/lib/AST/Mangle.cpp
+++ b/clang/lib/AST/Mangle.cpp
@@ -29,6 +29,24 @@
using namespace clang;
+void clang::mangleObjCMethodName(raw_ostream &OS, bool includePrefixByte,
+ bool isInstanceMethod, StringRef ClassName,
+ std::optional<StringRef> CategoryName,
+ StringRef MethodName, bool isThunk) {
+ // \01+[ContainerName(CategoryName) SelectorName]
+ if (includePrefixByte)
+ OS << "\01";
+ OS << (isInstanceMethod ? '-' : '+');
+ OS << '[';
+ OS << ClassName;
+ if (CategoryName)
+ OS << "(" << *CategoryName << ")";
+ OS << " ";
+ OS << MethodName;
+ OS << ']';
+ if (!isThunk)
+ OS << "_inner";
+}
// FIXME: For blocks we currently mimic GCC's mangling scheme, which leaves
// much to be desired. Come up with a better mangling scheme.
@@ -327,7 +345,8 @@ void MangleContext::mangleBlock(const DeclContext *DC, const BlockDecl *BD,
void MangleContext::mangleObjCMethodName(const ObjCMethodDecl *MD,
raw_ostream &OS,
bool includePrefixByte,
- bool includeCategoryNamespace) {
+ bool includeCategoryNamespace,
+ bool isThunk) {
if (getASTContext().getLangOpts().ObjCRuntime.isGNUFamily()) {
// This is the mangling we've always used on the GNU runtimes, but it
// has obvious collisions in the face of underscores within class
@@ -361,24 +380,24 @@ void MangleContext::mangleObjCMethodName(const ObjCMethodDecl *MD,
}
// \01+[ContainerName(CategoryName) SelectorName]
- if (includePrefixByte) {
- OS << '\01';
- }
- OS << (MD->isInstanceMethod() ? '-' : '+') << '[';
+ auto CategoryName = std::optional<StringRef>();
+ StringRef ClassName;
if (const auto *CID = MD->getCategory()) {
- OS << CID->getClassInterface()->getName();
- if (includeCategoryNamespace) {
- OS << '(' << *CID << ')';
- }
+ if (includeCategoryNamespace)
+ CategoryName = CID->getName();
+ ClassName = CID->getClassInterface()->getName();
+
} else if (const auto *CD =
dyn_cast<ObjCContainerDecl>(MD->getDeclContext())) {
- OS << CD->getName();
+ ClassName = CD->getName();
} else {
llvm_unreachable("Unexpected ObjC method decl context");
}
- OS << ' ';
- MD->getSelector().print(OS);
- OS << ']';
+ std::string MethodName;
+ llvm::raw_string_ostream MethodNameOS(MethodName);
+ MD->getSelector().print(MethodNameOS);
+ clang::mangleObjCMethodName(OS, includePrefixByte, MD->isInstanceMethod(),
+ ClassName, CategoryName, MethodName, isThunk);
}
void MangleContext::mangleObjCMethodNameAsSourceName(const ObjCMethodDecl *MD,
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 47bfd470dbafb..c522b794fbfe3 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2702,6 +2702,17 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
ArgAttrs[IRArgs.first] = llvm::AttributeSet::get(getLLVMContext(), Attrs);
}
+ // Direct method prologue should not contain nil check anymore.
+ // As a result, we can set `self` to be NonNull to prepare for further
+ // optimizations.
+ if (TargetDecl) {
+ auto OMD = dyn_cast<ObjCMethodDecl>(TargetDecl);
+ if (shouldHaveNilCheckThunk(OMD) && !IsThunk) {
+ auto IRArgs = IRFunctionArgs.getIRArgs(0);
+ ArgAttrs[IRArgs.first] = ArgAttrs[IRArgs.first].addAttribute(
+ getLLVMContext(), llvm::Attribute::NonNull);
+ }
+ }
unsigned ArgNo = 0;
for (CGFunctionInfo::const_arg_iterator I = FI.arg_begin(),
@@ -3386,7 +3397,13 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
}
}
- if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()) {
+ if (InnerFn) {
+ // Don't emit any arg except for `self` if we are in a thunk function.
+ // We still need self for nil check, other arguments aren't used in this
+ // function and thus are not needed. Avoid emitting them also prevents
+ // accidental release/retain.
+ EmitParmDecl(*Args[0], ArgVals[0], 1);
+ } else if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()) {
for (int I = Args.size() - 1; I >= 0; --I)
EmitParmDecl(*Args[I], ArgVals[I], I + 1);
} else {
diff --git a/clang/lib/CodeGen/CGObjC.cpp b/clang/lib/CodeGen/CGObjC.cpp
index 27c7c2fa9cba1..1fd4c1d2c8a0f 100644
--- a/clang/lib/CodeGen/CGObjC.cpp
+++ b/clang/lib/CodeGen/CGObjC.cpp
@@ -756,12 +756,13 @@ void CodeGenFunction::StartObjCMethod(const ObjCMethodDecl *OMD,
if (OMD->hasAttr<NoDebugAttr>())
DebugInfo = nullptr; // disable debug info indefinitely for this function
- llvm::Function *Fn = CGM.getObjCRuntime().GenerateMethod(OMD, CD);
+ bool isInner = CGM.shouldHaveNilCheckThunk(OMD) && !InnerFn;
+ llvm::Function *Fn = CGM.getObjCRuntime().GenerateMethod(OMD, CD, !isInner);
const CGFunctionInfo &FI = CGM.getTypes().arrangeObjCMethodDeclaration(OMD);
if (OMD->isDirectMethod()) {
Fn->setVisibility(llvm::Function::HiddenVisibility);
- CGM.SetLLVMFunctionAttributes(OMD, FI, Fn, /*IsThunk=*/false);
+ CGM.SetLLVMFunctionAttributes(OMD, FI, Fn, /*IsThunk=*/InnerFn);
CGM.SetLLVMFunctionAttributesForDefinition(OMD, Fn);
} else {
CGM.SetInternalFunctionAttributes(OMD, Fn, FI);
@@ -780,10 +781,14 @@ void CodeGenFunction::StartObjCMethod(const ObjCMethodDecl *OMD,
OMD->getLocation(), StartLoc);
if (OMD->isDirectMethod()) {
- // This function is a direct call, it has to implement a nil check
- // on entry.
- //
- // TODO: possibly have several entry points to elide the check
+ // Having `InnerFn` indicates that we are generating a nil check thunk.
+ // In that case our job is done here.
+ if (InnerFn)
+ return;
+ // Only NeXTFamily can have nil check thunk.
+ if (CGM.shouldHaveNilCheckThunk(OMD))
+ // Go generate a nil check thunk around `Fn`
+ CodeGenFunction(CGM, /*InnerFn=*/Fn).GenerateObjCDirectThunk(OMD, CD);
CGM.getObjCRuntime().GenerateDirectMethodPrologue(*this, Fn, OMD, CD);
}
@@ -1637,6 +1642,44 @@ void CodeGenFunction::GenerateObjCSetter(ObjCImplementationDecl *IMP,
FinishFunction(OMD->getEndLoc());
}
+void CodeGenFunction::GenerateObjCDirectThunk(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD) {
+ assert(InnerFn && CGM.shouldHaveNilCheckThunk(OMD) &&
+ "Should only generate wrapper when the flag is set.");
+ StartObjCMethod(OMD, CD);
+
+ // Manually pop all the clean up that doesn't need to happen in the outer
+ // function. InnerFn will do this for us.
+ while (EHStack.stable_begin() != PrologueCleanupDepth)
+ EHStack.popCleanup();
+
+ // Generate a nil check.
+ CGM.getObjCRuntime().GenerateDirectMethodPrologue(*this, CurFn, OMD, CD);
+ // Call the InnerFn and pass the return value
+ SmallVector<llvm::Value *> Args(CurFn->arg_size());
+ std::transform(CurFn->arg_begin(), CurFn->arg_end(), Args.begin(),
+ [](llvm::Argument &arg) { return &arg; });
+
+ // This will be optimized into a tail call.
+ auto *CallInst = EmitCallOrInvoke(InnerFn, Args);
+ // Preserve the inner function's attributes to the call instruction.
+ CallInst->setAttributes(InnerFn->getAttributes());
+ llvm::Value *RetVal = CallInst;
+
+ // If `AutoreleaseResult` is set, the return value is not void.
+ if (AutoreleaseResult)
+ RetVal = EmitARCRetainAutoreleasedReturnValue(RetVal);
+
+ // This excessive store is totally unnecessary.
+ // But `FinishFunction` really wants us to store the result so it can
+ // clean up the function properly.
+ // The unnecessary store-load of the ret value will be optimized out anyway.
+ if (!CurFn->getReturnType()->isVoidTy())
+ Builder.CreateStore(RetVal, ReturnValue);
+
+ // Nil check's end location is the function's start location.
+ FinishFunction(OMD->getBeginLoc());
+}
namespace {
struct DestroyIvar final : EHScopeStack::Cleanup {
private:
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index d1876f47c0eea..802168befd628 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -595,7 +595,13 @@ class CGObjCGNU : public CGObjCRuntime {
llvm::Constant *GetEHType(QualType T) override;
llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) override;
+ const ObjCContainerDecl *CD,
+ bool isThunk) override {
+ // isThunk is irrelevent for GNU.
+ return GenerateMethod(OMD, CD);
+ };
+ llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD);
// Map to unify direct method definitions.
llvm::DenseMap<const ObjCMethodDecl *, llvm::Function *>
diff --git a/clang/lib/CodeGen/CGObjCMac.cpp b/clang/lib/CodeGen/CGObjCMac.cpp
index 01552b6e53d00..4f0d7567b4b97 100644
--- a/clang/lib/CodeGen/CGObjCMac.cpp
+++ b/clang/lib/CodeGen/CGObjCMac.cpp
@@ -1068,12 +1068,13 @@ class CGObjCCommonMac : public CodeGen::CGObjCRuntime {
ConstantAddress GenerateConstantString(const StringLiteral *SL) override;
ConstantAddress GenerateConstantNSString(const StringLiteral *SL);
- llvm::Function *
- GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD = nullptr) override;
+ llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD = nullptr,
+ bool isThunk = true) override;
llvm::Function *GenerateDirectMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD);
+ const ObjCContainerDecl *CD,
+ bool isThunk);
void GenerateDirectMethodPrologue(CodeGenFunction &CGF, llvm::Function *Fn,
const ObjCMethodDecl *OMD,
@@ -2094,7 +2095,8 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
llvm::FunctionCallee Fn = nullptr;
if (Method && Method->isDirectMethod()) {
assert(!IsSuper);
- Fn = GenerateDirectMethod(Method, Method->getClassInterface());
+ Fn = GenerateDirectMethod(Method, Method->getClassInterface(),
+ /*isThunk=*/true);
// Direct methods will synthesize the proper `_cmd` internally,
// so just don't bother with setting the `_cmd` argument.
RequiresSelValue = false;
@@ -2118,10 +2120,6 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
: ObjCTypes.getSendFn(IsSuper);
}
- // Cast function to proper signature
- llvm::Constant *BitcastFn = cast<llvm::Constant>(
- CGF.Builder.CreateBitCast(Fn.getCallee(), MSI.MessengerType));
-
// We don't need to emit a null check to zero out an indirect result if the
// result is ignored.
if (Return.isUnused())
@@ -2131,6 +2129,16 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
if (!RequiresNullCheck && Method && Method->hasParamDestroyedInCallee())
RequiresNullCheck = true;
+ // `RequiresNullCheck` will do the null check at the caller site.
+ // In that case, a direct instance method can skip the null check and call the
+ // inner function instead.
+ if (CGM.shouldHaveNilCheckThunk(Method))
+ if (RequiresNullCheck || !ReceiverCanBeNull)
+ Fn = GenerateDirectMethod(Method, Method->getClassInterface(),
+ /*isThunk=*/false);
+ // Cast function to proper signature
+ llvm::Constant *BitcastFn = cast<llvm::Constant>(
+ CGF.Builder.CreateBitCast(Fn.getCallee(), MSI.MessengerType));
NullReturnState nullReturn;
if (RequiresNullCheck) {
nullReturn.init(CGF, Arg0);
@@ -3857,11 +3865,12 @@ CGObjCMac::emitMethodList(Twine name, MethodListType MLT,
}
llvm::Function *CGObjCCommonMac::GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) {
+ const ObjCContainerDecl *CD,
+ bool isThunk) {
llvm::Function *Method;
if (OMD->isDirectMethod()) {
- Method = GenerateDirectMethod(OMD, CD);
+ Method = GenerateDirectMethod(OMD, CD, isThunk);
} else {
auto Name = getSymbolNameForMethod(OMD);
@@ -3877,14 +3886,13 @@ llvm::Function *CGObjCCommonMac::GenerateMethod(const ObjCMethodDecl *OMD,
return Method;
}
-llvm::Function *
-CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) {
+llvm::Function *CGObjCCommonMac::GenerateDirectMethod(
+ const ObjCMethodDecl *OMD, const ObjCContainerDecl *CD, bool isThunk) {
auto *COMD = OMD->getCanonicalDecl();
auto I = DirectMethodDefinitions.find(COMD);
llvm::Function *OldFn = nullptr, *Fn = nullptr;
- if (I != DirectMethodDefinitions.end()) {
+ if (isThunk && I != DirectMethodDefinitions.end()) {
// Objective-C allows for the declaration and implementation types
// to differ slightly.
//
@@ -3913,11 +3921,16 @@ CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
// Replace the cached function in the map.
I->second = Fn;
} else {
- auto Name = getSymbolNameForMethod(OMD, /*include category*/ false);
-
- Fn = llvm::Function::Create(MethodTy, llvm::GlobalValue::ExternalLinkage,
- Name, &CGM.getModule());
- DirectMethodDefinitions.insert(std::make_pair(COMD, Fn));
+ auto Name =
+ getSymbolNameForMethod(OMD, /*include category*/ false, isThunk);
+ // Non-thunk functions are not cached and may be repeatedly created.
+ // Therefore, we try to find it before we create one.
+ Fn = CGM.getModule().getFunction(Name);
+ if (!Fn)
+ Fn = llvm::Function::Create(MethodTy, llvm::GlobalValue::ExternalLinkage,
+ Name, &CGM.getModule());
+ if (isThunk)
+ DirectMethodDefinitions.insert(std::make_pair(COMD, Fn));
}
return Fn;
@@ -3930,6 +3943,8 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
bool ReceiverCanBeNull = true;
auto selfAddr = CGF.GetAddrOfLocalVar(OMD->getSelfDecl());
auto selfValue = Builder.CreateLoad(selfAddr);
+ bool shouldHaveNilCheckThunk = CGF.CGM.shouldHaveNilCheckThunk(OMD);
+ bool isNilCheckThunk = shouldHaveNilCheckThunk && CGF.InnerFn;
// Generate:
//
@@ -3971,7 +3986,9 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
ReceiverCanBeNull = isWeakLinkedClass(OID);
}
- if (ReceiverCanBeNull) {
+ // Only emit nil check if this is a nil check thunk or the method
+ // decides that its receiver can be null
+ if (isNilCheckThunk || (!shouldHaveNilCheckThunk && ReceiverCanBeNull)) {
llvm::BasicBlock *SelfIsNilBlock =
CGF.createBasicBlock("objc_direct_method.self_is_nil");
llvm::BasicBlock *ContBlock =
@@ -4001,14 +4018,19 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
Builder.SetInsertPoint(ContBlock);
}
- // only synthesize _cmd if it's referenced
- if (OMD->getCmdDecl()->isUsed()) {
+ // Only synthesize _cmd if it's referenced
+ // However, a nil check thunk doesn't need _cmd even if it's referenced
+ if (!isNilCheckThunk && OMD->getCmdDecl()->isUsed()) {
// `_cmd` is not a parameter to direct methods, so storage must be
// explicitly declared for it.
CGF.EmitVarDecl(*OMD->getCmdDecl());
Builder.CreateStore(GetSelector(CGF, OMD),
CGF.GetAddrOfLocalVar(OMD->getCmdDecl()));
}
+
+ // It's possible that selfValue is never used.
+ if (se...
[truncated]
|
@llvm/pr-subscribers-clang Author: Peter Rong (DataCorrupted) ChangesTL;DR“In Objective-C, it is valid to send a message to nil—it simply has no effect at runtime.”[1] Such requirements can be easily implemented in the runtime. For However, not all direct method calls need to do this check if we can be sure at static time that Detailed designOverall
Class methods are excludedThe receiver for the class method (self) is the class object. That said, there is only one (global) instance of this receiver and it is lazily initialized. That means most of the time nil check doesn’t happen, and the code is not generated. Variadic argument functions are excludedWe need to deal with different calling conventions on different platforms to make variadic argument functions work. Since there is only a handful amount of these functions, we feel reasonable to exclude them. ResultsIn our testing pre-upstreaming, we've discovered that the added symbols introduced <0.1% size regression in some of our iOS apps. With this patch we open the window for more optimizations including:
As a result, we believe this trade off is acceptable. Future plansThis PR is the first patch towards ObjCDirect ABI visibility change. In the future, we plan to do the following:
Patch is 60.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126639.diff 24 Files Affected:
diff --git a/clang/include/clang/AST/DeclObjC.h b/clang/include/clang/AST/DeclObjC.h
index 4663603f79754..69c36b77da6c2 100644
--- a/clang/include/clang/AST/DeclObjC.h
+++ b/clang/include/clang/AST/DeclObjC.h
@@ -482,6 +482,12 @@ class ObjCMethodDecl : public NamedDecl, public DeclContext {
/// True if the method is tagged as objc_direct
bool isDirectMethod() const;
+ // Only direct instance method that have a fixed number of arguments can have
+ // nil check thunk functions.
+ bool canHaveNilCheckThunk() const {
+ return isDirectMethod() && isInstanceMethod() && !isVariadic();
+ }
+
/// True if the method has a parameter that's destroyed in the callee.
bool hasParamDestroyedInCallee() const;
diff --git a/clang/include/clang/AST/Mangle.h b/clang/include/clang/AST/Mangle.h
index 6134a70c04bd3..7d6f2181b8326 100644
--- a/clang/include/clang/AST/Mangle.h
+++ b/clang/include/clang/AST/Mangle.h
@@ -40,6 +40,13 @@ struct ThisAdjustment;
struct ThunkInfo;
class VarDecl;
+/// Extract mangling function name from MangleContext such that swift can call
+/// it to prepare for ObjCDirect in swift.
+void mangleObjCMethodName(raw_ostream &OS, bool includePrefixByte,
+ bool isInstanceMethod, StringRef ClassName,
+ std::optional<StringRef> CategoryName,
+ StringRef MethodName, bool isThunk);
+
/// MangleContext - Context for tracking state which persists across multiple
/// calls to the C++ name mangler.
class MangleContext {
@@ -153,7 +160,8 @@ class MangleContext {
void mangleObjCMethodName(const ObjCMethodDecl *MD, raw_ostream &OS,
bool includePrefixByte = true,
- bool includeCategoryNamespace = true);
+ bool includeCategoryNamespace = true,
+ bool isThunk = true);
void mangleObjCMethodNameAsSourceName(const ObjCMethodDecl *MD,
raw_ostream &);
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index a7f5f1abbb825..837b6e5ae4e9d 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -359,6 +359,7 @@ CODEGENOPT(EmitLLVMUseLists, 1, 0) ///< Control whether to serialize use-lists.
CODEGENOPT(WholeProgramVTables, 1, 0) ///< Whether to apply whole-program
/// vtable optimization.
+CODEGENOPT(ObjCEmitNilCheckThunk , 1, 0) ///< Whether objc_direct methods should emit a nil check thunk.
CODEGENOPT(VirtualFunctionElimination, 1, 0) ///< Whether to apply the dead
/// virtual function elimination
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 383440ddbc0ea..d41693667621c 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -372,6 +372,7 @@ LANGOPT(IncludeDefaultHeader, 1, 0, "Include default header file for OpenCL")
LANGOPT(DeclareOpenCLBuiltins, 1, 0, "Declare OpenCL builtin functions")
BENIGN_LANGOPT(DelayedTemplateParsing , 1, 0, "delayed template parsing")
LANGOPT(BlocksRuntimeOptional , 1, 0, "optional blocks runtime")
+LANGOPT(ObjCEmitNilCheckThunk, 1, 0, "Emit a thunk to do nil check for objc direct methods")
LANGOPT(
CompleteMemberPointers, 1, 0,
"Require member pointer base types to be complete at the point where the "
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 5ad187926e710..eb06410f57172 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3001,6 +3001,8 @@ def fthin_link_bitcode_EQ : Joined<["-"], "fthin-link-bitcode=">,
Visibility<[ClangOption, CLOption, CC1Option]>, Group<f_Group>,
HelpText<"Write minimized bitcode to <file> for the ThinLTO thin link only">,
MarshallingInfoString<CodeGenOpts<"ThinLinkBitcodeFile">>;
+def fobjc_emit_nil_check_thunk:
+ Flag<["-"], "fobjc-emit-nil-check-thunk">, Visibility<[ClangOption, CC1Option]>, Group<f_Group>;
defm fat_lto_objects : BoolFOption<"fat-lto-objects",
CodeGenOpts<"FatLTO">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption, CC1Option], "Enable">,
diff --git a/clang/lib/AST/Mangle.cpp b/clang/lib/AST/Mangle.cpp
index 15be9c62bf888..fa37e80a76c88 100644
--- a/clang/lib/AST/Mangle.cpp
+++ b/clang/lib/AST/Mangle.cpp
@@ -29,6 +29,24 @@
using namespace clang;
+void clang::mangleObjCMethodName(raw_ostream &OS, bool includePrefixByte,
+ bool isInstanceMethod, StringRef ClassName,
+ std::optional<StringRef> CategoryName,
+ StringRef MethodName, bool isThunk) {
+ // \01+[ContainerName(CategoryName) SelectorName]
+ if (includePrefixByte)
+ OS << "\01";
+ OS << (isInstanceMethod ? '-' : '+');
+ OS << '[';
+ OS << ClassName;
+ if (CategoryName)
+ OS << "(" << *CategoryName << ")";
+ OS << " ";
+ OS << MethodName;
+ OS << ']';
+ if (!isThunk)
+ OS << "_inner";
+}
// FIXME: For blocks we currently mimic GCC's mangling scheme, which leaves
// much to be desired. Come up with a better mangling scheme.
@@ -327,7 +345,8 @@ void MangleContext::mangleBlock(const DeclContext *DC, const BlockDecl *BD,
void MangleContext::mangleObjCMethodName(const ObjCMethodDecl *MD,
raw_ostream &OS,
bool includePrefixByte,
- bool includeCategoryNamespace) {
+ bool includeCategoryNamespace,
+ bool isThunk) {
if (getASTContext().getLangOpts().ObjCRuntime.isGNUFamily()) {
// This is the mangling we've always used on the GNU runtimes, but it
// has obvious collisions in the face of underscores within class
@@ -361,24 +380,24 @@ void MangleContext::mangleObjCMethodName(const ObjCMethodDecl *MD,
}
// \01+[ContainerName(CategoryName) SelectorName]
- if (includePrefixByte) {
- OS << '\01';
- }
- OS << (MD->isInstanceMethod() ? '-' : '+') << '[';
+ auto CategoryName = std::optional<StringRef>();
+ StringRef ClassName;
if (const auto *CID = MD->getCategory()) {
- OS << CID->getClassInterface()->getName();
- if (includeCategoryNamespace) {
- OS << '(' << *CID << ')';
- }
+ if (includeCategoryNamespace)
+ CategoryName = CID->getName();
+ ClassName = CID->getClassInterface()->getName();
+
} else if (const auto *CD =
dyn_cast<ObjCContainerDecl>(MD->getDeclContext())) {
- OS << CD->getName();
+ ClassName = CD->getName();
} else {
llvm_unreachable("Unexpected ObjC method decl context");
}
- OS << ' ';
- MD->getSelector().print(OS);
- OS << ']';
+ std::string MethodName;
+ llvm::raw_string_ostream MethodNameOS(MethodName);
+ MD->getSelector().print(MethodNameOS);
+ clang::mangleObjCMethodName(OS, includePrefixByte, MD->isInstanceMethod(),
+ ClassName, CategoryName, MethodName, isThunk);
}
void MangleContext::mangleObjCMethodNameAsSourceName(const ObjCMethodDecl *MD,
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 47bfd470dbafb..c522b794fbfe3 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2702,6 +2702,17 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
ArgAttrs[IRArgs.first] = llvm::AttributeSet::get(getLLVMContext(), Attrs);
}
+ // Direct method prologue should not contain nil check anymore.
+ // As a result, we can set `self` to be NonNull to prepare for further
+ // optimizations.
+ if (TargetDecl) {
+ auto OMD = dyn_cast<ObjCMethodDecl>(TargetDecl);
+ if (shouldHaveNilCheckThunk(OMD) && !IsThunk) {
+ auto IRArgs = IRFunctionArgs.getIRArgs(0);
+ ArgAttrs[IRArgs.first] = ArgAttrs[IRArgs.first].addAttribute(
+ getLLVMContext(), llvm::Attribute::NonNull);
+ }
+ }
unsigned ArgNo = 0;
for (CGFunctionInfo::const_arg_iterator I = FI.arg_begin(),
@@ -3386,7 +3397,13 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
}
}
- if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()) {
+ if (InnerFn) {
+ // Don't emit any arg except for `self` if we are in a thunk function.
+ // We still need self for nil check, other arguments aren't used in this
+ // function and thus are not needed. Avoid emitting them also prevents
+ // accidental release/retain.
+ EmitParmDecl(*Args[0], ArgVals[0], 1);
+ } else if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()) {
for (int I = Args.size() - 1; I >= 0; --I)
EmitParmDecl(*Args[I], ArgVals[I], I + 1);
} else {
diff --git a/clang/lib/CodeGen/CGObjC.cpp b/clang/lib/CodeGen/CGObjC.cpp
index 27c7c2fa9cba1..1fd4c1d2c8a0f 100644
--- a/clang/lib/CodeGen/CGObjC.cpp
+++ b/clang/lib/CodeGen/CGObjC.cpp
@@ -756,12 +756,13 @@ void CodeGenFunction::StartObjCMethod(const ObjCMethodDecl *OMD,
if (OMD->hasAttr<NoDebugAttr>())
DebugInfo = nullptr; // disable debug info indefinitely for this function
- llvm::Function *Fn = CGM.getObjCRuntime().GenerateMethod(OMD, CD);
+ bool isInner = CGM.shouldHaveNilCheckThunk(OMD) && !InnerFn;
+ llvm::Function *Fn = CGM.getObjCRuntime().GenerateMethod(OMD, CD, !isInner);
const CGFunctionInfo &FI = CGM.getTypes().arrangeObjCMethodDeclaration(OMD);
if (OMD->isDirectMethod()) {
Fn->setVisibility(llvm::Function::HiddenVisibility);
- CGM.SetLLVMFunctionAttributes(OMD, FI, Fn, /*IsThunk=*/false);
+ CGM.SetLLVMFunctionAttributes(OMD, FI, Fn, /*IsThunk=*/InnerFn);
CGM.SetLLVMFunctionAttributesForDefinition(OMD, Fn);
} else {
CGM.SetInternalFunctionAttributes(OMD, Fn, FI);
@@ -780,10 +781,14 @@ void CodeGenFunction::StartObjCMethod(const ObjCMethodDecl *OMD,
OMD->getLocation(), StartLoc);
if (OMD->isDirectMethod()) {
- // This function is a direct call, it has to implement a nil check
- // on entry.
- //
- // TODO: possibly have several entry points to elide the check
+ // Having `InnerFn` indicates that we are generating a nil check thunk.
+ // In that case our job is done here.
+ if (InnerFn)
+ return;
+ // Only NeXTFamily can have nil check thunk.
+ if (CGM.shouldHaveNilCheckThunk(OMD))
+ // Go generate a nil check thunk around `Fn`
+ CodeGenFunction(CGM, /*InnerFn=*/Fn).GenerateObjCDirectThunk(OMD, CD);
CGM.getObjCRuntime().GenerateDirectMethodPrologue(*this, Fn, OMD, CD);
}
@@ -1637,6 +1642,44 @@ void CodeGenFunction::GenerateObjCSetter(ObjCImplementationDecl *IMP,
FinishFunction(OMD->getEndLoc());
}
+void CodeGenFunction::GenerateObjCDirectThunk(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD) {
+ assert(InnerFn && CGM.shouldHaveNilCheckThunk(OMD) &&
+ "Should only generate wrapper when the flag is set.");
+ StartObjCMethod(OMD, CD);
+
+ // Manually pop all the clean up that doesn't need to happen in the outer
+ // function. InnerFn will do this for us.
+ while (EHStack.stable_begin() != PrologueCleanupDepth)
+ EHStack.popCleanup();
+
+ // Generate a nil check.
+ CGM.getObjCRuntime().GenerateDirectMethodPrologue(*this, CurFn, OMD, CD);
+ // Call the InnerFn and pass the return value
+ SmallVector<llvm::Value *> Args(CurFn->arg_size());
+ std::transform(CurFn->arg_begin(), CurFn->arg_end(), Args.begin(),
+ [](llvm::Argument &arg) { return &arg; });
+
+ // This will be optimized into a tail call.
+ auto *CallInst = EmitCallOrInvoke(InnerFn, Args);
+ // Preserve the inner function's attributes to the call instruction.
+ CallInst->setAttributes(InnerFn->getAttributes());
+ llvm::Value *RetVal = CallInst;
+
+ // If `AutoreleaseResult` is set, the return value is not void.
+ if (AutoreleaseResult)
+ RetVal = EmitARCRetainAutoreleasedReturnValue(RetVal);
+
+ // This excessive store is totally unnecessary.
+ // But `FinishFunction` really wants us to store the result so it can
+ // clean up the function properly.
+ // The unnecessary store-load of the ret value will be optimized out anyway.
+ if (!CurFn->getReturnType()->isVoidTy())
+ Builder.CreateStore(RetVal, ReturnValue);
+
+ // Nil check's end location is the function's start location.
+ FinishFunction(OMD->getBeginLoc());
+}
namespace {
struct DestroyIvar final : EHScopeStack::Cleanup {
private:
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index d1876f47c0eea..802168befd628 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -595,7 +595,13 @@ class CGObjCGNU : public CGObjCRuntime {
llvm::Constant *GetEHType(QualType T) override;
llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) override;
+ const ObjCContainerDecl *CD,
+ bool isThunk) override {
+ // isThunk is irrelevent for GNU.
+ return GenerateMethod(OMD, CD);
+ };
+ llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD);
// Map to unify direct method definitions.
llvm::DenseMap<const ObjCMethodDecl *, llvm::Function *>
diff --git a/clang/lib/CodeGen/CGObjCMac.cpp b/clang/lib/CodeGen/CGObjCMac.cpp
index 01552b6e53d00..4f0d7567b4b97 100644
--- a/clang/lib/CodeGen/CGObjCMac.cpp
+++ b/clang/lib/CodeGen/CGObjCMac.cpp
@@ -1068,12 +1068,13 @@ class CGObjCCommonMac : public CodeGen::CGObjCRuntime {
ConstantAddress GenerateConstantString(const StringLiteral *SL) override;
ConstantAddress GenerateConstantNSString(const StringLiteral *SL);
- llvm::Function *
- GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD = nullptr) override;
+ llvm::Function *GenerateMethod(const ObjCMethodDecl *OMD,
+ const ObjCContainerDecl *CD = nullptr,
+ bool isThunk = true) override;
llvm::Function *GenerateDirectMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD);
+ const ObjCContainerDecl *CD,
+ bool isThunk);
void GenerateDirectMethodPrologue(CodeGenFunction &CGF, llvm::Function *Fn,
const ObjCMethodDecl *OMD,
@@ -2094,7 +2095,8 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
llvm::FunctionCallee Fn = nullptr;
if (Method && Method->isDirectMethod()) {
assert(!IsSuper);
- Fn = GenerateDirectMethod(Method, Method->getClassInterface());
+ Fn = GenerateDirectMethod(Method, Method->getClassInterface(),
+ /*isThunk=*/true);
// Direct methods will synthesize the proper `_cmd` internally,
// so just don't bother with setting the `_cmd` argument.
RequiresSelValue = false;
@@ -2118,10 +2120,6 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
: ObjCTypes.getSendFn(IsSuper);
}
- // Cast function to proper signature
- llvm::Constant *BitcastFn = cast<llvm::Constant>(
- CGF.Builder.CreateBitCast(Fn.getCallee(), MSI.MessengerType));
-
// We don't need to emit a null check to zero out an indirect result if the
// result is ignored.
if (Return.isUnused())
@@ -2131,6 +2129,16 @@ CodeGen::RValue CGObjCCommonMac::EmitMessageSend(
if (!RequiresNullCheck && Method && Method->hasParamDestroyedInCallee())
RequiresNullCheck = true;
+ // `RequiresNullCheck` will do the null check at the caller site.
+ // In that case, a direct instance method can skip the null check and call the
+ // inner function instead.
+ if (CGM.shouldHaveNilCheckThunk(Method))
+ if (RequiresNullCheck || !ReceiverCanBeNull)
+ Fn = GenerateDirectMethod(Method, Method->getClassInterface(),
+ /*isThunk=*/false);
+ // Cast function to proper signature
+ llvm::Constant *BitcastFn = cast<llvm::Constant>(
+ CGF.Builder.CreateBitCast(Fn.getCallee(), MSI.MessengerType));
NullReturnState nullReturn;
if (RequiresNullCheck) {
nullReturn.init(CGF, Arg0);
@@ -3857,11 +3865,12 @@ CGObjCMac::emitMethodList(Twine name, MethodListType MLT,
}
llvm::Function *CGObjCCommonMac::GenerateMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) {
+ const ObjCContainerDecl *CD,
+ bool isThunk) {
llvm::Function *Method;
if (OMD->isDirectMethod()) {
- Method = GenerateDirectMethod(OMD, CD);
+ Method = GenerateDirectMethod(OMD, CD, isThunk);
} else {
auto Name = getSymbolNameForMethod(OMD);
@@ -3877,14 +3886,13 @@ llvm::Function *CGObjCCommonMac::GenerateMethod(const ObjCMethodDecl *OMD,
return Method;
}
-llvm::Function *
-CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
- const ObjCContainerDecl *CD) {
+llvm::Function *CGObjCCommonMac::GenerateDirectMethod(
+ const ObjCMethodDecl *OMD, const ObjCContainerDecl *CD, bool isThunk) {
auto *COMD = OMD->getCanonicalDecl();
auto I = DirectMethodDefinitions.find(COMD);
llvm::Function *OldFn = nullptr, *Fn = nullptr;
- if (I != DirectMethodDefinitions.end()) {
+ if (isThunk && I != DirectMethodDefinitions.end()) {
// Objective-C allows for the declaration and implementation types
// to differ slightly.
//
@@ -3913,11 +3921,16 @@ CGObjCCommonMac::GenerateDirectMethod(const ObjCMethodDecl *OMD,
// Replace the cached function in the map.
I->second = Fn;
} else {
- auto Name = getSymbolNameForMethod(OMD, /*include category*/ false);
-
- Fn = llvm::Function::Create(MethodTy, llvm::GlobalValue::ExternalLinkage,
- Name, &CGM.getModule());
- DirectMethodDefinitions.insert(std::make_pair(COMD, Fn));
+ auto Name =
+ getSymbolNameForMethod(OMD, /*include category*/ false, isThunk);
+ // Non-thunk functions are not cached and may be repeatedly created.
+ // Therefore, we try to find it before we create one.
+ Fn = CGM.getModule().getFunction(Name);
+ if (!Fn)
+ Fn = llvm::Function::Create(MethodTy, llvm::GlobalValue::ExternalLinkage,
+ Name, &CGM.getModule());
+ if (isThunk)
+ DirectMethodDefinitions.insert(std::make_pair(COMD, Fn));
}
return Fn;
@@ -3930,6 +3943,8 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
bool ReceiverCanBeNull = true;
auto selfAddr = CGF.GetAddrOfLocalVar(OMD->getSelfDecl());
auto selfValue = Builder.CreateLoad(selfAddr);
+ bool shouldHaveNilCheckThunk = CGF.CGM.shouldHaveNilCheckThunk(OMD);
+ bool isNilCheckThunk = shouldHaveNilCheckThunk && CGF.InnerFn;
// Generate:
//
@@ -3971,7 +3986,9 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
ReceiverCanBeNull = isWeakLinkedClass(OID);
}
- if (ReceiverCanBeNull) {
+ // Only emit nil check if this is a nil check thunk or the method
+ // decides that its receiver can be null
+ if (isNilCheckThunk || (!shouldHaveNilCheckThunk && ReceiverCanBeNull)) {
llvm::BasicBlock *SelfIsNilBlock =
CGF.createBasicBlock("objc_direct_method.self_is_nil");
llvm::BasicBlock *ContBlock =
@@ -4001,14 +4018,19 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue(
Builder.SetInsertPoint(ContBlock);
}
- // only synthesize _cmd if it's referenced
- if (OMD->getCmdDecl()->isUsed()) {
+ // Only synthesize _cmd if it's referenced
+ // However, a nil check thunk doesn't need _cmd even if it's referenced
+ if (!isNilCheckThunk && OMD->getCmdDecl()->isUsed()) {
// `_cmd` is not a parameter to direct methods, so storage must be
// explicitly declared for it.
CGF.EmitVarDecl(*OMD->getCmdDecl());
Builder.CreateStore(GetSelector(CGF, OMD),
CGF.GetAddrOfLocalVar(OMD->getCmdDecl()));
}
+
+ // It's possible that selfValue is never used.
+ if (se...
[truncated]
|
@@ -3971,7 +3986,9 @@ void CGObjCCommonMac::GenerateDirectMethodPrologue( | |||
ReceiverCanBeNull = isWeakLinkedClass(OID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MadCoder I noticed your TODO you left here five years ago, and I wonder if you want me to do something about it. With this patch we can easily move the initializer into the thunk. But I am not sure how to test if self
is initialized or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so if I'm reading this right, a translation unit that defines a direct method promises to emit both of these symbols, and in principle the caller calls the right one depending on whether the object is known to be null? Would it make more sense to emit the thunk lazily on the caller side instead?
clang/include/clang/AST/Mangle.h
Outdated
@@ -153,7 +160,8 @@ class MangleContext { | |||
|
|||
void mangleObjCMethodName(const ObjCMethodDecl *MD, raw_ostream &OS, | |||
bool includePrefixByte = true, | |||
bool includeCategoryNamespace = true); | |||
bool includeCategoryNamespace = true, | |||
bool isThunk = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compilers introduce thunks for a lot of different reasons, so I don't like having super-generic names like isThunk
floating around. Maybe an ObjCThunkKind
or an isNilCheckThunk
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driveby comment, feel free to ignore: 3 sequential optional boolean parameters is past the point at which you should come up with a flags enum for readability.
clang/lib/AST/Mangle.cpp
Outdated
OS << MethodName; | ||
OS << ']'; | ||
if (!isThunk) | ||
OS << "_inner"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about this. You're adding _inner
to everything that isn't a thunk? Now, obviously, from the lack of sweeping changes to the entire ObjC test suite, that's not actually what's happening, but I'd guess it's because you're only calling this for direct methods, which is not at all clear from the method name.
inner
is a really non-specific thing to add to the symbol name. This is important because it's what's going to show up in backtraces. Maybe _nonnull
would be better?
Although... the end state we want is that the non-thunk function gets the undecorated name and the thunk function gets the decorated one, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
end state we want is that the non-thunk function gets the undecorated name and the thunk function gets the decorated one, right?
I think it's the other way around, thunk functions gets undecorated ones so anyone can use the old function name when they don't know if the receiver will be nil.
Some caller is non-null but we can't know that until some optimization is done (e.g. code simplification). Thus my solution here is to conservatively emit the thunk for all, and try to DCE (in another patch later) unnecessary ones later. |
Hmm. So you have a compatibility goal that old compiled code can call new compiled code, but it's not important for new compiled code to be able to call old compiled code? Because new compiled code will presumably try to bypass the thunk if it can prove that |
I agree that compatibility is a concern, and the reason we put this patch before visibility change is make sure that we settle ABI first to avoid incompatible ABIs. However, I don't think it is a concern here: Symbols of direct methods are not exported anyways, meaning that no one outside the linking unit can call it, thus there isn't "old" and "new" code: If thunk bypass shall kick in (presumably post link), the compiler will see all relevant code. Current state is that the developers have to write a thunk to expose certain direct methods, so even if a direct method is manually exposed, there isn't a new-old code interaction issue. The decision that thunk has the old ABI is purely to maintain the consistency of symbol semantics so it doesn't cause confusion: |
e6c0251
to
b468535
Compare
Hmm. I'm a little uncomfortable about basing the design around an optimization that only works under LTO. Do we even need clang changes if we're relying on LTO? It might as well be a generic optimization that outlines early exits from a function body. At best it's an attribute that hints that that might be worthwhile. Seems to me that the optimal rule is that we guarantee our inner function that does no additional work. As long as that gets a different mangling from the symbol that uses the old rule, any compatibility problems will be immediately apparent at link time. Thunks can be emitted lazily by default, and for code that needs to maintain compatibility with the old rule, we just have a switch to force the emission of thunk with |
Apologize for the ambiguity, that optimization won't happen in / rely on LTO only.
I agree, and I would assume it could happen somewhere in the CodeSimplification pass, where it runs both pre link and post link.
I agree and that is the case right now. However, I don't fully understand the following, could you please explain more?
Unfortunately, callee can't know all of its callers during CodeGen (imagine
Since all symbols are hidden and resides in one linking unit, there should be no compatibility issues since all translation unit shall be compiled by one compiler. What am I missing here? |
Oh if by "old code" and "new code" do you mean:
(Maybe I'm too young to understand why people do this, but well its doable) For optimizations: Ideally we will just replace a call to |
@rjmccall I did a redesign based on your feedback, let me know if it sound good to you. If so, I'll implement it and send a new PR. clangThe CodeGen logic should be kept mostly intact. define [Class directMethod]() {
%Entry:
// Init Stack
br self==nil, %objc_direct.self_is_nil, %objc_direct.cont
%objc_direct.self_is_nil:
memset(&retval, 0, sizeof(retval));
br %return
%objc_direct.cont:
// Source Code
%return:
// Clean up stack
ret retval
} We make a few changes:
Eventually the code should look like: define [Class directMethod]() {
%Entry:
// Init Stack
br self==nil, %objc_direct.self_is_nil, %objc_direct.cont
%objc_direct.self_is_nil:
memset(&retval, 0, sizeof(retval));
br %return
%objc_direct.cont:
// Source Code
%return:
// Clean up stack
ret retval
%CallNonNull: ; No pred yet
retval = call [Class directMethod]_nonnull()
// ARC intrinsics
br %return
} llvm
define [Class directMethod]() {
%Entry:
// Init Stack
; br self==nil, %objc_direct.self_is_nil, %objc_direct.cont
br self==nil, %objc_direct.self_is_nil, %CallNonNull
%objc_direct.self_is_nil:
memset(&retval, 0, sizeof(retval));
br %return
%objc_direct.cont: ; dead code
// Source Code
%return:
// Clean up stack
ret retval
%CallNonNull:
retval = call [Class directMethod]_nonnull()
// ARC intrinsics
br %return
}
define [Class directMethod]_nonnull() {
%Entry:
// Init Stack
; br self==nil, %objc_direct.self_is_nil, %objc_direct.cont
br %objc_direct.cont:
%objc_direct.self_is_nil: ; dead code
memset(&ret, 0, sizeof(ret));
br %return
%objc_direct.cont:
// Source Code
%return:
// Clean up stack
ret retval
%CallNonNull: ; dead code
retval = call [Class directMethod]_nonnull()
br %return
} |
I dunno, that just seems really complicated and (ultimately) limiting for the optimization. Right now, |
IIRC both requirements are satisfied in this patch. |
Signed-off-by: Peter Rong <[email protected]>
Signed-off-by: Peter Rong <[email protected]>
Signed-off-by: Peter Rong <[email protected]>
Signed-off-by: Peter Rong <[email protected]>
daa7523
to
cb27bd7
Compare
…swift-frontend (#137884) When implementing `@objcDirect` in Swift, Swift needs to mangle a native `Decl` that is not a clang Node, which in turn don't have access to `clang::MangleContext`. Reimplementing mangling logic in Swift is redundant. This patch moves mangling logic from `clang::MangleContext` to `clang` using only basic types (`StringRef`, `std::optional`, etc.), such that Swift can we can just call Clang API: Swift depends on Clang already. We are separating this from #126639 so we can draft the proposal on the Swift side. #126639 will be worked to depend on this PR. Tests: No new tests, old ones should pass with no problem. --------- Signed-off-by: Peter Rong <[email protected]>
…swift-frontend (llvm#137884) When implementing `@objcDirect` in Swift, Swift needs to mangle a native `Decl` that is not a clang Node, which in turn don't have access to `clang::MangleContext`. Reimplementing mangling logic in Swift is redundant. This patch moves mangling logic from `clang::MangleContext` to `clang` using only basic types (`StringRef`, `std::optional`, etc.), such that Swift can we can just call Clang API: Swift depends on Clang already. We are separating this from llvm#126639 so we can draft the proposal on the Swift side. llvm#126639 will be worked to depend on this PR. Tests: No new tests, old ones should pass with no problem. --------- Signed-off-by: Peter Rong <[email protected]>
…swift-frontend (llvm#137884) When implementing `@objcDirect` in Swift, Swift needs to mangle a native `Decl` that is not a clang Node, which in turn don't have access to `clang::MangleContext`. Reimplementing mangling logic in Swift is redundant. This patch moves mangling logic from `clang::MangleContext` to `clang` using only basic types (`StringRef`, `std::optional`, etc.), such that Swift can we can just call Clang API: Swift depends on Clang already. We are separating this from llvm#126639 so we can draft the proposal on the Swift side. llvm#126639 will be worked to depend on this PR. Tests: No new tests, old ones should pass with no problem. --------- Signed-off-by: Peter Rong <[email protected]>
…swift-frontend (llvm#137884) When implementing `@objcDirect` in Swift, Swift needs to mangle a native `Decl` that is not a clang Node, which in turn don't have access to `clang::MangleContext`. Reimplementing mangling logic in Swift is redundant. This patch moves mangling logic from `clang::MangleContext` to `clang` using only basic types (`StringRef`, `std::optional`, etc.), such that Swift can we can just call Clang API: Swift depends on Clang already. We are separating this from llvm#126639 so we can draft the proposal on the Swift side. llvm#126639 will be worked to depend on this PR. Tests: No new tests, old ones should pass with no problem. --------- Signed-off-by: Peter Rong <[email protected]>
…swift-frontend (llvm#137884) When implementing `@objcDirect` in Swift, Swift needs to mangle a native `Decl` that is not a clang Node, which in turn don't have access to `clang::MangleContext`. Reimplementing mangling logic in Swift is redundant. This patch moves mangling logic from `clang::MangleContext` to `clang` using only basic types (`StringRef`, `std::optional`, etc.), such that Swift can we can just call Clang API: Swift depends on Clang already. We are separating this from llvm#126639 so we can draft the proposal on the Swift side. llvm#126639 will be worked to depend on this PR. Tests: No new tests, old ones should pass with no problem. --------- Signed-off-by: Peter Rong <[email protected]>
…swift-frontend (llvm#137884) When implementing `@objcDirect` in Swift, Swift needs to mangle a native `Decl` that is not a clang Node, which in turn don't have access to `clang::MangleContext`. Reimplementing mangling logic in Swift is redundant. This patch moves mangling logic from `clang::MangleContext` to `clang` using only basic types (`StringRef`, `std::optional`, etc.), such that Swift can we can just call Clang API: Swift depends on Clang already. We are separating this from llvm#126639 so we can draft the proposal on the Swift side. llvm#126639 will be worked to depend on this PR. Tests: No new tests, old ones should pass with no problem. --------- Signed-off-by: Peter Rong <[email protected]>
TL;DR
“In Objective-C, it is valid to send a message to nil—it simply has no effect at runtime.”[1]
Such requirements can be easily implemented in the runtime. For
objc_direct
methods, there won’t be a runtime to do the nil check for us. A decision was made when the patch first rolled out that the callee will do the nil check.However, not all direct method calls need to do this check if we can be sure at static time that
self
won't be null. In this PR, we propose a nil check thunk for all direct methods.Detailed design
Overall
-fobjc-emit-nil-check-thunk
to clang frontend. This flag is temporary and will be eliminated once we test it for sometime and determine that this is a stable change. Ideally, we want to make it default.Class methods are excluded
The receiver for the class method (self) is the class object. That said, there is only one (global) instance of this receiver and it is lazily initialized. That means most of the time nil check doesn’t happen, and the code is not generated.
Variadic argument functions are excluded
We need to deal with different calling conventions on different platforms to make variadic argument functions work. Since there is only a handful amount of these functions, we feel reasonable to exclude them.
Results
In our testing pre-upstreaming, we've discovered that the added symbols introduced <0.1% size regression in some of our iOS apps.
With this patch we open the window for more optimizations including:
As a result, we believe this trade off is acceptable.
Future plans
This PR is the first patch towards ObjCDirect ABI visibility change. In the future, we plan to do the following:
\01-[Class method]
and inner functions\01-[Class method]_inner
should get new ABI's, proposing-<Class method>
and-<Class method>_inner
for now.