Skip to content

[PAC] Implement function pointer re-signing #98847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 37 additions & 5 deletions clang/lib/CodeGen/Address.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#ifndef LLVM_CLANG_LIB_CODEGEN_ADDRESS_H
#define LLVM_CLANG_LIB_CODEGEN_ADDRESS_H

#include "CGPointerAuthInfo.h"
#include "clang/AST/CharUnits.h"
#include "clang/AST/Type.h"
#include "llvm/ADT/PointerIntPair.h"
Expand Down Expand Up @@ -108,6 +109,22 @@ class RawAddress {

/// Like RawAddress, an abstract representation of an aligned address, but the
/// pointer contained in this class is possibly signed.
///
/// This is designed to be an IR-level abstraction, carrying just the
/// information necessary to perform IR operations on an address like loads and
/// stores. In particular, it doesn't carry C type information or allow the
/// representation of things like bit-fields; clients working at that level
/// should generally be using `LValue`.
///
/// An address may be either *raw*, meaning that it's an ordinary machine
/// pointer, or *signed*, meaning that the pointer carries an embedded
/// pointer-authentication signature. Representing signed pointers directly in
/// this abstraction allows the authentication to be delayed as long as possible
/// without forcing IRGen to use totally different code paths for signed and
/// unsigned values or to separately propagate signature information through
/// every API that manipulates addresses. Pointer arithmetic on signed addresses
/// (e.g. drilling down to a struct field) is accumulated into a separate offset
/// which is applied when the address is finally accessed.
class Address {
friend class CGBuilderTy;

Expand All @@ -121,7 +138,11 @@ class Address {

CharUnits Alignment;

/// Offset from the base pointer.
/// The ptrauth information needed to authenticate the base pointer.
CGPointerAuthInfo PtrAuthInfo;

/// Offset from the base pointer. This is non-null only when the base
/// pointer is signed.
llvm::Value *Offset = nullptr;

llvm::Value *emitRawPointerSlow(CodeGenFunction &CGF) const;
Expand All @@ -140,12 +161,14 @@ class Address {
}

Address(llvm::Value *BasePtr, llvm::Type *ElementType, CharUnits Alignment,
llvm::Value *Offset, KnownNonNull_t IsKnownNonNull = NotKnownNonNull)
CGPointerAuthInfo PtrAuthInfo, llvm::Value *Offset,
KnownNonNull_t IsKnownNonNull = NotKnownNonNull)
: Pointer(BasePtr, IsKnownNonNull), ElementType(ElementType),
Alignment(Alignment), Offset(Offset) {}
Alignment(Alignment), PtrAuthInfo(PtrAuthInfo), Offset(Offset) {}

Address(RawAddress RawAddr)
: Pointer(RawAddr.isValid() ? RawAddr.getPointer() : nullptr),
: Pointer(RawAddr.isValid() ? RawAddr.getPointer() : nullptr,
RawAddr.isValid() ? RawAddr.isKnownNonNull() : NotKnownNonNull),
ElementType(RawAddr.isValid() ? RawAddr.getElementType() : nullptr),
Alignment(RawAddr.isValid() ? RawAddr.getAlignment()
: CharUnits::Zero()) {}
Expand Down Expand Up @@ -192,13 +215,18 @@ class Address {
/// Return the IR name of the pointer value.
llvm::StringRef getName() const { return Pointer.getPointer()->getName(); }

const CGPointerAuthInfo &getPointerAuthInfo() const { return PtrAuthInfo; }
void setPointerAuthInfo(const CGPointerAuthInfo &Info) { PtrAuthInfo = Info; }

// This function is called only in CGBuilderBaseTy::CreateElementBitCast.
void setElementType(llvm::Type *Ty) {
assert(hasOffset() &&
"this funcion shouldn't be called when there is no offset");
ElementType = Ty;
}

bool isSigned() const { return PtrAuthInfo.isSigned(); }

/// Whether the pointer is known not to be null.
KnownNonNull_t isKnownNonNull() const {
assert(isValid());
Expand All @@ -215,6 +243,9 @@ class Address {

llvm::Value *getOffset() const { return Offset; }

Address getResignedAddress(const CGPointerAuthInfo &NewInfo,
CodeGenFunction &CGF) const;

/// Return the pointer contained in this class after authenticating it and
/// adding offset to it if necessary.
llvm::Value *emitRawPointer(CodeGenFunction &CGF) const {
Expand All @@ -240,7 +271,8 @@ class Address {
/// alignment.
Address withElementType(llvm::Type *ElemTy) const {
if (!hasOffset())
return Address(getBasePointer(), ElemTy, getAlignment(), nullptr,
return Address(getBasePointer(), ElemTy, getAlignment(),
getPointerAuthInfo(), /*Offset=*/nullptr,
isKnownNonNull());
Address A(*this);
A.ElementType = ElemTy;
Expand Down
4 changes: 2 additions & 2 deletions clang/lib/CodeGen/CGBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -190,8 +190,8 @@ class CGBuilderTy : public CGBuilderBaseTy {
const llvm::Twine &Name = "") {
if (!Addr.hasOffset())
return Address(CreateAddrSpaceCast(Addr.getBasePointer(), Ty, Name),
ElementTy, Addr.getAlignment(), nullptr,
Addr.isKnownNonNull());
ElementTy, Addr.getAlignment(), Addr.getPointerAuthInfo(),
/*Offset=*/nullptr, Addr.isKnownNonNull());
// Eagerly force a raw address if these is an offset.
return RawAddress(
CreateAddrSpaceCast(Addr.emitRawPointer(*getCGF()), Ty, Name),
Expand Down
3 changes: 2 additions & 1 deletion clang/lib/CodeGen/CGExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1311,7 +1311,8 @@ static Address EmitPointerWithAlignment(const Expr *E, LValueBaseInfo *BaseInfo,
if (CE->getCastKind() == CK_AddressSpaceConversion)
Addr = CGF.Builder.CreateAddrSpaceCast(
Addr, CGF.ConvertType(E->getType()), ElemTy);
return Addr;
return CGF.authPointerToPointerCast(Addr, CE->getSubExpr()->getType(),
CE->getType());
}
break;

Expand Down
7 changes: 6 additions & 1 deletion clang/lib/CodeGen/CGExprScalar.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2373,7 +2373,9 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());
return EmitLoadOfLValue(DestLV, CE->getExprLoc());
}
return Builder.CreateBitCast(Src, DstTy);

llvm::Value *Result = Builder.CreateBitCast(Src, DstTy);
return CGF.authPointerToPointerCast(Result, E->getType(), DestTy);
}
case CK_AddressSpaceConversion: {
Expr::EvalResult Result;
Expand Down Expand Up @@ -2523,6 +2525,8 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
if (DestTy.mayBeDynamicClass())
IntToPtr = Builder.CreateLaunderInvariantGroup(IntToPtr);
}

IntToPtr = CGF.authPointerToPointerCast(IntToPtr, E->getType(), DestTy);
return IntToPtr;
}
case CK_PointerToIntegral: {
Expand All @@ -2538,6 +2542,7 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
PtrExpr = Builder.CreateStripInvariantGroup(PtrExpr);
}

PtrExpr = CGF.authPointerToPointerCast(PtrExpr, E->getType(), DestTy);
return Builder.CreatePtrToInt(PtrExpr, ConvertType(DestTy));
}
case CK_ToVoid: {
Expand Down
230 changes: 230 additions & 0 deletions clang/lib/CodeGen/CGPointerAuth.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include "CodeGenModule.h"
#include "clang/CodeGen/CodeGenABITypes.h"
#include "clang/CodeGen/ConstantInitBuilder.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/Support/SipHash.h"

using namespace clang;
Expand Down Expand Up @@ -165,6 +166,128 @@ CGPointerAuthInfo CodeGenModule::getPointerAuthInfoForType(QualType T) {
return ::getPointerAuthInfoForType(*this, T);
}

static bool isZeroConstant(const llvm::Value *Value) {
if (const auto *CI = dyn_cast<llvm::ConstantInt>(Value))
return CI->isZero();
return false;
}

static bool equalAuthPolicies(const CGPointerAuthInfo &Left,
const CGPointerAuthInfo &Right) {
assert((Left.isSigned() || Right.isSigned()) &&
"shouldn't be called if neither is signed");
if (Left.isSigned() != Right.isSigned())
return false;
return Left.getKey() == Right.getKey() &&
Left.getAuthenticationMode() == Right.getAuthenticationMode();
}

// Return the discriminator or return zero if the discriminator is null.
static llvm::Value *getDiscriminatorOrZero(const CGPointerAuthInfo &Info,
CGBuilderTy &Builder) {
llvm::Value *Discriminator = Info.getDiscriminator();
return Discriminator ? Discriminator : Builder.getSize(0);
}

llvm::Value *
CodeGenFunction::emitPointerAuthResignCall(llvm::Value *Value,
const CGPointerAuthInfo &CurAuth,
const CGPointerAuthInfo &NewAuth) {
assert(CurAuth && NewAuth);

if (CurAuth.getAuthenticationMode() !=
PointerAuthenticationMode::SignAndAuth ||
NewAuth.getAuthenticationMode() !=
PointerAuthenticationMode::SignAndAuth) {
llvm::Value *AuthedValue = EmitPointerAuthAuth(CurAuth, Value);
return EmitPointerAuthSign(NewAuth, AuthedValue);
}
// Convert the pointer to intptr_t before signing it.
auto *OrigType = Value->getType();
Value = Builder.CreatePtrToInt(Value, IntPtrTy);

auto *CurKey = Builder.getInt32(CurAuth.getKey());
auto *NewKey = Builder.getInt32(NewAuth.getKey());

llvm::Value *CurDiscriminator = getDiscriminatorOrZero(CurAuth, Builder);
llvm::Value *NewDiscriminator = getDiscriminatorOrZero(NewAuth, Builder);

// call i64 @llvm.ptrauth.resign(i64 %pointer,
// i32 %curKey, i64 %curDiscriminator,
// i32 %newKey, i64 %newDiscriminator)
auto *Intrinsic = CGM.getIntrinsic(llvm::Intrinsic::ptrauth_resign);
Value = EmitRuntimeCall(
Intrinsic, {Value, CurKey, CurDiscriminator, NewKey, NewDiscriminator});

// Convert back to the original type.
Value = Builder.CreateIntToPtr(Value, OrigType);
return Value;
}

llvm::Value *CodeGenFunction::emitPointerAuthResign(
llvm::Value *Value, QualType Type, const CGPointerAuthInfo &CurAuthInfo,
const CGPointerAuthInfo &NewAuthInfo, bool IsKnownNonNull) {
// Fast path: if neither schema wants a signature, we're done.
if (!CurAuthInfo && !NewAuthInfo)
return Value;

llvm::Value *Null = nullptr;
// If the value is obviously null, we're done.
if (auto *PointerValue = dyn_cast<llvm::PointerType>(Value->getType())) {
Null = CGM.getNullPointer(PointerValue, Type);
} else {
assert(Value->getType()->isIntegerTy());
Null = llvm::ConstantInt::get(IntPtrTy, 0);
}
if (Value == Null)
return Value;

// If both schemas sign the same way, we're done.
if (equalAuthPolicies(CurAuthInfo, NewAuthInfo)) {
const llvm::Value *CurD = CurAuthInfo.getDiscriminator();
const llvm::Value *NewD = NewAuthInfo.getDiscriminator();
if (CurD == NewD)
return Value;

if ((CurD == nullptr && isZeroConstant(NewD)) ||
(NewD == nullptr && isZeroConstant(CurD)))
return Value;
}

llvm::BasicBlock *InitBB = Builder.GetInsertBlock();
llvm::BasicBlock *ResignBB = nullptr, *ContBB = nullptr;

// Null pointers have to be mapped to null, and the ptrauth_resign
// intrinsic doesn't do that.
if (!IsKnownNonNull && !llvm::isKnownNonZero(Value, CGM.getDataLayout())) {
ContBB = createBasicBlock("resign.cont");
ResignBB = createBasicBlock("resign.nonnull");

auto *IsNonNull = Builder.CreateICmpNE(Value, Null);
Builder.CreateCondBr(IsNonNull, ResignBB, ContBB);
EmitBlock(ResignBB);
}

// Perform the auth/sign/resign operation.
if (!NewAuthInfo)
Value = EmitPointerAuthAuth(CurAuthInfo, Value);
else if (!CurAuthInfo)
Value = EmitPointerAuthSign(NewAuthInfo, Value);
else
Value = emitPointerAuthResignCall(Value, CurAuthInfo, NewAuthInfo);

// Clean up with a phi if we branched before.
if (ContBB) {
EmitBlock(ContBB);
auto *Phi = Builder.CreatePHI(Value->getType(), 2);
Phi->addIncoming(Null, InitBB);
Phi->addIncoming(Value, ResignBB);
Value = Phi;
}

return Value;
}

llvm::Constant *
CodeGenModule::getConstantSignedPointer(llvm::Constant *Pointer, unsigned Key,
llvm::Constant *StorageAddress,
Expand Down Expand Up @@ -351,3 +474,110 @@ CodeGenModule::getVTablePointerAuthInfo(CodeGenFunction *CGF,
/* IsIsaPointer */ false,
/* AuthenticatesNullValues */ false, Discriminator);
}

llvm::Value *CodeGenFunction::authPointerToPointerCast(llvm::Value *ResultPtr,
QualType SourceType,
QualType DestType) {
CGPointerAuthInfo CurAuthInfo, NewAuthInfo;
if (SourceType->isSignableType())
CurAuthInfo = getPointerAuthInfoForType(CGM, SourceType);

if (DestType->isSignableType())
NewAuthInfo = getPointerAuthInfoForType(CGM, DestType);

if (!CurAuthInfo && !NewAuthInfo)
return ResultPtr;

// If only one side of the cast is a function pointer, then we still need to
// resign to handle casts to/from opaque pointers.
if (!CurAuthInfo && DestType->isFunctionPointerType())
CurAuthInfo = CGM.getFunctionPointerAuthInfo(SourceType);

if (!NewAuthInfo && SourceType->isFunctionPointerType())
NewAuthInfo = CGM.getFunctionPointerAuthInfo(DestType);

return emitPointerAuthResign(ResultPtr, DestType, CurAuthInfo, NewAuthInfo,
/*IsKnownNonNull=*/false);
}

Address CodeGenFunction::authPointerToPointerCast(Address Ptr,
QualType SourceType,
QualType DestType) {
CGPointerAuthInfo CurAuthInfo, NewAuthInfo;
if (SourceType->isSignableType())
CurAuthInfo = getPointerAuthInfoForType(CGM, SourceType);

if (DestType->isSignableType())
NewAuthInfo = getPointerAuthInfoForType(CGM, DestType);

if (!CurAuthInfo && !NewAuthInfo)
return Ptr;

if (!CurAuthInfo && DestType->isFunctionPointerType()) {
// When casting a non-signed pointer to a function pointer, just set the
// auth info on Ptr to the assumed schema. The pointer will be resigned to
// the effective type when used.
Ptr.setPointerAuthInfo(CGM.getFunctionPointerAuthInfo(SourceType));
return Ptr;
}

if (!NewAuthInfo && SourceType->isFunctionPointerType()) {
NewAuthInfo = CGM.getFunctionPointerAuthInfo(DestType);
Ptr = Ptr.getResignedAddress(NewAuthInfo, *this);
Ptr.setPointerAuthInfo(CGPointerAuthInfo());
return Ptr;
}

return Ptr;
}

Address CodeGenFunction::getAsNaturalAddressOf(Address Addr,
QualType PointeeTy) {
CGPointerAuthInfo Info =
PointeeTy.isNull() ? CGPointerAuthInfo()
: CGM.getPointerAuthInfoForPointeeType(PointeeTy);
return Addr.getResignedAddress(Info, *this);
}

Address Address::getResignedAddress(const CGPointerAuthInfo &NewInfo,
CodeGenFunction &CGF) const {
assert(isValid() && "pointer isn't valid");
CGPointerAuthInfo CurInfo = getPointerAuthInfo();
llvm::Value *Val;

// Nothing to do if neither the current or the new ptrauth info needs signing.
if (!CurInfo.isSigned() && !NewInfo.isSigned())
return Address(getBasePointer(), getElementType(), getAlignment(),
isKnownNonNull());

assert(ElementType && "Effective type has to be set");
assert(!Offset && "unexpected non-null offset");

// If the current and the new ptrauth infos are the same and the offset is
// null, just cast the base pointer to the effective type.
if (CurInfo == NewInfo && !hasOffset())
Val = getBasePointer();
else
Val = CGF.emitPointerAuthResign(getBasePointer(), QualType(), CurInfo,
NewInfo, isKnownNonNull());

Val = CGF.Builder.CreateBitCast(Val, getType());
return Address(Val, getElementType(), getAlignment(), NewInfo,
/*Offset=*/nullptr, isKnownNonNull());
}

llvm::Value *LValue::getPointer(CodeGenFunction &CGF) const {
assert(isSimple());
return emitResignedPointer(getType(), CGF);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it might be worth to have a "short-path" for non-signed pointers. Most code is compile w/o pauth, but we'll always have overhead of calling these pauth-related functions (while having the same observable behavior as previously for non-pauth case).

The same applies to CodeGenFunction::AuthPointerToPointerCast - maybe we want to have some global switch for indicating that pauth is disabled everywhere, so we do not have to obtain signing schemas for src and dest types everytime and check them against empty schema to ensure that no pauth is needed.

Anyway, I suggest to leave this out of scope of the patch and do performance testing with https://llvm-compile-time-tracker.com/ after merging the PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can check whether there is a noticeable regression in compile time.

}

llvm::Value *LValue::emitResignedPointer(QualType PointeeTy,
CodeGenFunction &CGF) const {
assert(isSimple());
return CGF.getAsNaturalAddressOf(Addr, PointeeTy).getBasePointer();
}

llvm::Value *LValue::emitRawPointer(CodeGenFunction &CGF) const {
assert(isSimple());
return Addr.isValid() ? Addr.emitRawPointer(CGF) : nullptr;
}
Loading
Loading