Skip to content

Commit 07fda1e

Browse files
committed
Merge branch 'users/meinersbur/flang_runtime_move-files' into users/meinersbur/flang_runtime
2 parents 66292f0 + 7eef009 commit 07fda1e

File tree

108 files changed

+2470
-1775
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

108 files changed

+2470
-1775
lines changed

clang/docs/SourceBasedCodeCoverage.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,11 @@ directory structure will be created. Additionally, the following special
9494
not specified (i.e the pattern is "%m"), it's assumed that ``N = 1``. The
9595
merge pool specifier can only occur once per filename pattern.
9696

97+
* "%b" expands out to the binary ID (build ID). It can be used with "%Nm" to
98+
avoid binary signature collisions. To use it, the program should be compiled
99+
with the build ID linker option (``--build-id`` for GNU ld or LLD,
100+
``/build-id`` for lld-link on Windows). Linux, Windows and AIX are supported.
101+
97102
* "%c" expands out to nothing, but enables a mode in which profile counter
98103
updates are continuously synced to a file. This means that if the
99104
instrumented program crashes, or is killed by a signal, perfect coverage

clang/docs/UsersManual.rst

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2965,7 +2965,8 @@ instrumentation:
29652965
environment variable to specify an alternate file. If non-default file name
29662966
is specified by both the environment variable and the command line option,
29672967
the environment variable takes precedence. The file name pattern specified
2968-
can include different modifiers: ``%p``, ``%h``, ``%m``, ``%t``, and ``%c``.
2968+
can include different modifiers: ``%p``, ``%h``, ``%m``, ``%b``, ``%t``, and
2969+
``%c``.
29692970

29702971
Any instance of ``%p`` in that file name will be replaced by the process
29712972
ID, so that you can easily distinguish the profile output from multiple
@@ -2987,11 +2988,11 @@ instrumentation:
29872988
``%p`` is that the storage requirement for raw profile data files is greatly
29882989
increased. To avoid issues like this, the ``%m`` specifier can used in the profile
29892990
name. When this specifier is used, the profiler runtime will substitute ``%m``
2990-
with a unique integer identifier associated with the instrumented binary. Additionally,
2991+
with an integer identifier associated with the instrumented binary. Additionally,
29912992
multiple raw profiles dumped from different processes that share a file system (can be
29922993
on different hosts) will be automatically merged by the profiler runtime during the
29932994
dumping. If the program links in multiple instrumented shared libraries, each library
2994-
will dump the profile data into its own profile data file (with its unique integer
2995+
will dump the profile data into its own profile data file (with its integer
29952996
id embedded in the profile name). Note that the merging enabled by ``%m`` is for raw
29962997
profile data generated by profiler runtime. The resulting merged "raw" profile data
29972998
file still needs to be converted to a different format expected by the compiler (
@@ -3001,6 +3002,12 @@ instrumentation:
30013002
30023003
$ LLVM_PROFILE_FILE="code-%m.profraw" ./code
30033004
3005+
Although rare, binary signatures used by the ``%m`` specifier can have
3006+
collisions. In this case, the ``%b`` specifier, which expands to the binary
3007+
ID (build ID in ELF and COFF), can be added. To use it, the program should be
3008+
compiled with the build ID linker option (``--build-id`` for GNU ld or LLD,
3009+
``/build-id`` for lld-link on Windows). Linux, Windows and AIX are supported.
3010+
30043011
See `this <SourceBasedCodeCoverage.html#running-the-instrumented-program>`_ section
30053012
about the ``%t``, and ``%c`` modifiers.
30063013

clang/include/clang/AST/OperationKinds.def

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -367,6 +367,9 @@ CAST_OPERATION(HLSLVectorTruncation)
367367
// Non-decaying array RValue cast (HLSL only).
368368
CAST_OPERATION(HLSLArrayRValue)
369369

370+
// Aggregate by Value cast (HLSL only).
371+
CAST_OPERATION(HLSLElementwiseCast)
372+
370373
//===- Binary Operations -------------------------------------------------===//
371374
// Operators listed in order of precedence.
372375
// Note that additions to this should also update the StmtVisitor class,

clang/include/clang/Basic/BuiltinsSME.def

Lines changed: 0 additions & 21 deletions
This file was deleted.

clang/include/clang/Basic/BuiltinsSVE.def

Lines changed: 0 additions & 22 deletions
This file was deleted.

clang/include/clang/Driver/Driver.h

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -797,22 +797,14 @@ class Driver {
797797
const ToolChain &getToolChain(const llvm::opt::ArgList &Args,
798798
const llvm::Triple &Target) const;
799799

800-
/// @}
801-
802-
/// Retrieves a ToolChain for a particular device \p Target triple
803-
///
804-
/// \param[in] HostTC is the host ToolChain paired with the device
805-
///
806-
/// \param[in] TargetDeviceOffloadKind (e.g. OFK_Cuda/OFK_OpenMP/OFK_SYCL) is
807-
/// an Offloading action that is optionally passed to a ToolChain (used by
808-
/// CUDA, to specify if it's used in conjunction with OpenMP)
800+
/// Retrieves a ToolChain for a particular \p Target triple for offloading.
809801
///
810802
/// Will cache ToolChains for the life of the driver object, and create them
811803
/// on-demand.
812-
const ToolChain &getOffloadingDeviceToolChain(
813-
const llvm::opt::ArgList &Args, const llvm::Triple &Target,
814-
const ToolChain &HostTC,
815-
const Action::OffloadKind &TargetDeviceOffloadKind) const;
804+
const ToolChain &getOffloadToolChain(const llvm::opt::ArgList &Args,
805+
const Action::OffloadKind Kind,
806+
const llvm::Triple &Target,
807+
const llvm::Triple &AuxTarget) const;
816808

817809
/// Get bitmasks for which option flags to include and exclude based on
818810
/// the driver mode.

clang/include/clang/Sema/SemaHLSL.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,9 @@ class SemaHLSL : public SemaBase {
141141
// Diagnose whether the input ID is uint/unit2/uint3 type.
142142
bool diagnoseInputIDType(QualType T, const ParsedAttr &AL);
143143

144+
bool CanPerformScalarCast(QualType SrcTy, QualType DestTy);
145+
bool ContainsBitField(QualType BaseTy);
146+
bool CanPerformElementwiseCast(Expr *Src, QualType DestType);
144147
ExprResult ActOnOutParamExpr(ParmVarDecl *Param, Expr *Arg);
145148

146149
QualType getInoutParameterType(QualType Ty);

clang/include/module.modulemap

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,11 +49,7 @@ module Clang_Basic {
4949
textual header "clang/Basic/BuiltinsLoongArchLASX.def"
5050
textual header "clang/Basic/BuiltinsLoongArchLSX.def"
5151
textual header "clang/Basic/BuiltinsMips.def"
52-
textual header "clang/Basic/BuiltinsNEON.def"
5352
textual header "clang/Basic/BuiltinsPPC.def"
54-
textual header "clang/Basic/BuiltinsRISCVVector.def"
55-
textual header "clang/Basic/BuiltinsSME.def"
56-
textual header "clang/Basic/BuiltinsSVE.def"
5753
textual header "clang/Basic/BuiltinsSystemZ.def"
5854
textual header "clang/Basic/BuiltinsVE.def"
5955
textual header "clang/Basic/BuiltinsVEVL.gen.def"

clang/lib/AST/Expr.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1956,6 +1956,7 @@ bool CastExpr::CastConsistency() const {
19561956
case CK_FixedPointToBoolean:
19571957
case CK_HLSLArrayRValue:
19581958
case CK_HLSLVectorTruncation:
1959+
case CK_HLSLElementwiseCast:
19591960
CheckNoBasePath:
19601961
assert(path_empty() && "Cast kind should not have a base path!");
19611962
break;

clang/lib/AST/ExprConstant.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15047,6 +15047,7 @@ bool IntExprEvaluator::VisitCastExpr(const CastExpr *E) {
1504715047
case CK_NoOp:
1504815048
case CK_LValueToRValueBitCast:
1504915049
case CK_HLSLArrayRValue:
15050+
case CK_HLSLElementwiseCast:
1505015051
return ExprEvaluatorBaseTy::VisitCastExpr(E);
1505115052

1505215053
case CK_MemberPointerToBoolean:
@@ -15905,6 +15906,7 @@ bool ComplexExprEvaluator::VisitCastExpr(const CastExpr *E) {
1590515906
case CK_IntegralToFixedPoint:
1590615907
case CK_MatrixCast:
1590715908
case CK_HLSLVectorTruncation:
15909+
case CK_HLSLElementwiseCast:
1590815910
llvm_unreachable("invalid cast kind for complex value");
1590915911

1591015912
case CK_LValueToRValue:

clang/lib/CodeGen/CGExpr.cpp

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5338,6 +5338,7 @@ LValue CodeGenFunction::EmitCastLValue(const CastExpr *E) {
53385338
case CK_MatrixCast:
53395339
case CK_HLSLVectorTruncation:
53405340
case CK_HLSLArrayRValue:
5341+
case CK_HLSLElementwiseCast:
53415342
return EmitUnsupportedLValue(E, "unexpected cast lvalue");
53425343

53435344
case CK_Dependent:
@@ -6376,3 +6377,75 @@ RValue CodeGenFunction::EmitPseudoObjectRValue(const PseudoObjectExpr *E,
63766377
LValue CodeGenFunction::EmitPseudoObjectLValue(const PseudoObjectExpr *E) {
63776378
return emitPseudoObjectExpr(*this, E, true, AggValueSlot::ignored()).LV;
63786379
}
6380+
6381+
void CodeGenFunction::FlattenAccessAndType(
6382+
Address Addr, QualType AddrType,
6383+
SmallVectorImpl<std::pair<Address, llvm::Value *>> &AccessList,
6384+
SmallVectorImpl<QualType> &FlatTypes) {
6385+
// WorkList is list of type we are processing + the Index List to access
6386+
// the field of that type in Addr for use in a GEP
6387+
llvm::SmallVector<std::pair<QualType, llvm::SmallVector<llvm::Value *, 4>>,
6388+
16>
6389+
WorkList;
6390+
llvm::IntegerType *IdxTy = llvm::IntegerType::get(getLLVMContext(), 32);
6391+
// Addr should be a pointer so we need to 'dereference' it
6392+
WorkList.push_back({AddrType, {llvm::ConstantInt::get(IdxTy, 0)}});
6393+
6394+
while (!WorkList.empty()) {
6395+
auto [T, IdxList] = WorkList.pop_back_val();
6396+
T = T.getCanonicalType().getUnqualifiedType();
6397+
assert(!isa<MatrixType>(T) && "Matrix types not yet supported in HLSL");
6398+
if (const auto *CAT = dyn_cast<ConstantArrayType>(T)) {
6399+
uint64_t Size = CAT->getZExtSize();
6400+
for (int64_t I = Size - 1; I > -1; I--) {
6401+
llvm::SmallVector<llvm::Value *, 4> IdxListCopy = IdxList;
6402+
IdxListCopy.push_back(llvm::ConstantInt::get(IdxTy, I));
6403+
WorkList.emplace_back(CAT->getElementType(), IdxListCopy);
6404+
}
6405+
} else if (const auto *RT = dyn_cast<RecordType>(T)) {
6406+
const RecordDecl *Record = RT->getDecl();
6407+
assert(!Record->isUnion() && "Union types not supported in flat cast.");
6408+
6409+
const CXXRecordDecl *CXXD = dyn_cast<CXXRecordDecl>(Record);
6410+
6411+
llvm::SmallVector<QualType, 16> FieldTypes;
6412+
if (CXXD && CXXD->isStandardLayout())
6413+
Record = CXXD->getStandardLayoutBaseWithFields();
6414+
6415+
// deal with potential base classes
6416+
if (CXXD && !CXXD->isStandardLayout()) {
6417+
for (auto &Base : CXXD->bases())
6418+
FieldTypes.push_back(Base.getType());
6419+
}
6420+
6421+
for (auto *FD : Record->fields())
6422+
FieldTypes.push_back(FD->getType());
6423+
6424+
for (int64_t I = FieldTypes.size() - 1; I > -1; I--) {
6425+
llvm::SmallVector<llvm::Value *, 4> IdxListCopy = IdxList;
6426+
IdxListCopy.push_back(llvm::ConstantInt::get(IdxTy, I));
6427+
WorkList.insert(WorkList.end(), {FieldTypes[I], IdxListCopy});
6428+
}
6429+
} else if (const auto *VT = dyn_cast<VectorType>(T)) {
6430+
llvm::Type *LLVMT = ConvertTypeForMem(T);
6431+
CharUnits Align = getContext().getTypeAlignInChars(T);
6432+
Address GEP =
6433+
Builder.CreateInBoundsGEP(Addr, IdxList, LLVMT, Align, "vector.gep");
6434+
for (unsigned I = 0, E = VT->getNumElements(); I < E; I++) {
6435+
llvm::Value *Idx = llvm::ConstantInt::get(IdxTy, I);
6436+
// gep on vector fields is not recommended so combine gep with
6437+
// extract/insert
6438+
AccessList.emplace_back(GEP, Idx);
6439+
FlatTypes.push_back(VT->getElementType());
6440+
}
6441+
} else {
6442+
// a scalar/builtin type
6443+
llvm::Type *LLVMT = ConvertTypeForMem(T);
6444+
CharUnits Align = getContext().getTypeAlignInChars(T);
6445+
Address GEP =
6446+
Builder.CreateInBoundsGEP(Addr, IdxList, LLVMT, Align, "gep");
6447+
AccessList.emplace_back(GEP, nullptr);
6448+
FlatTypes.push_back(T);
6449+
}
6450+
}
6451+
}

clang/lib/CodeGen/CGExprAgg.cpp

Lines changed: 93 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -491,6 +491,79 @@ static bool isTrivialFiller(Expr *E) {
491491
return false;
492492
}
493493

494+
// emit a flat cast where the RHS is a scalar, including vector
495+
static void EmitHLSLScalarFlatCast(CodeGenFunction &CGF, Address DestVal,
496+
QualType DestTy, llvm::Value *SrcVal,
497+
QualType SrcTy, SourceLocation Loc) {
498+
// Flatten our destination
499+
SmallVector<QualType, 16> DestTypes; // Flattened type
500+
SmallVector<std::pair<Address, llvm::Value *>, 16> StoreGEPList;
501+
// ^^ Flattened accesses to DestVal we want to store into
502+
CGF.FlattenAccessAndType(DestVal, DestTy, StoreGEPList, DestTypes);
503+
504+
assert(SrcTy->isVectorType() && "HLSL Flat cast doesn't handle splatting.");
505+
const VectorType *VT = SrcTy->getAs<VectorType>();
506+
SrcTy = VT->getElementType();
507+
assert(StoreGEPList.size() <= VT->getNumElements() &&
508+
"Cannot perform HLSL flat cast when vector source \
509+
object has less elements than flattened destination \
510+
object.");
511+
for (unsigned I = 0, Size = StoreGEPList.size(); I < Size; I++) {
512+
llvm::Value *Load = CGF.Builder.CreateExtractElement(SrcVal, I, "vec.load");
513+
llvm::Value *Cast =
514+
CGF.EmitScalarConversion(Load, SrcTy, DestTypes[I], Loc);
515+
516+
// store back
517+
llvm::Value *Idx = StoreGEPList[I].second;
518+
if (Idx) {
519+
llvm::Value *V =
520+
CGF.Builder.CreateLoad(StoreGEPList[I].first, "load.for.insert");
521+
Cast = CGF.Builder.CreateInsertElement(V, Cast, Idx);
522+
}
523+
CGF.Builder.CreateStore(Cast, StoreGEPList[I].first);
524+
}
525+
return;
526+
}
527+
528+
// emit a flat cast where the RHS is an aggregate
529+
static void EmitHLSLElementwiseCast(CodeGenFunction &CGF, Address DestVal,
530+
QualType DestTy, Address SrcVal,
531+
QualType SrcTy, SourceLocation Loc) {
532+
// Flatten our destination
533+
SmallVector<QualType, 16> DestTypes; // Flattened type
534+
SmallVector<std::pair<Address, llvm::Value *>, 16> StoreGEPList;
535+
// ^^ Flattened accesses to DestVal we want to store into
536+
CGF.FlattenAccessAndType(DestVal, DestTy, StoreGEPList, DestTypes);
537+
// Flatten our src
538+
SmallVector<QualType, 16> SrcTypes; // Flattened type
539+
SmallVector<std::pair<Address, llvm::Value *>, 16> LoadGEPList;
540+
// ^^ Flattened accesses to SrcVal we want to load from
541+
CGF.FlattenAccessAndType(SrcVal, SrcTy, LoadGEPList, SrcTypes);
542+
543+
assert(StoreGEPList.size() <= LoadGEPList.size() &&
544+
"Cannot perform HLSL flat cast when flattened source object \
545+
has less elements than flattened destination object.");
546+
// apply casts to what we load from LoadGEPList
547+
// and store result in Dest
548+
for (unsigned I = 0, E = StoreGEPList.size(); I < E; I++) {
549+
llvm::Value *Idx = LoadGEPList[I].second;
550+
llvm::Value *Load = CGF.Builder.CreateLoad(LoadGEPList[I].first, "load");
551+
Load =
552+
Idx ? CGF.Builder.CreateExtractElement(Load, Idx, "vec.extract") : Load;
553+
llvm::Value *Cast =
554+
CGF.EmitScalarConversion(Load, SrcTypes[I], DestTypes[I], Loc);
555+
556+
// store back
557+
Idx = StoreGEPList[I].second;
558+
if (Idx) {
559+
llvm::Value *V =
560+
CGF.Builder.CreateLoad(StoreGEPList[I].first, "load.for.insert");
561+
Cast = CGF.Builder.CreateInsertElement(V, Cast, Idx);
562+
}
563+
CGF.Builder.CreateStore(Cast, StoreGEPList[I].first);
564+
}
565+
}
566+
494567
/// Emit initialization of an array from an initializer list. ExprToVisit must
495568
/// be either an InitListEpxr a CXXParenInitListExpr.
496569
void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
@@ -890,7 +963,25 @@ void AggExprEmitter::VisitCastExpr(CastExpr *E) {
890963
case CK_HLSLArrayRValue:
891964
Visit(E->getSubExpr());
892965
break;
893-
966+
case CK_HLSLElementwiseCast: {
967+
Expr *Src = E->getSubExpr();
968+
QualType SrcTy = Src->getType();
969+
RValue RV = CGF.EmitAnyExpr(Src);
970+
QualType DestTy = E->getType();
971+
Address DestVal = Dest.getAddress();
972+
SourceLocation Loc = E->getExprLoc();
973+
974+
if (RV.isScalar()) {
975+
llvm::Value *SrcVal = RV.getScalarVal();
976+
EmitHLSLScalarFlatCast(CGF, DestVal, DestTy, SrcVal, SrcTy, Loc);
977+
} else {
978+
assert(RV.isAggregate() &&
979+
"Can't perform HLSL Aggregate cast on a complex type.");
980+
Address SrcVal = RV.getAggregateAddress();
981+
EmitHLSLElementwiseCast(CGF, DestVal, DestTy, SrcVal, SrcTy, Loc);
982+
}
983+
break;
984+
}
894985
case CK_NoOp:
895986
case CK_UserDefinedConversion:
896987
case CK_ConstructorConversion:
@@ -1461,6 +1552,7 @@ static bool castPreservesZero(const CastExpr *CE) {
14611552
case CK_NonAtomicToAtomic:
14621553
case CK_AtomicToNonAtomic:
14631554
case CK_HLSLVectorTruncation:
1555+
case CK_HLSLElementwiseCast:
14641556
return true;
14651557

14661558
case CK_BaseToDerivedMemberPointer:

clang/lib/CodeGen/CGExprComplex.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -610,6 +610,7 @@ ComplexPairTy ComplexExprEmitter::EmitCast(CastKind CK, Expr *Op,
610610
case CK_MatrixCast:
611611
case CK_HLSLVectorTruncation:
612612
case CK_HLSLArrayRValue:
613+
case CK_HLSLElementwiseCast:
613614
llvm_unreachable("invalid cast kind for complex value");
614615

615616
case CK_FloatingRealToComplex:

clang/lib/CodeGen/CGExprConstant.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1335,6 +1335,7 @@ class ConstExprEmitter
13351335
case CK_MatrixCast:
13361336
case CK_HLSLVectorTruncation:
13371337
case CK_HLSLArrayRValue:
1338+
case CK_HLSLElementwiseCast:
13381339
return nullptr;
13391340
}
13401341
llvm_unreachable("Invalid CastKind");

0 commit comments

Comments
 (0)