-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL] Introduce min_work_groups_per_cu and max_work_groups_per_mp #11192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3f48159
to
ecf8358
Compare
2742daa
to
ff95dde
Compare
ff95dde
to
e0a9fcf
Compare
This is now ready for review, as llvm/llvm-project#66496 went into upstream llvm. Not sure what is the practice with changes depending on the upstream (e0a9fcfeec8606db7f70a5ea19ba44b2b0d18fd9) we could cherry-pick or make this PR wait till the pull-down. |
Unless there is urgency, I would prefer the PR wait until after the pulldown. |
Sure, that's fine with me. |
e0a9fcf
to
fd13ad4
Compare
fd13ad4
to
ed14966
Compare
ed14966
to
1264d97
Compare
The attributes match to CUDA's launch bounds minBlocksPerMultiprocessor and maxBlocksPerCluster respectively.
1264d97
to
bbe994d
Compare
@elizabethandrews @smanna12 is there anything else that needs to be address for this PR? |
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
@@ -22,6 +22,7 @@ | |||
#include "clang/AST/Mangle.h" | |||
#include "clang/AST/RecursiveASTVisitor.h" | |||
#include "clang/AST/Type.h" | |||
#include "clang/Basic/AttributeCommonInfo.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually need to add this? Unless I'm missing something obvious, I thought we've been using AttributeCommonInfo
in this file prior to this PR ? I just assumed it was included by one of the other headers or something. What necessitated the addition in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, that seems to be IDE residue.
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
return AL.getLocation(); | ||
} | ||
template <typename T, | ||
std::enable_if_t<std::is_same_v<AttributeCommonInfo, T>, bool> = true> | ||
static SourceLocation getAttrLoc(const T &AL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this being called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In checkUInt32Argument
. And it is being called from https://github.com/jchlanda/llvm/blob/jakub/launch_bounds/clang/lib/Sema/SemaDeclAttr.cpp#L4466 with the const AttributeCommonInfo &CI
as AttrInfo
template parameter.
true /* StrictlyUnsigned */)) | ||
return; | ||
|
||
// Validate that we have an integer constant expression and then store the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smanna12 please confirm but I do not think you need to do both checkUInt32Argument
and VerifyIntegerConstantExpression
. I could be wrong but I feel we're repeating code by calling both these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps as a side note, VerifyIntegerConstantExpression
has this nice property (I'm slightly out of my depths with clang, so forgive if I misname the things) that it rewrites the AST, so for each use of the attributes, it always follows the same pattern:
ConstantExpr{{.*}}'int'
value: Int N
IntegerLiteral {{.*}} 'int' N
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just get the Value from VerifyIntegerConstantExpression
and replicate this diagnostic from checkUInt32Argument
-
if (!I->isIntN(32)) {
S.Diag(Expr->getExprLoc(), diag::err_ice_too_large)
<< toString(*I, 10, false) << 32 << /* Unsigned */ 1;
return false;
}
if (StrictlyUnsigned && I->isSigned() && I->isNegative()) {
S.Diag(getAttrLoc(AI), diag::err_attribute_requires_positive_integer)
<< &AI << /*non-negative*/ 1;
return false;
}
I'm guessing there are other attributes which do similar things. @smanna will know better since she works with these a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, that's exactly how it used to be, I had it in a helper function used by both attributes: 9431687#diff-2a5bdb2d9f07f8d77de51d5403d349c22978141b6de6bd87fc5e449f5ed95becL4450-L4469
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea sorry for some confusion over this. I thought you would be able to just call checkUInt32Argument
(without calling VerifyIntegerConstantExpression
, but I am not so sure about this anymore. I've asked a colleague about this and will get back to you once he replies). In any case though, the issue with the code you first had and the current one now is with calling getIntegerConstantExpr
or it's equivalent twice. VerifyIntegerConstantExpression
should give you the value you need right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This attribute requires a strictly positive value.
if (ArgVal <= 0) {
Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer)
<< CI << /*positive*/ 0;
return std::nullopt;
}
}
I'm guessing there are other attributes which do similar things. @smanna will know better since she works with these a lot.
SYCLIntelBankBitsAttr is doing similar thing in void Sema::AddSYCLIntelBankBitsAttr()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jchlanda I spoke to Aaron who has implemented/worked with a lot of attribute code in community clang to get a better understanding of when to call checkUInt32Argument
vs VerifyIntegerConstantExpression
. Apparently checkUInt32Argument
was implemented when template instantiations of arguments were not really a thing. Since we're now doing more with template instantiations, it makes sense to catch the constant expression (checkUInt32Argument
discards it after evaluating it) and keep it in AST. So VerifyIntegerConstantExpression
+ required diagnostics as suggested above is the right way to handle this. I apologize for the confusion here. This area needs some cleanup in community clang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, not a problem at all. I got rid of the extra overload to make checkUInt32Argument
work, and encapsulated those check in a static helper function that is called instead of checkUInt32Argument
(while keeping VerifyIntegerConstantExpression
), it required some test changes, all commited here: acba95f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This attribute requires a strictly positive value. if (ArgVal <= 0) { Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer) << CI << /*positive*/ 0; return std::nullopt; } }
I'm guessing there are other attributes which do similar things. @smanna will know better since she works with these a lot.
SYCLIntelBankBitsAttr is doing similar thing in void Sema::AddSYCLIntelBankBitsAttr()
Please stop using @smanna.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This attribute requires a strictly positive value. if (ArgVal <= 0) { Diag(E->getExprLoc(), diag::err_attribute_requires_positive_integer) << CI << /*positive*/ 0; return std::nullopt; } }
I'm guessing there are other attributes which do similar things. @smanna will know better since she works with these a lot.
SYCLIntelBankBitsAttr is doing similar thing in void Sema::AddSYCLIntelBankBitsAttr()
Please stop using @smanna.
true /* StrictlyUnsigned */)) | ||
return; | ||
|
||
// Validate that we have an integer constant expression and then store the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just get the Value from VerifyIntegerConstantExpression
and replicate this diagnostic from checkUInt32Argument
-
if (!I->isIntN(32)) {
S.Diag(Expr->getExprLoc(), diag::err_ice_too_large)
<< toString(*I, 10, false) << 32 << /* Unsigned */ 1;
return false;
}
if (StrictlyUnsigned && I->isSigned() && I->isNegative()) {
S.Diag(getAttrLoc(AI), diag::err_attribute_requires_positive_integer)
<< &AI << /*non-negative*/ 1;
return false;
}
I'm guessing there are other attributes which do similar things. @smanna will know better since she works with these a lot.
clang/test/SemaSYCL/lb_sm_70.cpp
Outdated
@@ -0,0 +1,60 @@ | |||
// RUN: %clang_cc1 -internal-isystem %S/Inputs -triple nvptx-unknown-unknown -target-cpu sm_70 -fsycl-is-device -Wno-c++23-extensions -S -emit-llvm %s -o -fsyntax-only -verify %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the -S -emit-llvm here
Please do not tag me ***@***.***). You should be tagging @smanna12. I‘m getting spammed.On 1 Nov 2023, at 15:18, smanna12 ***@***.***> wrote:
@smanna12 commented on this pull request.
In clang/lib/CodeGen/Targets/NVPTX.cpp:
@@ -245,6 +245,32 @@ void NVPTXTargetCodeGenInfo::setTargetAttributes(
// And kernel functions are not subject to inlining
F->addFnAttr(llvm::Attribute::NoInline);
}
+ if (const auto *MWGS = FD->getAttr<SYCLIntelMaxWorkGroupSizeAttr>()) {
+ auto MaxThreads = (*MWGS->getZDimVal()).getExtValue() *
+ (*MWGS->getYDimVal()).getExtValue() *
+ (*MWGS->getXDimVal()).getExtValue();
+ if (MaxThreads > 0)
+ addNVVMMetadata(F, "maxntidx", MaxThreads);
+
+ auto attrValue = [&](Expr *E) {
+ const auto *CE = cast<ConstantExpr>(E);
+ std::optional<llvm::APInt> Val = CE->getResultAsAPSInt();
+ assert(Val.has_value() && "Failed to get attribute value.");
assert does not need here. cast<> will give an assert for null attribute value.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
clang/lib/Sema/SemaDeclAttr.cpp
Outdated
template <typename AttrInfo> | ||
static std::enable_if_t<std::is_base_of_v<Attr, AttrInfo>, SourceLocation> | ||
getAttrLoc(const AttrInfo &AL) { | ||
/// Helper functions to provide Attribute Location for the Attr types, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes can be removed now right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I removed the overload but did not remember about that change, fixed now.
|
||
constexpr float A = 2.0; | ||
// expected-warning@+5{{'maxclusterrank' requires sm_90 or higher, CUDA arch provided: sm_70, ignoring 'max_work_groups_per_mp' attribute}} | ||
// expected-error@+3 {{integral constant expression must have integral or unscoped enumeration type, not 'float'}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh. Not a huge fan of the new diagnostics but I guess we can't do anything since I assume these are thrown from VerifyIntegerConstantExpression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's generated in VerifyIntegerConstantExpression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for making all the changes!
Not a problem, than you for a thorough review :) |
The attributes match to CUDA's launch bounds
minBlocksPerMultiprocessor
and
maxBlocksPerCluster
respectively. See: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank andhttps://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds for details.