-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[OMPIRBuilder] - Handle dependencies in createTarget
#93977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OMPIRBuilder] - Handle dependencies in createTarget
#93977
Conversation
…ased target codegen path in OMPIRBuilder
…ttests(?) and lit tests - basically more testing before PR
ping! Gentle reminder, I'd appreciate reviews on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pranavb-ca! I did a first round and have a few comments.
DepArray = Builder.CreateAlloca(DepArrayTy, nullptr, ".dep.arr.addr"); | ||
|
||
unsigned P = 0; | ||
for (const OpenMPIRBuilder::DependData &Dep : Dependencies) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there is a similar loop in OpenMPIRBuilder::createTask
. Can we outline this to a shared util used in both locations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, my intention was to switch OpenMPIRBuilder::createTask
to use this function but i had a couple of other changes to OpenMPIRBuilder::createTask
in mind so I decided to leave that out and roll it into a different PR. If you prefer, I can make the switch in this PR itself though. Just a preference, no strong opinions on this.
RTLoc, TargetTaskAllocaIP)); | ||
|
||
OI.ExitBB = Builder.saveIP().getBlock(); | ||
OI.PostOutlineCB = [this, ToBeDeleted, Dependencies, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I there are pieces in this lambda that can be outlined into shared utils and called from here and OpenMPIRBuilder::createTask
. For example, the logic to setup TaskData
(but, other parts as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is an active Todo which I was planning to add in a different PR because now this PR doesn't touch the handling of the task
construct at all. But that's only a preference, I'll be happy to roll those changes into this PR too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This passes ToBeDeleted
and Dependencies
by-value to the lambda, i.e. makes a copy of those lists. If called multiple times, items in ToBeDeleted
will be erased multiple times. I assume this is not intended?
OI.PostOutlineCB = [this, ToBeDeleted, Dependencies, | |
OI.PostOutlineCB = [this, &ToBeDeleted, &Dependencies, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This passes
ToBeDeleted
andDependencies
by-value to the lambda, i.e. makes a copy of those lists. If called multiple times, items inToBeDeleted
will be erased multiple times. I assume this is not intended?
In fact a copy is necessary to "own" the data. PostOutlineCB
is called only once so multiple deletions is not a problem. However, and what is more important to note, is that it is called long after ToBeDeleted
is destroyed (that is after we have returned from emitTargetTask
) making it necessary to keep a copy around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Pranav, I think this looks good. I just have a few minor comments.
dds.emplace_back(dd); | ||
} | ||
} | ||
if (!taskOp.getDependVars().empty() && taskOp.getDepends()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd suggest moving this check into buildDependData
, since it always needs to be done. It would just return without adding anything to dds
if the condition is not met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this comment hasn't been addressed. Let me know if you just missed it, if you disagree with it or if it isn't very clear and I should try to explain better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a full a review, but the first notes that I started with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually, looks quite good. Just some style comments.
AllocaInst *ArgStructAlloca = | ||
dyn_cast<AllocaInst>(StaleCI->getArgOperand(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AllocaInst *ArgStructAlloca = | |
dyn_cast<AllocaInst>(StaleCI->getArgOperand(1)); | |
auto *ArgStructAlloca = | |
dyn_cast<AllocaInst>(StaleCI->getArgOperand(1)); |
while (!ToBeDeleted.empty()) { | ||
ToBeDeleted.top()->eraseFromParent(); | ||
ToBeDeleted.pop(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while (!ToBeDeleted.empty()) { | |
ToBeDeleted.top()->eraseFromParent(); | |
ToBeDeleted.pop(); | |
} | |
llvm::for_each(ToBeDeleted, [](Instruction *I) { I->ereaseFromParent(); } ); |
or a simple foreach-loop. ToBeDeleted
goes out of scope anyway.
OI.OuterAllocaBB = AllocaIP.getBlock(); | ||
|
||
// Add the thread ID argument. | ||
std::stack<Instruction *> ToBeDeleted; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::stack<Instruction *> ToBeDeleted; | |
SmallVector<Instruction *> ToBeDeleted; |
std::stack
uses std::deque
(by default), which is quite malloc-heavy. From cppreference:
deques typically have large minimal memory cost; a deque holding just one element has to allocate its full internal array (e.g. 8 times the object size on 64-bit libstdc++; 16 times the object size or 4096 bytes, whichever is larger, on 64-bit libc++).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur -> The only reason, I didnt do it is because createFakeIntVal
is used by a few other functions not related to this PR. So changing ToBeDeleted
from std::stack
to SmallVector
would involve changing other functions (such as createTeams
) which are not related to this PR. I could go ahead and add that to this PR. Or simply open a separate PR just to make this change which should be quick. Let me know what you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meinersbur - Please take a look. I have gone ahead and made this change. If you agree with my reason above then I'll simply back out that one commit and make it a new PR.
static void | ||
buildDependData(std::optional<ArrayAttr> depends, OperandRange dependVars, | ||
LLVM::ModuleTranslation &moduleTranslation, | ||
SmallVector<llvm::OpenMPIRBuilder::DependData> &dds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SmallVector<llvm::OpenMPIRBuilder::DependData> &dds) { | |
SmallVectorImpl<llvm::OpenMPIRBuilder::DependData> &dds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
… depend clause (sibling tasks)
@Meinersbur, @ergawy, @skatrak - Could you please give this another look. I have updated the testcase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on my previous comments. I just have a couple more minor ones.
dds.emplace_back(dd); | ||
} | ||
} | ||
if (!taskOp.getDependVars().empty() && taskOp.getDepends()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this comment hasn't been addressed. Let me know if you just missed it, if you disagree with it or if it isn't very clear and I should try to explain better.
@@ -3088,10 +3102,13 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder, | |||
if (!mapData.IsDeclareTarget[i] && !mapData.IsAMember[i]) | |||
kernelInput.push_back(mapData.OriginalValue[i]); | |||
} | |||
SmallVector<llvm::OpenMPIRBuilder::DependData> dds; | |||
buildDependData(targetOp.getOperation(), moduleTranslation, dds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skatrak - Apologies for missing your previous comment about moving the empty depends list check into buildDependData
. I have done that now in this iteration of the PR.
@skatrak - I have updated the PR with changes based on your feedback. Could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Pranav, LGTM. I have just one more comment but no need to wait for another review by me before merging this. Just make sure @Meinersbur is happy with it as well!
@@ -682,6 +682,41 @@ convertOmpTeams(omp::TeamsOp op, llvm::IRBuilderBase &builder, | |||
return bodyGenStatus; | |||
} | |||
|
|||
static void | |||
buildDependData(Operation *op, LLVM::ModuleTranslation &moduleTranslation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest taking std::optional<ArrayAttr> depends
and OperandRange dependVars
as arguments instead of Operation *op
here. This way you don't need to try casting op
to each supported operation below, but rather make the body of this function what the processDepends
lambda inside currently has.
dds.emplace_back(dd); | ||
} | ||
} | ||
buildDependData(taskOp.getOperation(), moduleTranslation, dds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion to how this call could look if following my comment above:
buildDependData(taskOp.getOperation(), moduleTranslation, dds); | |
buildDependData(moduleTranslation, taskOp.getDepends(), taskOp.getDependVars(), dds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary: This patch handles dependencies specified by the `depend` clause on an OpenMP target construct. It does this much the same way clang does it by materializing an OpenMP `task` that is tagged with the dependencies. The following functions are relevant to this patch - 1) `createTarget` - This function itself is largely unchanged except that it now accepts a vector of `DependData` objects that it simply forwards to `emitTargetCall` 2) `emitTargetCall` - This function has changed now to check if an outer target-task needs to be materialized (i.e if `target` construct has `nowait` or has `depend` clause). If yes, it calls `emitTargetTask` to do all the heavy lifting for creating and dispatching the task. 3) `emitTargetTask` - Bulk of the change is here. See the large comment explaining what it does at the beginning of this function Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251117
This PR handles dependencies specified by the
depend
clause on an OpenMP target construct. It does this much the same way clang does it by materializing an OpenMPtask
that is tagged with the dependencies.To aid review, here are the relevant implementation details
createTarget
- This function itself is largely unchanged except that it now accepts a vector ofDependData
objects that it simply forwards toemitTargetCall
emitTargetCall
- This function has changed now to check if an outer target-task needs to be materialized (Yes, if target construct hasnowait
or hasdepend
clause). If yes, it callsemitTargetTask
to do all the heavy lifting for creating and dispatching the task.emitTargetTask
- Bulk of the PR is here. See the large comment explaining what it does at the beginning of the function.Things that will/should follow in subsequent PRs
createTask
andemitTargetTask