-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU] Set glc/slc on volatile/nontemporal SMEM loads #77443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5803,7 +5803,32 @@ in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx6-gfx9-table`. | |
|
||
- Must happen before | ||
any following volatile | ||
global/generic | ||
global/generic/private/constant | ||
load/store. | ||
- Ensures that | ||
volatile | ||
operations to | ||
different | ||
addresses will not | ||
be reordered by | ||
hardware. | ||
|
||
load *none* *none* - constant - !volatile & !nontemporal | ||
|
||
1. s_load/s_buffer_load | ||
|
||
- !volatile & nontemporal | ||
|
||
1. s_load/s_buffer_load glc=1 slc=1 | ||
|
||
- volatile | ||
|
||
1. s_load/s_buffer_load glc=1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The VMEM case adds a waitcnt. Do we need to add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need a waitcny lgkm(0) here for the same reason. It is waiting for the proceeding scalar load to complete before continuing. That ensures each volatile memory is complete before moving to the next one. There is no need to wait for VMEM as any previous VMEM will have been followed by its own waitcnt vmem(0). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. I have documented that but I don't understand There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the spec looks fine to me. But you must admit that the concept of a volatile constant load is a bit meaningless. Not sure how it could ever happen. By definition a volatile location cannot be constant. That may be why I left scalar out of the memory model description. Presumably this cahnge needs to be done to the memory model specification for the other targets too. |
||
2. s_waitcnt lgkmcnt(0) | ||
|
||
- Must happen before | ||
any following volatile | ||
global/generic/private/constant | ||
load/store. | ||
- Ensures that | ||
volatile | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,7 +29,6 @@ class SM_Pseudo <string opName, dag outs, dag ins, string asmOps, list<dag> patt | |
let mayStore = 0; | ||
let mayLoad = 1; | ||
let hasSideEffects = 0; | ||
let maybeAtomic = 0; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does removing this cause SIMemoryLegalizer to process scalar loads and stores? Should the memory model in AMDGPUUsage be updated to include the scalar memory instructions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not setting
Probably. Are you saying that should be a prerequisite for this patch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think if we change the implementation of the memory model we should ensure the ABI documentation is updated to match. I think only the non-atomic rows need to be updated to include the SMEM instructions since we do not support scalar atomics. |
||
let UseNamedOperandTable = 1; | ||
let SchedRW = [WriteSMEM]; | ||
|
||
|
@@ -248,7 +247,6 @@ class SM_Atomic_Pseudo <string opName, | |
// Should these be set? | ||
let ScalarStore = 1; | ||
let hasSideEffects = 1; | ||
let maybeAtomic = 1; | ||
|
||
let IsAtomicNoRet = !not(isRet); | ||
let IsAtomicRet = isRet; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should say "constant address space and only if the compiler can prove that the address is uniform", but I'm not sure where to fit that into the table.
Should I mention scalar stores in the table? The text above says "Since the constant address space contents do not change during the execution of a kernel dispatch it is not legal to perform
stores".
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we do not generate scalar stores as the instruction was removed from the isa.
I do not think you need to say "only if the address is uniform" as the compiler cannot generate scalar loads unless that condition is already met.