Skip to content

Commit 9abb3a9

Browse files
Pierre-vhbcahoon
authored andcommitted
[AMDGPU][SIMemoryLegalizer] Fix order of GL0/1_INV on GFX10/11 (llvm#81450)
Fixes SWDEV-443292 Change-Id: I2eeb68b9d82a560683a96efb0207e82a93de901a
1 parent 0001782 commit 9abb3a9

17 files changed

+1574
-1572
lines changed

llvm/docs/AMDGPUUsage.rst

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -12100,8 +12100,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1210012100
before invalidating
1210112101
the caches.
1210212102

12103-
3. buffer_gl0_inv;
12104-
buffer_gl1_inv
12103+
3. buffer_gl1_inv;
12104+
buffer_gl0_inv
1210512105

1210612106
- Must happen before
1210712107
any following
@@ -12130,8 +12130,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1213012130
before invalidating
1213112131
the caches.
1213212132

12133-
3. buffer_gl0_inv;
12134-
buffer_gl1_inv
12133+
3. buffer_gl1_inv;
12134+
buffer_gl0_inv
1213512135

1213612136
- Must happen before
1213712137
any following
@@ -12237,8 +12237,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1223712237
invalidating the
1223812238
caches.
1223912239

12240-
3. buffer_gl0_inv;
12241-
buffer_gl1_inv
12240+
3. buffer_gl1_inv;
12241+
buffer_gl0_inv
1224212242

1224312243
- Must happen before
1224412244
any following
@@ -12268,8 +12268,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1226812268
invalidating the
1226912269
caches.
1227012270

12271-
3. buffer_gl0_inv;
12272-
buffer_gl1_inv
12271+
3. buffer_gl1_inv;
12272+
buffer_gl0_inv
1227312273

1227412274
- Must happen before
1227512275
any following
@@ -12464,8 +12464,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1246412464
the
1246512465
fence-paired-atomic.
1246612466

12467-
2. buffer_gl0_inv;
12468-
buffer_gl1_inv
12467+
2. buffer_gl1_inv;
12468+
buffer_gl0_inv
1246912469

1247012470
- Must happen before any
1247112471
following global/generic
@@ -13178,8 +13178,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1317813178
invalidating the
1317913179
caches.
1318013180

13181-
4. buffer_gl0_inv;
13182-
buffer_gl1_inv
13181+
4. buffer_gl1_inv;
13182+
buffer_gl0_inv
1318313183

1318413184
- Must happen before
1318513185
any following
@@ -13253,8 +13253,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1325313253
invalidating the
1325413254
caches.
1325513255

13256-
4. buffer_gl0_inv;
13257-
buffer_gl1_inv
13256+
4. buffer_gl1_inv;
13257+
buffer_gl0_inv
1325813258

1325913259
- Must happen before
1326013260
any following
@@ -13481,8 +13481,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1348113481
requirements of
1348213482
release.
1348313483

13484-
2. buffer_gl0_inv;
13485-
buffer_gl1_inv
13484+
2. buffer_gl1_inv;
13485+
buffer_gl0_inv
1348613486

1348713487
- Must happen before
1348813488
any following

llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2030,8 +2030,11 @@ bool SIGfx10CacheControl::insertAcquire(MachineBasicBlock::iterator &MI,
20302030
switch (Scope) {
20312031
case SIAtomicScope::SYSTEM:
20322032
case SIAtomicScope::AGENT:
2033-
BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL0_INV));
2033+
// The order of invalidates matter here. We must invalidate "outer in"
2034+
// so L1 -> L0 to avoid L0 pulling in stale data from L1 when it is
2035+
// invalidated.
20342036
BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL1_INV));
2037+
BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL0_INV));
20352038
Changed = true;
20362039
break;
20372040
case SIAtomicScope::WORKGROUP:

0 commit comments

Comments
 (0)