Skip to content

Commit da61c86

Browse files
committed
[RFC] Introduce convergence control intrinsics
This is a reboot of the original design and implementation by Nicolai Haehnle <[email protected]>: https://reviews.llvm.org/D85603 This change also obsoletes an earlier attempt at restarting the work on convergence tokens: https://reviews.llvm.org/D104504 Changes relative to D85603: 1. Clean up the definition of a "convergent operation", a convergent call and convergent function. 2. Clean up the relationship between dynamic instances, sets of threads and convergence tokens. 3. Redistribute the formal rules into the definitions of the convergence intrinsics. 4. Expand on the semantics of entering a function from outside LLVM, and the environment-defined outcome of the entry intrinsic. 5. Replace the term "cycle" with "closed path". The static rules are defined in terms of closed paths, and then a relation is established with cycles. 6. Specify that if a function contains a controlled convergent operation, then all convergent operations in that function must be controlled. 7. Describe an optional procedure to infer tokens for uncontrolled convergent operations. 8. Introduce controlled maximal convergence-before and controlled m-converged property as an update to the original properties in UniformityAnalysis. 9. Additional constraint that a cycle heart can only occur in the header of a reducible cycle (natural loop). Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D147116
1 parent e36dd3e commit da61c86

22 files changed

+2302
-109
lines changed

llvm/docs/ConvergenceAndUniformity.rst

Lines changed: 42 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _convergence-and-uniformity:
2+
13
==========================
24
Convergence And Uniformity
35
==========================
@@ -82,6 +84,8 @@ Diverged path
8284
either reaches a join node of the branch or reaches the end of the
8385
function without passing through any join node of the branch.
8486

87+
.. _convergence-dynamic-instances:
88+
8589
Threads and Dynamic Instances
8690
=============================
8791

@@ -135,7 +139,7 @@ instance*. Informally, two threads that produce converged dynamic
135139
instances are said to be *converged*, and they are said to execute
136140
that static instance *convergently*, at that point in the execution.
137141

138-
*Convergence order* is a strict partial order over dynamic instances
142+
*Convergence-before* is a strict partial order over dynamic instances
139143
that is defined as the transitive closure of:
140144

141145
1. If dynamic instance ``P`` is executed strictly before ``Q`` in the
@@ -171,40 +175,26 @@ The fact that *convergence-before* is a strict partial order is a
171175
constraint on the *converged-with* relation. It is trivially satisfied
172176
if different dynamic instances are never converged. It is also
173177
trivially satisfied for all known implementations for which
174-
convergence plays some role. Aside from the strict partial convergence
175-
order, there are currently no additional constraints on the
176-
*converged-with* relation imposed in LLVM IR.
178+
convergence plays some role.
177179

178180
.. _convergence-note-convergence:
179181

180182
.. note::
181183

182-
1. The ``convergent`` attribute on convergent operations does
183-
constrain changes to ``converged-with``, but it is expressed in
184-
terms of control flow and does not explicitly deal with thread
185-
convergence.
186-
187-
2. The convergence-before relation is not
184+
1. The convergence-before relation is not
188185
directly observable. Program transforms are in general free to
189186
change the order of instructions, even though that obviously
190187
changes the convergence-before relation.
191188

192-
3. Converged dynamic instances need not be executed at the same
189+
2. Converged dynamic instances need not be executed at the same
193190
time or even on the same resource. Converged dynamic instances
194191
of a convergent operation may appear to do so but that is an
195-
implementation detail. The fact that ``P`` is convergence-before
192+
implementation detail.
193+
194+
3. The fact that ``P`` is convergence-before
196195
``Q`` does not automatically imply that ``P`` happens-before
197196
``Q`` in a memory model sense.
198197

199-
4. **Future work:** Providing convergence-related guarantees to
200-
compiler frontends enables some powerful optimization techniques
201-
that can be used by programmers or by high-level program
202-
transforms. Constraints on the ``converged-with`` relation may
203-
be added eventually as part of the definition of LLVM
204-
IR, so that guarantees can be made that frontends can rely on.
205-
For a proposal on how this might work, see `D85603
206-
<https://reviews.llvm.org/D85603>`_.
207-
208198
.. _convergence-maximal:
209199

210200
Maximal Convergence
@@ -217,8 +207,11 @@ relation is reasonable for real targets and is compatible with
217207
convergent operations.
218208

219209
The maximal converged-with relation is defined in terms of cycle
220-
headers, which are not unique to a given CFG. Each cycle hierarchy for
221-
the same CFG results in a different maximal converged-with relation.
210+
headers, with the assumption that threads converge at the header on every
211+
"iteration" of the cycle. Informally, two threads execute the same iteration of
212+
a cycle if they both previously executed the cycle header the same number of
213+
times after they entered that cycle. In general, this needs to account for the
214+
iterations of parent cycles as well.
222215

223216
**Maximal converged-with:**
224217

@@ -235,6 +228,10 @@ the same CFG results in a different maximal converged-with relation.
235228

236229
.. note::
237230

231+
Cycle headers may not be unique to a given CFG if it is irreducible. Each
232+
cycle hierarchy for the same CFG results in a different maximal
233+
converged-with relation.
234+
238235
For brevity, the rest of the document restricts the term
239236
*converged* to mean "related under the maximal converged-with
240237
relation for the given cycle hierarchy".
@@ -269,7 +266,7 @@ Maximal convergence can now be demonstrated in the earlier example as follows:
269266
Dependence on Cycles Headers
270267
----------------------------
271268

272-
Contradictions in convergence order are possible only between two
269+
Contradictions in *convergence-before* are possible only between two
273270
nodes that are inside some cycle. The dynamic instances of such nodes
274271
may be interleaved in the same thread, and this interleaving may be
275272
different for different threads.
@@ -427,6 +424,8 @@ any use ``U`` outside the cycle receives a value from non-converged
427424
dynamic instances of ``N``. An output of ``U`` may be divergent,
428425
depending on the semantics of the instruction.
429426

427+
.. _uniformity-analysis:
428+
430429
Static Uniformity Analysis
431430
==========================
432431

@@ -458,20 +457,14 @@ hierarchy:
458457

459458

460459
Each node ``X`` in a given CFG is reported to be m-converged if and
461-
only if:
462-
463-
1. ``X`` is a :ref:`top-level<cycle-toplevel-block>` node, in which
464-
case, there are no cycle headers to influence the convergence of
465-
``X``.
460+
only if every cycle that contains ``X`` satisfies the following necessary
461+
conditions:
466462

467-
2. Otherwise, if ``X`` is inside a cycle, then every cycle that
468-
contains ``X`` satisfies the following necessary conditions:
469-
470-
a. Every divergent branch inside the cycle satisfies the
471-
:ref:`diverged entry criterion<convergence-diverged-entry>`, and,
472-
b. There are no :ref:`diverged paths reaching the
473-
cycle<convergence-diverged-outside>` from a divergent branch
474-
outside it.
463+
1. Every divergent branch inside the cycle satisfies the
464+
:ref:`diverged entry criterion<convergence-diverged-entry>`, and,
465+
2. There are no :ref:`diverged paths reaching the
466+
cycle<convergence-diverged-outside>` from a divergent branch
467+
outside it.
475468

476469
.. note::
477470

@@ -700,3 +693,15 @@ Clearly, this can be determined only in a cycle hierarchy ``T`` where
700693
in a different cycle hierarchy ``T'`` where ``C`` is part of a larger
701694
cycle ``C'`` with the same header, but this does not contradict the
702695
conclusion in ``T``.
696+
697+
Controlled Convergence
698+
======================
699+
700+
:ref:`Convergence control tokens <dynamic_instances_and_convergence_tokens>`
701+
provide an explicit semantics for determining which threads are converged at a
702+
given point in the program. The impact of this is incorporated in a
703+
:ref:`controlled maximal converged-with <controlled_maximal_converged_with>`
704+
relation over dynamic instances and a :ref:`controlled m-converged
705+
<controlled_m_converged>` property of static instances. The :ref:`uniformity
706+
analysis <uniformity-analysis>` implemented in LLVM includes this for targets
707+
that support convergence control tokens.

0 commit comments

Comments
 (0)