rust-lang
diff --git a/‎llvm/docs/LangRef.rst
Lines changed: 2 additions & 0 deletions b/‎llvm/docs/LangRef.rst
Lines changed: 2 additions & 0 deletions
diff --git a/‎llvm/docs/LoopTerminology.rst
Lines changed: 218 additions & 96 deletions b/‎llvm/docs/LoopTerminology.rst
Lines changed: 218 additions & 96 deletions
@@ -20066,6 +20066,8 @@ not overflow at link time under the medium code model if ``x`` is an
 a constant initializer folded into a function body. This intrinsic can be
 used to avoid the possibility of overflows when loading from such a constant.
 
+.. _llvm_sideeffect:
+
 '``llvm.sideeffect``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 
@@ -7,119 +7,241 @@ LLVM Loop Terminology (and Canonical Forms)
 .. contents::
    :local:
 
-Introduction
-============
+Loop Definition
+===============
 
-Loops are a core concept in any optimizer.  This page spells out some
-of the common terminology used within LLVM code to describe loop
-structures.
+Loops are an important concept for a code optimizer. In LLVM, detection
+of loops in a control-flow graph is done by :ref:`loopinfo`. It is based
+on the following definition.
 
-First, let's start with the basics. In LLVM, a Loop is a maximal set of basic
-blocks that form a strongly connected component (SCC) in the Control
-Flow Graph (CFG) where there exists a dedicated entry/header block that
-dominates all other blocks within the loop. Thus, without leaving the
-loop, one can reach every block in the loop from the header block and
-the header block from every block in the loop.
+A loop is a subset of nodes from the control-flow graph (CFG; where
+nodes represent basic blocks) with the following properties:
 
-Note that there are some important implications of this definition:
+1. The induced subgraph (which is the subgraph that contains all the
+   edges from the CFG within the loop) is strongly connected
+   (every node is reachable from all others).
 
-* Not all SCCs are loops.  There exist SCCs that do not meet the
-  dominance requirement and such are not considered loops.  
+2. All edges from outside the subset into the subset point to the same
+   node, called the **header**. As a consequence, the header dominates
+   all nodes in the loop (i.e. every execution path to any of the loop's
+   node will have to pass through the header).
 
-* Loops can contain non-loop SCCs and non-loop SCCs may contain
-  loops.  Loops may also contain sub-loops.
+3. The loop is the maximum subset with these properties. That is, no
+   additional nodes from the CFG can be added such that the induced
+   subgraph would still be strongly connected and the header would
+   remain the same.
 
-* A header block is uniquely associated with one loop.  There can be
-  multiple SCC within that loop, but the strongly connected component
-  (SCC) formed from their union must always be unique.
+In computer science literature, this is often called a *natural loop*.
+In LLVM, this is the only definition of a loop.
 
-* Given the use of dominance in the definition, all loops are
-  statically reachable from the entry of the function.  
 
-* Every loop must have a header block, and some set of predecessors
-  outside the loop.  A loop is allowed to be statically infinite, so
-  there need not be any exiting edges.
+Terminology
+-----------
 
-* Any two loops are either fully disjoint (no intersecting blocks), or
-  one must be a sub-loop of the other.
+The definition of a loop comes with some additional terminology:
 
-* Loops in a function form a forest. One implication of this fact
-  is that a loop either has no parent or a single parent.
+* An **entering block** (or **loop predecessor**) is a non-loop node
+  that has an edge into the loop (necessarily the header). If there is
+  only one entering block entering block, and its only edge is to the
+  header, it is also called the loop's **preheader**. The preheader
+  dominates the loop without itself being part of the loop.
 
-A loop may have an arbitrary number of exits, both explicit (via
-control flow) and implicit (via throwing calls which transfer control
-out of the containing function).  There is no special requirement on
-the form or structure of exit blocks (the block outside the loop which
-is branched to).  They may have multiple predecessors, phis, etc...
+* A **latch** is a loop node that has an edge to the header.
 
-Key Terminology
-===============
+* A **backedge** is an edge from a latch to the header.
+
+* An **exiting edge** is an edge from inside the loop to a node outside
+  of the loop. The source of such an edge is called an **exiting block**, its
+  target is an **exit block**.
+
+.. image:: ./loop-terminology.svg
+   :width: 400 px
+
+
+Important Notes
+---------------
+
+This loop definition has some noteworthy consequences:
+
+* A node can be the header of at most one loop. As such, a loop can be
+  identified by its header. Due to the header being the only entry into
+  a loop, it can be called a Single-Entry-Multiple-Exits (SEME) region.
+
+
+* For basic blocks that are not reachable from the function's entry, the
+  concept of loops is undefined. This follows from the concept of
+  dominance being undefined as well.
+
+
+* The smallest loop consists of a single basic block that branches to
+  itself. In this case that block is the header, latch (and exiting
+  block if it has another edge to a different block) at the same time.
+  A single block that has no branch to itself is not considered a loop,
+  even though it is trivially strongly connected.
+
+.. image:: ./loop-single.svg
+   :width: 300 px
+
+In this case, the role of header, exiting block and latch fall to the
+same node. :ref:`loopinfo` reports this as:
+
+.. code-block:: console
+
+  $ opt input.ll -loops -analyze
+  Loop at depth 1 containing: %for.body<header><latch><exiting>
 
-**Header Block** - The basic block which dominates all other blocks
-contained within the loop.  As such, it is the first one executed if
-the loop executes at all.  Note that a block can be the header of
-two separate loops at the same time, but only if one is a sub-loop
-of the other.
-
-**Exiting Block** - A basic block contained within a given loop which has
-at least one successor outside of the loop and one successor inside the
-loop.  (The latter is a consequence of the block being contained within
-an SCC which is part of the loop.)  That is, it has a successor which
-is an Exit Block.  
-
-**Exit Block** - A basic block outside of the associated loop which has a
-predecessor inside the loop.  That is, it has a predecessor which is
-an Exiting Block.
-
-**Latch Block** - A basic block within the loop whose successors include
-the header block of the loop.  Thus, a latch is a source of backedge.
-A loop may have multiple latch blocks.  A latch block may be either
-conditional or unconditional.
-
-**Backedge(s)** - The edge(s) in the CFG from latch blocks to the header
-block.  Note that there can be multiple such edges, and even multiple
-such edges leaving a single latch block.  
-
-**Loop Predecessor** -  The predecessor blocks of the loop header which
-are not contained by the loop itself.  These are the only blocks
-through which execution can enter the loop.  When used in the
-singular form implies that there is only one such unique block. 
-
-**Preheader Block** - A preheader is a (singular) loop predecessor which
-ends in an unconditional transfer of control to the loop header.  Note
-that not all loops have such blocks.
-
-**Backedge Taken Count** - The number of times the backedge will execute
-before some interesting event happens.  Commonly used without
-qualification of the event as a shorthand for when some exiting block
-branches to some exit block. May be zero, or not statically computable.
-
-**Iteration Count** - The number of times the header will execute before
-some interesting event happens.  Commonly used without qualification to
-refer to the iteration count at which the loop exits.  Will always be
-one greater than the backedge taken count.  *Warning*: Preceding
-statement is true in the *integer domain*; if you're dealing with fixed
-width integers (such as LLVM Values or SCEVs), you need to be cautious
-of overflow when converting one to the other.
-
-It's important to note that the same basic block can play multiple
-roles in the same loop, or in different loops at once.  For example, a
-single block can be the header for two nested loops at once, while
-also being an exiting block for the inner one only, and an exit block
-for a sibling loop.  Example:
+
+* Loops can be nested inside each other. That is, a loop's node set can
+  be a subset of another loop with a different loop header. The loop
+  hierarchy in a function forms a forest: Each top-level loop is the
+  root of the tree of the loops nested inside it.
+
+.. image:: ./loop-nested.svg
+   :width: 350 px
+
+
+* It is not possible that two loops share only a few of their nodes.
+  Two loops are either disjoint or one is nested inside the other. In
+  the example below the left and right subsets both violate the
+  maximality condition. Only the merge of both sets is considered a loop.
+
+.. image:: ./loop-nonmaximal.svg
+   :width: 250 px
+
+
+* It is also possible that two logical loops share a header, but are
+  considered a single loop by LLVM:
 
 .. code-block:: C
 
-  while (..) {
-    for (..) {}
-    do {
-      do {
-        // <-- block of interest
-        if (exit) break;
-      } while (..);
-    } while (..)
+  for (int i = 0; i < 128; ++i)
+    for (int j = 0; j < 128; ++j)
+      body(i,j);
+
+which might be represented in LLVM-IR as follows. Note that there is
+only a single header and hence just a single loop.
+
+.. image:: ./loop-merge.svg
+   :width: 400 px
+
+The :ref:`LoopSimplify <loop-terminology-loop-simplify>` pass will
+detect the loop and ensure separate headers for the outer and inner loop.
+
+.. image:: ./loop-separate.svg
+   :width: 400 px
+
+* A cycle in the CFG does not imply there is a loop. The example below
+  shows such a CFG, where there is no header node that dominates all
+  other nodes in the cycle. This is called **irreducible control-flow**.
+
+.. image:: ./loop-irreducible.svg
+   :width: 150 px
+
+The term reducible results from the ability to collapse the CFG into a
+single node by successively replacing one of three base structures with
+a single node: A sequential execution of basic blocks, a conditional
+branching (or switch) with re-joining, and a basic block looping on itself.
+`Uncyclopedia <https://en.wikipedia.org/wiki/Control-flow_graph#Reducibility>`_
+has a more formal definition, which basically says that every cycle has
+a dominating header.
+
+
+* Irreducible control-flow can occur at any level of the loop nesting.
+  That is, a loop that itself does not contain any loops can still have
+  cyclic control flow in its body; a loop that is not nested inside
+  another loop can still be part of an outer cycle; and there can be
+  additional cycles between any two loops where one is contained in the other.
+
+
+* Exiting edges are not the only way to break out of a loop. Other
+  possibilities are unreachable terminators, [[noreturn]] functions,
+  exceptions, signals, and your computer's power button.
+
+
+* A basic block "inside" the loop that does not have a path back to the
+  loop (i.e. to a latch or header) is not considered part of the loop.
+  This is illustrated by the following code.
+
+.. code-block:: C
+
+  for (unsigned i = 0; i <= n; ++i) {
+    if (c1) {
+      // When reaching this block, we will have exited the loop.
+      do_something();
+      break;
+    }
+    if (c2) {
+      // abort(), never returns, so we have exited the loop.
+      abort();
+    }
+    if (c3) {
+      // The unreachable allows the compiler to assume that this will not rejoin the loop.
+      do_something();
+      __builtin_unreachable();
+    }
+    if (c4) {
+      // This statically infinite loop is not nested because control-flow will not continue with the for-loop.
+      while(true) {
+        do_something();
+      }
+    }
   }
 
+
+* There is no requirement for the control flow to eventually leave the
+  loop, i.e. a loop can be infinite. A **statically infinite loop** is a
+  loop that has no exiting edges. A **dynamically infinite loop** has
+  exiting edges, but it is possible to be never taken. This may happen
+  only under some circumstances, such as when n == UINT_MAX in the code
+  below.
+
+.. code-block:: C
+
+  for (unsigned i = 0; i <= n; ++i)
+    body(i);
+
+It is possible for the optimizer to turn a dynamically infinite loop
+into a statically infinite loop, for instance when it can prove that the
+exiting condition is always false. Because the exiting edge is never
+taken, the optimizer can change the conditional branch into an
+unconditional one.
+
+Note that under some circumstances the compiler may assume that a loop will
+eventually terminate without proving it. For instance, it may remove a loop
+that does not do anything in its body. If the loop was infinite, this
+optimization resulted in an "infinite" performance speed-up. A call
+to the intrinsic :ref:`llvm.sideeffect<llvm_sideeffect>` can be added
+into the loop to ensure that the optimizer does not make this assumption
+without proof.
+
+
+* The number of executions of the loop header before leaving the loop is
+  the **loop trip count** (or **iteration count**). If the loop should
+  not be executed at all, a **loop guard** must skip the entire loop:
+
+.. image:: ./loop-guard.svg
+   :width: 500 px
+
+Since the first thing a loop header might do is to check whether there
+is another execution and if not, immediately exit without doing any work
+(also see :ref:`loop-terminology-loop-rotate`), loop trip count is not
+the best measure of a loop's number of iterations. For instance, the
+number of header executions of the code below for a non-positive n
+(before loop rotation) is 1, even though the loop body is not executed
+at all.
+
+.. code-block:: C
+
+  for (int i = 0; i < n; ++i)
+    body(i);
+
+A better measure is the **backedge-taken count**, which is the number of
+times any of the backedges is taken before the loop. It is one less than
+the trip count for executions that enter the header.
+
+
+.. _loopinfo:
+
 LoopInfo
 ========
 
@@ -139,7 +261,7 @@ are important for working successfully with this interface.
   be removed from LoopInfo. If this can not be done for some reason,
   then the optimization is *required* to preserve the static
   reachability of the loop.
-  
+
 
 .. _loop-terminology-loop-simplify: