Skip to content

Commit e8b31fb

Browse files
authored
[mlir] fix latex formulas in the tutorial
1 parent c61f0a8 commit e8b31fb

File tree

1 file changed

+5
-4
lines changed
  • mlir/docs/Tutorials/transform

1 file changed

+5
-4
lines changed

mlir/docs/Tutorials/transform/ChH.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -583,10 +583,11 @@ LLVM IR and processed by the LLVM compiler to produce an executable or JITted.
583583

584584
The generated code runs in ~420ms on an Intel processor with Skylake
585585
microarchitecture clocked at 2.0GHz. Given that the computation performs
586-
$5*80*100*128*(2*3*3*128 + 2) ~= 5.9 * 10^9$ floating point operations, it
587-
reaches ~14 GFlops. With 1 FMA unit available, the single-core performance of
588-
the test processor is 64 GFlops $16 * 2 * 2 * 10^9$, where 16 is the vector
589-
width), so only 22% of the theoretical peak is achieved.
586+
$`5 \cdot 80 \cdot 100 \cdot 128 \cdot (2 \cdot 3 \cdot 3 \cdot 128 + 2) \approx 5.9 * 10^9`$
587+
floating point operations, it reaches ~14 GFlops. With 1 FMA unit available,
588+
the single-core performance of the test processor is 64 GFlops
589+
($`16 \cdot 2 \cdot 2 \cdot 10^9`$, where 16 is the vector width), so only
590+
22% of the theoretical peak is achieved.
590591
591592
The code produced by Halide runs in ~120ms on the same processor, a 3.5x
592593
improvement and 77% of peak. Let us analyze the generated assembly to understand

0 commit comments

Comments
 (0)