Skip to content

Commit b47c95f

Browse files
committed
Update Manifest.toml and expand a little on evaluating_loops.
1 parent 6030e21 commit b47c95f

File tree

2 files changed

+34
-7
lines changed

2 files changed

+34
-7
lines changed

docs/Manifest.toml

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,32 +13,51 @@ uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"
1313

1414
[[DocStringExtensions]]
1515
deps = ["LibGit2", "Markdown", "Pkg", "Test"]
16-
git-tree-sha1 = "1df01539a1c952cef21f2d2d1c092c2bcf0177d7"
16+
git-tree-sha1 = "88bb0edb352b16608036faadcc071adda068582a"
1717
uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
18-
version = "0.6.0"
18+
version = "0.8.1"
1919

2020
[[Documenter]]
21-
deps = ["Base64", "DocStringExtensions", "InteractiveUtils", "LibGit2", "Logging", "Markdown", "Pkg", "REPL", "Random", "Test", "Unicode"]
22-
git-tree-sha1 = "a6db1c69925cdc53aafb38caec4446be26e0c617"
21+
deps = ["Base64", "Dates", "DocStringExtensions", "InteractiveUtils", "JSON", "LibGit2", "Logging", "Markdown", "REPL", "Test", "Unicode"]
22+
git-tree-sha1 = "d497bcc45bb98a1fbe19445a774cfafeabc6c6df"
2323
uuid = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
24-
version = "0.21.0"
24+
version = "0.24.5"
2525

2626
[[InteractiveUtils]]
2727
deps = ["Markdown"]
2828
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
2929

30+
[[JSON]]
31+
deps = ["Dates", "Mmap", "Parsers", "Unicode"]
32+
git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e"
33+
uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
34+
version = "0.21.0"
35+
3036
[[LibGit2]]
37+
deps = ["Printf"]
3138
uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
3239

40+
[[Libdl]]
41+
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
42+
3343
[[Logging]]
3444
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
3545

3646
[[Markdown]]
3747
deps = ["Base64"]
3848
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
3949

50+
[[Mmap]]
51+
uuid = "a63ad114-7e13-5084-954f-fe012c677804"
52+
53+
[[Parsers]]
54+
deps = ["Dates", "Test"]
55+
git-tree-sha1 = "0c16b3179190d3046c073440d94172cfc3bb0553"
56+
uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
57+
version = "0.3.12"
58+
4059
[[Pkg]]
41-
deps = ["Dates", "LibGit2", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"]
60+
deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"]
4261
uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
4362

4463
[[Printf]]

docs/src/devdocs/evaluating_loops.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,12 @@ The cost estimate is based on the costs of individual instructions and the numbe
99
- The `reciprocal throughput` is similar to the latency, but it measures the number of cycles per operation when many of the same operation are repeated in sequence. Continuing our hose analogy, think of it as the inverse of the flow rate at steady-state. It is typically ≤ the `scalar latency`.
1010
- The `register pressure` measures the register consumption by the operation
1111

12-
Data on individual instructions for specific architectures can be found on [Agner Fog's website](https://agner.org/optimize/instruction_tables.pdf).
12+
Data on individual instructions for specific architectures can be found on [Agner Fog's website](https://agner.org/optimize/instruction_tables.pdf). Most of the costs used were those for the Skylake-X architecture.
13+
14+
Examples of how these come into play:
15+
- Vectorizing a loop will result in each instruction evaluating multiple iterations, but the costs of loads and stores will change based on the memory layouts of the accessed arrays.
16+
- Unrolling can help reduce the number of times an operation must be performed, for example if it can allow us to reuse memory multiple times rather than reloading it every time it is needed.
17+
- When there is a reduction, such as performing a sum, there is a dependency chain. Each `+` has to wait for the previous `+` to finish executing before it can begin, thus execution time is bounded by latency rather than minimum of the throughput of the `+` and load operations. By unrolling the loop, we can create multiple independent dependency chains.
18+
19+
20+

0 commit comments

Comments
 (0)