Skip to content

[MCA] Do not allocate space for DependenceEdge by default in DependencyGraphNode (NFC) #125080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 31, 2025

Conversation

asi-sc
Copy link
Contributor

@asi-sc asi-sc commented Jan 30, 2025

For each instruction from the input assembly sequence, DependencyGraph has a dedicated node (DGNode). Outgoing edges (data, resource and memory dependencies) are tracked as SmallVector<..., 8> for each DGNode in the graph. However, it's unlikely that a usual input instruction will have approximately eight dependent instructions. Below is my statistics for several RISC-V input sequences:

Number of  | Number of nodes with
edges      | this # of edges
---------------------------------
         0 | 8239447
         1 | 464252
         2 | 6164
         3 | 6783
         4 | 939
         5 | 500
         6 | 545
         7 | 116
         8 | 2
         9 | 1
        10 | 1

Approximately the same distribution is produced by llvm-mca lit tests for X86, AArch and RISC-V (even modified ones with extra dependencies added).
On a rather big input asm sequences, the use of SmallVector<..., 8> dramatically increases memory consumption without any need for it. In my case, replacing it with SmallVector<...,0> reduces memory usage by ~28% or ~1700% of input file size (2.2GB in absolute values).

There is no change in execution time, I verified it on mca lit-tests and on my big test (execution time is ~30s in both cases).

This change was made with the same intention as #124904 and optimizes I believe quite an unusual scenario. However, if there is no negative impact on other known scenarios, I'd like to have the change in llvm-project.

…cyGraphNode (NFC)

For each instruction from the input assembly sequence, DependencyGraph has a
dedicated node (DGNode). Outgoing edges (data, resource and memory dependencies)
are tracked as SmallVector<..., 8> for each DGNode in the graph. However, it's
rather unlikely that a usual input instruction will have approximately eight
dependent instructions. Below is my statistics for several RISC-V input
sequences:

Number of  | Number of nodes with
edges      | this # of edges
--------------------------------------------
         0 | 8239447
         1 | 464252
         2 | 6164
         3 | 6783
         4 | 939
         5 | 500
         6 | 545
         7 | 116
         8 | 2
         9 | 1
        10 | 1

Approximately the same distribution is produced by llvm-mca lit tests (even
modified ones with extra dependencies added).
On a rather big input asm sequences, the use of SmallVector<..., 8> dramatically
increases memory consumption without any need for it. In my case, replacing it
with SmallVector<...,0> reduces memory usage by ~28% or ~1700% of input file size
(2.2GB in absolute values).

There is no change in execution time, I verified it on mca lit-tests and on my big
test (execution time is ~30s in both cases).
@llvmbot
Copy link
Member

llvmbot commented Jan 30, 2025

@llvm/pr-subscribers-tools-llvm-mca

Author: Anton Sidorenko (asi-sc)

Changes

For each instruction from the input assembly sequence, DependencyGraph has a dedicated node (DGNode). Outgoing edges (data, resource and memory dependencies) are tracked as SmallVector<..., 8> for each DGNode in the graph. However, it's unlikely that a usual input instruction will have approximately eight dependent instructions. Below is my statistics for several RISC-V input sequences:

Number of  | Number of nodes with
edges      | this # of edges
---------------------------------
         0 | 8239447
         1 | 464252
         2 | 6164
         3 | 6783
         4 | 939
         5 | 500
         6 | 545
         7 | 116
         8 | 2
         9 | 1
        10 | 1

Approximately the same distribution is produced by llvm-mca lit tests for X86, AArch and RISC-V (even modified ones with extra dependencies added).
On a rather big input asm sequences, the use of SmallVector<..., 8> dramatically increases memory consumption without any need for it. In my case, replacing it with SmallVector<...,0> reduces memory usage by ~28% or ~1700% of input file size (2.2GB in absolute values).

There is no change in execution time, I verified it on mca lit-tests and on my big test (execution time is ~30s in both cases).

This change was made with the same intention as #124904 and optimizes I believe quite an unusual scenario. However, if there is no negative impact on other known scenarios, I'd like to have the change in llvm-project.


Full diff: https://github.com/llvm/llvm-project/pull/125080.diff

1 Files Affected:

  • (modified) llvm/tools/llvm-mca/Views/BottleneckAnalysis.h (+1-1)
diff --git a/llvm/tools/llvm-mca/Views/BottleneckAnalysis.h b/llvm/tools/llvm-mca/Views/BottleneckAnalysis.h
index 529090cf543fc4..2621efb0413ae2 100644
--- a/llvm/tools/llvm-mca/Views/BottleneckAnalysis.h
+++ b/llvm/tools/llvm-mca/Views/BottleneckAnalysis.h
@@ -228,7 +228,7 @@ class DependencyGraph {
     unsigned Depth;
 
     DependencyEdge CriticalPredecessor;
-    SmallVector<DependencyEdge, 8> OutgoingEdges;
+    SmallVector<DependencyEdge, 0> OutgoingEdges;
   };
   SmallVector<DGNode, 16> Nodes;
 

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Number of | Number of nodes with
edges | this # of edges

     0 | 8239447
     1 | 464252
     2 | 6164
     3 | 6783
     4 | 939
     5 | 500
     6 | 545
     7 | 116
     8 | 2
     9 | 1
    10 | 1

This is an interesting distribution. If my math is correct ~95% of the nodes have no out-going edges. Thus I think this is a reasonable change.

@adibiagio
Copy link
Collaborator

Thanks for this patch!

I left a minor comment. Otherwise, the change looks good to me.

@mshockwave
Copy link
Member

Thanks for this patch!

I left a minor comment. Otherwise, the change looks good to me.

I don't see any comment, I think you might have forgot to press the submit button :-)

@@ -228,7 +228,7 @@ class DependencyGraph {
unsigned Depth;

DependencyEdge CriticalPredecessor;
SmallVector<DependencyEdge, 8> OutgoingEdges;
SmallVector<DependencyEdge, 0> OutgoingEdges;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is useful to have a comment (not more than a couple of lines) explaining why zero was used for the number of inline elements. Otherwise it would be unclear for the reader why we chose that number in this case.

@asi-sc asi-sc merged commit 9cf8ee9 into llvm:main Jan 31, 2025
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants