[Dialect] [OneDNNGraph] Add ops lowering for llama2 mlp #107

LongshengDu · 2024-05-29T09:11:44Z

Add llama2 mlp ops lowering, update matmul support for batch broadcast, matmul lowering for 3dx2d flatten. A llama2 mlp graph has been placed in onednn-graph-llama2.mlir for codegen testing.

Note: if using conda, first export LD_PRELOAD=path/to/libomp.so, then gc-opt %s --gc-cpu-pipeline | gc-cpu-runner -e main -entry-point-result=void to execute generated binary.

Tracking: #117

lib/gc/Transforms/Pipeline.cpp

lib/gc/Dialect/OneDNNGraph/OneDNNGraphOps.cpp

test/gc/Dialect/OneDNNGraph/onednn-graph-to-linalg.mlir

test/gc/Dialect/OneDNNGraph/onednn-graph-ops.mlir

kurapov-peter

One concern I had while reading is the semantics of the shapes folding. I wasn't able to formulate any concrete examples and there might be none, but it still makes sense to give it yet another thought (maybe add complex cases where problems may arise).

kurapov-peter · 2024-06-05T14:01:52Z

lib/gc/Dialect/OneDNNGraph/OneDNNGraphOps.cpp

-SmallVector<int64_t> getReducedShape(ShapeAdaptor operandShape,
-                                     ArrayRef<int64_t> axes, bool keep_dims) {
-  SmallVector<int64_t> outputShape;
+SmallVector<int64_t> canonicalizeKeepAxes(ArrayRef<int64_t> axes, int64_t rank,


Would be nice to have a short note on what is considered canonical for future reference

added comment

kurapov-peter · 2024-06-05T14:57:14Z

lib/gc/Transforms/OneDNNGraphToLinalg.cpp

+    Value newVal = op;
+    if (collapseShape.size() < opShape.size()) {
+      assert(collapseShape.size() + bcastDims.size() == bcastShape.size());
+      auto reassociation =
+          computeReassociationByAnchor(keepDims, opTy.getRank());
+      ShapedType collapseTy =
+          RankedTensorType::get(collapseShape, opTy.getElementType());
+      newVal = rewriter.create<tensor::CollapseShapeOp>(loc, collapseTy, newVal,
+                                                        reassociation);
+    }


Could you please elaborate on what's going on? And what does an anchor mean in this context?

consider kept dims [a, b,..., c] as anchor, so reassociation = [[...a...], [...b...], ..., [...c...]]
e.g. for shape [16, 1, 32, 1, 64], rank = 5, kept dims = [0, 2, 4]
[16, 1, 32, 1, 64] --collapse-> [16, 32, 64]
reassociation = [[0], [1, 2], [3, 4]]

added comment

Thanks. Feels like this can still be better in the following sense. So first, we introduce an "anchor" which is the result of the collapse transformation. Kept dims is an array of indices of the shape. Finally, reassociation would be better off with a clear definition. All these are not obvious and make you read the code to understand. For example, what would happen to [16, 1, 1, 32], rank 4, kept dims [0, 3]? One would expect a [16, 32]. What should the reassociation look like?
So, for anchor - do we need another term at all? It overloads the term for fusion as well. Kept dims are clear I think. The reassociation we should also clarify.

for [16, 1, 1, 32] -> [16, 32], rank = 4, kept dims = [0, 3]
reassociation = [[0], [1, 2, 3]]
anchor_dims is just a internal term/var name for this file, to represent which dims to collapse to/expand from, I don't think it will conflict with the term for fusion.
reassociation is not introduced by us, it is the attribute of tensor.collapse_shape and tensor.expand_shape from MLIR tensor dialect

lib/gc/Transforms/OneDNNGraphToLinalg.cpp

kurapov-peter · 2024-06-05T15:09:23Z

lib/gc/Dialect/OneDNNGraph/OneDNNGraphOps.cpp

+    SmallVector<int64_t> lhsShape(lhsType.getShape());
+    SmallVector<int64_t> rhsShape(rhsType.getShape());
+    assert(lhsShape.size() >= 2 && rhsShape.size() >= 2);
+    // assuming last 2 input dims are row and col


What guarantees it?

In else if (lRank > 1 && rRank > 1), it checks for both input rank >= 2, meaning 2 inputs are all matrix and may have batch dims.

I mean, what would happen with a transposed matrix that has batch dimension at the last position for whatever reason? I guess what you are saying is that we will treat it as if the last two are not batch dimensions and if they are the shape/layout was just wrong. Correct?

Transpose attr only Controls whether to transpose the last two dimensions, so batch dims always before last 2 dims according to the onednn spec. If last two dimensions somehow contain a batch dim, it is definitely wrong.

LongshengDu · 2024-06-06T05:32:56Z

One concern I had while reading is the semantics of the shapes folding. I wasn't able to formulate any concrete examples and there might be none, but it still makes sense to give it yet another thought (maybe add complex cases where problems may arise).

@kurapov-peter Can you specify what consider as semantics of the shapes folding? We can look into it.

xurui1995

LGTM

yifeizh2 · 2024-06-07T05:07:31Z

lib/gc/Transforms/Pipeline.cpp

@@ -143,6 +144,8 @@ class GCCPUPipeline : public impl::GCCPUPipelineBase<GCCPUPipeline> {
    auto op = getOperation();
    PassManager pm{op->getContext()};
    populateCPUPipeline(pm);
+    // TODO(longsheng): add a option to
+    // disable threading and enable pm.enableIRPrinting();


What was this TODO comment for previously ?

There were no code here previously, I use to add pm.enableIRPrinting() for debug internal passes, but this function also requires disabling threading, we can figure out later.

Longsheng Du added 11 commits May 27, 2024 20:15

rebase

0781b4f

Merge branch 'main' into longsheng/add_onednn_ops

93306ef

fix tidy

60807a8

fix

9344308

fix

eb4678f

fix

d82cc81

fix

bda491d

fix

f1a8e10

fix

1ecf8a4

Merge branch 'main' into longsheng/add_onednn_ops

45e64c9

rebase

50c7737

LongshengDu added the WIP work in progress label May 29, 2024

Longsheng Du added 15 commits May 29, 2024 17:13

Merge branch 'main' into longsheng/llma2_onednn_lower

fe0c118

fix

8e144c9

fix

44619b4

update

dd98c65

add flatten

04db05b

fix format

4877c06

Merge branch 'main' into longsheng/llma2_onednn_lower

f6072f0

update

4dc1aa4

add test

47b6551

debug

e86afaa

test

6682794

fix

0c3ff50

fix

72983b7

Merge branch 'main' into longsheng/llma2_onednn_lower

b14398c

fix

d208ec4

LongshengDu requested review from yifeizh2, Menooker and ZhennanQin June 3, 2024 04:21

LongshengDu requested a review from kurapov-peter June 3, 2024 04:22

LongshengDu removed the WIP work in progress label Jun 3, 2024

LongshengDu requested a review from ciyongch June 3, 2024 04:22

Longsheng Du added 3 commits June 3, 2024 12:27

test

fc85574

fix

dee3d54

fix

eb04513

LongshengDu requested a review from xurui1995 June 3, 2024 06:32

Longsheng Du added 6 commits June 3, 2024 15:37

update test

7e1a4be

update interface

efb6133

Merge branch 'main' into longsheng/llma2_onednn_lower

20d7dd0

fix

37782b6

Merge branch 'main' into longsheng/llma2_onednn_lower

fb61105

Merge branch 'main' into longsheng/llma2_onednn_lower

ecf95e8

ciyongch reviewed Jun 4, 2024

View reviewed changes

Longsheng Du added 2 commits June 4, 2024 16:08

add fix

d31acb3

fix

2a1c98b

kurapov-peter reviewed Jun 5, 2024

View reviewed changes

add comments

532f51f

Merge branch 'main' into longsheng/llma2_onednn_lower

e6e8159

xurui1995 approved these changes Jun 7, 2024

View reviewed changes

yifeizh2 approved these changes Jun 7, 2024

View reviewed changes

yifeizh2 reviewed Jun 7, 2024

View reviewed changes

kurapov-peter approved these changes Jun 7, 2024

View reviewed changes

kurapov-peter merged commit d7c3c0b into main Jun 7, 2024
4 checks passed

LongshengDu deleted the longsheng/llma2_onednn_lower branch June 7, 2024 15:55

LongshengDu linked an issue Jul 3, 2024 that may be closed by this pull request

Add support for Llama 2 MLP OPs on oneDNN Graph dialect #117

Closed

[Dialect] [OneDNNGraph] Add ops lowering for llama2 mlp #107

[Dialect] [OneDNNGraph] Add ops lowering for llama2 mlp #107

Uh oh!

Conversation

LongshengDu commented May 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kurapov-peter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LongshengDu Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LongshengDu commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xurui1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

LongshengDu commented May 29, 2024 •

edited

Loading

LongshengDu Jun 7, 2024 •

edited

Loading

LongshengDu commented Jun 6, 2024 •

edited

Loading