[LLVMCPU] Add an additional level of tiling #19027

pashu123 · 2024-11-05T14:11:05Z

Add an additional level of tiling, aka vector parallel to the CPU default pipeline. Some of the linalg op that is not specialized through any pipeline may hit a bufferization issue if passed through the default pipeline. Adding an extra level of tiling takes care of such cases.

Removes some ops (disabled for producer fusion) from dispatch Region creation. They were added in #18777 . For more info: #18900

hanhanW · 2024-11-05T18:26:24Z

compiler/src/iree/compiler/Codegen/LLVMCPU/KernelDispatch.cpp

+  // Add an extra level of tiling.
+  if (auto linalgOp = dyn_cast<linalg::LinalgOp>(*op)) {
+    SmallVector<int64_t> vecTileSizes = distTileSizes;
+    limitVectorTileSizes(linalgOp, vecTileSizes);
+    tileSizes.push_back(vecTileSizes);
+  }


I think we'll need the second level of tiling on any TilingInterface ops, so please add a TODO. It is the key to get smaller problem size; you don't need the special logics in Passes.cpp. What I'd do is something like below

Suggested change

// Add an extra level of tiling.

if (auto linalgOp = dyn_cast<linalg::LinalgOp>(*op)) {

SmallVector<int64_t> vecTileSizes = distTileSizes;

limitVectorTileSizes(linalgOp, vecTileSizes);

tileSizes.push_back(vecTileSizes);

}

SmallVector<int64_t> vecTileSizes = distTileSizes;

// TODO: ...

if (auto linalgOp = dyn_cast<linalg::LinalgOp>(*op)) {

limitVectorTileSizes(linalgOp, vecTileSizes);

}

tileSizes.push_back(vecTileSizes);

compiler/src/iree/compiler/DispatchCreation/FormDispatchRegions.cpp

hanhanW · 2024-11-05T18:32:39Z

compiler/src/iree/compiler/Codegen/LLVMCPU/Passes.cpp

@@ -653,8 +653,14 @@ void addCPULinalgExtTileAndVectorizePipeline(
  }
 }

-void addCPUDefaultPassPipeline(OpPassManager &funcPassManager) {
+void addCPUDefaultPassPipeline(OpPassManager &funcPassManager,
+                               FailureOr<TilingConfig> &tilingConfig) {
  addTileAndDistributePasses(funcPassManager);


It does not make sense to add these passes when the TilingConfig is not present, so perhaps we should just move it into the if-body?

Here is the context for other reviewers, there are some dispatches that do not have any TilingInterface ops, so we can't find any TilingConfig on any op. I think we still need the bufferization because of the flow/hal binding ops. This is one such examples: https://gist.github.com/pashu123/0b5329e8a60311b634ce9d2230381b20

MaheshRavishankar

Could we add a test here? It seems to me that we are using the same tile sizes for distribution and vector size. Want to verify that this isnt the case.

hanhanW · 2024-11-06T19:53:42Z

Could we add a test here? It seems to me that we are using the same tile sizes for distribution and vector size. Want to verify that this isnt the case.

Yes, we're using the same tile sizes for distribution and vector sizes in this PR. It happens in non-linalg ops. For linalg op cases, the vector sizes are reduced based on the heuristic (by looking at indexing maps and limit tile sizes for some dimensions).

@pashu123 can you add some tests to https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/LLVMCPU/test/select_x86_64_lowering_strategy.mlir ? We can add a test case for a group convolution and a test case for whatever LinalgExt ops (e.g., scan or topk or others).

hanhanW

Overall looks good to me. I made a bad review comment, and I put an update for it. Please take a look. (and sorry about that)

compiler/src/iree/compiler/Codegen/LLVMCPU/test/select_x86_64_lowering_strategy.mlir

compiler/src/iree/compiler/Codegen/LLVMCPU/KernelDispatch.cpp

Add an addition level of tiling aka vector parallel to the CPU default pipeline. For some of the linalg op, that is not specialized through any pipeline may hit bufferization issue, if passed through the default pipeline. Adding an extra level of tiling takes care of such cases.

hanhanW

Could we add a test here? It seems to me that we are using the same tile sizes for distribution and vector size. Want to verify that this isnt the case.

Overall looks good to me, just one final nit in the test file.

@MaheshRavishankar do you still have the concern? We have different vector sizes for Linalg ops, but we have the same vector sizes for other TilingInterface ops (which has a TODO item to improve it). It is true that the second tile size list is redundant for other TilingInterface ops for now, but it simplifies how we set up the pipeline (e.g., we don't need any special logics when adding passes to the pipeline).

compiler/src/iree/compiler/Codegen/LLVMCPU/test/select_x86_64_lowering_strategy.mlir

Add an additional level of tiling, aka vector parallel to the CPU default pipeline. Some of the linalg op that is not specialized through any pipeline may hit a bufferization issue if passed through the default pipeline. Adding an extra level of tiling takes care of such cases. Removes some ops (disabled for producer fusion) from dispatch Region creation. They were added in iree-org#18777 . For more info: iree-org#18900

Add an additional level of tiling, aka vector parallel to the CPU default pipeline. Some of the linalg op that is not specialized through any pipeline may hit a bufferization issue if passed through the default pipeline. Adding an extra level of tiling takes care of such cases. Removes some ops (disabled for producer fusion) from dispatch Region creation. They were added in iree-org#18777 . For more info: iree-org#18900 Signed-off-by: Giacomo Serafini <[email protected]>

pashu123 requested review from hanhanW, MaheshRavishankar and IanWood1 as code owners November 5, 2024 14:11

pashu123 mentioned this pull request Nov 5, 2024

'func.func' op exceeded stack allocation limit of 32768 bytes for function. Got 524288 bytes nod-ai/SHARK-ModelDev#875

Closed

pashu123 force-pushed the newtiling branch 2 times, most recently from 399fadd to ba3344f Compare November 5, 2024 17:36

hanhanW requested changes Nov 5, 2024

View reviewed changes

pashu123 requested a review from hanhanW November 6, 2024 13:33

MaheshRavishankar requested changes Nov 6, 2024

View reviewed changes

pdhirajkumarprasad mentioned this pull request Nov 7, 2024

[Tracker] All the issue related with e2e shark test suite nod-ai/SHARK-ModelDev#812

Open

pashu123 requested a review from MaheshRavishankar November 7, 2024 14:17

hanhanW reviewed Nov 7, 2024

View reviewed changes

compiler/src/iree/compiler/Codegen/LLVMCPU/test/select_x86_64_lowering_strategy.mlir Outdated Show resolved Hide resolved

compiler/src/iree/compiler/Codegen/LLVMCPU/KernelDispatch.cpp Outdated Show resolved Hide resolved

pashu123 force-pushed the newtiling branch 2 times, most recently from c133b54 to 864d9df Compare November 10, 2024 17:53

pashu123 added 4 commits November 10, 2024 23:24

Address comments

1c1235a

Add a pipeline test case

1125f34

Update comments.

e19f918

pashu123 force-pushed the newtiling branch 2 times, most recently from c133b54 to e19f918 Compare November 10, 2024 17:55

pashu123 requested a review from hanhanW November 10, 2024 18:03

hanhanW reviewed Nov 11, 2024

View reviewed changes

compiler/src/iree/compiler/Codegen/LLVMCPU/test/select_x86_64_lowering_strategy.mlir Outdated Show resolved Hide resolved

hanhanW approved these changes Nov 11, 2024

View reviewed changes

Update tests

00276a3

pashu123 force-pushed the newtiling branch from c08eebc to 00276a3 Compare November 11, 2024 18:23

MaheshRavishankar approved these changes Nov 12, 2024

View reviewed changes

pashu123 merged commit da286ea into iree-org:main Nov 12, 2024
36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLVMCPU] Add an additional level of tiling #19027

[LLVMCPU] Add an additional level of tiling #19027

pashu123 commented Nov 5, 2024 •

edited

Loading

hanhanW Nov 5, 2024

hanhanW Nov 5, 2024

MaheshRavishankar left a comment

hanhanW commented Nov 6, 2024

hanhanW left a comment

hanhanW left a comment

[LLVMCPU] Add an additional level of tiling #19027

[LLVMCPU] Add an additional level of tiling #19027

Conversation

pashu123 commented Nov 5, 2024 • edited Loading

hanhanW Nov 5, 2024

Choose a reason for hiding this comment

hanhanW Nov 5, 2024

Choose a reason for hiding this comment

MaheshRavishankar left a comment

Choose a reason for hiding this comment

hanhanW commented Nov 6, 2024

hanhanW left a comment

Choose a reason for hiding this comment

hanhanW left a comment

Choose a reason for hiding this comment

pashu123 commented Nov 5, 2024 •

edited

Loading