[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179

qedawkins · 2024-08-09T17:27:14Z

This adds the ConcretizeMmaShapes pass to the LLVMGPUTileAndFuse pipeline to add support for other intrinsic types, in particular MFMA and WMMA variants that require reshaping of the accumulator to match requirements of the layout.

This also reworks the reshaping code to use SingleSubgroupLayout instead of VectorExt::PerDimLayoutAttr to drop an unneeded dialect dependency and also simplify the IR for cases where reshaping is not needed. In particular, when there is a unit outer dimension in a layout, no additional reshaping is needed so we can omit the reshapes in such cases. There is an option in the future to still do such reshaping so as to pre-swizzle the data needed for the MMA during the store to shared memory, but the details for how best to implement that are left as TODO.

Max191 · 2024-08-12T13:27:18Z

cc @hanhanW @lialan just for visibility. This is a bit similar to the expand_shape portion of the GPUMaterializeEncoding patterns, so may be interesting to you guys.

Max191

This is great, thanks Quinn! Let's add some simple tests for the remaining new supported intrinsics, and then I think it's good to land. I verified correctness of matmul for MFMA_I32_32x32x16_I8 and MFMA_F32_32x32x8_F16 with this patch as well.

compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/Transforms.cpp

compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/test/concretize_mma_shapes.mlir

compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/test/distribute_mma_to_lanes.mlir

compiler/src/iree/compiler/Codegen/LLVMGPU/test/ROCDL/pipeline_tile_and_fuse.mlir

qedawkins · 2024-08-12T21:05:31Z

@hanhanW was this the intended PR for this comment?

hanhanW · 2024-08-12T21:11:25Z

@hanhanW was this the intended PR for this comment?

no... let me delete it. I was confused where my comment is in llvm/llvm-project#102952 (comment).. Sorry about that.

This adds the ConcretizeMmaShapes pass to the LLVMGPUTileAndFuse pipeline to add support for other intrinsic types, in particular MFMA and WMMA variants that require reshaping of the accumulator to match requirements of the layout. This also reworks the reshaping code to use SingleSubgroupLayout instead of VectorExt::PerDimLayoutAttr to drop an unneeded dialect dependency and also simplify the IR for cases where reshaping is not needed. In particular, when there is a unit `outer` dimension in a layout, no additional reshaping is needed so we can omit the reshapes in such cases. There is an option in the future to still do such reshaping so as to pre-swizzle the data needed for the MMA during the store to shared memory, but the details for how best to implement that are left as TODO.

qedawkins requested a review from Max191 August 9, 2024 17:27

qedawkins requested review from MaheshRavishankar, kuhar, Groverkss and antiagainst as code owners August 9, 2024 17:27

qedawkins force-pushed the expand_and_add_all_intrinsics branch from 8edf5b1 to 2166c03 Compare August 9, 2024 17:52

Max191 requested review from lialan and hanhanW August 12, 2024 13:25

Max191 approved these changes Aug 12, 2024

View reviewed changes

qedawkins added 4 commits August 13, 2024 09:39

Add concretize tests

b0bc4f0

Add distribution tests

b9ff1c4

Add pipeline tests

0a6a163

qedawkins force-pushed the expand_and_add_all_intrinsics branch from 2166c03 to 0a6a163 Compare August 13, 2024 14:58

qedawkins enabled auto-merge (squash) August 13, 2024 14:59

qedawkins disabled auto-merge August 13, 2024 15:10

fix bazel

92550d3

qedawkins merged commit 7812c77 into iree-org:main Aug 13, 2024
45 checks passed

qedawkins deleted the expand_and_add_all_intrinsics branch August 13, 2024 15:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179

[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179

qedawkins commented Aug 9, 2024

Max191 commented Aug 12, 2024

Max191 left a comment

qedawkins commented Aug 12, 2024

hanhanW commented Aug 12, 2024

[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179

[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179

Conversation

qedawkins commented Aug 9, 2024

Max191 commented Aug 12, 2024

Max191 left a comment

Choose a reason for hiding this comment

qedawkins commented Aug 12, 2024

hanhanW commented Aug 12, 2024