-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179
[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179
Conversation
8edf5b1
to
2166c03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks Quinn! Let's add some simple tests for the remaining new supported intrinsics, and then I think it's good to land. I verified correctness of matmul for MFMA_I32_32x32x16_I8
and MFMA_F32_32x32x8_F16
with this patch as well.
compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/Transforms.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/test/concretize_mma_shapes.mlir
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/Transforms/test/distribute_mma_to_lanes.mlir
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/LLVMGPU/test/ROCDL/pipeline_tile_and_fuse.mlir
Show resolved
Hide resolved
@hanhanW was this the intended PR for this comment? |
no... let me delete it. I was confused where my comment is in llvm/llvm-project#102952 (comment).. Sorry about that. |
This adds the ConcretizeMmaShapes pass to the LLVMGPUTileAndFuse pipeline to add support for other intrinsic types, in particular MFMA and WMMA variants that require reshaping of the accumulator to match requirements of the layout. This also reworks the reshaping code to use SingleSubgroupLayout instead of VectorExt::PerDimLayoutAttr to drop an unneeded dialect dependency and also simplify the IR for cases where reshaping is not needed. In particular, when there is a unit `outer` dimension in a layout, no additional reshaping is needed so we can omit the reshapes in such cases. There is an option in the future to still do such reshaping so as to pre-swizzle the data needed for the MMA during the store to shared memory, but the details for how best to implement that are left as TODO.
2166c03
to
0a6a163
Compare
This adds the ConcretizeMmaShapes pass to the LLVMGPUTileAndFuse pipeline to add support for other intrinsic types, in particular MFMA and WMMA variants that require reshaping of the accumulator to match requirements of the layout.
This also reworks the reshaping code to use SingleSubgroupLayout instead of VectorExt::PerDimLayoutAttr to drop an unneeded dialect dependency and also simplify the IR for cases where reshaping is not needed. In particular, when there is a unit
outer
dimension in a layout, no additional reshaping is needed so we can omit the reshapes in such cases. There is an option in the future to still do such reshaping so as to pre-swizzle the data needed for the MMA during the store to shared memory, but the details for how best to implement that are left as TODO.