Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Codegen][GPU] Add support for all other intrinsics to TileAndFuse #18179

Merged
merged 5 commits into from
Aug 13, 2024

Conversation

qedawkins
Copy link
Contributor

This adds the ConcretizeMmaShapes pass to the LLVMGPUTileAndFuse pipeline to add support for other intrinsic types, in particular MFMA and WMMA variants that require reshaping of the accumulator to match requirements of the layout.

This also reworks the reshaping code to use SingleSubgroupLayout instead of VectorExt::PerDimLayoutAttr to drop an unneeded dialect dependency and also simplify the IR for cases where reshaping is not needed. In particular, when there is a unit outer dimension in a layout, no additional reshaping is needed so we can omit the reshapes in such cases. There is an option in the future to still do such reshaping so as to pre-swizzle the data needed for the MMA during the store to shared memory, but the details for how best to implement that are left as TODO.

@qedawkins qedawkins requested a review from Max191 August 9, 2024 17:27
@qedawkins qedawkins force-pushed the expand_and_add_all_intrinsics branch from 8edf5b1 to 2166c03 Compare August 9, 2024 17:52
@Max191 Max191 requested review from lialan and hanhanW August 12, 2024 13:25
@Max191
Copy link
Contributor

Max191 commented Aug 12, 2024

cc @hanhanW @lialan just for visibility. This is a bit similar to the expand_shape portion of the GPUMaterializeEncoding patterns, so may be interesting to you guys.

Copy link
Contributor

@Max191 Max191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks Quinn! Let's add some simple tests for the remaining new supported intrinsics, and then I think it's good to land. I verified correctness of matmul for MFMA_I32_32x32x16_I8 and MFMA_F32_32x32x8_F16 with this patch as well.

@qedawkins
Copy link
Contributor Author

@hanhanW was this the intended PR for this comment?

@hanhanW
Copy link
Contributor

hanhanW commented Aug 12, 2024

@hanhanW was this the intended PR for this comment?

no... let me delete it. I was confused where my comment is in llvm/llvm-project#102952 (comment).. Sorry about that.

This adds the ConcretizeMmaShapes pass to the LLVMGPUTileAndFuse
pipeline to add support for other intrinsic types, in particular MFMA
and WMMA variants that require reshaping of the accumulator to match
requirements of the layout.

This also reworks the reshaping code to use SingleSubgroupLayout instead
of VectorExt::PerDimLayoutAttr to drop an unneeded dialect dependency
and also simplify the IR for cases where reshaping is not needed. In
particular, when there is a unit `outer` dimension in a layout, no
additional reshaping is needed so we can omit the reshapes in such
cases. There is an option in the future to still do such reshaping so as
to pre-swizzle the data needed for the MMA during the store to shared
memory, but the details for how best to implement that are left as TODO.
@qedawkins qedawkins force-pushed the expand_and_add_all_intrinsics branch from 2166c03 to 0a6a163 Compare August 13, 2024 14:58
@qedawkins qedawkins enabled auto-merge (squash) August 13, 2024 14:59
@qedawkins qedawkins disabled auto-merge August 13, 2024 15:10
@qedawkins qedawkins merged commit 7812c77 into iree-org:main Aug 13, 2024
45 checks passed
@qedawkins qedawkins deleted the expand_and_add_all_intrinsics branch August 13, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants