-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Codegen][GPU] Add kernel config for LLVMGPUTileAndFuse #17791
Conversation
55a8f7e
to
b180ec1
Compare
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Show resolved
Hide resolved
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some testing with this, and there are some compiler failures and correctness issues with some dispatches. We should debug these issues before we land this change.
b180ec1
to
055bfc4
Compare
055bfc4
to
3d5f4ea
Compare
Talked offline, the correctness issues looked to be a floating point precision ghost. I verified e2e correctness of this patch on SDXL int8 by generating an image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to land. There is a similar issue with VectorDistribute fusions, which may be causing minor precision differences. I talked with @MaheshRavishankar and we still want to track this precision difference. Ideally these precision differences should be able to be truned off if necessary to produce bitwise exact results independent of pipelines. Not blocking progress, since numerics are overall accurate, but we should track this.
This adds kernel configuration logic for targeting simple thread distribution of linalg-based dispatches on LLVMGPU. The configuration logic is primarily copied from the same logic on the SPIR-V side due to the already well tested heuristics there for the kinds of varied target descriptions that are present for SPIR-V. Currently this is locked behind a flag `iree-codegen-llvmgpu-use-tile-and-fuse`.
2a42789
to
5909a05
Compare
This adds kernel configuration logic for targeting simple thread distribution of linalg-based dispatches on LLVMGPU. The configuration logic is primarily copied from the same logic on the SPIR-V side due to the already well tested heuristics there for the kinds of varied target descriptions that are present on the SPIR-V side.
Currently this is locked behind a flag
iree-codegen-llvmgpu-use-tile-and-fuse
. Future patches will add specialized logic for matmul.