-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support i1 datatype with an experimental flag. #18713
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alan and I had an offline sync, and he is revisiting the codegen side changes. I'm not an expert of host side changes, so we need some inputs from Ben.
compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/Patterns.cpp
Outdated
Show resolved
Hide resolved
a6f19b2
to
36561a9
Compare
6b327fe
to
07de9c4
Compare
I'll revisit the PR because now we have the support for cross-byte i1 loads. @lialan you're still working on the store part, right? Also, could you add an e2e i1 load test to the PR? I think we can iterate on that for the load support. |
@hanhanW Looks like the stores are not our highest priority at the moment. We want to have i1 support for loads primarily |
ba8c833
to
872d19e
Compare
compiler/src/iree/compiler/Codegen/Common/EmulateNarrowType.cpp
Outdated
Show resolved
Hide resolved
@lialan I'm not sure if the PR is ready or not because you added few commits. Can you mark it draft and turn it open when it's ready for review? If you need some inputs and it's not ready for review yet, feel free to tag me or ping me on discord. |
Apologies, not sure what automation or my mis-click marked this as open. I turned it back to draft. There are some issues with e2e tests and I am fixing it. |
Turns out, when converting to flow dialect, some operations such as |
b31fc83
to
df005bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick look at TypePropagationPass.cpp and I think we are not able to remove it because we can't handle i3/i5/etc at this moment.
The changes look okay to me, and I think there is a bug in the PR. Please take a look.
3020ae6
to
df13e54
Compare
compiler/src/iree/compiler/Dialect/Stream/Transforms/test/encode_host_tensors_packing.mlir
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, just few comments about tests
compiler/src/iree/compiler/Codegen/LLVMCPU/test/select_x86_64_lowering_strategy.mlir
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Dialect/Stream/Transforms/test/encode_host_tensors_packing.mlir
Outdated
Show resolved
Hide resolved
c0ee0a7
to
04ca8c7
Compare
compiler/src/iree/compiler/Dialect/Stream/Transforms/test/encode_host_tensors_packing_i1.mlir
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please update the checks in encode_host_tensors_pack.mlir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think would be good to have @benvanik review the changes to ElementPackingUtils.cpp
. I am unblocking it, but please wait for Ben to review.
@benvanik Ben can you review it for us? |
* Turn on by `--iree-enable-i1` option. * also added e2e tests. Signed-off-by: Alan Li <[email protected]>
Enable packed i1 datatype storage This commit introduces support for packed storage of the `i1` (bit) datatype. When subbyte type packing is enabled via the `--iree-experimental-packed-i1-storage` option, vectors of `i1` elements will be stored in a compact packed representation. For example, a `vector<6xi1>` will occupy a single byte of memory with the 6 bit elements packed together and 2 padding bits. A `vector<3x3xi1>` will take up 2 bytes, with the 9 bit elements packed across the bytes and 7 padding bits. Limitations: - To ensure correct behavior, the tiling configuration aligns the innermost dimension data loads with byte boundaries. This is necessitated by the current lack of emulation for unaligned subbyte vector loading/storing. - Unaligned subbyte emulation support can be added in the future, though it may incur some performance overhead. This change requires corresponding updates in the frontend to utilize the packed `i1` storage format. Signed-off-by: Alan Li <[email protected]>
Enable packed i1 datatype storage This commit introduces support for packed storage of the `i1` (bit) datatype. When subbyte type packing is enabled via the `--iree-experimental-packed-i1-storage` option, vectors of `i1` elements will be stored in a compact packed representation. For example, a `vector<6xi1>` will occupy a single byte of memory with the 6 bit elements packed together and 2 padding bits. A `vector<3x3xi1>` will take up 2 bytes, with the 9 bit elements packed across the bytes and 7 padding bits. Limitations: - To ensure correct behavior, the tiling configuration aligns the innermost dimension data loads with byte boundaries. This is necessitated by the current lack of emulation for unaligned subbyte vector loading/storing. - Unaligned subbyte emulation support can be added in the future, though it may incur some performance overhead. This change requires corresponding updates in the frontend to utilize the packed `i1` storage format. Signed-off-by: Alan Li <[email protected]> Signed-off-by: Giacomo Serafini <[email protected]>
Enable packed i1 datatype storage
This commit introduces support for packed storage of the
i1
(bit) datatype. When subbyte type packing is enabled via the--iree-experimental-packed-i1-storage
option, vectors ofi1
elements will be stored in a compact packed representation.For example, a
vector<6xi1>
will occupy a single byte of memory with the 6 bit elements packed together and 2 padding bits. Avector<3x3xi1>
will take up 2 bytes, with the 9 bit elements packed across the bytes and 7 padding bits.Limitations:
This change requires corresponding updates in the frontend to utilize the packed
i1
storage format.