Skip to content

Commit

Permalink
add the head_dim=96 dispatch for block attention
Browse files Browse the repository at this point in the history
  • Loading branch information
zeroRains committed Jan 11, 2025
1 parent 4549c74 commit dc14f43
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions paddle/phi/kernels/fusion/gpu/block_attn.h
Original file line number Diff line number Diff line change
Expand Up @@ -1600,6 +1600,10 @@ void dispatch_blha_impl_headsize(const phi::GPUContext &dev_ctx,
dispatch_blha_impl_blocksize<T, 64, 64>(
params, dev_ctx.stream(), load_func, store_func, use_cachekv_int8);
break;
case 96:
dispatch_blha_impl_blocksize<T, 96, 128>(
params, dev_ctx.stream(), load_func, store_func, use_cachekv_int8);
break;
case 128:
dispatch_blha_impl_blocksize<T, 128, 128>(
params, dev_ctx.stream(), load_func, store_func, use_cachekv_int8);
Expand Down

0 comments on commit dc14f43

Please sign in to comment.