Skip to content

QKV Fine-grained Tiling

Compare
Choose a tag to compare
@DefTruth DefTruth released this 03 Jan 08:51
· 31 commits to main since this release
82f1d04

What's Changed

  • [ELU] support ELU F32/F16 kernel✔️ by @southkarl in #194
  • [HARDSHRINK][FP16] support HARDSHRINK F32/FP16 kernel by @southkarl in #195
  • [swizzle] update smem swizzle layout tools✔️ by @DefTruth in #196
  • [swizzle] update smem swizzle layout tools✔️ by @DefTruth in #197
  • [swizzle] add padding -> swizzle layout tools🎉 by @DefTruth in #198
  • [HGEMM] HGEMM TN A&B SMEM Swizzle✔️ by @DefTruth in #199
  • [HGEMM] HGEMM TN A&B SMEM Swizzle✔️ by @DefTruth in #200
  • [FA2] shared-qkv + HMMA F32F16F16F32✔️ by @DefTruth in #201
  • [HARDSWISH] HARDSWISH F32/F16 kernel✔️ by @southkarl in #202
  • [FA2] kOStorageAccFloat32 flag -> shared-qkv✔️ by @DefTruth in #203
  • [FA2] kOStorageAccFloat32 -> share-qkv✔️ by @DefTruth in #204
  • [FA2] kOStorageAccFloat32 -> share-qkv✔️ by @DefTruth in #205
  • [FA2] share-kv + MMA F32F16F16F16F32✔️ by @DefTruth in #206
  • [FA2] share-kv + MMA F32F16F16F16F32✔️ by @DefTruth in #207
  • [FA2] tiling-kv + MMA F32F16F16F16F32✔️ by @DefTruth in #208
  • [FA2] tiling-qkv + MMA F32F16F16F32✔️ by @DefTruth in #209
  • [FA2] tiling-qkv + MMA F32F16F16F32✔️ by @DefTruth in #210
  • [FA2] flash-attn-mma fully tiling-qkv🎉 by @DefTruth in #211
  • [FA2] flash-attn-mma fully tiling-qkv🎉 by @DefTruth in #212
  • [FA2] tiling-qkv F32/F16 + swizzle q/qk/qkv🎉 by @DefTruth in #213

New Contributors

Full Changelog: v2.6.13...v2.6.14