Skip to content

v0.3.4

Latest
Compare
Choose a tag to compare
@EricLBuehler EricLBuehler released this 28 Nov 19:27
· 43 commits to master since this release
68c078f

New features

  • Qwen2-VL support
  • Idefics 3/SmolVLM support
  • ️‍🔥 6x prompt performance boost (all benchmarks faster than or comparable to MLX, llama.cpp)!
  • 🗂️ More efficient non-PagedAttention KV cache implementation!
  • Public tokenization API

Python wheels

The wheels now include support for Windows, Linux, and Mac with x84_64 and aarch64.

MSRV

1.79.0

What's Changed

New Contributors

Full Changelog: v0.3.2...v0.3.4