Even lower bitwidth kernel #3

xzyaoi · 2024-05-30T15:29:52Z

Hi,

Thanks for the great work! Just wondering if there're plans for supporting lower bitwidth kernels (e.g., 2 bit + 2:4 sparsity).

For a bit of context, we were working on a project that compresses the difference between the fine-tuned model and the base model, and it turned out we can compress it more aggressively (see: https://arxiv.org/abs/2312.05215), and it would be great if we can leverage marlin & sparse marlin to accelerate the inference.

Thanks in advance!

Best regards,
Xiaozhe

cc: @alexm-neuralmagic (since I saw there's a PR for 8bit, but closed)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Even lower bitwidth kernel #3

Even lower bitwidth kernel #3

xzyaoi commented May 30, 2024 •

edited

Loading

Even lower bitwidth kernel #3

Even lower bitwidth kernel #3

Comments

xzyaoi commented May 30, 2024 • edited Loading

xzyaoi commented May 30, 2024 •

edited

Loading