[FR]: Olive Quantization #1588

miaoqiz · 2025-01-30T04:38:53Z

Hi,

Thanks very much for the tool and support! I appreciate it!

Two questions with regards to "olive quantize":

Is there a way to use it to quantize adapter? I understand "olive convert-adapters" can process "int4" quantization, how about other precisions (e.g., uint4,uint8,uint16,fp4,fp8,fp16,nf4)
There seems to be an incorrect argument of "-a, --adapter_path" with "olive quantize" in the documentation, ""olive quantize -h" does not show the argument.

Thanks!

devang-ml · 2025-02-11T00:21:45Z

The olive quantize command quantizes the entire model or skips adapter weights in case of ONNX model. It does not quantize adapters only.

What is your scenario? What is your master weights precision and intended adapter weights precision?

miaoqiz · 2025-02-11T00:24:31Z

Thanks very much! I appreciate the information.

miaoqiz added the enhancement New feature or request label Jan 30, 2025

miaoqiz closed this as completed Feb 11, 2025

Provide feedback