Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR]: Olive Quantization #1588

Closed
7 tasks
miaoqiz opened this issue Jan 30, 2025 · 2 comments
Closed
7 tasks

[FR]: Olive Quantization #1588

miaoqiz opened this issue Jan 30, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@miaoqiz
Copy link

miaoqiz commented Jan 30, 2025

Proposal Summary

Hi,

Thanks very much for the tool and support! I appreciate it!

Two questions with regards to "olive quantize":

  1. Is there a way to use it to quantize adapter? I understand "olive convert-adapters" can process "int4" quantization, how about other precisions (e.g., uint4,uint8,uint16,fp4,fp8,fp16,nf4)
  2. There seems to be an incorrect argument of "-a, --adapter_path" with "olive quantize" in the documentation, ""olive quantize -h" does not show the argument.

Thanks!

What component(s) does this request affect?

  • OliveModels
  • OliveSystems
  • OliveEvaluator
  • Metrics
  • Engine
  • Passes
  • Other
@miaoqiz miaoqiz added the enhancement New feature or request label Jan 30, 2025
@devang-ml
Copy link
Contributor

The olive quantize command quantizes the entire model or skips adapter weights in case of ONNX model. It does not quantize adapters only.

What is your scenario? What is your master weights precision and intended adapter weights precision?

@miaoqiz
Copy link
Author

miaoqiz commented Feb 11, 2025

Thanks very much! I appreciate the information.

@miaoqiz miaoqiz closed this as completed Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants