Release v0.9.0
Highlights
- Initial support for graph quantization to programmatically generate a quantized model from a floating-point one. ImageNet examples with PTQ can be found here: https://github.com/Xilinx/brevitas/tree/master/src/brevitas_examples/imagenet_classification/ptq .
- Initial support for QuantMultiheadAttention, which is leveraged for e.g. ViT support above.
- Various improvements to graph equalization, which are leveraged in the PTQ examples above.
- New accumulation-aware quantizers, to train for low-precision accumulation, based on our A2Q paper https://arxiv.org/abs/2301.13376 .
- Experimental support for BatchQuant quantizer, based on https://arxiv.org/abs/2105.08952 , currently still untested.
- Initial support for learned rounding.
Overview of changes
Graph quantization
- Initial graph quantization support by @Giuseppe5 in #549 #574 #532 #579
Quantized layers
- Initial support for QuantMultiheadAttention #568
- Breaking change: rename Quant(Adaptive)AvgPool to Trunc(Adaptive)AvgPool by @volcacius in #562
Quantizers
- Weight normalization-based integer quantizers by @i-colbert in #559
- Accumulator-aware weight quantization by @i-colbert in #567
- BatchQuant quantizers support by @volcacius in #563
QuantTensor
- Support to move QuantTensor across devices by @Giuseppe5 in #528
- Initial support for interpolate and pixel_shuffle by @volcacius in #578
PTQ
- Batch Norm support in graph equalization by @Giuseppe5 in #531
- Mul support in graph equalization by @Giuseppe5 in #530
- Learned round support by @Giuseppe5 in #573
- MultiheadAttention and LayerNorm support in graph equalization by @Giuseppe5 in #555
- Fix calibration over large number of batches by @Giuseppe5 in #523
Export
- Itemize scalar quantize args only in TorchScript QCDQ by @volcacius in #561
- Round avgpool export fixes by @volcacius in #562
CI, linting
- Linter isort by @Giuseppe5 in #505
- CI: bump isort from 5.10.1 to 5.11.5 by @Giuseppe5 in #540
- Test: enable parallelism with pytest-xdist by @Giuseppe5 in #513
- GHA workflow improvement by @Giuseppe5 in #507
- Add support for yapf by @Giuseppe5 in #511
FX
- Disable FX backport on 1.8.1+ by @volcacius in #504
Examples
- Pretrained Resnet18 example on CIFAR10 targeting FINN by @volcacius in #577
- Graph quantization + PTQ examples and benchmarking scripts by @Giuseppe5 in #547 #575 #576
For the Full Changelog please check : v0.8.0...v0.9.0