Set scaling factor of a quantizer directly #1079

jurevreca12 · 2024-10-29T14:13:52Z

Hello, I am trying to create a quantized convolutional neural network using brevitas that has scaling factors equal to 1. However, I can't find how to set the scaling factor.
This is the code I use:

class QuantWeightActLeNet(Module):
    def __init__(self):
        super(QuantWeightActLeNet, self).__init__()
        self.quant_inp = qnn.QuantIdentity(bit_width=4) 
        self.conv1 = qnn.QuantConv2d(3, 6, 5, bias=True, weight_bit_width=4)
        self.relu1 = qnn.QuantReLU(bit_width=4)
        self.conv2 = qnn.QuantConv2d(6, 16, 5, bias=True, weight_bit_width=4)
        self.relu2 = qnn.QuantReLU(bit_width=3)
        self.fc1   = qnn.QuantLinear(16*5*5, 120, bias=True, weight_bit_width=4)
        self.relu3 = qnn.QuantReLU(bit_width=4)
        self.fc2   = qnn.QuantLinear(120, 84, bias=True, weight_bit_width=4)
        self.relu4 = qnn.QuantReLU(bit_width=4)
        self.fc3   = qnn.QuantLinear(84, 10, bias=True)

    def forward(self, x):
        out = self.quant_inp(x)
        out = self.relu1(self.conv1(out))
        out = F.max_pool2d(out, 2)
        out = self.relu2(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = out.reshape(out.shape[0], -1)
        out = self.relu3(self.fc1(out))
        out = self.relu4(self.fc2(out))
        out = self.fc3(out)
        return out

model = QuantWeightActLeNet()
ishape=(1, 3, 32, 32)
qonnx_model = transform.brevitas_to_qonnx(model, ishape)
onnx.save(qonnx_model.model, f'QuantWeightActLeNet.onnx')

However, the QONNX model does not have unary scaling factors. Instead the input scaling factor is 0.125
(as seen in the image bellow). How can I set the scaling factors of this quantizers? I know that I can use a
custom class for a quantizer, however I would like to create a plethora of models with different bitwidths, and
I don't want to create a class for every bitwidth. What I would like is to be able to use an argument in the
QuantIdentity constructor to just set the scaling factor (e.g. QuantIdentity(bitwidth=4, scale_factor=1.0).
Is such a thing possible? if not I think it would be a useful addition.

The text was updated successfully, but these errors were encountered:

Giuseppe5 · 2024-10-29T15:49:54Z

bit_width = 4
scaling_init = 2**(bit_width-1) - 1
self.conv1 = qnn.QuantConv2d(3, 6, 5, bias=True, weight_bit_width=bit_width , weight_scaling_impl_type='const', weight_scaling_init=scaling_init )

This should do the trick (I picked one layer from your example at random, adapt and re-use at will)

Giuseppe5 · 2024-10-29T15:59:49Z

For activations, it is slightly different:

bit_width = 4
scaling_init = 2**bit_width - 1
qnn.QuantReLU(bit_width=4, scaling_impl_type='const', scaling_init=scaling_init )

The reason is because QuantRelu uses an unsigned int quantizer, and there are subtle difference in how to override quant arguments between the different layers.

For more info check our notebook tutorials, or feel free to ask here.

jurevreca12 · 2024-10-29T16:07:34Z

Thanks @Giuseppe5 I manged to get it working. I've actually being trying to figure this out for a while now, but it was quite hard to figure out how this dependency injection works.

jurevreca12 added the enhancement New feature or request label Oct 29, 2024

jurevreca12 closed this as completed Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set scaling factor of a quantizer directly #1079

Set scaling factor of a quantizer directly #1079

jurevreca12 commented Oct 29, 2024

Giuseppe5 commented Oct 29, 2024 •

edited

Loading

Giuseppe5 commented Oct 29, 2024

jurevreca12 commented Oct 29, 2024

Set scaling factor of a quantizer directly #1079

Set scaling factor of a quantizer directly #1079

Comments

jurevreca12 commented Oct 29, 2024

Giuseppe5 commented Oct 29, 2024 • edited Loading

Giuseppe5 commented Oct 29, 2024

jurevreca12 commented Oct 29, 2024

Giuseppe5 commented Oct 29, 2024 •

edited

Loading