Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set scaling factor of a quantizer directly #1079

Closed
jurevreca12 opened this issue Oct 29, 2024 · 3 comments
Closed

Set scaling factor of a quantizer directly #1079

jurevreca12 opened this issue Oct 29, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@jurevreca12
Copy link

Hello, I am trying to create a quantized convolutional neural network using brevitas that has scaling factors equal to 1. However, I can't find how to set the scaling factor.
This is the code I use:

class QuantWeightActLeNet(Module):
    def __init__(self):
        super(QuantWeightActLeNet, self).__init__()
        self.quant_inp = qnn.QuantIdentity(bit_width=4) 
        self.conv1 = qnn.QuantConv2d(3, 6, 5, bias=True, weight_bit_width=4)
        self.relu1 = qnn.QuantReLU(bit_width=4)
        self.conv2 = qnn.QuantConv2d(6, 16, 5, bias=True, weight_bit_width=4)
        self.relu2 = qnn.QuantReLU(bit_width=3)
        self.fc1   = qnn.QuantLinear(16*5*5, 120, bias=True, weight_bit_width=4)
        self.relu3 = qnn.QuantReLU(bit_width=4)
        self.fc2   = qnn.QuantLinear(120, 84, bias=True, weight_bit_width=4)
        self.relu4 = qnn.QuantReLU(bit_width=4)
        self.fc3   = qnn.QuantLinear(84, 10, bias=True)

    def forward(self, x):
        out = self.quant_inp(x)
        out = self.relu1(self.conv1(out))
        out = F.max_pool2d(out, 2)
        out = self.relu2(self.conv2(out))
        out = F.max_pool2d(out, 2)
        out = out.reshape(out.shape[0], -1)
        out = self.relu3(self.fc1(out))
        out = self.relu4(self.fc2(out))
        out = self.fc3(out)
        return out

model = QuantWeightActLeNet()
ishape=(1, 3, 32, 32)
qonnx_model = transform.brevitas_to_qonnx(model, ishape)
onnx.save(qonnx_model.model, f'QuantWeightActLeNet.onnx')

However, the QONNX model does not have unary scaling factors. Instead the input scaling factor is 0.125
(as seen in the image bellow). How can I set the scaling factors of this quantizers? I know that I can use a
custom class for a quantizer, however I would like to create a plethora of models with different bitwidths, and
I don't want to create a class for every bitwidth. What I would like is to be able to use an argument in the
QuantIdentity constructor to just set the scaling factor (e.g. QuantIdentity(bitwidth=4, scale_factor=1.0).
Is such a thing possible? if not I think it would be a useful addition.

qonnx visualized

@jurevreca12 jurevreca12 added the enhancement New feature or request label Oct 29, 2024
@Giuseppe5
Copy link
Collaborator

Giuseppe5 commented Oct 29, 2024

bit_width = 4
scaling_init = 2**(bit_width-1) - 1
self.conv1 = qnn.QuantConv2d(3, 6, 5, bias=True, weight_bit_width=bit_width , weight_scaling_impl_type='const', weight_scaling_init=scaling_init )

This should do the trick (I picked one layer from your example at random, adapt and re-use at will)

@Giuseppe5
Copy link
Collaborator

For activations, it is slightly different:

bit_width = 4
scaling_init = 2**bit_width - 1
qnn.QuantReLU(bit_width=4, scaling_impl_type='const', scaling_init=scaling_init )

The reason is because QuantRelu uses an unsigned int quantizer, and there are subtle difference in how to override quant arguments between the different layers.

For more info check our notebook tutorials, or feel free to ask here.

@jurevreca12
Copy link
Author

Thanks @Giuseppe5 I manged to get it working. I've actually being trying to figure this out for a while now, but it was quite hard to figure out how this dependency injection works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants