Question about quantizer #51

hulilin666 · 2024-12-24T03:54:17Z

Hello, I noticed that in the quantization process you use the operation w_bar = tf.round(tf.stop_gradient(w_hard - w_soft) + w_soft). However, tf.round is a non-differentiable operation, which will prevent the gradients from being backpropagated to the encoder part, resulting in the encoder parameters not being updated throughout the training process. I believe the correct operation should be w_bar = tf.stop_gradient(w_hard - w_soft) + w_soft.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about quantizer #51

Question about quantizer #51

hulilin666 commented Dec 24, 2024

Question about quantizer #51

Question about quantizer #51

Comments

hulilin666 commented Dec 24, 2024