You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I noticed that in the quantization process you use the operation w_bar = tf.round(tf.stop_gradient(w_hard - w_soft) + w_soft). However, tf.round is a non-differentiable operation, which will prevent the gradients from being backpropagated to the encoder part, resulting in the encoder parameters not being updated throughout the training process. I believe the correct operation should be w_bar = tf.stop_gradient(w_hard - w_soft) + w_soft.
The text was updated successfully, but these errors were encountered:
Hello, I noticed that in the quantization process you use the operation
w_bar = tf.round(tf.stop_gradient(w_hard - w_soft) + w_soft)
. However,tf.round
is a non-differentiable operation, which will prevent the gradients from being backpropagated to the encoder part, resulting in the encoder parameters not being updated throughout the training process. I believe the correct operation should bew_bar = tf.stop_gradient(w_hard - w_soft) + w_soft
.The text was updated successfully, but these errors were encountered: