Skip to content

Latest commit

 

History

History
42 lines (31 loc) · 2.67 KB

File metadata and controls

42 lines (31 loc) · 2.67 KB

Generative Adversarial Networks

Info GAN

  • The idea is to provide a latent code, which has meaningful and consistent effects on the output
  • splitting the Generator input into two parts: the traditional noise vector and a new “latent code” vector.
  • The codes are then made meaningful by maximizing the Mutual Information between the code and the generator output.
  • To calculate the regularization term, you don’t need an estimation of the code itself, but rather you need to estimate the likelihood of seeing that code for the given generated input.
  • Therefore, the output of Q is not the code value itself, but instead the statistics of the distribution you chose to model the code.
  • $Q(c|x)$, the probability distribution for c given the image x.
  • For instance, if you have used a continuous valued code (i.e. between -1 and +1), you might model $Q(c|x)$ as a Normal/Gaussian distribution. In that case, $Q$ would output two values for this part of the code: the mean and standard deviation.
  • Once you know the mean and standard deviation you can calculate the likelihood $Q(c|x)$, which is what you need for the regularization term.

Zero centered gradient penalty

$$L_{reg} = \left( \frac{\partial D(x)}{x} - 0\right)^2$$

  • The reason as to why this works is that this essentially weakens the discriminator, which can be really helpful at the beginning of training.
  • A powerful discriminator looks like a steep function, and as a result, there's no useful gradient signal for updating the generator.
  • to penalize the discriminator for deviating from the Nash-equilibrium.
  • The simplest way to achieve this is to penalize the gradient on real data alone
  • when the generator distribution produces the true data distribution and the discriminator is equal to 0 on the data manifold

faut imager une fonction sigmoide qui devient de plus en plus "steep"

More