UP | HOME

variational auto encoders

1. motivation

1.1. machine learning perspective

  • We want our latent space to be continuous and complete
    • continuous: similar representations correspond to similar entities
    • complete: every point in the latent space can be decoded to a reasonable output
  • our loss is composed of a reconstruction term and a regularization term
    • the reconstruction term encourages the output to be similar to the input
    • the regularization term encourages the latent representations to be continuous and complete
      • namely the distribution of latent variables should be close to 0 mean and unit variance

1.2. graphical models perspective

TODO

2. Loss

  • Jointly optimize \(\theta\) to make the re-construction loss high…
    • As well as tune \(\phi\) to make our approximation \(q_{\phi}(z\mid x)\) of the posterior \(p(z\mid x)\) as close as possible – as measured by kl divergence
    • see wikipedia for the full derivation
  • But we eventually end up with this equation:
  • \(-\log(p_{\theta}(x)) + D_{KL}(q_{\phi}(z\mid x) || p(z\mid x)) = -E_{z\sim q_{\phi}(z\mid x)}(\log p_{\theta}(x \mid z)) + D_{kl}(q_{\phi}(z\mid x) || p_{\theta}(z))\)
  • The LHS is the quantity that we will use as our loss
  • The RHS is the way we will end up computing/back-propagating this loss
  • On the LHS we see two terms
    • One is the negative log likelihood of the data – the reconstruction loss
    • The other is the distance between our approximation to the posterior and the true posterior – as parameterized by \(\theta\) at least
  • On the RHS we see two terms
    • One is an expectation that can be sampled from easily at runtime (see gumbel max trick)
    • Where does the other come from?
  • Question: I can see that we will eventually pull \(q\) under \(p(z\mid x)\), but what about this loss is encouraging \(p_{\theta}(z)\) and \(p_{\theta}(z\mid x)\) to be anything sensible? I guess I just have to really trust the starting equation, and believe that \(p_{\theta}(x)\) is being optimized

3. useful links

4. sources

5. see also

Created: 2024-07-15 Mon 01:28