UP | HOME

log-derivative trick

1. the trick

\[ \nabla_{\theta} \log p_{\theta}(x) = \frac{\nabla_{\theta}p_{\theta}(x)}{p_{\theta}(x)} \]

  • this is just the chain rule

2. application

Often, we want to take the gradient of an expectation with respect to \(\theta\): \[\begin{align*} \nabla_{\theta} E[f(x)] &= \nabla_{\theta} \int f(x) p_{\theta}(x) dx\\ &= \int \nabla_{\theta}(p_{\theta}(x)) f(x) dx \end{align*}\]

  • The second line is differentiating under the integral (see leibniz rule)
  • Why not take the expectation of the gradient? The second line is close to looking like an expectation. But it's not. We could make it look like an expectation if we did this:

\[\begin{align*} \int \nabla_{\theta}(p_{\theta}(x)) f(x) dx &= \int \nabla_{\theta}(p_{\theta}(x)) \frac{p_{\theta}(x)}{p_{\theta}(x)}f(x) dx \\ &= \int \nabla_{\theta}(\log p_{\theta}(x))p_{\theta}(x) f(x) dx \\ &= E_{x\sim p_{\theta}}[f(x)\nabla_{\theta}(\log p_{\theta}(x))] \end{align*}\]

  • then, the second line is the log-derivative trick

3. places where it shows up

Created: 2025-11-02 Sun 18:48