Jensen's Inequality

For a convex function \(f\), on the interval \([x_1, x_2]\), the secant line from \(f(x_1)\) to \(f(x_2)\) is always above \(f(x)\). So, for a weight \(t \in [0,1]\): \[ tf(x_1) + (1-t)f(x_2) \geq f(tx_1 + (1-t)x_2) \] This is the definition of convexity.

Stated more generally: for \(\alpha_1,...,\alpha_n\) non-negative real numbers such that \(\sum_{\alpha_i} = 1\), we have: \[ \sum_i\alpha_i f(x_i) \geq f\left(\sum_i\alpha_i x_i\right) \]

0.1. Proof

We will use induction. From the definition of convexity, we have \(n=2\). Now, assume that our hypothesis holds for \(n\). For \(n+1\): \[\begin{align*} f \left( \sum_{i}^{n+1} \alpha_i x_i \right) &= f \left( \alpha_1 x_1 + (1 - \alpha_1)\sum_{i=2}^{n+1} \frac{1}{1-\alpha_1} \alpha_i x_i \right)\\ & \leq \alpha_1 f(x_1) + (1-\alpha_1) f\left(\sum_{i=2}^{n+1} \frac{1}{1-\alpha_1} \alpha_i x_i \right)\\ & \leq \alpha_1 f(x_1) + \sum_{i=2}^{n+1} \alpha_i f\left(x_i \right) \end{align*}\] The second line comes from the definition of convexity. The last line comes from our inductive hypothesis and the fact that \(\sum_{i=2}^{n+1} \frac{\alpha_i}{1-\alpha_1} = 1\).

In the context of probability, you often see the following for convex \(p\): \[ \mathbb{E}[p(X)] \geq p(\mathbb{E}[X]) \] which follows from the above proof.

1. Concave functions

For concave functions, the inequality is reversed