hypothesis testing

1. terminology

\(p_{\mathsf{H}}(H_m)\) is the a priori probability of hypothesis \(H_m\)
\(p_{\mathsf{y}\mid\mathsf{H}}(\cdot \mid H_m)\) is the conditional probability of the observed data under \(H_m\)
- Can think of as "if \(H_m\) is true, then this is what the distribution would look like"
- Or can think of as "\(H_1, .., H_2\) and \(\mathbf{y}\) are random variables that label each outcome \(\omega\) in the probability space. \(p(\mathbf{y} = y \mid H_m = h)\) is the probability of all the \(\omega\) that satisfy \(\mathbf{y}(\omega) = y\) and \(H_m(\omega) = h\), normalized by all the probability mass that \(H_1\) takes up.
\(p_{\mathsf{H} \mid \mathsf{y}} (H_m \mid \mathbf{y})\) is the a posteriori probability of a hypothesis given an observation

2. statistical formulation

from 18.650 notes (here)
Consider \(X_1,...,X_n\) i.i.d random variables
Consider statistical model \((\Omega, (P_\theta)_{\theta\in\Theta})\)
- \(\theta\) are parameters
Then \(\Theta_0\) and \(\Theta_1\) are disjoint sets in \(\Theta\)
We have corresponding hypothesis:

\[\begin{cases} H_0 : \theta \in \Theta_0\\ H_1 : \theta \in \Theta_1 \end{cases} \] \(H_0\) is the null hypothesis. And \(H_1\) is the alternative hypothesis. If we believe that the true \(\theta\) is in one of the two sets, we test \(H_0\) against \(H_1\).

We work to reject \(H_0\) in favor of \(H_1\).

A test is a statistic \(\psi \in \{0,1\}\) such that

if \(\psi=0\), \(H_0\) is not rejected
if \(\psi=1\), \(H_1\) is rejected

Why the asymmetry? \(\psi=0\) can't be taken as evidence that \(H_0\) is probable, but \(\psi=1\) can be taken as evidence that \(H_0\) is improbable.

Coin example
- \(H_0: p=1/2\) and \(H_1: p\neq 1/2\)
- \(\psi = \mathbf{1}\{|\sqrt{n} \frac{\bar{X}_n - 0.5}{\sqrt{0.5(1-0.5)}} | > C\}\) for some threshold \(C\)
the rejection region of a test \(\psi\) are those samples which will result in rejecting the null hypothesis:

\(R_{\psi} = \{x\in \Omega^n : \psi(x) = 1 \}\)

the type 1 error is a mapping from the null hypothosis \(\theta\) s to the probability of rejecting \(H_0\) when \(H_0\) is actually true.
- \(\alpha_{\psi}: \Theta_0 \rightarrow \mathbb{R}\)
- \(\theta \mapsto P_{\theta}[\psi=1]\), where \(P_{\theta}\) is taken over all the samples in \(\Omega^n\), with probabilities given under the assumption that the parameter is \(\theta\)
type 2 error – the probabily of not rejecting \(H_0\) when \(H_1\) is actually true
- \(\beta_{\psi}: \Theta_1 \rightarrow \mathbb{R}\)
- \(\theta \mapsto P_{\theta}[\psi=0]\)
the power of a test \(\psi\):
- \(\pi_{\psi} = \inf_{\theta \in \Theta_1}(1-\beta_{\psi}(\theta))\)
- for each \(\theta\) in the alternative hypothesis, what is the probability that we don't erroneously "accept" the null hypothesis. This probability should ideally be very high, if we want to be sure that we never get false negatives. Among all \(\theta\) 's what is the smallest value? This is our highest guaranteed "power"
A test has level \(\alpha\) if \(\alpha_\psi < \alpha\), \(\forall \theta \in \Theta_0\). Analagously to the above, this is an upper bound on the probability of getting a false positive.
A test has asymptotic level \(\alpha\) if \(\alpha\) if \(\lim_{n\rightarrow \infty}\alpha_\psi < \alpha\), \(\forall \theta \in \Theta_0\).
In general, a test \(\psi\) has the form \(\psi = \mathbb{1}\{T_n > c\}\) for some sample test statistic \(T_n\) and some threshold \(c\).

3. decision rule

The solution to a hypothesis testing problem is a decision rule.

See Bayesian Inference note for more discussion.

4. see also

5. sources

6.437 lecture notes (local copy)