# hypothesis testing

## 1. terminology

- \(p_{\mathsf{H}}(H_m)\) is the
*a priori*probability of hypothesis \(H_m\) - \(p_{\mathsf{y}\mid\mathsf{H}}(\cdot \mid H_m)\) is the conditional probability of the observed data under \(H_m\)
- Can think of as "if \(H_m\) is true, then this is what the distribution would look like"
- Or can think of as "\(H_1, .., H_2\) and \(\mathbf{y}\) are random variables that label each outcome \(\omega\) in the probability space. \(p(\mathbf{y} = y \mid H_m = h)\) is the probability of all the \(\omega\) that satisfy \(\mathbf{y}(\omega) = y\) and \(H_m(\omega) = h\), normalized by all the probability mass that \(H_1\) takes up.

- \(p_{\mathsf{H} \mid \mathsf{y}} (H_m \mid \mathbf{y})\) is the
*a posteriori*probability of a hypothesis given an observation

## 2. statistical formulation

- from 18.650 notes (here)
- Consider \(X_1,...,X_n\) i.i.d random variables
- Consider statistical model \((\Omega, (P_\theta)_{\theta\in\Theta})\)
- \(\theta\) are parameters

- Then \(\Theta_0\) and \(\Theta_1\) are disjoint sets in \(\Theta\)
- We have corresponding hypothesis:

\[\begin{cases} H_0 : \theta \in \Theta_0\\ H_1 : \theta \in \Theta_1 \end{cases} \] \(H_0\) is the null hypothesis. And \(H_1\) is the alternative hypothesis. If we believe that the true \(\theta\) is in one of the two sets, we test \(H_0\) against \(H_1\).

We work to reject \(H_0\) in favor of \(H_1\).

A *test* is a statistic \(\psi \in \{0,1\}\) such that

- if \(\psi=0\), \(H_0\) is not rejected
- if \(\psi=1\), \(H_1\) is rejected

Why the asymmetry? \(\psi=0\) can't be taken as evidence that \(H_0\) is probable, but \(\psi=1\) *can* be taken as evidence that \(H_0\) is improbable.

- Coin example
- \(H_0: p=1/2\) and \(H_1: p\neq 1/2\)
- \(\psi = \mathbf{1}\{|\sqrt{n} \frac{\bar{X}_n - 0.5}{\sqrt{0.5(1-0.5)}} | > C\}\) for some threshold \(C\)

- the
*rejection region*of a test \(\psi\) are those samples which will result in rejecting the null hypothesis:

\(R_{\psi} = \{x\in \Omega^n : \psi(x) = 1 \}\)

- the
*type 1*error is a mapping from the null hypothosis \(\theta\) s to the probability of rejecting \(H_0\) when \(H_0\) is actually true.- \(\alpha_{\psi}: \Theta_0 \rightarrow \mathbb{R}\)
- \(\theta \mapsto P_{\theta}[\psi=1]\), where \(P_{\theta}\) is taken over all the samples in \(\Omega^n\), with probabilities given under the assumption that the parameter is \(\theta\)

*type 2*error – the probabily of not rejecting \(H_0\) when \(H_1\) is actually true- \(\beta_{\psi}: \Theta_1 \rightarrow \mathbb{R}\)
- \(\theta \mapsto P_{\theta}[\psi=0]\)

- the
*power*of a test \(\psi\):- \(\pi_{\psi} = \inf_{\theta \in \Theta_1}(1-\beta_{\psi}(\theta))\)
- for each \(\theta\) in the alternative hypothesis, what is the probability that we don't erroneously "accept" the null hypothesis. This probability should ideally be very high, if we want to be sure that we never get false negatives. Among all \(\theta\) 's what is the smallest value? This is our highest guaranteed "power"

- A test has
*level*\(\alpha\) if \(\alpha_\psi < \alpha\), \(\forall \theta \in \Theta_0\). Analagously to the above, this is an upper bound on the probability of getting a false positive. - A test has
*asymptotic level*\(\alpha\) if \(\alpha\) if \(\lim_{n\rightarrow \infty}\alpha_\psi < \alpha\), \(\forall \theta \in \Theta_0\). - In general, a test \(\psi\) has the form \(\psi = \mathbb{1}\{T_n > c\}\) for some sample test statistic \(T_n\) and some threshold \(c\).

## 3. decision rule

The solution to a hypothesis testing problem is a decision rule.

See Bayesian Inference note for more discussion.