UP | HOME

random variable

From the 6.436 lecture notes here.

1. definition: random variables

Let \((\Omega, \mathcal{F})\) be a measurable space.

A function \(X: \Omega \rightarrow \mathbb{R}\) is a random variable if the set \(\{\omega \mid X(\omega) < c\}\) is \(\mathcal{F}\) -measurable (see measurable set) for every \(c \in \mathbb{R}\).

I think of \(X\) as a labelling function with a very specific property that describes how labels are distributed.

2. probability law of random variable

First some notation:

  • for a r.v. \(X\), the set \(\{\omega \mid X(\omega) \leq c\}\) is often written \(\{X \leq C\}\)
  • for a general subset \(B \subset \mathbb{R}\), the set \(\{\omega \mid X(\omega) \in B\}\) is often written \(X^{-1}(B)\) or \(\{X \in B\}\)

The definition only talks about sets \(\{\omega \mid X(\omega) < c\}\), but actually any Borel measurable set \(B\in \mathcal{B}\) (see Lebesgue Measure note) \(\{X \in B\}\) is \(\mathcal{F}\) -measurable. Why? Given a random variable \(X\), we know that we have a collection of \(\{X \leq c\}\) that are \(\mathcal{F}\) -measurable. And all it takes to generate the Borel \(\sigma\) -field is a collection of intervals \((-\infty, c]\) (again, see Lebesgue Measure note). The (short) proof that might be given would involve generating the Borel \(\sigma\) -algebra from these intervals, showing that each generated set corresponds to a \(\mathcal{F}\) -measurable set.

3. definition

Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a probability space. Let \(X : \Omega \rightarrow \mathbb{R}\) be a random variable. Then,

  1. Define for every Borel measurable set \(B\in \mathcal{B}\), \(\mathbb{P}_X(B) = \mathbb{P}(X \in B)\)
  2. The resulting \(\mathbb{P}_X : \mathcal{B} \rightarrow [0,1]\) is called the probability law of \(X\)

3.1. proposition

Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a probability space. Then, the probability law \(\mathbb{P}_X\) of \(X\) is a probability measure on \((\mathbb{P}, \mathcal{B}, \Omega)\) (remember \(\mathcal{B}\) is the Borel \(\sigma\) -algebra).

4. definition: measurable functions

More generally, let \((\Omega_1, \mathcal{F}_1)\) and \((\Omega_2, \mathcal{F}_2)\) be two measurable spaces. Then, a function \(f : \Omega_1 \rightarrow \Omega_2\) is said to be \((\mathcal{F}_1, \mathcal{F}_2)\) measurable if \(f^{-1}(B) \in \mathcal{F}_1\) for every \(B \in \mathcal{F}_2\)

So, a random variable is a function that is \((\mathcal{F}, \mathcal{B})\) -measurable. A question I have for the above: does this mean that \(f\) needs to be invertible?

5. theorem: assorted facts about random variables

Let \((\Omega, \mathcal{F})\) be a measurable space.

  1. (simple random variables)Let \(A \in \mathcal{F}\) and define the indicator function \(I_A(\omega) = 1\) when \(\omega \in A\) and \(I_A(\omega) = 0\) when \(\omega \not\in A\). Then \(I_A\) is measurable. More precisely, it is \((\mathcal{F}, \mathcal{B})\) -measurable
  2. Let \(A_1, ..., A_n\) be \(\mathcal{F}\) -measurable and let \(x_1,...,x_n\) be real numbers. Then, define the function \(X = \sum_i x I_{A_i}\). That is, \[ X(\omega) = \sum_{i}x I_{A_i}(\omega) \] for every \(\omega \in \Omega\). Then \(X\) is a random variable. (Called a simple random variable)
  3. Suppose \((\Omega, \mathcal{F}) = (\mathbb{R}, \mathcal{B})\). Let \(X : \mathbb{R} \rightarrow \mathbb{R}\) be a continuous function. Then \(X\) is a random variable.
  4. (functions of a random variable) Let \(X\) be a random variable. Let \(f : \mathbb{R} \rightarrow \mathbb{R}\) be a continuous function (or, more generally, a \((\mathcal{B}, \mathcal{B})\) -measurable function.
  5. Let \(X_1,...,X_n\) be random variables. Let \(f: \mathbb{R}^n \rightarrow \mathbb{R}\) be a continuous function. Then \(f(X_1,..., X_n)\) is a random variable. In particular \(X_1 + X_2\) and \(X_1X_2\) are random variables.

5.1. stubs of proofs

  1. \(\{I_A \leq c\}\) is just \(A^C\) for \(0 \leq c < 1\) and \(\{I_A \leq c\}\) is \(\Omega\) for \(c \geq 1\). And \(\{I_A \leq c\} = \emptyset\) for \(c < 1\).
  2. Consider the set \(\{\omega \mid X(\omega) \leq c\}\). We want to check that this set is \(\mathcal{F}\) -measurable. Consider every possible combination of \(x_i\) such that their sum is \(\leq c\). Each combination corresponds with an intersection of \(\mathcal{F}\) -measurable sets of the form \(\{I_{A_i} = 1\}\). There are a finite number of such combinations.
  3. From the definition of a measurable function, for every Borel measurable set in the image of \(f\), there is a Borel measurable set in the domain. The set of values \(u\) such that \(u\leq c\) is the Borel measurable set \(B = (-\infty, c]\). So, \(f^{-1}(B)\) is a Borel measurable set. And \(X^{-1}(f^{-1}(B))\) is a \(\mathcal{F}\) measurable set because \(X\) is a random variable. So \(f(X)\) is a random variable on \((\Omega, \mathcal{F})\).

6. Theorem: inf, sup, and limit of random variable

Let \(f_n : \Omega \rightarrow \mathbb{R}\) be a function for every \(n\). Then consider some new functions that we can define:

  • \(f(\omega) = \inf_{n} f_n(\omega)\) for all \(\omega \in \Omega\)
  • \(f(\omega) = \sup_{n} f_n(\omega)\) for all \(\omega \in \Omega\)
  • \(f(\omega) = \lim_{n\rightarrow \infty}\inf_{n} f_n(\omega)\) for all \(\omega \in \Omega\)
  • \(f(\omega) = \lim_{n\rightarrow \infty}\sup_{n} f_n(\omega)\) for all \(\omega \in \Omega\).

If \(\lim_{n\rightarrow\infty} f_n(\omega)\) exists for every \(\omega\), then we say that \(\{f_n\}\) converges pointwise to the function defined by \(f(\omega) = \lim_{n\rightarrow\infty)}(\omega)\)

Then, let \((\Omega, \mathcal{F})\) be a measurable space. If \(X_n\) is a random variable for every \(n\), THEN

  • \(\inf_{n} X_n\)
  • \(\sup_{n} X_n\)
  • \(\lim_{n\rightarrow \infty} \inf_n X_n\)
  • \(\lim_{n\rightarrow \infty} \inf_n X_n\)

are all random variables and if \(\{X_n\}\) converges pointwise to \(X = \lim_{n\rightarrow \infty} X_n\), then \(X\) is also a random variable

7. definition: discrete random variable

  1. A random variable \(X\) on a probability space \((\Omega, \mathcal{F}, \mathbb{P})\) if the range of \(X\) is countable or finite
  2. If this is the case, then we can define a probability mass function (PMF) \(p_X : \mathbb{R} \rightarrow [0,1]\) defined by \(p_X = \mathbb{P}(X=x)\)

8. definition: continuous random variable

A random variable \(X:\Omega \rightarrow \mathbb{R}\) on a probability space is a continuous r.v. if there exists a non-negative measurable function \(f : \mathbb{R} \rightarrow [0,\infty)\) such that \[ F_X(x) = \mathbb{P}(X \leq x) = \int_{\mathbb{R}} 1_{(-\infty, x]} f \, d\mathbb{\lambda} \] for all \(x\in \mathbb{R}\), where \(\lambda\) is a Lebesgue Measure and the integral is a lebesgue integral.

9. More about PMFs: Marginal, Joint, and Conditional

9.1. for discrete random variables

9.1.1. marginal

The random variable \(X\) has marginal distribution \(p_X\), given by the PMF. Note: two random variables can have the same marginal distribution, but return different results on an experiment (remember that an r.v. is a function). So, for example, let \(X\) be a r.v. with a pmf that is symmetric about the \(y\) axis. Then, let \(Y = -X\). Then, \(Y\) and \(X\) have the same pmf, but different results for every outcome that is not on the origin.

9.1.2. joint

The joint pmf for r.v.'s \(X\) and \(Y\) is a function \(p_{X,Y} : \mathbb{R}^2 \rightarrow [0,1]\): \[ p_{X,Y}(x,y) = \mathbb{P}(X = x, Y =y) \]

Where \(\{X = x, Y=y\}\) is the event \(\{\omega\in\Omega : X(\omega)=x, Y(\omega)=y\}\).

9.1.2.1. marginalizing

The relation between the joint and marginal is given by: \[ p_X(x) = \mathbb{P}(\{X=x\}) = \sum_{y}(\mathbb{P}\{X=x, Y=y\}) = \sum_{y} (p_{X,Y}(x,y) \] The second equality follows from the fact that \(\{X =x\}\) is the countable union of disjoint sets (remember that \(Y\) is discrete).

9.1.3. conditional

The conditional PMF of \(X\) given \(Y\) is a function defined by: \[ p_{X\mid Y}(x\mid y) = \mathbb{P}(X=x \mid Y=y) = \frac{p_{X,Y}(x,y)}{p_Y(y)} \] if \(\mathbb{P}(Y=y) > 0\), otherwise it is undefined.

Note that \(\sum_x p_{X\mid Y}(x \mid y) = 1\), so that \(p_{X\mid Y}\) is just like a normal PMF, but taken over the (normalized) slice of the probability space where \(Y=y\).

10. helpful links

Created: 2024-07-15 Mon 01:26