# random variable

From the 6.436 lecture notes here.

## 1. definition: random variables

Let \((\Omega, \mathcal{F})\) be a measurable space.

A function \(X: \Omega \rightarrow \mathbb{R}\) is a random variable if the set \(\{\omega \mid X(\omega) < c\}\) is \(\mathcal{F}\) -measurable (see measurable set) for every \(c \in \mathbb{R}\).

I think of \(X\) as a labelling function with a very specific property that describes how labels are distributed.

## 2. probability law of random variable

First some notation:

- for a r.v. \(X\), the set \(\{\omega \mid X(\omega) \leq c\}\) is often written \(\{X \leq C\}\)
- for a general subset \(B \subset \mathbb{R}\), the set \(\{\omega \mid X(\omega) \in B\}\) is often written \(X^{-1}(B)\) or \(\{X \in B\}\)

The definition only talks about sets \(\{\omega \mid X(\omega) < c\}\), but actually any Borel measurable set \(B\in \mathcal{B}\) (see Lebesgue Measure note) \(\{X \in B\}\) is \(\mathcal{F}\) -measurable. Why? Given a random variable \(X\), we know that we have a collection of \(\{X \leq c\}\) that are \(\mathcal{F}\) -measurable. And all it takes to generate the Borel \(\sigma\) -field is a collection of intervals \((-\infty, c]\) (again, see Lebesgue Measure note). The (short) proof that might be given would involve generating the Borel \(\sigma\) -algebra from these intervals, showing that each generated set corresponds to a \(\mathcal{F}\) -measurable set.

## 3. definition

Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a probability space. Let \(X : \Omega \rightarrow \mathbb{R}\) be a random variable. Then,

- Define for every Borel measurable set \(B\in \mathcal{B}\), \(\mathbb{P}_X(B) = \mathbb{P}(X \in B)\)
- The resulting \(\mathbb{P}_X : \mathcal{B} \rightarrow [0,1]\) is called the
*probability law*of \(X\)

### 3.1. proposition

Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a probability space. Then, the probability law \(\mathbb{P}_X\) of \(X\) is a probability measure on \((\mathbb{P}, \mathcal{B}, \Omega)\) (remember \(\mathcal{B}\) is the Borel \(\sigma\) -algebra).

## 4. definition: measurable functions

More generally, let \((\Omega_1, \mathcal{F}_1)\) and \((\Omega_2, \mathcal{F}_2)\) be two measurable spaces. Then, a function \(f : \Omega_1 \rightarrow \Omega_2\) is said to be \((\mathcal{F}_1, \mathcal{F}_2)\) measurable if \(f^{-1}(B) \in \mathcal{F}_1\) for every \(B \in \mathcal{F}_2\)

So, a random variable is a function that is \((\mathcal{F}, \mathcal{B})\) -measurable. A question I have for the above: does this mean that \(f\) needs to be invertible?

## 5. theorem: assorted facts about random variables

Let \((\Omega, \mathcal{F})\) be a measurable space.

- (
*simple random variables*)Let \(A \in \mathcal{F}\) and define the indicator function \(I_A(\omega) = 1\) when \(\omega \in A\) and \(I_A(\omega) = 0\) when \(\omega \not\in A\). Then \(I_A\) is measurable. More precisely, it is \((\mathcal{F}, \mathcal{B})\) -measurable - Let \(A_1, ..., A_n\) be \(\mathcal{F}\) -measurable and let \(x_1,...,x_n\) be real numbers. Then, define the function \(X = \sum_i x I_{A_i}\). That is,
\[
X(\omega) = \sum_{i}x I_{A_i}(\omega)
\] for every \(\omega \in \Omega\). Then \(X\) is a random variable. (Called a
*simple random variable*) - Suppose \((\Omega, \mathcal{F}) = (\mathbb{R}, \mathcal{B})\). Let \(X : \mathbb{R} \rightarrow \mathbb{R}\) be a continuous function. Then \(X\) is a random variable.
- (functions of a random variable) Let \(X\) be a random variable. Let \(f : \mathbb{R} \rightarrow \mathbb{R}\) be a continuous function (or, more generally, a \((\mathcal{B}, \mathcal{B})\) -measurable function.
- Let \(X_1,...,X_n\) be random variables. Let \(f: \mathbb{R}^n \rightarrow \mathbb{R}\) be a continuous function. Then \(f(X_1,..., X_n)\) is a random variable. In particular \(X_1 + X_2\) and \(X_1X_2\) are random variables.

### 5.1. stubs of proofs

- \(\{I_A \leq c\}\) is just \(A^C\) for \(0 \leq c < 1\) and \(\{I_A \leq c\}\) is \(\Omega\) for \(c \geq 1\). And \(\{I_A \leq c\} = \emptyset\) for \(c < 1\).
- Consider the set \(\{\omega \mid X(\omega) \leq c\}\). We want to check that this set is \(\mathcal{F}\) -measurable. Consider every possible combination of \(x_i\) such that their sum is \(\leq c\). Each combination corresponds with an intersection of \(\mathcal{F}\) -measurable sets of the form \(\{I_{A_i} = 1\}\). There are a finite number of such combinations.
- From the definition of a measurable function, for every Borel measurable set in the image of \(f\), there is a Borel measurable set in the domain. The set of values \(u\) such that \(u\leq c\) is the Borel measurable set \(B = (-\infty, c]\). So, \(f^{-1}(B)\) is a Borel measurable set. And \(X^{-1}(f^{-1}(B))\) is a \(\mathcal{F}\) measurable set because \(X\) is a random variable. So \(f(X)\) is a random variable on \((\Omega, \mathcal{F})\).

## 6. Theorem: inf, sup, and limit of random variable

Let \(f_n : \Omega \rightarrow \mathbb{R}\) be a function for every \(n\). Then consider some new functions that we can define:

- \(f(\omega) = \inf_{n} f_n(\omega)\) for all \(\omega \in \Omega\)
- \(f(\omega) = \sup_{n} f_n(\omega)\) for all \(\omega \in \Omega\)
- \(f(\omega) = \lim_{n\rightarrow \infty}\inf_{n} f_n(\omega)\) for all \(\omega \in \Omega\)
- \(f(\omega) = \lim_{n\rightarrow \infty}\sup_{n} f_n(\omega)\) for all \(\omega \in \Omega\).

If \(\lim_{n\rightarrow\infty} f_n(\omega)\) exists for every \(\omega\), then we say that \(\{f_n\}\) converges pointwise to the function defined by \(f(\omega) = \lim_{n\rightarrow\infty)}(\omega)\)

Then, let \((\Omega, \mathcal{F})\) be a measurable space. If \(X_n\) is a random variable for every \(n\), THEN

- \(\inf_{n} X_n\)
- \(\sup_{n} X_n\)
- \(\lim_{n\rightarrow \infty} \inf_n X_n\)
- \(\lim_{n\rightarrow \infty} \inf_n X_n\)

are all random variables and if \(\{X_n\}\) converges pointwise to \(X = \lim_{n\rightarrow \infty} X_n\), then \(X\) is also a random variable

## 7. definition: discrete random variable

- A random variable \(X\) on a probability space \((\Omega, \mathcal{F}, \mathbb{P})\) if the range of \(X\) is countable or finite
- If this is the case, then we can define a probability mass function (PMF) \(p_X : \mathbb{R} \rightarrow [0,1]\) defined by \(p_X = \mathbb{P}(X=x)\)

## 8. definition: continuous random variable

A random variable \(X:\Omega \rightarrow \mathbb{R}\) on a probability space is a continuous r.v. if there exists a non-negative measurable function \(f : \mathbb{R} \rightarrow [0,\infty)\) such that \[ F_X(x) = \mathbb{P}(X \leq x) = \int_{\mathbb{R}} 1_{(-\infty, x]} f \, d\mathbb{\lambda} \] for all \(x\in \mathbb{R}\), where \(\lambda\) is a Lebesgue Measure and the integral is a lebesgue integral.

## 9. More about PMFs: Marginal, Joint, and Conditional

### 9.1. for discrete random variables

#### 9.1.1. marginal

The random variable \(X\) has marginal distribution \(p_X\), given by the PMF. Note: two random variables can have the same marginal distribution, but return different results on an experiment (remember that an r.v. is a function). So, for example, let \(X\) be a r.v. with a pmf that is symmetric about the \(y\) axis. Then, let \(Y = -X\). Then, \(Y\) and \(X\) have the same pmf, but different results for every outcome that is not on the origin.

#### 9.1.2. joint

The joint pmf for r.v.'s \(X\) and \(Y\) is a function \(p_{X,Y} : \mathbb{R}^2 \rightarrow [0,1]\): \[ p_{X,Y}(x,y) = \mathbb{P}(X = x, Y =y) \]

Where \(\{X = x, Y=y\}\) is the event \(\{\omega\in\Omega : X(\omega)=x, Y(\omega)=y\}\).

##### 9.1.2.1. marginalizing

The relation between the joint and marginal is given by: \[ p_X(x) = \mathbb{P}(\{X=x\}) = \sum_{y}(\mathbb{P}\{X=x, Y=y\}) = \sum_{y} (p_{X,Y}(x,y) \] The second equality follows from the fact that \(\{X =x\}\) is the countable union of disjoint sets (remember that \(Y\) is discrete).

#### 9.1.3. conditional

The conditional PMF of \(X\) given \(Y\) is a function defined by: \[ p_{X\mid Y}(x\mid y) = \mathbb{P}(X=x \mid Y=y) = \frac{p_{X,Y}(x,y)}{p_Y(y)} \] if \(\mathbb{P}(Y=y) > 0\), otherwise it is undefined.

Note that \(\sum_x p_{X\mid Y}(x \mid y) = 1\), so that \(p_{X\mid Y}\) is just like a normal PMF, but taken over the (normalized) slice of the probability space where \(Y=y\).