lebesgue integral
From the 6.436 lecture notes here and Brent Nelson's notes here.
1. key picture to remember
The Riemann integral can be approximated by:
- dividing the \(x\) axis into intervals of width \(w\)
- finding the average \(h\) height of the function in each interval
- in each interval, drawing a box of width \(w\) and height \(h\)
- summing up the areas of the boxes
Analogy: Finding the volume of a mountain by laying a grid on the ground and finding the height of the mountain in each grid cell
The Lebesgue integral can be approximated by:
- dividing the \(y\) axis into different heights
- for a given height \(h\), finding the "width" of the region on the \(x\) axis such that the function has height \(h\) on that region
- This region is not necessarily an interval. It could be something really weird. We're relying on our measure to tell us its "width"
- multiplying the width by the height to get an area
- summing up the areas for all heights
Analogy: Finding the volume of a mountain by drawing contour lines and then finding the volumes that fall under each contour.
2. recall: measurable functions
Recall from our discussion in the random variable note, that a function \(f:\mathbb{R}\rightarrow\mathbb{R}\) is Lebesgue measurable if for every Borel measurable set \(S \in \mathcal{B}\), we have \(f^{-1}(S) \in \mathcal{L}\), where \(\mathcal{L}\) is the collection of Lebesgue measureable sets.
3. recall a.e.
Almost everywhere (a.e.) \(p\) means that "\(p\) holds everywhere outside a set of measure 0" (see discussion of null-sets in Lebesgue Measure note).
4. recall extended value random variable
Recall that \(X: \Omega \rightarrow \bar{\mathbb{R}}\) is a random value that can take on anything in \(\mathbb{R} \cup \{\infty, -\infty\}\)
5. outline
Our objective is to define \(\int g \, d\mu\), sometimes written \(\int g(\omega) \, d\mu(\omega)\) where \(g:\Omega \rightarrow \bar{\mathbb{R}}\) is a Lebesgue measurable function defined on a measure space \((\Omega, \mathcal{F}, \mu)\). We will follow the 6.436 lecture notes in progressively working our way towards a definition for general functions.
- First, we will define the Lebesgue integral for non-negative simple functions. Simple functions are measurable, and their range only contains a finite set of finite values.
- Then, we will generalize to non-negative (not necessarily simple) functions. In this case, we will define the integral by approximating from below using simple functions. (Think of how the Riemann integral involves approximations using rectangles)
- Finally to handle functions that take negative values, we will decompose \(g\) into \(g^+\) and \(g^-\) and let \(\int g\, d\mu = \int g^+ \, d\mu - \int g^- \,d\mu\)
6. properties
There are a number of properties that turn out to be true of the integrals that we define. These properties generally match up with what we would expect from an integral, e.g.
- integral of \(g\) where \(g\geq 0\) is \(\geq 0\)
- integral of \(g\), where \(g=0\) a.e. is \(=0\).
A full list can be found in the 6.436 notes and on the wikipedia page.
There are two properties that we will highlight in particular
6.1. monotone convergence theorem
\[ 0 \leq g_n \uparrow g \Rightarrow \int g_n \, d\mu \uparrow \int g \, d\mu \] and \[ 0 \leq g_n \uparrow g, \text{a.e.} \Rightarrow \int g_n \, d\mu \uparrow \int g \, d\mu \] This says that if the sequence \(\{g_n\}\) pointwise converges to \(g\), then the Lebesgue integral of \(g\) exists and is equal to the pointwise limit the integrals of \(\{g_n\}\). This is noteworthy, because the same cannot be said of the Riemann integral (see below).
6.2. linearity of expectation
\[ \int(g + h)\, d\mu = \int g \, d\mu + \int h \, d\mu \] When \(\mu\) is a probability measure and \(g\) and \(h\) are random variables (see below), then we get the linearity of expectation (see properties of expectation).
6.3. relation to probability measure
If \((\Omega, \mathcal{F}, \mathbb{P})\) is a probability space and \(X:\Omega \rightarrow \bar{\mathbb{R}}\) is measurable (i.e. a random variable), then \(\int X \, d\mathbb{P}\) is also denoted \(\mathbb{E}[X]\) and called the expectation of \(X\).
7. limitations of Riemann integral
7.1. definition of Riemann integral
Recall the definition of the Riemann integral \(\int_{a}^{b} f(x) \, dx\). We divide the interval \([a,b]\) using a finite sequence of points \(\sigma = (x_1, ..., x_n)\) that satisfy \(a = x_1 < x_2 \cdots < x_n = b\). Then, we define: \[ U(\sigma) = \sum_{i=1}^{n-1}\left(\max_{x_i \leq x < x_{i+1}} g(x)\right) \cdot (x_{i+1} - x_i) \] and \[ L(\sigma) = \sum_{i=1}^{n-1}\left(\min_{x_i \leq x < x_{i+1}} g(x)\right) \cdot (x_{i+1} - x_i) \]
\(U(\sigma)\) and \(L(\sigma)\) are approximations of the area under \(g\) from above and below respectively. You can see that they are the sums of rectangles, where the rectangles in \(U\) have heights above \(g\) and the rectangles in \(L\) have heights below \(g\).
Then, given a sequence of \(\sigma\) 's, the Riemann integral of \(g\) is defined as: \[ \lim\sup_{\sigma} L(\sigma) = \lim\inf_{\sigma} U(\sigma) = c \] provided that the above two quantities exist and are equal.
7.2. \(1_{\mathbb{Q}}\) is not Riemann integrable
Let \(1_{\mathbb{Q}}\) be the Dirichlet function: \[1_{\mathbb{Q}} = \begin{cases} 1 & \text{if } x\in \mathbb{Q}\\ 0 & \text{if } x\not\in \mathbb{Q} \end{cases}\]
Then, for every interval \([x_i, x_{i+1})\), there is a rational number and a irrational number. So, we see that \(L(\sigma) = 0\) and \(U(\sigma) = 1\) for any \(\sigma\). So the integral does not exist.
7.3. failure of monotone convergence theorem
From the wikipedia page.
Let's define a sequence of functions, such that each function is Riemann integrable, but the limit is not Riemann integrable.
Let \(\{a_k\}\) be a countable enumeration of all rational numbers.
Define function \(g_k\) where \[g_k(x) = \begin{cases} 1 & \text{if } x = a_j, j\leq k\\ 0 & \text{otherwise} \end{cases}\] Then we see that \(g_k\) is 0 except at a finite number of points, so each \(g_k\) is Riemann integrable. But \(g_k \rightarrow 1_{\mathbb{Q}}\) when \(k\rightarrow \infty\), which is not integrable.
8. Lebesgue integral
8.1. simple function
A function \(g : \Omega \rightarrow \mathbb{R}\) is simple if it is Lebesgue measurable, finite, and can only take finitely many values. In particular, a simple function can be written as: \[\begin{equation} g(\omega) = \sum_{i=1}^{k} a_i 1_{A_i}(\omega) \label{eq1} \end{equation}\]
for all \(\omega \in \Omega\). That is, it can be written as a weighted sum of indicator functions – think of a generalized piecewise linear function.
Note that the representation on the LHS is not necessarily unique for \(g\). However, it is if we require the \(a_i\) to be unique and the sets \(A_i\) to form a partition of \(\Omega\). If this is true, then it is called a canonical representation.
8.2. Lebesgue integral for non-negative simple functions
Then, for simple functions \(g\) of the form in \(\ref{eq1}\), we define the Lebesgue integral: \[ \int g \, d\mu = \sum_{i=1}^{k} a_i \mu(A_i) \] where if \(a_i = 0\) and \(\mu(A_i) = \infty\), we assume \(a_i\mu(A_i) = 0\)
Note that this does not require \(g\) to be in canonical form. Quick check: if \(g\) has two distinct forms, will the integral for these forms be equal? Yes, because both forms will be equivalent to the canonical form, which implies that both forms have the same integral of the canonical form.
8.3. Lebesgue integral for non-negative functions
For a non-negative (not necessarily) simple function \(g: \Omega \rightarrow [0, \infty]\), let \(S(g)\) be the set of all non-negative simple functions \(q\) that satisfy \(0\leq q \leq g\). Then, we define the Lebesgue integral for \(g\) as: \[ \int g \, d\mu = \sup_{q\in S(g)} \int q \, d\mu \]
8.3.1. intuition
At first I was confused, because I knew that \(g\) could be very weird looking – it could be nowhere continuous. So how could we hope to approximate it using a simple function? I think it helped to remember that the Riemann integral involves approximations using rectangles. This is kind of like that, where instead of using finer meshes of integrals, we are using finer simple functions. The next section also helped.
8.4. A concrete example
The above definition tells us how to find \(\int g \, d\mu\) given a \(S(g)\), but doesn't tell us how to find \(S(g)\). Here, we explicity give one example to help with understanding. For a measurable function \(g\), we will define a sequence \(q_n\) such that \(\int q_n \, d\mu \rightarrow \int g \, d\mu\).
For any positive integer \(r\), we define \(q_r : \Omega \rightarrow \mathbb{R}\): \[q_r(\omega) = \begin{cases} r &\text{if} g(\omega) \geq r\\ \frac{1}{2^r} &\text{if } \frac{i}{2^r} \geq g(\omega) \frac{i+1}{2^r}, i=0, 1, ...,r2^r - 1 \end{cases}\]
In words: \(q_r\) is capped by \(r\), otherwise it snaps to the nearest multiple of \(\frac{1}{2^r}\), rounding down. The resolution gets better as \(r\) increases.
Then, the two things to notice are:
- every \(q_r\) is simple and measurable.
- Simple because the range is finite and contains finitely many values
- Measurable because \(g\) is measurable, so any interval \(\frac{i}{2^r} \leq g(\omega) \leq \frac{i+1}{2^r}\) has a measurable pre-image.
- We have \(q_r \uparrow g\). That is for every \(q_r(\omega) \uparrow g(\omega)\) for every \(\omega\)
Now, you can use the mental image of these functions to understand what is going on in \(S(g)\).