mutual information
The mutual information between random variables and is the Kullback-Leibler divergence between and :
Expressed in terms of entropy:
In the first line, you can think of:
- as the uncertainty in
- the conditional entropy as the uncertainty remaining in after is known
- as the amount that knowing reduces the uncertainty in
You can also think of as a measure of the information shared by and . How much does knowing one variable reduce uncertainty about the other? If and are independent, then is zero, because knowing the value of doesn't change the distribution of at all. However, if is a deterministic function of , then because , since there is no uncertainty after observing . Then, the mutual information is . That is, the amount of uncertainty reduction that we get from observing is exactly all the uncertainty that had to begin with.
You can also take a close look at the definition. Just like Entropy, MI is obtained by averaging over a distribution. Here, we average over the joint distribution, and at each point, we take a measure of how far from independence we are.
It turns out that if and only if and are independent.
Note that is a symmetric measure.
Notice that can be thought of as a distance from independence.
Created: 2024-07-15 Mon 01:28