Note that the Fisher information is a function of . Let's first consider the term
Given that is fixed, lets say that is the Maximum Likelihood Estimation. Then what is this quantity? It is the curvature of the log-likelihood curve. It is how much the log likelihood is going to change as is wiggled. If the curvature were very flat, then that means that all the 's around the MLE have about the same likelihood. This can be interpreted as meaning that the observation doesn't tell us which generated the observation. Indeed if we gave all 's the same prior, and we had to choose based on maximum likelihood, a flat curvature would mean that this choice is very hard. On the other hand, if the curvature were very steep around the MLE, that would mean we could be very certain that generated .
Now, how hard is our choice on average? Take the expectation over all the possible values of the observation . Finally, if we're taking the expectation over , we need to draw from some distribution. Remember that is given, and that the Fisher information is a function of .