-
Notifications
You must be signed in to change notification settings - Fork 0
Primer on Statistical Inference
The term inverse modelling is widely used in atmospheric sciences for the method of estimating some physical quantity, in our case the emissions from some source, using some indirect measurements. This problem is a statistical problem, requiring a physical relationship to be turned into a statistical model. The methods to infer useful information from this statistical model relies on probability theory -- that is the output from a simulation of a physical process informs such metrics as the most probable emissions or the expected emissions with some degree of uncertainty.
This document talks about probability, choice of statistical models and how to make inference from these.
Basic concepts of probability are intuitive.
A basic question such as 'if there are 5 apples in a bag, 3 are red, 2 are green, what is the probability of pulling out a green apple?' are well understood and the answer is clearly
Here the dividing line between green and
This concept becomes a little harder for continuous variables.
Let's instead think of a thermometer -- we use thermometers to measure temperature.
However, thermometers never measure temperature directly.
Instead they measure some abstract quantity, such a volume or resistance, from which we infer temperature.
There will be some error in the measurement of temperature, which we can assume is random about the true value.
Let
This means that the measurement of resistance is some function of the true temperature plus a bit of error.
If the relationship
As we have some error involved we won't know the true temperature, so instead we need to infer something about it.
This means that we have to look at the probability of temperature given a measurement of resistance, i.e. p
where
In the above case we are talking about the probability of the a measured resistance given a temperature, but we are often more interested in the temperature given a measured resistance.
Conceptually (and mathematically) these things are very similar, and we tend to call \ref{eq:normaldist} the likelihood of of
Unfortunately with only one measurement we can't do much better than say that our best estimate is the temperature that our resistance measurement gives, i.e.
Let's say that we make
where
To find the maximum likelhihood estimate (the most likely value) for
and the uncertainty, or confidence interval, in this estimate can be summarised by its standard deviation
This approach holds when we have a robust statistical model, which describes the physical system. That is, we can justify our statistical model on the understanding of how our thermometer works in reality if we repeat the measurement many many times. Unfortunately in the geosciences we often don't have this luxury. The statistical models aren't robust in the same way, e.g. we don't have a solid physical understanding of the error characteristics in our understanding of the relationship between the emissions of carbon dioxide from a power plant in Italy and a measurement of carbon dioxide in the atmosphere made in Germany. Needless to say, it is unlikely that it truthfully fits a simple common statistical model. For these occasions we need Bayesian statistics.
Bayesian statistics considers the fact that often extra information is needed to inform inference to complement the measurements.
Take the example at the end of the previous section of estimating carbon dioxide from measurements.
Our statistical model based on measurements alone is probably not good enough to robustly estimate the emissions from a power plant
where we have generalised this to be a vector of desired quantities
Now for making inference using Bayesian statistics. We usually don't have to worry about the normalisation constant and we can just say that
If both probability distributions (the likelihood and prior probability) are Normal on the right hand side of the proportionality, then much like before it is often easier to work with log-probabilties,
where here
and the uncertainty in this estimate is,
Note here that the posterior distribution, that is our estimate of
The use of 'Bayesian inference' is widespread in atmospheric sciences and inverse theory. Often, however, what is being done isn't Bayesian at all but a regularisation, i.e. guiding values to some optimum value using sensible constraints. I won't dwell on this -- many would likely disagree -- but it can be seen in many inverse procedures. One telltale sign of this is the arbitrary choice of likelihood and prior distributions. This may be done for computational convenience, and to guide some solver to a 'better value'. This is obviously incorrect, and so any posterior probabilities will likely also be incorrect and therefore caution should be taken in what these 'probabilities' and uncertainties actually mean.
The proportionality in equation Bayes' law infers the desired quantity
In the equation the comma inside the probability just means 'and', so for example