Bayes Rules: STAT 295, 1/29/09

One of the problems involves the posterior predictive distribution. The idea here is that once we have the posterior distribution of the parameters θ, we can multiply this by the likelihood for new observations y, integrate out θ, and get a distribution on y|x that predicts what kinds of observations we would expect in the future. For example, we may have observed the orbit of a planet or the weather at various points in the past; the posterior predictive distribution would allow us to predict the position of the planet in the future or the weather at a later date.

Jeff showed why we can ignore constants independent of the parameters like Choose(n,s) in the likelihood and Γ(a+b)/Γ(a)&Gamma(b); in the prior. Because these parameters appear in both numerator and denominator, and because they are independent of the parameters can be taken outside of the integral the defines the marginal distribution of the data, they will simply cancel out.

I noted that care needs to be taken when comparing models or averaging over models. For example, one might have several models in mind (e.g., a linear and a quadratic model to fit a run of data). Since the models are different, the likelihoods and priors are also different and will have different normalizing factors. In such a case you cannot ignore those factors because to do so would make the different models incompatible.

We ran the R code and looked at the posterior distribution, as given by a sample of size 100,000. From this we noted that the results were fairly stable to flat versus the wide beta prior that Albert used. We noted that we will be making our inferences directly from the sample using methods that don't require us to know the normalizing constant for the posterior distribution.

We went on to Chart Set 4. We discussed the summaries of the posterior mean and median, and pointed out that these can be derived by minimizing the loss given by the square of the difference between the true value and the estimated value, and the absolute value of that distance, respectively.

One chart had "HDR" on it as a label. This stands for "Highest Density Region" and is just a Bayesian credible interval. My recollection now is that the book from which I got this terminology, Samuel Schmitt's Measuring Uncertainty: An Elementary Introduction to Bayesian Statistics, really referred to the shortest credible interval.

We discussed confidence intervals and credible intervals. A confidence interval is an interval that describes the distribution of data on repeated hypothetical trials, given a particular parameter. It is a way of describing the statistical properties of the procedure that is used to calculate the interval under repeated sampling. A credible interval describes the distribution of the parameter, given the particular data set we have actually observed.

Although we can sometimes interpret a confidence interval numerically as a corresponding credible interval (e.g., linear regression with normal errors), and although sometimes credible intervals have good frequentist coverage properties and so can, again numerically, be used as a confidence interval, they are not the same thing at all. In Bayesian theory, data x, once observed, is regarded as fixed, and appears in the result through the likelihood function, which is not a probability density on the parameters θ. There, θ is considered a random variable. But in frequentist theory, θ is not a random variable, x is an exemplar of the random variable X. So it is important to keep these two ideas separate.

In my experience with physical scientists, a large fraction of them want to interpret a confidence interval as a distribution on θ. This is of course wrong, but that is a natural tendency. So it seems that many physical scientists are naturally Bayesians!

Jeff did a blackboard calculation to show that regardless of the prior, in the limit of large data n→∞, the expected mean of the sleep problem →s/n. With lots of data, the prior is pretty much irrelevant. (There are exceptions, which we will discuss later).

Bayes Rules

Friday, January 30, 2009

STAT 295, 1/29/09

No comments:

Blog Archive

About Me