Bayes Rules: STAT 295 2/10/09

I made some remarks on the homework. Won't repeat them here since they will appear on the papers that were turned back.

We talked about Jacobians (named for a 19th century mathematician, however he was German, not French as I had misremembered). I gave my reasoning, which relied on the change of variables when doing an integral:

∫p(u)du=∫(p(u(v))(du/dv)dv=∫p(v)dv

The idea here is that if u is a variable and we transform to v, the integral has to be unchanged since it is the same probability that we're talking about.

The highlighted piece here is the Jacobian. Whenever you change variables in our probability statements, you have to include a Jacobian. So, the above equation says that

p(v)=p(u)(du/dv).

Jeff gave a different (but entirely equivalent) argument. His argument (which is hard to reproduce here, so I hope you got good notes) also starts with integrating, here to compute the probability that θ is less than x. That is an integral from -∞ to x of the density on θ. But if we transform to λ via θ=g(λ), then you get

Pr(θ<x)=Pr(λ<g^-1(x))=∫_-∞^{g^-1(x)}p(λ)dλ

To go from the distribution (Pr) to the density (p) you take the derivative with respect to x. When you do this, you use the fundamental theorem of calculus (the derivative of the integral is the function under the integral) and the chain rule (you have to multiply by dg^-1(x)/dx, which is the Jacobian).

On the homework for Thursday, Jeff generalized the problem to the following condition:...Assume that there does not exist k(x,y) independent of θ such that

f_x(x|θ)f_y(y|θ)=k(x,y)f(x,y|θ)

Then you should be able to show that you can't pretend that x and y are independent when you calculate the posterior distribution on x and y. You have to use the formulation that f(x,y|θ)=k(x,y)f(y|x,θ)f(x|θ), in other words, the full conditional probability formula is required.

The corrections to the Albert book are here. You have the most recent (third) printing, in all likelihood, so most of these won't apply.

We then discussed the normal likelihood with both mean (μ) and variance (σ²) unknown.

The charts have the equations. We took a prior that is flat on mu and 1/σ² on σ². We'll justify these priors as "noninformative" later in the course. Both priors are improper, but if the posterior is proper there will not be a problem.

The key observation here is that in the posterior, there is a piece that is independent of mu, times another piece that has mu and sigma. This means that we can factorize the posterior like g(σ²)g(μ|σ²). We observed that the latter piece would be normal, with mean the mean of the y's, and with variance to be gotten by sampling σ². Then we found that the marginal distribution of σ² is an inverse chi-square distribution with (n-1) degrees of freedom, where n is the number of data points. Jeff discussed several ways of sampling from an inverse chi-square distribution, and they are included in his R code, which will be found here.

Jeff then used the code to plot contours of the posterior distribution, plot sample points, and compute quantiles and other things of interest. It is known that the frequentist confidence interval on mu is given by a t distribution with (n-1) degrees of freedom, and Jeff computed that as well as the Bayesian credible interval from the sample. They coincided. This is one example where the credible interval and the confidence interval are the same.

Bayes Rules

Tuesday, February 10, 2009

STAT 295 2/10/09

No comments:

Blog Archive

About Me