Bayes Rules: STAT 295 2/3/09

I have posted a revised Chart Set #5. Jeff noted some errors and they have been corrected. Unfortunately, since I moved to the latest version of MacOS, I am no longer able to produce 4-up pdf files, so this one (and some of the later ones) will be full size. I apologize for this and will consult with Small Dog to see if this can be fixed.

Jeff asked why the two forms of the likelihood for problem #4 (due 2/5) are equivalent. You should address this question in your turned-in assignment. Note that programming without loops is faster, so you should attempt doing that calculation without loops.

On Chart Set #4, Chart 24, we had been discussing robustness. We noted that the mean and standard deviations of the posterior distribution do depend (although fairly insensitively) on the prior. In particular, the beta prior had a smaller standard deviation and the mode was moved to the left, towards the peak of the beta prior, relative to where the mode was for the flat prior. This led to the notion of stable estimation, such that when we have a lot of data or very precise data, and the prior doesn't change much over the region where the likelihood peaks, then the results won't be sensitive to the prior.

We considered continuous examples, of which the beta-binomial model is an example. Jeff justified the density approximation P(a<Y<b|c-ε<X<c+ε) ≈ ∫f(c,y)/g(c)dy. Thus, we can use ratios of joint to marginal densities in the continuous case, just as we can use the ratio of joint to marginal distributions in the discrete case.

We saw how increasing the amount of data tightens the posterior around the true value.

We then skipped to Chart Set #5. We will return to Chart Set #4 later.

We discussed the Poisson distribution and motivated it. Many things are well modeled as Poisson events, in addition to the ones on the chart set, requests to google.com, stars per square degree, etc.

We went on to the heart transplant mortality problem from Albert's book. The exposure for each patient is the probability that the particular patient will die in a given time frame after the operation. It depends on things like the patient's age, conditions like diabetes, and so on, and is presumed known from other studies. Then the exposures for each patient are added up over all the patients to estimate the overall exposure (risk) for the particular hospital. It is the expected number of patients that will die. The notes use the letter 'e', but Jeff remarked that it's easy to confuse that with the base of natural logarithms, so he changed it to 'd' in his chalkboard discussion. Then if Y is the random variable representing the observed number of deaths, the likelihood has Y~pois(λd) and λ is the parameter we wish to estimate for the hospital.

A gamma prior is chosen. It is flexible, with two adjustable parameters and pedagogically easy because it is the unique conjugate prior for a Poisson likelihood. However, it is a bad idea to use a prior simply because it makes calculations easy, since modern sampling techniques allow us to use any prior we wish, and if we know better, we should use it.

I've corrected Slide #8 on which z and o were transposed. See the chart set published above.

The heart transplant mortality problem will be continued next time.

Bayes Rules

Wednesday, February 4, 2009

STAT 295 2/3/09

No comments:

Blog Archive

About Me