Bayes Rules

Monday, October 27, 2008

Class, 10/27

Today we discussed criminal trials from the juror's point of view. We decided, after some discussion, that the worst thing would be to convict someone who was actually innocent. We know from the Innocence Project that an unacceptably high proportion of people in prison are probably innocent. We set up a decision tree with branches Convict Innocent, Acquit Innocent (the worst and best outcomes) in a probability fork with u being the probability of CI and (1-u) the probability of AI, and the "for certain" branch of the tree being Acquit Guilty. After some discussion we decided on something like u=0.o1 which would mean that 99% of people sent to prison would actually be guilty (assuming we can evaluate that probability as a juror). With a loss of 0 for AI and 1000 for CI, we found that the loss for AG would be 10 to make us indifferent between the two branches.

We also discussed the case of Convict Guilty, and although some thought that AI would personally be better than CG (both correct decisions), this didn't seem to hold up when we replaced AG with CG in the decision tree we drew. CG for certain seemed better than CI with probability even as small as 0.001.

We also discussed whether the seriousness of the case and the harshness of the punishment should not also change our losses. Surely, some thought, the penalty for a traffic ticket is not as onerous a penalty as 20 years in prison for a serious crime, if the person accused were actually innocent, and the death penalty is even more unacceptable if the accused were actually innocent (even though Vermont doesn't have the death penalty, a recent Vermont jury did give the death penalty in a federal case, so it's not entirely moot even for Vermonters). So, the loss for CI ought to be larger if the penalty is more serious, some said. One student would never give the death penalty...for that student, the loss is effectively infinite.

The next several classes will be devoted to discussing the practice problems for the second test. We will pick up the juror discussion again after the test.

Sunday, October 26, 2008

Class, 10/24

We decided on November 7 as the date for the next test. A study guide can be found here, which we will be discussing next week and the following week. You'll get a paper copy on Monday.

Today we talked about juror's decisions. We didn't do any math, but we discussed the various options available to a juror, including the consequences of making the wrong decision (convict someone who is innocent, acquit someone who is guilty). We'll pick the discussion up again on Monday and try to quantify the results of our conversation on Friday.

Friday, October 24, 2008

An interesting short article

The New York Times published this interesting article on statistics, baseball and health care today.

Wednesday, October 22, 2008

Class, 10/22

OK, so today I revisited the insurance problem, but from the point of view of expected loss rather than expected utility. It's not any different, but I drew a picture on the blackboard that showed that there was a range in insurance premiums between the minimum premium that the insuarnce company would sell the insurance policy for (p=m/h, see previous posting for definitions) and the maximum amount the homeowner would buy it for (p=loss(m)/loss(h)). The homeowner hopes that between these two limits, there will be competition between insurance companies that will give him or her a good deal on insurance.

We then discussed testing various hypotheses based on data observed.

First, we discussed testing a coin which may be fair or unfair.

We decided that if it was fair, a reasonable prior would be P(fair)=0.5, and P(unfair)=0.5.

But then, what does P(unfair) mean? If the coin is fair, it is supposed to be Heads with probability 0.5. But if it is unfair? What is the probability? We decided to split the probability up equally amongst the possibilities, which we chose to be 0.05, 0.15, 0.25,...0.85, 0.95 with each possibility having prior probability 0.05 (after some discussion that reminded us that the the total probability has to be 1, and already we have expended 0.5 on the "null hypothesis" that the coin is fair.)

So we set up a "spreadsheet" calculation. We discussed how to actually do it if we were doing it with Excel.

We didn't actually do the calculation, but I will tell you that the result is: With 60 heads and 40 tails, the probability of obtaining a result that extreme or more extreme (60 or more heads, or 40 or less tails) is about 0.05, but the probability that the coin is fair, given that we have observed the data (60 heads and 40 tails) is about 0.5.

This is very interesting. The standard statistical test of statistical significance, how extreme the result is, is very different from what the Bayesian result is.

We also discussed the problem of estimating the probability that an unknown proportion (here, the bias of the coin, or in the problem set, the cure rate of the new drug) is greater than some fixed value (say 0.2). The spreadsheet is the same except for no special picking out of 0.5; this means we crossed out this line and used 0.1 for the alternatives 0.5, 0.15,...0.95, and to determine the probability that the the new drug is better than the old one, we just add the probabilities for the states of nature that are greater than 0.2. (Again we didn't do the actual calculation.

We finished with a challenge: How to decide, if you are on a jury, whether to convict or acaquit a defendant in a criminal case. More clearly, what does "beyond a reasonable doubt" mean?

Tuesday, October 21, 2008

Class, 10/20

We talked about utilities. First we looked at the shapes of the curves that you all derived over the weekend. We learned that a straight line is neutral, a utility curve that curves up is risk-seeking and one that curves down is risk-averse.

We then discussed insurance on a house. We found that if h is the value of the house and m is the premium we pay for the insurance, and p is the probability of disaster (e.g., a fire burns the house down), then the insurance company will demand that p be less than m/h. On the other hand, the owner of the house (if her utility is neutral) will demand that p be greater than m/h, and no transaction can take place. But insurance is bought and sold, so there has to be an explanation for this. And there is an explanation, because although insurance companies have a nearly neutral utility curve except for truly huge amounts, people do not, and most people have risk-averse utility curves. They will demand that p be greater than u(-m)/u(-h) where u(-) means the value of the curve at the point in question (the quantities are negative because in both the case of the premium and the potential catastrophe, the person ends up with less assets). But, if you have a utility curve that curves down, that means that the ratio u(-m)/u(-h) will be less than m/h, so the person will be willing to buy the insurance after all. Therefore, the insurance company can now set a value of m that the consumer will be willing to pay and which will also give a profit to the company, thus keeping the stockholders happy.

I remarked that this is actually how all commerce works. There are two parties, a seller and a buyer. They are willing to make a transaction because their utility curves are different, and so it is a "win-win" situation where everyone, both the buyer and the seller, feel themselves better off (in terms of utilities) than they did before the transaction took place.

One student had remarked in class and in journals that this approach (decision theory) might not be adequate when considering lotteries, where there is a huge payoff of very low probability. Should someone wager to win the lottery, even if taxes and annuitization made it a positive payoff on expected return basis? My answer is, "Not Really." The reason is that we don't (or shouldn't) make decisions based on expected return. We should make decisions based on expected utility or expected loss. I posed the question, would you rather have $280 million with probability 1/2, or $10 million for sure. The overwhelming choice of the class was, take the $10 million. This means, that to most of the people in the class, having $280 million isn't that much better than having $10 million. This means that in the lottery problem, you probably won't want to use $280 million as the leaves on the ends of the decision tree. You probably will make as good a decision if you just put $10 million there. And if you did this, your decision would be just as rational, and would tell you that the lottery is not really a good place to invest your money (unless your only reward is the thrill of entering the lottery!) Final comment is that the student who raised this issue initially agreed that when utilities or losses were used as the payoff, then it would not be a problem.

I finally drew on the board a generally useful way to estimate utilities for any events whatsoever. I used the Monty Hall example of a car, a goat, and a trip to Hawaii. Presumably the car is the best and the goat is the worst, with the Hawaii trip in between. Draw a decision tree, put the car and the goat on the probability branches and the Hawaii trip on the "get for certain" branch. Then, pick a probability for getting the car that makes you neutral between the two branches of the decision tree. There should be a point where you are neutral, for if the probability of getting the car is 1, you'd take the car for sure, but if the probability of getting the car is 0, you'd pick the Hawaii trip for sure. One student volunteered p=0.8. That means that her utility for the Hawaii trip is 0.8, since at that point, both branches of the decision tree have exactly the same value.

I remarked finally that if you use this method for evaluating utilities, then the utilities so calculated are actually probabilities!

Saturday, October 18, 2008

Class, 10/17

Today we spent most of the class discussing the lottery problem I left you with last time.

What we need to compute is the probability that no one wins the lottery, the probability that exactly one person wins the lottery, exactly two people, and so forth.

After some discussion we decided that the probability that no one wins is the probability that the first person loses AND that the second person loses AND....AND that the last person loses. Since these are independent events, we need to multiply the probabilities of each of these events (AND always means multiply the probabilities). If I write w=1/80,000,000, the probability that a given person wins the lottery, then the probability that that person loses is (1-w). The probability that everyon loses is (1-w)^N, where N=200,000,000 is the number of tickets sold. Although that looks terrible to compute, actually a hand calculator correctly computed this number to be 0.082.

To get the probability that exactly one person wins, we decided that it is equal to the probability that (the first person wins AND all the others lose) OR (the second person wins AND all the others lose) OR...OR the last person wins ane all the others lose). The AND means multiplication, and the OR means adding probabilities. To get the probability that one specified person wins AND all the others lose, this is equal to w*(1-w)^N-1 which is hardly different from w*(1-w)^N since the extra factor of (1-w) is very, very clost to 1. But there are N tickets, so the OR means we add this number to itself N times and the probability that exactly one person wins, one of the N tickets, is (Nw)*0.082 or 0.205.

For two people we follow the same principle: We compute probability that (the first person wins AND the probability that exactly one of the other people wins AND that all the other people lose) OR (the second person wins AND exactly one of the other people wins AND that all the others lose) OR...OR (the last person wins AND exactly one of the other people wins AND that all the others lose). We just computed the probability that one of the other people wins AND that all the others lose, it's 0.208. And the probability that a particular person wins is still w. And, there are N identical numbers that are OR'ed together, so we have to multiply by N again, getting (Nw)*0.208. However, there is a little complication, because if you look at the first two numbers above, in both of them there is a piece that comes from the first person winning AND the second person winning. And a similar thing can be said about any pair of items above. So what has happened is that each pair of people appears twice in the sum and is therefore counted twice as much as it should be. So the probability we want has to be divided by 2, and the answer we need is: The probability that exactly two people win is (Nw)*0.208/2=0.257.

In a similar way, the probability that exactly three people win can be computed as (Nw)*0.257/3=0.214; similarly to the case of two people, we note that each triple of tickets gets counted three times, so we have to divide by 3. And so forth for the case of exactly four, five, six and so on. Once you get to seven people, there's less than a 1% chance that that many people will win.

Now we can compute the expected value of a ticket. Your probability of winning is w. If no one else wins (probability 0.082) then you would win $280M. If exactly one other person wins (probability 0.205) then you would win $140M. And so forth. Adding it all up, the expected value of a ticket is $1.29. We calculated $1.12 by dividing $280M by 2.5=200M tickets/80M probability of winning.

But still, taxes and the fact that you can't take home the entire jackpot if you want it all at once means that it is not worth it (from an expected value point of view). The only reason to buy a ticket is for the fun of it.

I gave out some worksheets for estimating your utility function for money. You should work them out this weekend. We'll discuss them on Monday.

Thursday, October 16, 2008

Decision tree diagram

Here is the photo I took on Wednesday. The quality isn't great, but you should be able to copy it to your clipboard and look at it in more detail. In fact, I just clicked on it (double click on a Mac, I don't know what you do with a PC) and Firefox presented it in a separate window, and the numbers were easily readable.

In addition to the decision tree we discussed the PowerBall lottery. p=1/80,000,000 to win, 200,000,000 tickets sold, $280,000,000 jackpot. The question is, does it pay (in an expected return sense) to enter? We immediately noticed that there might be more than one winner, and we estimated roughly 2.5 winners on average. This makes a ticket worth $1.41, so at first sight it appears to be a good idea to enter. But there are several problems, which we uncovered on further discussion. One is taxes: You would be taxed at the highest bracket, which is in the 35-39% range (depending on the tax law), as well as Vermont income tax. Also, you don't get the money all at once, but in installments over 20 years. To get the money at once, you have to take a discount of about 50% (since the way the lottery works, the state buys you an annuity that pays out over 20 years, and you would only get the amount that they would pay the insurance company to buy the annuity). Thus, it seems that it isn't worthwhile after all.

I left you with the problem of trying to get a more precise estimate of the expected return, considering the probability that 1, 2, 3,... more winners will win the lottery other than you.