Bayes Rules

Sunday, October 26, 2008

Class, 10/24

We decided on November 7 as the date for the next test. A study guide can be found here, which we will be discussing next week and the following week. You'll get a paper copy on Monday.

Today we talked about juror's decisions. We didn't do any math, but we discussed the various options available to a juror, including the consequences of making the wrong decision (convict someone who is innocent, acquit someone who is guilty). We'll pick the discussion up again on Monday and try to quantify the results of our conversation on Friday.

Friday, October 24, 2008

An interesting short article

The New York Times published this interesting article on statistics, baseball and health care today.

Wednesday, October 22, 2008

Class, 10/22

OK, so today I revisited the insurance problem, but from the point of view of expected loss rather than expected utility. It's not any different, but I drew a picture on the blackboard that showed that there was a range in insurance premiums between the minimum premium that the insuarnce company would sell the insurance policy for (p=m/h, see previous posting for definitions) and the maximum amount the homeowner would buy it for (p=loss(m)/loss(h)). The homeowner hopes that between these two limits, there will be competition between insurance companies that will give him or her a good deal on insurance.

We then discussed testing various hypotheses based on data observed.

First, we discussed testing a coin which may be fair or unfair.

We decided that if it was fair, a reasonable prior would be P(fair)=0.5, and P(unfair)=0.5.

But then, what does P(unfair) mean? If the coin is fair, it is supposed to be Heads with probability 0.5. But if it is unfair? What is the probability? We decided to split the probability up equally amongst the possibilities, which we chose to be 0.05, 0.15, 0.25,...0.85, 0.95 with each possibility having prior probability 0.05 (after some discussion that reminded us that the the total probability has to be 1, and already we have expended 0.5 on the "null hypothesis" that the coin is fair.)

So we set up a "spreadsheet" calculation. We discussed how to actually do it if we were doing it with Excel.

We didn't actually do the calculation, but I will tell you that the result is: With 60 heads and 40 tails, the probability of obtaining a result that extreme or more extreme (60 or more heads, or 40 or less tails) is about 0.05, but the probability that the coin is fair, given that we have observed the data (60 heads and 40 tails) is about 0.5.

This is very interesting. The standard statistical test of statistical significance, how extreme the result is, is very different from what the Bayesian result is.

We also discussed the problem of estimating the probability that an unknown proportion (here, the bias of the coin, or in the problem set, the cure rate of the new drug) is greater than some fixed value (say 0.2). The spreadsheet is the same except for no special picking out of 0.5; this means we crossed out this line and used 0.1 for the alternatives 0.5, 0.15,...0.95, and to determine the probability that the the new drug is better than the old one, we just add the probabilities for the states of nature that are greater than 0.2. (Again we didn't do the actual calculation.

We finished with a challenge: How to decide, if you are on a jury, whether to convict or acaquit a defendant in a criminal case. More clearly, what does "beyond a reasonable doubt" mean?

Tuesday, October 21, 2008

Class, 10/20

We talked about utilities. First we looked at the shapes of the curves that you all derived over the weekend. We learned that a straight line is neutral, a utility curve that curves up is risk-seeking and one that curves down is risk-averse.

We then discussed insurance on a house. We found that if h is the value of the house and m is the premium we pay for the insurance, and p is the probability of disaster (e.g., a fire burns the house down), then the insurance company will demand that p be less than m/h. On the other hand, the owner of the house (if her utility is neutral) will demand that p be greater than m/h, and no transaction can take place. But insurance is bought and sold, so there has to be an explanation for this. And there is an explanation, because although insurance companies have a nearly neutral utility curve except for truly huge amounts, people do not, and most people have risk-averse utility curves. They will demand that p be greater than u(-m)/u(-h) where u(-) means the value of the curve at the point in question (the quantities are negative because in both the case of the premium and the potential catastrophe, the person ends up with less assets). But, if you have a utility curve that curves down, that means that the ratio u(-m)/u(-h) will be less than m/h, so the person will be willing to buy the insurance after all. Therefore, the insurance company can now set a value of m that the consumer will be willing to pay and which will also give a profit to the company, thus keeping the stockholders happy.

I remarked that this is actually how all commerce works. There are two parties, a seller and a buyer. They are willing to make a transaction because their utility curves are different, and so it is a "win-win" situation where everyone, both the buyer and the seller, feel themselves better off (in terms of utilities) than they did before the transaction took place.

One student had remarked in class and in journals that this approach (decision theory) might not be adequate when considering lotteries, where there is a huge payoff of very low probability. Should someone wager to win the lottery, even if taxes and annuitization made it a positive payoff on expected return basis? My answer is, "Not Really." The reason is that we don't (or shouldn't) make decisions based on expected return. We should make decisions based on expected utility or expected loss. I posed the question, would you rather have $280 million with probability 1/2, or $10 million for sure. The overwhelming choice of the class was, take the $10 million. This means, that to most of the people in the class, having $280 million isn't that much better than having $10 million. This means that in the lottery problem, you probably won't want to use $280 million as the leaves on the ends of the decision tree. You probably will make as good a decision if you just put $10 million there. And if you did this, your decision would be just as rational, and would tell you that the lottery is not really a good place to invest your money (unless your only reward is the thrill of entering the lottery!) Final comment is that the student who raised this issue initially agreed that when utilities or losses were used as the payoff, then it would not be a problem.

I finally drew on the board a generally useful way to estimate utilities for any events whatsoever. I used the Monty Hall example of a car, a goat, and a trip to Hawaii. Presumably the car is the best and the goat is the worst, with the Hawaii trip in between. Draw a decision tree, put the car and the goat on the probability branches and the Hawaii trip on the "get for certain" branch. Then, pick a probability for getting the car that makes you neutral between the two branches of the decision tree. There should be a point where you are neutral, for if the probability of getting the car is 1, you'd take the car for sure, but if the probability of getting the car is 0, you'd pick the Hawaii trip for sure. One student volunteered p=0.8. That means that her utility for the Hawaii trip is 0.8, since at that point, both branches of the decision tree have exactly the same value.

I remarked finally that if you use this method for evaluating utilities, then the utilities so calculated are actually probabilities!

Saturday, October 18, 2008

Class, 10/17

Today we spent most of the class discussing the lottery problem I left you with last time.

What we need to compute is the probability that no one wins the lottery, the probability that exactly one person wins the lottery, exactly two people, and so forth.

After some discussion we decided that the probability that no one wins is the probability that the first person loses AND that the second person loses AND....AND that the last person loses. Since these are independent events, we need to multiply the probabilities of each of these events (AND always means multiply the probabilities). If I write w=1/80,000,000, the probability that a given person wins the lottery, then the probability that that person loses is (1-w). The probability that everyon loses is (1-w)^N, where N=200,000,000 is the number of tickets sold. Although that looks terrible to compute, actually a hand calculator correctly computed this number to be 0.082.

To get the probability that exactly one person wins, we decided that it is equal to the probability that (the first person wins AND all the others lose) OR (the second person wins AND all the others lose) OR...OR the last person wins ane all the others lose). The AND means multiplication, and the OR means adding probabilities. To get the probability that one specified person wins AND all the others lose, this is equal to w*(1-w)^N-1 which is hardly different from w*(1-w)^N since the extra factor of (1-w) is very, very clost to 1. But there are N tickets, so the OR means we add this number to itself N times and the probability that exactly one person wins, one of the N tickets, is (Nw)*0.082 or 0.205.

For two people we follow the same principle: We compute probability that (the first person wins AND the probability that exactly one of the other people wins AND that all the other people lose) OR (the second person wins AND exactly one of the other people wins AND that all the others lose) OR...OR (the last person wins AND exactly one of the other people wins AND that all the others lose). We just computed the probability that one of the other people wins AND that all the others lose, it's 0.208. And the probability that a particular person wins is still w. And, there are N identical numbers that are OR'ed together, so we have to multiply by N again, getting (Nw)*0.208. However, there is a little complication, because if you look at the first two numbers above, in both of them there is a piece that comes from the first person winning AND the second person winning. And a similar thing can be said about any pair of items above. So what has happened is that each pair of people appears twice in the sum and is therefore counted twice as much as it should be. So the probability we want has to be divided by 2, and the answer we need is: The probability that exactly two people win is (Nw)*0.208/2=0.257.

In a similar way, the probability that exactly three people win can be computed as (Nw)*0.257/3=0.214; similarly to the case of two people, we note that each triple of tickets gets counted three times, so we have to divide by 3. And so forth for the case of exactly four, five, six and so on. Once you get to seven people, there's less than a 1% chance that that many people will win.

Now we can compute the expected value of a ticket. Your probability of winning is w. If no one else wins (probability 0.082) then you would win $280M. If exactly one other person wins (probability 0.205) then you would win $140M. And so forth. Adding it all up, the expected value of a ticket is $1.29. We calculated $1.12 by dividing $280M by 2.5=200M tickets/80M probability of winning.

But still, taxes and the fact that you can't take home the entire jackpot if you want it all at once means that it is not worth it (from an expected value point of view). The only reason to buy a ticket is for the fun of it.

I gave out some worksheets for estimating your utility function for money. You should work them out this weekend. We'll discuss them on Monday.

Thursday, October 16, 2008

Decision tree diagram

Here is the photo I took on Wednesday. The quality isn't great, but you should be able to copy it to your clipboard and look at it in more detail. In fact, I just clicked on it (double click on a Mac, I don't know what you do with a PC) and Firefox presented it in a separate window, and the numbers were easily readable.

In addition to the decision tree we discussed the PowerBall lottery. p=1/80,000,000 to win, 200,000,000 tickets sold, $280,000,000 jackpot. The question is, does it pay (in an expected return sense) to enter? We immediately noticed that there might be more than one winner, and we estimated roughly 2.5 winners on average. This makes a ticket worth $1.41, so at first sight it appears to be a good idea to enter. But there are several problems, which we uncovered on further discussion. One is taxes: You would be taxed at the highest bracket, which is in the 35-39% range (depending on the tax law), as well as Vermont income tax. Also, you don't get the money all at once, but in installments over 20 years. To get the money at once, you have to take a discount of about 50% (since the way the lottery works, the state buys you an annuity that pays out over 20 years, and you would only get the amount that they would pay the insurance company to buy the annuity). Thus, it seems that it isn't worthwhile after all.

I left you with the problem of trying to get a more precise estimate of the expected return, considering the probability that 1, 2, 3,... more winners will win the lottery other than you.

Tuesday, October 14, 2008

Class, 10/13

We went through the test, and I won't repeat what we talked about except to note several things that I want to emphasize.

1) On the Fermi problems, several tips. Don't try to be too fancy, as in trying to estimate low, middle and high income populations and housing prices, and rolling it together to get an average. Better is to estimate something close to the median cost of a house and just multiplying by the number of houses. Very few people have expensive houses, and it isn't going to increase your accuracy by trying to factor that information in. Also, don't forget to divide the population of the U.S. (300 million) by the average family size (around 4).

2) On the coin problem, the easiest way is to recognize that the method chosen (pick coin at random and flip) has an equal probability of seeing any particular side. Cross off the tails (5 instances) and there are 7 ways to get a head. Of these, 4 will have a head on the other side. If you use the "spreadsheet" method, recognize that there are three states of nature, HH, HT and TT, with prior probability of 2/6, 3/6 and 1/6, respectively. The same idea works if you use a tree...recall that the branches at the base of any probability tree will always be the states of nature and their prior probabilities. This is why it is important to start any analysis by identifying the distinct states of nature, and then their prior probabilities, no matter what method you use. Then the likelihoods are branches off the base branches, identified as the data observed and the probability of observing that data (H in this case) given that the corresponding state of nature is true.

3) Here the important thing to recognize is that the taxis continue to drive around, so it is sampling with replacement, and the three factors in the likelihood will not change from observation to observation since the number of taxis available does not change.

4) This one is probably best solved with the natural frequencies method. Take 1500 students in the group. 150 will have taken the drug and 1350 will not have taken it. Of those that took it, 147 will be caught by the test and 3 will escape detection. Of those that did not take it, 40.5 will be falsely caught and 1309.5 will correctly be identified as not taking the drug. This gives the answer to the last part of the question (40.5), and by computing the ratio 147/(147+40.5)=0.78 we get the probability that a student has taken the drug, given that he tests positive.

5) The table is dependent, since the entries in at least one cell do not equal the product of the marginal probabilities in the corresponding row and column. To get an independent table, just multiply those marginals and enter the product into the corresponding row and column.

6) The easiest way to do this one is to focus on the gains and losses, rather than the absolute amount that you get back at the end. Thus, the gain is $500 for the bond, and for the mutual fund it is 0.7*$9700*0.15-0.3*$9700*0.1=$727.50. However, you have to pay a commission out of this (-$300 tollgate), so you'll actually have an expected profit of $427.50. Since this is less than $500, you'll prefer the bond.

If you focus on how much you get back, you have to be careful, because on the mutual fund branch you will already have subtracted the commission so you don't want a tollgate or you'd be paying the commission twice.

Generally speaking, it's a lot easier to do these problems by focusing on the gain or loss rather than the amount you get back after a year.