Tuesday, October 14, 2008

Class, 10/13

We went through the test, and I won't repeat what we talked about except to note several things that I want to emphasize.

1) On the Fermi problems, several tips. Don't try to be too fancy, as in trying to estimate low, middle and high income populations and housing prices, and rolling it together to get an average. Better is to estimate something close to the median cost of a house and just multiplying by the number of houses. Very few people have expensive houses, and it isn't going to increase your accuracy by trying to factor that information in. Also, don't forget to divide the population of the U.S. (300 million) by the average family size (around 4).

2) On the coin problem, the easiest way is to recognize that the method chosen (pick coin at random and flip) has an equal probability of seeing any particular side. Cross off the tails (5 instances) and there are 7 ways to get a head. Of these, 4 will have a head on the other side. If you use the "spreadsheet" method, recognize that there are three states of nature, HH, HT and TT, with prior probability of 2/6, 3/6 and 1/6, respectively. The same idea works if you use a tree...recall that the branches at the base of any probability tree will always be the states of nature and their prior probabilities. This is why it is important to start any analysis by identifying the distinct states of nature, and then their prior probabilities, no matter what method you use. Then the likelihoods are branches off the base branches, identified as the data observed and the probability of observing that data (H in this case) given that the corresponding state of nature is true.

3) Here the important thing to recognize is that the taxis continue to drive around, so it is sampling with replacement, and the three factors in the likelihood will not change from observation to observation since the number of taxis available does not change.

4) This one is probably best solved with the natural frequencies method. Take 1500 students in the group. 150 will have taken the drug and 1350 will not have taken it. Of those that took it, 147 will be caught by the test and 3 will escape detection. Of those that did not take it, 40.5 will be falsely caught and 1309.5 will correctly be identified as not taking the drug. This gives the answer to the last part of the question (40.5), and by computing the ratio 147/(147+40.5)=0.78 we get the probability that a student has taken the drug, given that he tests positive.

5) The table is dependent, since the entries in at least one cell do not equal the product of the marginal probabilities in the corresponding row and column. To get an independent table, just multiply those marginals and enter the product into the corresponding row and column.

6) The easiest way to do this one is to focus on the gains and losses, rather than the absolute amount that you get back at the end. Thus, the gain is $500 for the bond, and for the mutual fund it is 0.7*$9700*0.15-0.3*$9700*0.1=$727.50. However, you have to pay a commission out of this (-$300 tollgate), so you'll actually have an expected profit of $427.50. Since this is less than $500, you'll prefer the bond.

If you focus on how much you get back, you have to be careful, because on the mutual fund branch you will already have subtracted the commission so you don't want a tollgate or you'd be paying the commission twice.

Generally speaking, it's a lot easier to do these problems by focusing on the gain or loss rather than the amount you get back after a year.

No comments: