Saturday, November 1, 2008

Class, 10/31

We finished discussing the drug testing problem. From the data we have posterior probabilities for various cure rates for each drug. By multiplying them, we obtain the posterior probability that, for example, drug A has cure rate 0.25 and drug B has cure rate 0.35. We can then add up all those joint probabilities for which drug B has the greater cure rate, to get the probability that drug B is better than drug A.

We then discussed the marketing problem. We identified two stages, the first of which was to spend $20 million in testing the drug to FDA standards. We recognized that the drug might not pass this test; most experimental drugs do not. There is from our preliminary test a 75% chance that the drug will pass the test. We discussed "sunk costs," that is, costs that cannot be recovered. Even though getting to the point where we are (with 100 subjects tested) did cost some money, we can never get it back, so we may as well call our loss or utility exzctly zero at this point.

You can calculate using either losses or utilities. It's purely a matter of convenience. Since the drug company is very wealthy, its utility or loss function will be linear or nearly so.

We illustrated the process by imagining that if the company decides to continue the development of the drug, it will pass through a toll gate worth $20M. Then there will be a 75% probability that we will go on to develop the drug to marketing stage, which will cost an additional $80M and require another toll gate. We guessed that the drug had a 20% chance of commercial success (revenues of 20 years at $1B per year) and an 80% chance of "failure" (20 years at $10M/year, if I recall). These figures may not be what I wrote on the board, but you should have the correct numbers and the tree in your notes. We decided that the company should go ahead with the plan, given the figures we used.

We then discussed the fish problem. The states of nature are the numbers from 1 to 100. The prior we took to be equal (0.01 on each SON). One student asked, since we know there have to be at least 15 fish in the lake, why not set those to zero, but another student pointed out that that requires looking at the data, and the prior is supposed to reflect what you know before you look at the data, so to do this would be "cheating." We had a false start on the likelihood, which was my fault as I should have steered us to the correct solution more quickly. But several students were puzzled and we started over. Since there are N fish in the lake, and 10 of them are tagged, the probability of picking 5 tagged and 5 untagged is as follows:

For N=15, it's (10*9*8*7*6)*(5*4*3*2*1)/((15*14*13*12*11)*(10*9*8*7*6)). We get this because the probability of picking one tagged fish is 10/15, the probability of picking the second tagged fish is 9/14, and so on through the 5 tagged fish; then for the untagged fish it is 5/10 for the first one, 4/9 for the second one, and so on through the 5 untagged fish. As each fish is caught, the number of fish of that kind (tagged or untagged) decreases by 1, as does the total number of fish in the lake.

Similarly, if N=20, it's (10*9*8*7*6)*(10*9*8*7*6)/((20*19*18*17*16)*(15*14*13*12*11)), with the same kind of reasoning.

We'll discuss this more on Monday and then go on to the remaining problems on the study sheet.

No comments: