Bayes Rules: Interesting things in this week's journals

One person remarked on the volatility of the stock market, particularly as we are experiencing now. Generally speaking, the stock market is a risky proposition in the sense that over short periods of time it can be quite volatile. Over the past several months, it is down close to 30%. One should not put money into the stock market that you're going to need soon, even within the next five years or more. That's where you should put money you won't need for a decade or more. The second thing is diversification, which is best achieved by investing in mutual funds that represent a broad cross-section of the market. The third thing is to have a mix of stocks and fixed income investments like bonds and money market funds whose volatility is much less, even though their long-term potential for return is lower than with stocks. (Historically, stocks have returned of order 10% per year over a long period, although they can be down sharply in any given year. Bonds typically return a few percent over inflation.)

Another person also talked about the stock market, and mentioned recent volatility. I did mention that recent volatility, although bad, is by no means a percentage record. That is, a 700 point drop in one day is about 7%. There have been much larger percentage drops in history, although 700 points may be a point record. But really, only percentage changes actually reflect what's really happening.

Several people reported that they'd like more clarification of various points. There are many places where you can get this kind of response. Your journal is one; class is another, and you know by now that I'm happy to get your questions. Or you can talk to me out of class. Or you can ask questions by posting them as comments to this blog. Or you can send me email. I welcome all of these.

One person had read Lewis' book and asked about the Prisoner's Dilemma problem. This is of course a problem in game theory, which isn't really part of this course. But it is an interesting problem nonetheless, as it raises the question of whether there is a way for the prisoners to avoid falling into the jailors' trap, thus ending up with sentences that are more favorable to both. There are approaches that can do this, by embedding this particular game into a larger one. You might find information about this on WikiPedia or on the web.

One person did an analysis of all of the possible shooting orders for the Trewel problem, not just ABC but also all other permutations such as CBA, CAB, BAC, etc. In all of these situations, the result is that the best shooter has the best probability of surviving, and the worst shooter has the second best probability of surviving. It is Bob that has the lowest probability of making it out alive. Very interesting!

One person asked about Fermi problems, "If you are possibly starting with a completely wrong number, what's the point?" The point is that you often are in a situation where a decision must be made on imperfect knowledge, so you have to make such estimates. So, it is a good skill to perfect, and practice makes perfect. The more practice you have, the more skilled and confident you become, the better you'll do.

Another person asked about polls taken over a period of time and asked, does each poll have its own bell-shaped curve? The answer is yes, each poll has some uncertainty and therefore its own bell-shaped curve. Now peoples' opinions change over time, so we can't just average the polls over several months together to figure out what is going to happen next. The average of several polls is more likely to reflect what opinion was halfway through the polling period. More sophisticated approaches (something statisticians call "regresssion", which is the subject of a Burack lecture next Monday afternoon) would be required.

One person made a mathematical mistake, which I want to point out. If we have several probabilities expressed as percentages, e.g., 3% and 2%, you cannot multiply them to get the probability of a joint event as 6%. That's because, expressed as probabilities, these are 0.03 and 0.02, respectively, so the probability of the joint event is 0.0006, 0r 0.06%.

One person mentioned an interest in political science and polls, and I mentioned that Prof. Andrew Gelman at Columbia University has a blog to which he posts daily. He is a Bayesian statistician and a political science, author of an interesting book, "Red State, Blue State." Some of what he posts is advanced but much is quite accessible to nonstatisticians. His blog can be found here. I read it every day.

A very interesting problem was posed by one person, who mentioned hanging out with a friend and finding, within a short distance of each other, two four-leaf clovers. If the probabililty of finding one four-leaf clover is 10^-4=1/10000, does this mean that finding two nearby each other is 10^-8? Actually, it probably isn't, for several reasons. The first is basically a fallacy: It may be that this is correct for any two people sitting together at random places around the earth, but the fact is that if you find one four-leaf clover, all of a sudden your attention is drawn to this low-probability event, but it is an event that has already happened. So the probability that is really relevant, once you have found one, is P(find a second|found one), and that is at least 10^-4. If you hadn't found a second one, the first one probably wouldn't have been written about. The same fallacy underlies the occasional news show where you learn that someone who has already won the lottery has won again. The probability that you win a second time, given that you won once, is the same as the probability that you win once (assuming independence). But the only reason that the event made the news is because of the second win. It is a mistake to be very surprised that occasionally someone wins twice.

The other reason is that four-leaved clovers are (as the person mentioned) due to genetics, or to soil condition, or to other external factors. That means that it probably isn't the case the P(find a second one|found one)=10^-4. Because of these factors, this probability probably isn't independent, so writing that P(find a second one|found one) might be much larger than P(find one). It's not unlikely that four-leaved clovers grow in proximity, that is, in clusters.

Another person asked about the formula square root of N*p*(1-p) for the expected uncertainty of the number of coin flips or voters voting for a candidate, where p is the true probability in the entire population. I pointed out that this formula isn't a part of the course, but was brought up to answer a question that was asked in class. You are not responsible for this formula, and I will not derive it. But one thing puzzled this person: the uncertainty is smaller, the farther away from 0.5 p is. But it really is true. One way to see this is to ask about the case p=0. In that case, the voters are unanimous in favoring candidate B, or the coin has two tails. You have no variation at all, so this formula evaluates to 0, as it should. By extension, as you go away from zero, the variation will increase, and because of symmetry (after all heads and tails are symmetric states; voting for A and voting for B are similarly symmetric), it will decrease again as you go to values of p greater than 0.5.

Bayes Rules

Monday, October 6, 2008

Interesting things in this week's journals

No comments:

Blog Archive

About Me