On paper it was already decided. The Duke Blue Devils would be the 2010 NCAA Tournament Champions. On paper. Not because Duke was a #1 seed and Butler was a #5 seed. Not because the undersized Butler did not match up well. But because the numbers told me so.
Sometime Sunday afternoon I decided to put what I am learning in school to good use. As an actuarial science student, I have a very strong background in statistics. This also includes being able to calculate sample variances and confidence intervals. I figured I could use this to “predict” the outcome of Monday night’s championship game.
I was first introduced to this concept in one of my statistic classes. The idea is that you take two teams, find their average points scored per game and the variance of their PPG, and then create two normal curves using PPG as the sample mean and the sample variance as the variance. When you lay the two curves on top of each other, the area where they overlap is the probability that the team that averages less PPG will score more than their opponent.
So I tried to put this to use. I found the average points scored for both Duke (77.4) and Butler (68.9). Then I found the sample variances (178.82 and 79 respectively). What can be seen from this is the amount of points Duke scores varies more than Butler’s points scored. So then I created the confidence intervals. With 99% confidence, I could be certain that Butler would score between 65 and 73 points, and Duke would score between 71 and 83 points.
I then decided to calculate a confidence interval for the difference of the two means. The 99% confidence interval showed that Duke’s margin of victory would fall between 1 and 15 points (specifically 1.76 and 15.12, but you can’t score a fraction of a point). The range may seem somewhat large, but that is the result of trying to be as accurate as possible with such a small sample size. I also decided to see what the total points scored would be. The confidence interval was (139.65, 153.01).
So let’s see how I did. Butler scored 59 points. Not quite in the confidence interval. Duke scored 61. Not even close. Duke won, and by 2 points. Well, I got those right. Total points scored was 120. Also, not even close. So overall I went 2 for 5. Pretty good for baseball, but terrible if I were trying to make some money off of this. I’m definitely going to have to tweek some factors if I want this to work. Moral of the story: there’s a reason why they play the game.
Here’s your YouTube video:
That’s all for now. I’m going to try to post a little more often now that the show is all but over, so stay tuned.
-BGram