The importance of statistics

a graph

Statistics tell us reality isn't what we think.

Many people, when confronted with odd occurrences, tend to wonder what mystical forces are at work that conspire to cause them.  Usually, these people don’t understand statistics.  If they did understand statistics, they’d realize there’s really nothing odd going on at all. Here’s an example.

A few days ago, I noticed something odd. In three consecutive observations of the odometer in my car, I saw these readings: 40000, 40040, and 40044.  It came to me that some people might wonder what this could mean.  After all, it’s one thing to see something odd once in a while; but to see three such cases in row – surely, it must mean something….

I don’t believe it does, and I’ll prove it with simple statistics, the kind one hopefully learns in high school.  I claim that this sort of thing is perfectly normal and is not worthy of any special consideration.

Let’s work out the odds.  We want to know what the odds are of my looking at the odometer three times in a row and seeing readings consisting of only two digits.

Let’s start by working out what the odds are of an odometer reading consisting of only two digits.

The odometer has a five digit display.  The first digit can be any digit, so the odds of seeing a digit in the first place is 1.  Since we’re considering the case of any two digits, then the second digit can also be any digit; again, a probability of 1.

The other three digits can only be one of the two digits occurring in the first two places.  For those digits, then, the odds are thus 2 in 10, or 0.2.

The odds of the five places of the odometer being of only two digits is then the product of all the probabilities.  Order is not important in multiplication so it doesn’t matter which digits have a probability of 1 and which have a probability of 0.2, so long as only two places have the former and the other three have the latter.

The probability of an odometer reading containing only two digits is thus 1 * 1 * 0.2 * 0.2 * 0.2 = 0.008 (or 1 in 125).  So of the 100,000 possible readings on my odometer, only 800 will be made up of only two digits.

This doesn’t answer our question, though, because we have not included the odds of our actually seeing the odometer when it’s in one of these 800 specific configurations.

I look at my odometer irregularly, but perhaps more often than many others, because the odometer in my car is just under the dashboard clock, and I look at that latter fairly often – especially when I’m commuting during rush hour.  Indeed, when the car was new and had only a few hundred kilometres on it, I often confused the time with the mileage.

There are different ways to do the math.  I’m going to start by considering how often an odd odometer reading might occur in an average day.  Some mental math informs me I drive about 30 kilometres a day (including weekends).  So in one day, I could see 30 different odometer readings.

I’m going to assume that odd odometer readings are scattered evenly over the range of possible odometer readings.  This obviously isn’t true.  (Example: 40040 and 40044 are only 4 readings apart, but the next odd reading after those would be 40400, which is quite “far” from 40044.)  Still, over all 100,000 possible readings, I think it’s reasonable to think that the differences will even out in the long run.

To see every one of the 30 readings in a day, I’d have to look at the odometer 30 times.  But I don’t.  Having thought about it, I think I look at the odometer about 10 times a day – remember, my odometer is right by the clock.  That means I’ll only see one third of the possible readings, which lowers the odds by 1/3: we’re now at 1 in 375.

If the odds of my seeing one of the 800 special readings is 1 in 375, then the odds of seeing three such readings in a row is 1/375 cubed – 1 in 52,734,375.

Now, that may seem like a very small number – about 0.0000019% – but we’ve forgotten one very important thing: how many times I look at the odometer.

Ten times a day, seven days a week, 365 days a year: that’s 25,550 observations a year.  I’ve been driving since I was 18 – not as often as I do now, but nearly so.  Call it about 25 years.  That’s 638,750 odometer readings.

This increases the odds, increases them to 0.012 – a little more than 1%.  If we assume I still have a good 15 years of driving left in me, the odds go up to nearly 2%.

A two percent chance that I’ll see that peculiar configuration of odometer readings in my lifetime.  That’s not bad.  That’s like saying that one in 50 people will see this configuration of odometer readings.  Even if you only count those of us who drive, we still have millions of people all over the world who will see the same kind of weird odometer readings that I saw.

Remember: the odds of being hit by lightning once in your life (if you live in the USA) is about 1 in 3,000 (1).

So, is it weird to see peculiar odometer readings? Not at all. I would argue that most of the things we see that we think are weird, aren’t weird at all.  And it’s statistics that tell us why.


2 thoughts on “The importance of statistics

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s