Dec 10, 2009

Lies, Damned Lies, and Statistics

Statistics are often used as a vehicle for proving something. You see them every day and in innumerable forms, such as opinion polls, news stories, or weather forecasts. Many use the statistics they hear to form opinions, allocate funds, and plan their lives. In fact, we here at HippoCampus are 99.9% sure you’re looking at a statistic right now. But what do these numbers really mean? Are they total truth, total lies, or somewhere in the median? Read on (if you’re in the mode).

The practice of manipulating statistical data to elicit a predetermined reaction is commonplace in every corner of the worldwide media. Sometimes it is the reporter of the data that does this, and sometimes it is the researcher creating the data. Either way, it is important to be able to spot it when it happens.

Let’s use an example. Say we wanted to know how “green” a certain car manufacturer is, so we decide to find the average fuel mileage for the cars that they produce. There are a total of six different models: one gets 12 MPG, two get 14, one gets 16, one gets 20, and an ultra-high efficient hybrid model that’s powered by veggie-diesel, solar panels, and bad puns gets 200 MPG. Using this data set (12, 14, 14, 16, 20, 200), we can find the average using a couple of different methods, the most common being the mean, median, and mode.

Let’s find the median and mode first. Median is essentially the “middle value” of the list. In our data set, this equates to the number between 14 and 16, or 15. The Mode is the number that occurs most frequently in the list. In our data, this is the number 14, as it is the only value that occurs more than once.

Now let’s find the mean. This is the method most people are familiar with when it comes to finding average. To calculate mean, add together all the values in the data set and divide by the number of values in the set:

12+14+14+16+20+200 = 276

276/6 = 46

Notice that depending on the method that we use, we can get vastly different numbers. Now consider if the car manufacturer wanted to market itself as environmentally friendly and fuel efficient- which of the preceding methods do you think they will use in their press releases and advertisements about average fuel economy? It would not be difficult to misrepresent the average of 46 MPG as “evidence” that all of the cars produced by the manufacturer were fuel-efficient.

This is, of course, just one way that the unyielding veracity of numbers can be bent to a certain purpose. Look for more on this topic in future posts!

No comments: