Bad Math

Chris Burand
Nov 12, 2020
5 min read

The COVID-19 pandemic has revealed that a large proportion of people writing about the virus who cite numbers are mathematically challenged at best and often completely incompetent at worst. Please bear in mind, this is not an accusation and therefore not mudslinging. This is just an obvious fact.

For example, the number of articles and news stories I've seen where a writer did not know the difference between a "percentage" and a "percentage point" have been almost infinite. A journalist should not be allowed to write an article involving percentages without knowing that difference, but it does not stop there. I have even read reports by scientists who don't seem to know the difference between the two. This lack of knowledge results in hugely mistaken conclusions. For example, if a number goes from 20% to 25%, this is a five percentage point increase. However, it is a twenty five percent increase. Confusing five and twenty-five is huge! Also, going from 25% to 20% is only a twenty percent decrease. Going up and down does not often result in the same percentage change but the percentage point change on an absolute basis is the same.

Lack of context is another mathematical issue. Since math is objective while context is subjective, when reporting percentages, reasonable context is usually required to clarify your point. For example, "the claim that the virus danger has doubled!" grabs headlines. But if the increase is from .1% to .2%, the doubling is pretty much meaningless in real terms. A death rate that increases 50% when the base is .005 (.5%) means an increase to .0075. The number .0075 is seventy-five thousandths, i.e., a rounding error.

This issue is seen with cancer statistics all the time. No one sells articles with headlines like, "Eating whatever increases your chance of developing cancer from 5.4% to 5.9%." Instead, the headline reads, "Eating whatever increases your chance of developing cancer more than 13%!" A .5 percentage point increase does not scare but a 13% increase may scare. Alarming headlines sell papers and advertising.

It's pretty easy to use the above example because it is simple and widely misused by inept or sometimes manipulative people. Often the only solution to get to the root of the figures is to understand math yourself because inept and manipulative people are not going to explain it to you.

I see insurance consultants and carriers make the same errors over and over. I read an advertisement from a retention consultant claiming to increase retention by 2%. That does not seem like much to me. However, I think they meant they could increase retention from 90% to 92%, or two full points, which is impressive.

A more common error in the insurance industry is to use averages blindly. The average commercial liability auto combined ratio is around 108%. That seems awful but the use of an average is misleading. It is not the right metric. If the three carriers that are completely incompetent at writing commercial auto (a combined five-year LOSS ratio of around 130%) are excluded, then the average decreases to profitable levels. The average makes it seem like the entire line is unprofitable when in reality the line is plenty profitable -- if the incompetent carriers are eliminated. The measure that tells the story more accurately at the most basic level is a median. Other metrics, such as standard deviations, paint the picture with even more clarity.

Using averages incorrectly is an epidemic in this industry. A benchmarking firm published average sales by category. When all the averages were totaled, the total exceeded actual revenues by a material amount. Obviously, an impossibility exists when producers generate more sales for an agency than the agency records. The data was correct, but the wrong metric, an average, was used.

To know when using an average as a metric is applicable, a person must know more about the data in general and the distribution of the data in particular. If a person does not know this, there is no way to know if an average is even applicable.

A different benchmarking firm advises carriers as to whether their contingencies are higher or lower than competitors. They base their conclusions on the contingency to premium ratio. This inherently assumes every single carrier, every single year has exactly the same loss ratio and growth rate (among other assumptions). Contingency calculations are usually based on at least three major variables and often up to five major variables. Therefore, to use an average that only considers two variables will be as right as a broken clock.

Furthermore, averages are usually just not that important. There are times when we all must use them for various benchmarking purposes, but for a truly accurate analysis and opportunity identification, averages are not usually all that applicable. The reason is that in the real business world, distribution does not typically follow a normal curve. The normal curve applies better in the natural world. For example, distribution of height follows a normal curve and therefore, average height is useful. In the business world, performance does not follow the normal curve all that often. Instead performance follows a Pareto Curve. A great current example is the S&P 500. There are 500 firms, but five firms generate approximately 25% of the total value. The results are skewed. Applying average values undervalues those five firms significantly and over values the other 495 firms.

If you do not know the distribution of your samples, don't assume an average is applicable. People, writers, and consultants who don't know the difference should not use numbers in this context.

Another example of a misused metric is cause and effect. Below is a chart that shows how cities caused farmers to quit farming. The correlation is about perfect.

Even when using correlations correctly, drawing correct cause and effect correlations is beyond many individuals' abilities. Excel and scatter graphs have made it easy to create impressive charts. My personal opinion is that unless a person has taken adequate statistics classes, they should not be allowed to use this part of Excel. Too many people think that because they can create a graph they can accurately interpret the data.

I recently saw a highly educated insurance person make this mistake. He created a scatter graph where the R2 was .22. The audience did not know much about statistics, so they were enthralled. I have a random number generator that can randomly generate numbers that show R2's of .22 and sometimes higher. In other words, the correlation shown on that scatter graph was pretty much meaningless. Any conclusions drawn were likely based on random numbers. How assured of their conclusion should someone be if their conclusion is based on a random outcome?

The world needs people who actually understand math and statistics in order to navigate successfully through the pandemic. Thereby, the conclusions drawn will be based on solid correlations and differences exceeding rounding errors. Insurance agencies and carriers need leaders and advisors who understand math and statistics well to identify what actually drives success. Otherwise, luck will play an excessive roll. Would you rather control your success or gamble for your success?

NOTE: The information provided herein is intended for educational and informational purposes only and it represents only the views of the authors. It is not a recommendation that a particular course of action be followed. Burand & Associates, LLC and Chris Burand assume, and will have, no responsibility for liability or damage which may result from the use of any of this information.