One of the bigger problems when it comes to rational discussion today is that we don’t quite know what we are comparing.
Everybody has heard the cliché that “you cannot compare apples and oranges.” But guess what: We do compare apple and oranges all the time. This usually happens when we go grocery shopping. We may compare them based upon price per ounce, Vitamin C per ounce, or our own flavor preferences and the perceived quality of the fruit that day.
Of course, if we are buying ingredients to make an apple pie it would be “fruitless” to compare apples and oranges.
The important point here is to be very clear what we are comparing so as to evaluate whether the comparison is appropriate or not. I present below a number of common “comparison fallacies” in everyday discussion that often trip up individuals and lead them to poor conclusions. Learning to spot and avoid these fallacies can greatly help your decision-making process.
Facts don’t always lie at the center
Opinion-slingers often like to cite specific points of aggregated data to make their case. These usually can be the mean (average) or median. Sometimes it may be another particular benchmark such as a percentile. The question becomes, how indicative is this specific point of the other points in the distribution that they are comparing?
The Oregonian recently headlined an article “Minimum wage workers can’t afford a typical 1- bedroom apartment in 31 Oregon counties.” Outrage ensued. How could our nation be so cruel as to not provide affordable housing to those with low incomes?
This news article relied upon a report by the National Low Income Housing Coalition that tried to calculate the affordability of “modest” housing, which is defined by the Department of Housing and Urban Development as “the 40th percentile of rents that a family can be expected to pay.” This measure represents a payment that is ten percentage points below the average (perhaps median) monthly price of a rental unit.
Averages and specific percentile points (e.g., the 40th) contain important information, but how people are dispersed along the distribution is also critical to understanding the full picture. Statisticians are concerned both about measures of central tendency and dispersion (e.g., standard deviation), but too often we just put our attention on a single point of the distribution (the mean or a chosen percentile point in the report above).
Here we have a comparative argument regarding two different distributions: 1) the distribution of rental rates; and 2) the distribution of income. In the case of the former (rents), The Oregonian chose a baseline point that is near the middle of the distribution (the 40th percentile). Choosing this point as typical for modest housing seems reasonable and the NLIHC report is clear with this definition. So far, no problem.
But the critical question is whether or not a minimum wage earner is “typical” of workers in roughly the same way as they “typical” modest rental unit? More specifically, do minimum wage workers represent the 40th percentile of wage earners?
The answer is no. According to the Bureau of Labor Statistics’s 2017 Characteristics of Minimum Wage Workers, the percentage of the wage-earning population in the US earning the federal minimum is roughly 2.3% (or below the third percentile). Within the Pacific Region that includes Oregon, those at or below minimum wage is approximately 0.7% (see Table 2 in the linked report above). State and municipal minimum wage rates are almost always higher than the federal minimum and it may be the case that individual states and counties may have higher percentages of minimum wage workers, but that percentage is nowhere near the 40th percentile of workers. Why would we expect someone who is in the third percentile of income to afford housing that is in the 40th percentile?
Thus, the argument put forth about a minimum wage worker being able to afford a “typical” (40th percentile) rental unit is misleading. Minimum wage workers are not likely to afford “typical” housing because they themselves are not “typical” of all wage earners based on the same point estimate used to define “typical” for housing. It is more plausible that minimum wage workers are acquiring housing that is in the lower percentiles of rental rates (perhaps 10th percentile or less), or even living with parents or roommates making more than the minimum wage. A more accurate argument would account for this. While affordable housing may still be a legitimate concern, the apparent crisis of minimum wage workers being able to afford housing is much less severe than it initially appeared.
The important point here is that when somebody makes a comparison based upon measures of central tendency (e.g., the mean or median), or a specific point estimate (e.g., 40th percentile) it is important to ask what the dispersion of the distribution looks like. Averages are great summaries of data, but it is equally important to think about how the data are distributed when making comparisons.
Learn to live with randomness
Life is filled with random variation. Oftentimes we attribute extreme events to processes that are very common or ongoing and think that the latter cause the former. When making comparisons of a single event relative to long-term trends, we need to take into account the normal variation that occurs in data.
Take tornadoes for instance. The year 2011 was a particularly bad year for tornadoes with over 80 category F3 (strong-to-violent) incidents being recorded, with some highly deadly and destructive ones making news. Such devastation makes for scary headlines and helps activists raise calls to action. Not surprisingly, arguments appeared linking increased carbon emissions and climate change to the increase in tornadoes.
Alas, North America experienced a significantly lower rate of severe tornadoes, and tornadoes overall, in the subsequent years (with 2018 being on pace for one of the quietest seasons in decades). If there was an obvious causal connection between steadily-increasing atmospheric carbon, climate change, and tornadoes, we would expect to see a similar upward trend in tornadoes over time. Data from NOAA going back to the 1950s show no trend in increased or decreased tornado activity, but merely a great deal of random variation.
Yes, 2011’s tornado season was a “rare” event, but rare events do happen, albeit rarely. A once-in-a- century storm can be expected to occur (roughly) once or twice every century. Attributing a single “record-breaking” day or season to a long-standing pattern is a poor way to argue, particularly if there is a high degree of random variability in the data.
This is not to say that climate change is irrelevant to long-term weather patterns; it may well be. The broader point is that arguments that blame episodic, “freak” events on long-term trends need to be approached cautiously. If over the course of twenty years we see ten or fifteen “once-in-a- century” storms, then it is time to worry and think more deeply about what’s causing this.
A simple solution for poor comparisons
I have pointed to two common comparative problems that occur in everyday argumentation. Is there any quick fix to avoiding such errors in analysis? The answer is, yes. The easiest solution is one that involves a simple question and a bit of intellectual creativity.
Whenever confronted with a data-based argument, it is important always to ask the simple question: Compared to what?
The next step is to imagine all the possible bases for comparing the data. Are we comparing averages when we should be comparing variances? Does our comparison account for random variation in the data? Are we overlooking evidence that is difficult to see, but still may be important? And are we making a comparison against a standard that can never be met instead of a trend that may be improving?
Having these questions in one’s intellectual toolbox will improve the quality of discussion and will improve your thinking in the process. In the final analysis, the best way to get to a solid answer is to ask the right questions.