Read A Field Guide to Lies: Critical Thinking in the Information Age Online
Authors: Daniel J. Levitin
When the Probability of Events Is Informed by Other Events
The multiplication rule only applies if the events are independent of one another. What events are not independent? The weather, for example. The probability of it freezing tonight
and
freezing
tomorrow night are not independent events—weather patterns tend to remain for more than one day, and although freak freezes are known to occur, your best bet about tomorrow’s overnight temperatures is to look at today’s. You
could
calculate the number of nights in the year in which temperatures drop below freezing—let’s say it’s thirty-six where you live—and then state that the probability of a freeze tonight is 36 out of 365, or roughly 10 percent, but that doesn’t take the dependencies into account. If you say that the probability of it freezing two nights in a row during winter is 10% × 10% = 1% (following the multiplication rule), you’d be underestimating the probability because the two nights’ events are not independent; tomorrow’s weather forecast is informed by today’s.
The probability of an event can also be informed by the particular sample that you’re looking at. The probability of it freezing tonight is obviously affected by the area of the world you’re talking about. That probability is higher at the forty-fourth parallel than the tenth. The probability of finding someone over six foot six is greater if you’re looking at a basketball practice than at a tavern frequented by jockeys. The subgroup of people or things you’re looking at is relevant to your probability estimate.
Conditional Probabilities
Often when looking at statistical claims, we’re led astray by examining an entire group of random people when we really should be looking at a subgroup. What is the probability that you have pneumonia? Not very high. But if we know more about you and your particular case, the probability may be higher or lower. This is known as a
conditional probability
.
We frame two different questions:
The second question involves a conditional probability. It’s called that because we’re not looking at every possible condition, only those people who match the condition specified. Without running through the numbers, we can guess that the probability of pneumonia is greater in the second case. Of course, we can frame the question so that the probability of having pneumonia is lower than for a person drawn at random:
Along the same lines, the probability of you developing lung cancer is not independent of your family history. The probability of a waiter bringing ketchup to your table is not independent of what you ordered. You can calculate the probability of any person selected at random developing lung cancer in the next ten years, or the probability of a waiter bringing ketchup to a table calculated over all
tables. But we’re in the lucky position of knowing that these events are dependent on other behaviors. This allows us to narrow the population we’re studying in order to obtain a more accurate estimate. For example, if your father and mother both had lung cancer, you want to calculate the probability of you contracting lung cancer by looking at other people in this select group, people whose parents had lung cancer. If your parents didn’t have lung cancer, you want to look at the relevant subgroup of people who lack a family history of it (and you’ll likely come up with a different figure). If you want to know the probability that your waiter will bring you ketchup, you might look at only the tables of those patrons who ordered hamburgers or fries, not those who ordered tuna tartare or apple pie.
Ignoring the dependence of events (assuming independence) can have serious consequences in the legal world. One was
the case of Sally Clark, a woman from Essex, U.K., who stood trial for murdering her second child. Her first child had died in infancy, and his death had been attributed to SIDS (sudden infant death syndrome, or crib death). The prosecutors argued that the odds of having two children die of SIDS were so low that she must have murdered the second child. The prosecution’s witness, a pediatrician, cited a study that said SIDS occurred in 1 out of 8,543 infant deaths. (Dr. Meadow’s expertise in pediatrics does not make him an expert statistician or epidemiologist—this sort of confusion is the basis for many faulty judgments and is discussed in Part 3 of this book; an expert in one domain is not automatically an expert in another, seemingly related, domain.)
Digging deeper, we might question the figure of 8,543 deaths. How do they know that? SIDS is a diagnosis of exclusion—that is, there is no test that medical personnel can perform to conclude a death was by SIDS. Rather, if doctors are not able to find the cause, and they’ve ruled out everything else, they label it SIDS. Not being able to find something is not proof that it didn’t occur, so it is plausible that some of the deaths attributed to SIDS were actually the result of less mysterious causes, such as poisoning, suffocation, heart defect, etc.
For the sake of argument, however, let’s assume that SIDS is the cause of 1 out of 8,543 infant deaths as the expert witness testified. He further testified that the odds of two SIDS deaths occurring in the same family were 1⁄8543 x 1⁄8543, or 1 in 73 million. (“Coincidence? I think
not
!” the prosecutor might have shouted during his summation.) This calculation—this application of the multiplication rule—assumes the deaths are independent, but they might not be. Whatever caused Mrs. Clark’s first child to die suddenly might be present for both children by virtue of them being in the same household: Two environmental factors associated with SIDS are secondhand smoke and putting a baby to sleep on its stomach. Or perhaps the first child suffered from a congenital defect of some sort; this would have a relatively high probability of appearing in the second child’s genome (siblings share 50 percent of their DNA). By this way of thinking, there was a 50 percent chance that the second child would die due to a factor such as this, and so now Mrs. Clark looks a lot less like a child murderer. Eventually, her husband found evidence in the hospital archives that the second child’s death had a microbiological cause. Mrs. Clark was acquitted, but only after serving three years in prison for a crime she didn’t commit.
There’s a special notation for conditional probabilities. The probability of a waiter bringing you ketchup, given that you just ordered a hamburger, is written:
P(ketchup | hamburger)
where the vertical bar | is read as
given
. Note that this notation leaves out a lot of the words from the English-language description, so that the mathematical expression is succinct.
The probability of a waiter bringing you ketchup, given that you just ordered a hamburger
and
you asked for the ketchup, is noted:
P(ketchup | hamburger
∧
asked)
where the
∧
is read as
and
.
Visualizing Conditional Probabilities
The
relative incidence of pneumonia in the United States in one year is around 2 percent—six million people out of the 324 million in the country are diagnosed each year (of course there are no doubt many undiagnosed cases, as well as individuals who may have more than one case in a year, but let’s ignore these details for now). Therefore the probability of any person drawn at random having pneumonia is approximately 2 percent. But we can home in on a better estimate if we know something about that particular person. If you show up at the doctor’s office with coughing, congestion, and a fever, you’re no longer a person drawn at random—you’re someone in a doctor’s office showing these symptoms. You can methodically update your belief that something is true (that you have pneumonia) in light of new evidence. We do this by applying
Bayes’s rule to calculate a conditional probability: What is the probability that I have pneumonia
given
that I show symptom x? This kind of updating can become increasingly refined the more information you have. What is the probability that I have pneumonia
given
that I have these symptoms, and
given
that I have a family history of it, and
given
that I just spent three days with someone who has it? The probabilities climb higher and higher.
You can calculate the probabilities using the formula for Bayes’s rule (found in the Appendix), but an easy way to visualize and compute conditional probabilities is with the fourfold table, describing all possible scenarios: You did or didn’t order a hamburger, and you did or didn’t receive ketchup:
| | Ordered Hamburger | |
| | YES | NO |
Received Ketchup | YES | | |
NO | | |
Then, based on experiments and observation, you fill in the various values, that is, the frequencies of each event. Out of sixteen customers you observed at a restaurant, there was one instance of someone ordering a hamburger with which they received ketchup, and two instances with which they didn’t. These become entries in the left-hand column of the table:
| | Ordered Hamburger | |
| | YES | NO |
Received Ketchup | YES | 1 | 5 |
NO | 2 | 8 |
Similarly, you found that five people who didn’t order a hamburger received ketchup, and eight people did not. These are the entries in the right-hand column.
Next, you sum the rows and columns:
| | Ordered Hamburger | | |
| | YES | NO | |
Received Ketchup | YES | 1 | 5 | 6 |
NO | 2 | 8 | 10 | |
| | 3 | 13 | 16 |
Now, calculating the probabilities is easy. If you want to know the probability that you received ketchup
given
that you ordered a hamburger, you start with the given. That’s the left-hand vertical column.
| | Ordered Hamburger | | |
| | | NO | |
Received Ketchup | YES | | 5 | 6 |
NO | | 8 | 10 | |
| | | 13 | 16 |
Three people ordered hamburgers altogether—that’s the total at the bottom of the column. Now what is the probability of receiving ketchup
given
you ordered a hamburger? We look now at
the “YES received ketchup” square in the “YES ordered hamburger” column, and that number is 1. The conditional probability, P(ketchup|hamburger) is then just one out of three. And you can visualize the logic: three people ordered a hamburger; one of them got ketchup and two didn’t. We ignore the right-hand column for this calculation.
We can use this to calculate any conditional probability, including the probability of receiving ketchup if you
didn’t
order a hamburger: Thirteen people didn’t order a hamburger, five of them got ketchup, so the probability is five out of thirteen, or about 38 percent. In this particular restaurant, you’re more likely to get ketchup if you didn’t order a hamburger than if you did. (Now fire up your critical thinking. How could this be? Maybe the data are driven by people who ordered fries. Maybe all the hamburgers served already have ketchup on them.)