Authors: Stephen Baker
As Google evolves, Norvig said, it will start to replicate some of Watson's headier maneuvers, combining data from different sources. “If someone wants per capita income in a certain country, or in a list of countries, we might bring two tables together,” he said. For that, though, the company might require more detailed queries. “It might get to the point where we ask users to elaborate, and to write entire sentences,” he said. In effect, the computer will be demanding something closer to a
Jeopardy
clueâalbeit with fewer puns and riddles.
This would represent a turnaround. For more than a decade, the world's Web surfers have learned to hone their queries. In effect, they've used their human smarts to reverse-engineer Google's algorithmsâand to understand how a search engine “thinks.” Each word summons a universe of connections. Looking at each one like a circle in a Venn diagram, the goal is to organize three or four wordsâ3.5 is the global averageâwhose circles have the smallest possible overlap. For many, this analysis has become almost reflexive. Yet as the computer gets smarter, these sophisticated users stand to get poorer results than those who type long sentences, even paragraphs, and treat the computer as if it were human.
And many of the computer systems showing up in our lives will have a far more human touch than Watson. In fact, some of the most brilliant minds in AI are focusing on engineering systems whose very purpose is to leech intelligence from people. Luis Von Ahn, a professor at Carnegie Mellon, is perhaps the world's leader in this field. As he explains it, “For the first time in history, we can get one hundred or two hundred million people all working on a project together. If we can use their brains for even ten or fifteen seconds, we can create lots of value.” To this end, he has dreamed up online games to attract what he calls brain cycles. In one of them, the ESP game, two Web surfers who don't know each other are shown an image. If they type in the same word to describe it, another image pops up. They race ahead, trying to match descriptions and finish fifteen images in two and a half minutes. While they play, they're tagging photographs with metadata, a job that computers have not yet mastered. This dab of human intelligence enables search engines to find images. Von Ahn licensed the technology to Google in 2006. Another of his innovations, ReCaptcha, presents squiggly words to readers, who fill them in to enter Web sites or complete online purchases. By typing the distorted letters, they prove they're human (and not spam engines). This is where the genius comes in. The ReCaptchas are drawn from the old books in libraries. By completing them, the humans are helping, word by crooked word, to digitize world literature, making it accessible to computers (and to Google, which bought the technology in 2009).
This type of blend is likely to become the rule as smarter computers spread into the marketplace. It makes sense. A computer like Watson, after all, is an exotic beast, one developed at great cost to play humans in a game. The segregated scene on the
Jeopardy
stage, the machine separated from the two men, is in fact a contrivance. The question-answering contraptions that march into the economy, Watson's offspring and competitors alike, will be operating under an entirely different rubric: What works and at what cost? The winners, whether they're hunting for diseases or puzzling out marketing campaigns, will master different blends. They'll figure out how to turbocharge thinking machines with a touch of human smarts and, at the same time, to augment human reasoning with the speed and range of machines. Each side has towering strengths and glaring vulnerabilities. That's what gives the
Jeopardy
match its appeal. But outside the
Jeopardy
studio, stand-alones make little sense.
THE TIME FOR BIG
fixes was over. As the forest down the hill from the Yorktown lab took on its first dabs of yellow and red, researchers were putting the finishing touches on the question-answering machine. On the morning of September 10, 2010, five champion
Jeopardy
players walked into the Yorktown labs to take on a revamped and invigorated Watson. IBM's PR agency, Ogilvy, had a film crew in the studio to interview David Ferrucci and his team during the matches. The publicists were not to forget that focus of the campaign, which would extend into television commercials and Web videos over the coming months, would be on the people behind the machine. Big Blue was about people. That was the message. And the microphones on this late summer day would attempt to capture every word.
Over the previous four months, since the end of the first round of sparring sessions, Watson's creators had put their machine through a computer version of a graduate seminar. Watson boasted new algorithms to help sidestep disastrous categoriesâso-called train wrecks. Exhaustive new fact-checking procedures were in place to guide it to better responses in Final Jeopardy, and it had a profanity filter to steer it away from embarrassing gaffes. Also, it now received the digital read of
Jeopardy
answers after each clue so it could learn on the fly. This new intelligence clued Watson into its rivals' answers. It was as if the deaf machine had sprouted ears. It also sported its new finger. Encased in plastic, the apparatus gripped a
Jeopardy
buzzer and plunged it with its metal stub in three staccato bursts when Watson had enough confidence to bet. Even Watson's body was new. Over the summer, Eddie Epstein and his team had moved the entire system to IBM's latest generation of Power 7 Servers. If Watson was going to promote the company, it had to be running on the hardware Big Blue was selling.
In the remaining months leading up to the match against Ken Jennings and Brad Rutter, most of the adjustments would address Watson's game strategy: which categories to pick and how much to wager. It was getting too late to lift the machine's IQ. If Watson misunderstood clues and botched answers, they'd have to live with it. But the researchers could continue to fine-tune its betting strategy. Even at this late date, Watson could learn to make smarter decisions.
Though the final match was only months away, the arrangements between
Jeopardy
and IBM remained maddeningly fluid. An agreement was in place, but the contract had not yet been signed. Rumors about the match spread wildly on the Quiz Bowl circuits, yet the command from Culver City was to maintain secrecy. Under no circumstances were the names of the two participants to be released, not even the date of the match. On his blog, Jennings continued with his usual word games, stories about his children, and details of a trip to Manchester, England, which sparked connections in his fact-swimming mind to songs by Melissa Manchester and one from the musical
Hair
(“Manchester, England, across the Atlantic Sea . . .”). Nothing about his upcoming encounter with Watson.
Behind the scenes,
Jeopardy
officials maneuvered to get Jennings and Rutter a preview of this digital foe they'd soon be facing. Could they visit the Yorktown labs to see Watson in action, perhaps in early November? This inquiry led to further concerns. If the humans saw Watson and its weaknesses, they'd know what to prepare for. Ferrucci worried that they would focus on its electronic answer panel, which showed its top five responses to every clue. “That's a look inside its brain,” he said. One Friday, as a sparring match took place in the
Jeopardy
lab and visiting computer scientists from universities around the country cheered Watson on, Ferrucci stood to one side with Rocky Schmidt and discussed just how much Jennings and Rutter would seeâif they were granted access at all.
It was during this period that surprising news emerged from
Jeopardy
. A thirty-three-year-old computer scientist from the University of Delaware, Roger Craig, had just broken Ken Jennings's one-game scoring record with a $77,000 payday. “This Roger Craig guy,” Jennings blogged a day later, from England, “is a monster. . . . I only wish I could have been in the
Jeopardy
studio audience to cheer him on in person, like Roger Maris's widow or something. Great great stuff.” Jennings, like Craig himself, noted that Craig shared the name of a San Francisco 49er running back from the great Super Bowl squads of the 1980s. (
Jeopardy
luminaries recite such facts as naturally as the rest of us breathe or sweat. They can hardly help themselves.) Craig went on to win $231,200 over the course of six victories. What distinguished him more than his winnings were his methods. As a computer scientist, he used the tools of his trade to prepare for
Jeopardy
. He programmed himself, optimizing his own brain for the game. As the Watson team and the two human champions marched toward the matchup, each side busy devising its own strategy, Roger Craig stood at the intersection of the two domains.
Several weeks later, in mid-October, Craig sat at a pub in Newark, Delaware, discussing his methods over multiple refills of iced tea. With his broad face, wire-rimmed glasses, and a hairline in retreat, Craig looked the part of a cognitive warrior. Like many
Jeopardy
champions, he had spent his high school and college years in Quiz Bowl competitions and stuck with it even for the first couple of years of his graduate schooling at the University of Delaware. He originally studied biology, with the idea of becoming a doctor. But like Ferrucci, he had veered from medicine into computing. “I realized I didn't like the sight of blood,” he said. After a short stint researching plant genomics at Dupont, he went on to study computational biology at the computer science school at Delaware. When he appeared on
Jeopardy
, he was within months of finishing his dissertation, which featured models of protein interactions within a cell. This, he hoped, would soon land him a lofty research post in a pharmaceutical lab or academia. But it also provided him with the know-how and the software tools for his hobby, and he easily created software to train himself for
Jeopardy
. “It's nice to know how to program. You get some Perlscript,” he said, referring to a popular programming language. “Then it's just chop, chop, chop, boom!”
Much like the researchers at IBM, Craig divided his personal
Jeopardy
program into steps. First, he said, he developed the statistical landscape of the game. Using sites like J! Archive, he could calculate the probability that certain categories, from European capitals to anagrams, would pop up. Mapping the
Jeopardy
canon, as he saw it, was simply a data challenge. “Data is king,” he said. Then, with the exacting style of a
Jeopardy
champ, he corrected himself. “It should be data
are
king, since it's plural. Or I guess if you go to the Latin,
Datum
is king . . .”
The program he put together tested him on categories, gauged his strengths (sciences, NFL football) and weaknesses (fashion, Broadway shows), and then directed him toward the preparation most likely to pay off in his own match. To patch these holes in his knowledge, Craig used a free online tool called Anki, which provides electronic flash cards for hundreds of fields of study, from Japanese vocabulary to European monarchs. The program, in Craig's words, is based on psychological research on “the forgetting curve.” It helps people find holes in their knowledge and determines how often they need those areas to be reviewed to keep them in mind. In going over world capitals, for example, the system learns quickly that a user like Craig knows London, Paris, and Rome, so it might spend more time reinforcing the capital of, say, Kazakhstan. (And what would be the Kazakh capital? “Astana,” Craig said in a flash. “It used to be Almaty, but they moved it.”)
At times, the results of Craig's studies were uncanny. His program, for example, had directed him to polish up on monarchs. One day, looking over a list of Danish kings, he noticed that certain names repeated through the centuries. “I said, âOK, file that away,'” he recalled. (Psychologists call such decisions to tag certain bits of information for storage “judgments of learning.”
Jeopardy
players spend many of their waking hours forming such judgments.) In his third
Jeopardy
game, aired on September 15, Craig found himself in a tight battle with Kevin Knudson, a math professor from the University of Florida. Going into Final Jeopardy, Craig led, $13,800 to $12,200. The final category was Monarchs, and Craig wagered most of his money, $10,601. Then he saw the clue: “From 1513 to 1972, only men named Christian & Frederick alternated as rulers of this nation.” It was precisely the factoid he had filed away, and he was the only one who knew it was Denmark. Only days before these games were taped, in mid-July, Craig had seen the sci-fi movie
Inception,
in which Leonardo DiCaprio plunges into dream worlds. “I really wondered if I was dreaming,” he said. After three matches, it was lunchtime. Roger Craig had already pocketed $138,401.
Craig had been following IBM's
Jeopardy
project and was especially curious about Watson's statistically derived game strategy. He understood that language processing was a far greater challenge for the IBM team. But as a human, Craig had language down. What he didn't have was a team of Ph.D.s to run millions of game simulations on a cluster of powerful computers. This would presumably lead to the ideal strategy for betting and picking clues at each step of the game. His interest in this was hardly idle. By winning his six games, Craig would likely qualify for
Jeopardy
's Tournament of Champions in 2011. Watson's techniques could prove invaluable. As soon as his shows had aired in mid-September (and he was free to discuss his victories), he e-mailed Ferrucci, asking for a chance to IBM and spar with Watson. Ferrucci's response, while cordial, was noncommittal.
Jeopardy
, not IBM, was in charge of selecting Watson's sparring partners.
Before going on
Jeopardy
, Craig had long relied on traditional strategies. He'd read books on the game, including the 1998
How to Get on JeopardyâAnd Win,
by Michael DuPee. He'd also gone to Google Scholar, the search engine's repository of academic works, and downloaded papers on Final Jeopardy betting. Craig was steeped in the history and lore of the games, as well as various strategies, many of them named for players who had made them famous. One Final Jeopardy technique, Marktiple Choice, involves writing down a number of conceivable answers and then eliminating the unlikely ones. Formulated by a 2003 champion, Mark Dawson, it prods players to extend the search beyond the first response that pops into their mind. (In that sense, it's similar to the more systematic approach used by Watson.) Then there's the Forrest Bounce, a tactic named for a 1986 champion, Chuck Forrest, who disoriented his foes by jumping from one category to the next. “You can confuse your opponents,” said Craig, who went on to use the technique. (This irked even some viewers. On a
Jeopardy
online bulletin board, one North Carolinian wrote, “I could have done without Roger winning . . . I can't stand players that hop all over the board. It drives me nuts.”)