Our Final Invention: Artificial Intelligence and the End of the Human Era Hardcover (8 page)

BOOK: Our Final Invention: Artificial Intelligence and the End of the Human Era Hardcover
3.71Mb size Format: txt, pdf, ePub

“You have got a specific ‘so over that’ emotion and you’re assuming that super intelligence would have it too,” Yudkowsky said. “That is
anthropomorphism.
AI does not work like you do. It does not have a ‘so over that’ emotion.”

But, he said, there is one exception. Human minds uploaded into computers. That’s another route to AGI and beyond, sometimes confused with reverse engineering the brain. Reverse engineering seeks to first complete fine-grained learning about the human brain, then represent what the brain does in hardware and software. At the end of the process you have a computer with human-level intelligence. IBM’s Blue Brain project intends to accomplish this by the early 2020s.

On the other hand, mind-uploading, also called whole brain emulation, is the theory of modeling a human mind, like yours, in a computer. At the end of the process you still have your brain (unless, as experts warn, the scanning and transfer process destroys it) but another thinking, feeling “you” exists in the machine.

“If you had a superintelligence that started out as a human upload and began improving itself and became more and more alien over time, that might turn against humanity for reasons roughly analogous to the ones that you are thinking of,” Yudkowsky said. “But for a nonhuman-derived synthesized AI to turn on you, that can never happen because it is more alien than that. The vast majority of them would still kill you but not for that. Your whole visualization would apply only to a super-intelligence that came from human stock.”

*   *   *

I’d find in my ongoing inquiry that lots of experts took issue with Friendly AI, for reasons different from mine. The day after meeting Yudkowsky I got on the phone with Dr. James Hughes, chairman of the Department of Philosophy at Trinity College, and the executive director of the Institute for Ethics and Emerging Technologies (IEET). Hughes probed a weakness in the idea that an AI’s utility function couldn’t change.

“One of the dogmas of the Friendly AI people is that if you are careful you can design a superintelligent being with a goal set that will become unchanging. And they somehow have ignored the fact that we humans have fundamental goals of sex, food, shelter, security. These morph into things like the desire to be a suicide bomber and the desire to make as much money as possible, and things which are completely distant from those original goal sets but were built on through a series of steps which we can watch in our mind.

“And so
we
are able then to examine our own goals and change them. For example, we can become intentionally celibate—that’s totally against our genetic programming. The idea that a superintelligent being with as malleable a mind as an AI would have
wouldn’
t drift and change is just absurd.”

The Web site of Hughes’s think tank, IEET, shows they are equal-opportunity critics, suspicious not just of the dangers of AI, but of nanotech, biotech, and other risky endeavors. Hughes believes that superintelligence is dangerous, but the chances of it soon emerging in the short term are remote. However, it is
so
dangerous that the risk has to be graded equally to imminent threats, such as sea level rise and giant asteroids plunging from the sky (both go in the first category in H. W. Lewis’s ranking of risks, from chapter 2). Hughes concurs with my other concern: baby steps of AI development leading up to superintelligence (called “god in a box” by Hughes) are dangerous, too.

“MIRI just dismisses all of that because they are focused on god jumping out of a box. And when god jumps out of a box there is nothing that human beings can do to stop or change the course of action. You either have to have a good god or a bad god and that’s the MIRI approach. Make sure it’s a good god!”

*   *   *

The idea of god jumping out of a box reminded me of other unfinished business—the AI-Box Experiment. To recap, Eliezer Yudkowsky played the role of an ASI contained in a computer that had no physical connection to the outside world—no cable or wires, no routers, no Bluetooth. Yudkowsky’s goal: escape the box. The Gatekeeper’s goal: keep him in. The game was held in a chat room by players who conversed in text. Each session lasted a maximum of two hours. Keeping silent and boring the Gatekeeper into surrendering was a permitted but never used tactic.

Between 2002 and 2005, Yudkowsky played against five Gatekeepers. He escaped three times, and stayed in the box twice. How did he escape? I had learned online that one of the rules of the AI Box experiment was that the transcripts of the contests cannot be revealed, so I didn’t know the answer. Why the secrecy?

Put yourself in Yudkowsky’s shoes. If you, playing the AI in the box, had an ingenious means of escape, why reveal it and tip off the
next
Gatekeeper, should you ever choose to play again? And second, to try and simulate the persuasive power of a creature a thousand times more intelligent than the smartest human, you might want to go a little over the edge of what’s socially acceptable dialogue. Or you might want to go
way
over the edge. And who wants to share that with the world?

The AI-Box Experiment is important because among the likely outcomes of a superintelligence operating without human interference is human annihilation, and that seems to be a showdown we humans cannot win. The fact that Yudkowsky won three times while playing the AI made me all the more concerned and intrigued. He may be a genius, but he’s not a thousand times more intelligent than the smartest human, as an ASI could be. Bad or indifferent ASI needs to get out of the box just once.

The AI-Box Experiment also fascinated me because it’s a riff on the venerable Turing test. Devised in 1950 by mathematician, computer scientist, and World War II code breaker Alan Turing, the eponymous test was designed to determine whether a machine can exhibit intelligence. In it, a judge asks both a human and a computer a set of written questions. If the judge cannot tell which respondent is the computer and which is the human, the computer “wins.”

But there’s a twist. Turing knew that thinking is a slippery subject, and so is intelligence. Neither is easily defined, though we know each when we see it. In Turing’s test, the AI doesn’t have to think like a human to pass the test, because how could anyone know
how
it was thinking anyway? However, it does have to convincingly
pretend
to think like a human, and output humanlike answers. Turing himself called it “the imitation game.” He rejected the criticism that the machine might not be thinking like a human at all. He wrote, “May not machines carry out something which ought to be described as thinking but which is very different from what a man does?”

In other words, he objects to the assertion John Searle made with his Chinese Room Experiment: if it doesn’t think like a human it’s not intelligent. Most of the experts I’ve spoken with concur. If the AI does intelligent things, who cares what its program looks like?

Well, there may be at least two good reasons to care. The transparency of the AI’s “thought” process before it evolves beyond our understanding is crucial to our survival. If we’re going to try and imbue an AI with friendliness or any moral quality or safeguard, we need to know how it works at a high-resolution level before it is able to modify itself. Once that starts, our input may be irrelevant. Second, if the AI’s cognitive architecture is derived from human brains, or from a human brain upload, it may not be as alien as purely new AI. But, there’s a vigorous debate among computer scientists whether that connection to mankind will solve problems or create them.

No computer has yet passed the Turing test, though each year the controversial Loebner Prize, sponsored by philanthropist Hugh Loebner, is offered to the maker of one that does. But while the $100,000 grand prize goes unclaimed, an annual contest awards $7,000 to the creator of the “most humanlike computer.” For the last few years they’ve been chatbots—robots created to simulate conversation, with little success. Marvin Minsky, one of the founders of the field of artificial intelligence, has offered $100 to anyone who can talk Loebner into revoking his prize. That would, said Minsky, “spare us the horror of this obnoxious and unproductive annual publicity campaign.”

*   *   *

How did Yudkowsky talk his way out of the box? He had many variations of the carrot and stick to choose from. He could have promised wealth, cures from illness, inventions that would end all want. Decisive dominance over enemies. On the stick side, fear-mongering is a reliable social engineering tactic—what if at this moment your enemies are training ASI against you? In a real-world situation this might work—but what about an invented situation, like the AI-Box Experiment?

When I asked Yudkowsky about his methods he laughed, because everyone anticipates a diabolically clever solution to the AI-Box Experiment—some logical sleight of hand, prisoner-dilemma tactics, maybe something disturbing. But that’s not what happened.

“I did it the hard way,” he said.

Those three successful times, Yudkowsky told me, he simply wheedled, cajoled, and harangued. The Gatekeepers let him out, then paid up. And the two times he lost he had also begged. Afterward he didn’t like how it made him feel. He swore to never do it again.

*   *   *

Leaving Yudkowsky’s condo, I realized he hadn’t told me the whole truth. What variety of begging could work against someone determined not to be persuaded? Did he say, “Save me, Eliezer Yudkowsky, from public humiliation? Save me from the pain of losing?” Or maybe, as someone who’s devoted his life to exposing the dangers of AI, Yudkowsky would have negotiated a
meta
deal. A deal about the AI-Box Experiment itself. He could have asked whoever played the AI to join him in exposing the dangers of AGI by helping out with his most persuasive stunt—the AI-Box Experiment. He could’ve said, “Help me show the world that humans aren’t secure systems, and shouldn’t be trusted to contain AI!”

Which would be good for propaganda, and good for raising support. But no lesson at all about going up against real AI in the real world.

Now, back to Friendly AI. If it seems unlikely, does that mean an intelligence explosion is inevitable? Is runaway AI a certainty? If you, like me, thought computers were inert if left alone, not troublemakers, this comes as a surprise. Why would an AI do
anything
, much less cajole, threaten, or escape?

To find out I tracked down AI maker Stephen Omohundro, president of Self-Aware Systems. He’s a physicist and elite programmer who’s developing a science for understanding smarter-than-human intelligence. He claims that self-aware, self-improving AI systems will be motivated to do things that will be unexpected, even peculiar. According to Omohundro, if it is smart enough, a robot designed to play chess might also want to build a spaceship.

 

Chapter Five

Programs that Write Programs

… we are beginning to depend on computers to help us evolve new computers that let us produce things of much greater complexity. Yet we don’t quite understand the process—it’s getting ahead of us. We’re now using programs to make much faster computers so the process can run much faster. That’s what’s so confusing—technologies are feeding back on themselves; we’re taking off. We’re at that point analogous to when single-celled organisms were turning into multi-celled organisms. We are amoebas and we can’t figure out what the hell this thing is that we’re creating.

—Danny Hillis, founder of Thinking Machines, Inc.

You and I live at an interesting and sensitive time in human history. By about 2030, less than a generation from now, it could be our challenge to cohabit Earth with superintelligent machines, and to survive. AI theorists return again and again to a handful of themes, none more urgent than this one:
we need a science for understanding them.

So far we’ve explored a disaster scenario called the Busy Child. We’ve touched on some of the remarkable powers AI could have as it achieves and surpasses human intelligence through the process of recursive self-improvement, powers including self-replication, swarming a problem with many versions of itself, super high-speed calculations, running 24/7, mimicking friendliness, playing dead, and more. We’ve proposed that an artificial superintelligence won’t be satisfied with remaining isolated; its drives and intelligence would thrust it into our world and put our existence at risk. But why would a computer have drives at all? Why would they put us at risk?

To answer these questions, we need to predict how powerful AI will behave. Fortunately, someone has laid the foundation for us.

Surely no harm could come from building a chess-playing robot, could it?… such a robot will indeed be dangerous unless it is designed very carefully. Without special precautions, it will resist being turned off, will try to break into other machines and make copies of itself, and will try to acquire resources without regard for anyone else’s safety. These potentially harmful behaviors will occur not because they were programmed in at the start, but because of the intrinsic nature of goal driven systems.

This paragraph’s author is Steve Omohundro. Tall, fit, energetic, and pretty darn cheerful for someone who’s peered deep into the maw of the intelligence explosion, he’s got a bouncy step, a vigorous handshake, and a smile that shoots out rays of goodwill. He met me at a restaurant in Palo Alto, the city next to Stanford University, where he graduated Phi Beta Kappa on the way to U.C. Berkeley and a Ph.D. in physics. He turned his thesis into the book
Geometric Perturbation Theory in Physics
on the new developments in differential geometry. For Omohundro, it was the start of a career of making hard things look easy.

He’s been a highly regarded professor of artificial intelligence, a prolific technical author, and a pioneer in AI milestones like lip reading and recognizing pictures. He codesigned the computer languages StarLisp and Sather, both built for use in programming AI. He was one of just seven engineers who created Wolfram Research’s Mathematica, a powerful calculation system beloved by scientists, engineers, and mathematicians everywhere.

Other books

Anatomy of Evil by Brian Pinkerton
The Star Diaries by Stanislaw Lem
Just One Season in London by Leigh Michaels
As The World Burns by Roger Hayden
LS: The Beginning by O'Ralph, Kelvin
Faerie Winter by Janni Lee Simner
Fortune's Favorites by Colleen McCullough
Russian Tattoo by Elena Gorokhova