Authors: Douglas Edwards
Marissa assured me that I was not the only one misappropriating her ideas. I was just the most recent. And, she wanted to know, why didn't she get credit for her work on the homepage promotion lines, which after all, should really be her responsibility, not marketing's?
I wasn't sure what credit there was to give for a single line of text on the
Google.com
homepage, or who else in the company might care, but I offered to publicly acknowledge her contributions whenever she made them. I wasn't willing to cede control over them, though. The marketing text on the homepage was the most valuable promotional medium we employed. It reached millions of people, and since promotion was a marketing responsibility, not a product-management one, I insisted that marketing should control the space.
As we headed back into the building, I assured Marissa, with complete sincerity, that I respected her intelligence and opinions and the enormous contribution she made to Google. I viewed her as my most important colleague in terms of the work that lay ahead. We had been working together to improve Google for more than three years, I reminded her. Despite our differing points of view in the past—and probably going forward as well—it was essential we maintain a direct channel of communication. I encouraged her to bring future issues to me and assured her I would do the same.
I told Cindy later that our chat had poured oil on some troubled waters. But, I added, I didn't expect it to be the last conversation of its kind.
Meanwhile, the privacy discussion had grown a thousand heads and was consuming vast quantities of time and mental effort among the engineers and the product team. Was our goal to make Google the most trusted organization on the planet? Or the best search engine in the world? Both goals put user interests first, but they might be mutually exclusive.
Matt Cutts characterized the two main camps in what he termed "the Battle Royal" as hawks and doves, where hawks wanted to keep as much user information as we could gather and doves wanted to delete search data as quickly as we got it. Larry and Sergey were hawks. Matt considered himself one as well.
"We never know how we might use this data," Matt explained. "It's a reflection of what the world is thinking, so how can that not be useful?" As someone who worked on improving the quality of Google's search results, Matt saw limitless possibilities. For example, "You can learn spelling correction even in languages that you don't understand. You can look at the actions of users refining their queries and say, if you see someone type in
x,
it should be spell-corrected to
y.
"
Well, some engineers asked, why don't we just tell people how we use cookie data to improve our products? We could give Matt's example about the spell checker, which also relied on user data to work its magic with names like the often misspelled "Britney Spears."
We don't tell them, Larry explained, because we don't want our competitors to know how our spell checker works. Larry opposed any path that would reveal our technological secrets or stir the privacy pot and endanger our ability to gather data. People didn't know how much data we collected, but we were not doing anything evil with it, so why begin a conversation that would just confuse and concern everyone? Users would oversimplify the issue with baseless fears and then refuse to let us collect their data. That would be a disaster for Google, because we would suddenly have less insight into what worked and what didn't. It would be better to do the right thing and not talk about it.
Matt understood Larry's position. He also sympathized with Googlers who wanted to compromise by anonymizing the data or encrypting the logs and then throwing away the keys every month. That would keep some data accessible, but the unique identifiers would disappear.
Not that Matt thought it would do any good in stemming public concerns. "part of the problem," he told me, "was explaining that in real-world terms. As soon as you start talking about symmetric encryption and keys that rotate out, people's eyes turn to glass." The issue was too complicated to offer an easy solution. Even if we agreed to delete data, we couldn't be sure we erased all of it, because of automatic backups stored in numerous places for billing advertisers or maintaining an audit trail. I began to understand the hesitation to even engage in the discussion with users.
What if we let users opt out of accepting our cookies altogether? I liked that idea, but Marissa raised an interesting point. We would clearly want to set the default as "accept Google's cookies." If we fully explained what that meant to most users, however, they would probably prefer
not
to accept our cookie. So our default setting would go against users' wishes. Some people might call that evil, and evil made Marissa uncomfortable. She was disturbed that our current cookie-setting practices made the argument a reasonable one. She agreed that at the very least we should have a page telling users how they could delete their cookies, whether set by Google or by some other website.
Describing how to delete cookies fit neatly with a state-of-the-brand analysis I had been working on. In it, I laid out my thoughts about redirecting our identity from "search and only search" to a leadership role on issues affecting users online. I forecast that user privacy, our near monopoly in search, and censorship demands by foreign governments would be the three trials to bedevil us in the coming year. We needed to prepare—to get out in front and lead the parade rather than be trampled by it. Marissa complimented my analysis but had reservations about my recommendations. Just as I had thought "Don't be evil" overpromised, she feared taking public stands about our ethical positions would result in overly heightened expectations and negative reactions if we failed to live up to them. I understood that perspective (and shared it) but believed we didn't need to
claim
to be ethically superior. We just needed our actions to demonstrate that we were. Users could draw their own conclusions.
Sergey's feedback was less encouraging. "I find documents like this frightening," he stated. "It's vague and open-ended, which makes specific feedback impossible." Lest I take his lack of comments for assent, he asked me to detail the next steps I intended to take. I had already done that, but evidently he hadn't read past the first page. I wondered if my communication with Sergey would improve if I took him for a walking chat, as I had with Marissa—perhaps along a high cliff overlooking the ocean.
Meanwhile, the privacy discussion bubbled and boiled until at last a meeting could be arranged to hash out once and for all policies on employee access to user data, data retention, and user education about privacy issues.
The meeting raised many other questions, and answered none of them. Eric Schmidt half-jokingly suggested that our privacy policy should start off with the full text of the Patriot Act. Larry argued we should keep all our data until—well, until the time we should get rid of it. If we thought the government was overreaching, we could just encrypt everything and make it unreadable. Besides, Ashcroft would most likely go after the ISPs first, since they had much better data than we did about what users did online.
*
The meeting ended, but the debate continued for months.
My idea for blazing a path on educating users about privacy never gained the endorsement of Larry and Sergey, and so did not come to fruition. Perhaps they were right that it would have opened a Pandora's box. The issue of privacy would never go away, and trying to explain our rationale might only make things more confusing. Why not let the issue come to us instead of rushing out to meet it? We weren't willing to talk about the wonderful benefits of users sharing their data with us, because we weren't willing to share any information about how we used that data. If we couldn't say something nice, why say anything at all?
That didn't stop me from assuming the most aggressive possible stance when it came to communicating with users about privacy each time a new product launched. I repeated the Yada Yada story to every Googler who would listen, though I found few converts to my vision of users making fully informed decisions about the data they shared with us. Most engineers felt the tradeoff was too high. If users came to Google looking for information about online privacy, they figured, we would help them the way we always did—by sending them somewhere else for answers.
Larry refused to talk directly to users about cookies and log files, and he tried to keep the public from getting curious by minimizing their exposure to the data we collected. He wasn't always successful.
For example, a display of "real time" Google search queries crawled across a video monitor suspended over the receptionist's desk in our lobby. I sometimes sat on the red couch and watched to find out what the world was looking for. The terms scrolled by silently in a steady stream:
new employment in Montana
scheduled zip backup
greeting cards free
nervous system
lynyrd skynyrd tabliature Tuesday
datura metal
tamron lense 500mm
mode chip for playstation
the bone collector
singles chat
Journalists who came to Google stood in the lobby mesmerized by this peek into the global gestalt and later waxed poetical about the international impact of Google and the deepening role search plays in all our lives. Visitors were so entranced that they stared up at the display as they signed in for their temporary badges, not bothering to read the restrictive non-disclosure agreements they were agreeing to.
The query scroll was carefully filtered for offensive terms that might clash with our wholesome image.
*
Offensive terms written in English, anyway. I recall a group of Japanese visitors pointing and smirking at some of the katakana characters floating across the page. The inability to identify foreign-language porn is just one of the reasons we never used the query scroll widely for marketing purposes, despite its ability to instantly turn esoteric technology into voyeuristic entertainment.
Larry never cared for the scrolling queries screen. He constantly monitored the currents of public paranoia around information seepage, and the scrolling queries set off his alarm. He felt the display could inadvertently reveal personal data, because queries could contain names or information that users would prefer to remain private (for example, "John Smith DUI arrest in Springfield" or "Mary Jones amateur porn movie"). Moreover, it might cause people to think more about their own queries and stir what he deemed to be ungrounded fears over what information was conveyed with each search.
Larry tried to kill the Google Zeitgeist, too. Zeitgeist was a year-end feature that the PR team put together recapping the trends in search terms over the previous twelve months. The press loved Zeitgeist because it gave them another way to wrap up the year, but to Larry it raised too many questions about how much Google knew about users' searches and how long we kept their data. Cindy asked me to come up with a list of reasons to continue the tradition, and my rationale evidently convinced Larry the risk was acceptable, because the year-end Zeitgeist is still published on
Google.com
.
All the while we wrestled with the issues of what to tell users, our ability to mine their data became better and better. Amit Patel, as his first big project at Google, had built a rudimentary system to make sense of the logs that recorded user interactions with our site. Ironically, the same engineer who did the most to seed the notion of "Don't be evil" in the company's consciousness also laid the cornerstone of a system that would bring into question the purity of Google's intentions.
Amit's system was a stopgap measure. It took three years and an enormous effort from a team of Googlers led by legendary coder Rob Pike to perfect the technology that, since it processed logs, came to be designated "Sawmill." The power of Sawmill when it was activated in 2003 gave Google a clear understanding of user behavior, which in turn enabled our engineers to serve ads more effectively than Yahoo did, to identify and block some types of robotic software submitting search terms, to report revenue accurately enough to meet audit requirements, and to determine which UI features improved the site and which confused users. If engineers were reluctant to delete logs data before Sawmill, they were adamant about retaining it afterward.
Larry's refusal to engage the privacy discussion with the public always frustrated me. I remained convinced we could start with basic information and build an information center that would be clear and forthright about the tradeoffs users made when they entered their queries on Google or any other search engine. I didn't really believe many people would read all the pages, or particularly care what they said. In fact, I somewhat cynically counted on that. The mere fact that we had the explanation available would allay many of their concerns. Those who truly cared would see we were being transparent. Even if they didn't like our policies on data collection or retention, they would know what they were. If they went elsewhere to search, they would be taking a chance that our competitors' practices were far worse than ours.
To Larry the risks were just too high. Once we squeezed the toothpaste out of the tube, we would not be able to put it back. And he would never do anything that might cost us complete access to what Wayne Rosing called our "beautiful, irreplaceable data."
Ka-chunk. Ka-chunk. Over the rest of 2003, the product-development group under Jonathan worked with engineering to stamp out a series of features, services, and infrastructure updates.