Read Surveillance or Security?: The Risks Posed by New Wiretapping Technologies Online
Authors: Susan Landau
Einstein 3 represented a major change. Einstein 2 monitors communications at the federal network access points. In Einstein 3, agency-bound
traffic is diverted to an intrusion-prevention system. The communications
are searched using NSA tools and hardware.12' Traffic matching known or
suspected malware will be blocked.128 As of 2009, the effort is in experimental stages, being tested on the AT&T network, which sought Department of Justice written approval before it agreed to participate.129
Einstein 3 differs from its predecessors in that it is designed to check
packet content in addition to transactional information.13o That creates
one type of privacy risk. Another is that the system might pick up purely
private communications. At least one DHS lawyer did not see this as a
problem. Stewart Baker, who had once been the NSA chief counsel and
who was in 2008 the DHS assistant secretary for policy, dismissed concerns,
saying, "If, by mistake, some private communication is directed through this system, the result will be that the malware ... won't be delivered, to
which the right response is, 'Thank you very much,' not 'You've violated
my rights'.""'
With Einstein 2, the federal government made clear that users accessing
government sites and government workers have no privacy rights with
respect to their Internet communications. Einstein 3 is potentially more
intrusive, and the program leaves more questions open than answered.
Einstein 3 will collect personally identifiable information. What are the
rules governing the use of this information? How long will the data collected by Einstein 3 be retained? What type of auditing will be employed
in Einstein 3? Under what circumstances would information gained from
Einstein 3 be shared with law enforcement or national security?
There are fundamental questions to be asking here: What privacy and
security risks are created by the Einstein program? Is there a more effective
way to be pursuing the same strategic goals? That conversation, like many
about surveillance technologies, has not been raised in the public sphere.
Note added in proof. In a recent development, the Administrative Office of the U.S.
Courts changed its methods of data collection on encryption information. Encryption information had been included only in the yearly summary reports submitted
to the office. In August 2009, the Administrative Office revised its forms so that data
on encryption would be in the reports submitted at the conclusion of each wiretap.
This new data collection method should result in more accurate reporting-and
quite possibly an increase in the encryption numbers.
In the nineteenth century, the invention of the telephone, telegraph, and
telex changed the practice of business and government. The process was
gradual, but when it was done, it had fundamentally transformed the way
work was done. Although written letters and signed documents still had
their place, the world was now one in which important communications
took place electronically. That transformation took decades.
A second communications revolution began in the mid-1990s. The
opening up of the Internet to commercial traffic, the massive laying down
of fiber-optic cable around the globe, and the worldwide adoption of
cellular phones occurred at a rate many times faster than the original communications revolution. Its consequences, which are taking place in under
a generation, will have a much more profound effect than the original
transformation. Interactions between people and businesses, between citizens and their government, between corporations and other corporations,
were revolutionized. Major new businesses were created (including, of
course, the usual suspects of Amazon, eBay, and Google), and others,
including ones one might not have expected (e.g., newspapers), were completely changed. Major new modes of production, including open source,'
began, or were greatly expanded.
The question is what next. While this book is not a study on the future
of the Internet, security risks of communications surveillance change as
communications technologies do. In this chapter I discuss nascent communications technologies likely to soon have an impact on business,
government, and people.
6.1 The Networked World
One of DARPA's motivations in sponsoring the research that led to the
Internet was that data and programs at one site did not need to be replicated, but instead could be accessed as needed. By the late 1980s the connectivity
provided by the Internet and the high number of unused cycles on many
users' machines enabled a new paradigm: farm computation out. At the
time, Internet communication was largely limited to email, net news, and
file transfer. So the methods used to parcel out data for the computation
were fairly primitive-at least by current standards.
The first such project was factoring the ninth Fermat integer,2 F9 = 2512 + 1,
an integer of 155 digits, into primes (a prime is an integer whose factors are
only itself and one). Fast algorithms for factorization were important for
cryptanalysis, and Fermat numbers were interesting numbers with which to
test the factoring algorithms. People interested in helping the Fermat
number project contacted one of the project members by email or phone
and were shipped a portion of the computation to do on their machine.' In
under half the time that it would have taken to solve the problem had it
been done by a single supercomputer, F9 was factored.
These days a number of problems are done in such a distributed fashion,
but there is nothing ad hoc about the process. Consider SETI@home, the
Search for Extraterrestrial Intelligence, which is run out of University of
California at Berkeley. SETI@home sorts through radio signals from space.
Because narrow-bandwidth radio signals-signals confined to a small part
of the spectrum-are not known to occur naturally, their appearance would
be an indication of extraterrestrial intelligence. The problem is that radio
telescope signals are extremely noisy because of a combination of noise
from the radio telescope itself and celestial background noise. So separating
out the narrow-band signals is computationally expensive. But it is easy
to divide up the radio signals by the segments of the sky from which
they come. By farming out these segments to users, searching for narrowbandwidth signals is a perfect candidate for a distributed calculation. With
five million participants in the project-who can resist the glamour of
participating in a search for extraterrestrial life-SETI certainly distributes
the computation. The project calls itself the "world's largest supercomputer."
The same "world's largest supercomputer" is also being used for calculations at the microscopic level. Folding@home is a Stanford University
distributed computing project to determine the shape of proteins. Strings
of amino acids folded in complex ways, proteins are the building blocks
of our bodies. Folded correctly, proteins digest food, coordinate biological
action, attack viruses and bacteria, and enable life. Folded incorrectly,
proteins are culprits in Alzheimer's, "mad cow" disease, and a host of
other ailments. Thus understanding protein folding is part of determining
potential treatments of these illnesses. Protein folding obeys Newton's laws of motion applied to each atom in the system, and is thus hard to study.
Mimicking those laws, and the slight changes that can cause an error, for
the thousands and thousands of atoms in a single protein is computationally intensive, perfect for a distributed computation project.
Folding@home has users produce hundreds of thousands of simulations,
each with its own trajectory starting at an initial position and with a
random velocity. This enables the Folding@home researchers to estimate
the rate of folding in the protein4 and has already resulted in numerous
advances.'
SETI@home and Folding@home are projects in which users actively join
in solving a complex computation. A joint computation of a very different
sort is one in which users' questions are put to work. This is Google
Trends,' an aggregate compilation of the Google searches people make.
Just as the Centers for Disease Control and Prevention (CDC) can track
outbreaks of diseases by compiling data from hospitals, individual healthcare providers, and labs, so can Google track issues of interest by combining search data to spot trends. Google Trends studies aggregate user data
to determine what users are interested in at the moment. Tracking economic data, such as peoples' searches for "unemployment insurance" or
"foreclosure," means that Google Trends may be a potential predictor
of an economic crisis before the U.S. government collects sufficient data
to make a more definitive prediction. Similarly, by studying queries on
influenza-like illnesses, Google Flu Trends has been able to spot flu outbreaks two weeks ahead of the CDC, a rather impressive difference.'
The network effect, describing the rise in value of a network as more
people use it, is far larger than anyone might have anticipated as packetswitching protocols were being designed, or even when search engines
were being developed. This effect has expanded in highly unexpected
ways-and is likely to continue to do so for quite some time.
6.2 Cloud Computing
Another form of distributing computation is cloud computing, where computing is done not locally, but within the Internet cloud.' This is computing
as a utility, much like electricity or water. While the notion may seem
surprising to those who are accustomed to having an IT department in their
midst, this is not so odd an idea. It has happened for other services. In the
nineteenth century each factory had its own power source and generated
its own electricity. Alternating current enables low-cost transmission of
electricity over a long distance. Once that scientific understanding took hold, the advantage to centralizing the production of electricity was clear,
and electric power companies were born. The economies of scale they
provide in producing electricity far outweigh the costs of transporting the
electricity. Computing is similar. With high bandwidth and ubiquitous
connectivity increasingly available across the United States and the world,
outsourcing of computing is possible. It carries all sorts of advantages:
economies of scale, better predictability and use of resources, and better
ability to adjust the amount of computing used to the instantaneous
demand.
No longer would each business or individual user need to provide sufficient computation power for all their needs; users would pay only for the
capacity they actually used. No longer would each business need to maintain their own data center; they could use a cloud service like Amazon's
Elastic Compute Cloud (EC2) to supply it. Even individuals could benefit.
A cloud service such as Google Calendar, in which multiple people could
post and read entries, and available anywhere with access to the network,
is infinitely useful for scheduling anything involving more than one
person. Similarly, no longer would a user need to maintain copies of a
document on their laptop; they could use Google Docs storage. (Such
remote storage is particularly useful for group projects, since the revised
document would not need to be shipped back and forth between users.)
Rather than have local servers or personal computers do the work, the work
is done in the "cloud," and accessed when needed.
Google Calendar and Google Docs are manifestations of Software as a
Service (SaaS) in which functionality moves from the desktop to the cloud.
Amazon's EC2 is an example of Infrastructure as a Service (IaaS); the service
provided is raw computing and storage. IaaS is especially useful to start-ups,
which can avoid initial capital costs by purchasing computing power only
when it is needed in a pay-as-you-go model. IaaS allows businesses to scale
up quickly.
SaaS and IaaS are two ends of the cloud computing spectrum; in between
lies Platform as a Service (PaaS), which provides support for user-created
services. Thus while SaaS is a single service or suite of services, PaaS provides a computing platform for the user to create her own applications.
PaaS provides such underpinnings as concurrency management and security, and such mechanisms as visualization tools for the user to see how
customers are using the company's services, or database tools to help track
user behavior. SaaS, IaaS, and PaaS are not so much fundamentally different computing structures as they are different levels of software support
offered on what may be a common network of computers.
By moving where the data are stored, cloud computing changes many
playing fields at once. It makes computing and data storage much cheaper.
It consolidates the industry. And it changes risk. Data on a laptop, desktop,
or company's server is only as secure as the protections provided by the
user or company's system administrator. Data in the cloud is likely to be
better secured-a cloud provider that cannot protect the data it is storing
is a cloud provider that will not be in business very long-but the dataprotection equation is fundamentally changed. Cloud computing means
that a company's proprietary and private information is exposed on a
public network. If the network is wiretapped, it is a novel source of risk to
cloud customers.