Onion routing
Onion routing is a technique for anonymous communication over a computer network. In an onion network, messages are encapsulated in layers of encryption, analogous to layers of an onion. The encrypted data is transmitted through a series of network nodes called onion routers, each of which "peels" away a single layer, uncovering the data's next destination. When the final layer is decrypted, the message arrives at its destination. The sender remains anonymous because each intermediary knows only the location of the immediately preceding and following nodes. While onion routing provides a high level of security and anonymity, there are methods to break the anonymity of this technique, such as timing analysis.
Development and implementation
Onion routing was developed in the mid-1990s at the U.S. Naval Research Laboratory by employees Paul Syverson, Michael G. Reed, and David Goldschlag to protect U.S. intelligence communications online. It was further developed by the Defense Advanced Research Projects Agency and patented by the Navy in 1998.Computer scientists Roger Dingledine and Nick Mathewson joined Syverson in 2002 to develop what would become the largest and best known implementation of onion routing, Tor, then called The Onion Routing project. After the Naval Research Laboratory released the code for Tor under a free license, Dingledine, Mathewson and five others founded The Tor Project as a non-profit organization in 2006, with the financial support of the Electronic Frontier Foundation and several other organizations.
Data structure
An onion is the data structure formed by "wrapping" a message with successive layers of encryption to be decrypted by as many intermediary computers as there are layers before arriving at its destination. The original message remains hidden as it is transferred from one node to the next, and no intermediary knows both the origin and final destination of the data, allowing the sender to remain anonymous.Onion creation and transmission
To create and transmit an onion, the originator selects a set of nodes from a list provided by a "directory node". The chosen nodes are arranged into a path, called a "chain" or "circuit", through which the message will be transmitted. To preserve the anonymity of the sender, no node in the circuit is able to tell whether the node before it is the originator or another intermediary like itself. Likewise, no node in the circuit is able to tell how many other nodes are in the circuit and only the final node, the "exit node", is able to determine its own location in the chain.Using asymmetric key cryptography, the originator obtains a public key from the directory node to send an encrypted message to the first node, establishing a connection and a shared secret. Using the established encrypted link to the entry node, the originator can then relay a message through the first node to a second node in the chain using encryption that only the second node, and not the first, can decrypt. When the second node receives the message, it establishes a connection with the first node. While this extends the encrypted link from the originator, the second node cannot determine whether the first node is the originator or just another node in the circuit. The originator can then send a message through the first and second nodes to a third node, encrypted such that only the third node is able to decrypt it. The third, as with the second, becomes linked to the originator but connects only with the second. This process can be repeated to build larger and larger chains, but is typically limited to preserve performance.
When the chain is complete, the originator can send data over the Internet anonymously. When the final recipient of the data sends data back, the intermediary nodes maintain the same link back to the originator, with data again layered, but in reverse such that the final node this time removes the first layer of encryption and the first node removes the last layer of encryption before sending the data, for example a web page, to the originator.
Weaknesses
Timing analysis
One of the reasons typical Internet connections are not considered anonymous, is the ability of Internet service providers to trace and log connections between computers. For example, when a person accesses a particular website, the data itself may be secured through a connection like HTTPS such that the user's password, emails, or other content is not visible to an outside party, but there is a record of the connection itself, what time it occurred, and the amount of data transferred. Onion routing creates and obscures a path between two computers such that there's no discernible connection directly from a person to a website, but there still exist records of connections between computers. Traffic analysis searches those records of connections made by a potential originator and tries to match timing and data transfers to connections made to a potential recipient. If an attacker has compromised both ends of a route, a sender may be seen to have transferred an amount of data to an unknown computer a specified amount of seconds before a different unknown computer transferred data of the same exact size to a particular destination. Factors that may facilitate traffic analysis include nodes failing or leaving the network and a compromised node keeping track of a session as it occurs when chains are periodically rebuilt.Garlic routing is a variant of onion routing associated with the I2P network that encrypts multiple messages together, which both increases the speed of data transfer and makes it more difficult for attackers to perform traffic analysis.
Exit node vulnerability
Although the message being sent is transmitted inside several layers of encryption, the job of the exit node, as the final node in the chain, is to decrypt the final layer and deliver the message to the recipient. A compromised exit node is thus able to acquire the raw data being transmitted, potentially including passwords, private messages, bank account numbers, and other forms of personal information. Dan Egerstad, a Swedish researcher, used such an attack to collect the passwords of over 100 email accounts related to foreign embassies.Exit node vulnerabilities are similar to those on unsecured wireless networks, where the data being transmitted by a user on the network may be intercepted by another user or by the router operator. Both issues are solved by using a secure end-to-end connection like SSL or secure HTTP. If there is end-to-end encryption between the sender and the recipient, and the sender isn't lured into trusting a false SSL certificate offered by the exit node, then not even the last intermediary can view the original message.