Onion encryption and its application for Tor anonymous communication

From CryptoWiki
Revision as of 19:26, 5 May 2019 by 19-StukanovAA (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Onion encryption (onion routing) is a method for anonymously exchanging information using a computer network. In the onion network, messages are encapsulated in several layers of encryption, similar to onion layers. Encrypted data is transmitted through a set of network nodes, called onion routers, each of which “removes” one level of encryption, receiving information about which node it is necessary to send the received data further. The message arrived at its destination when the last layer of encryption was decrypted. The sender retains its anonymity, as the intermediate nodes have information only about the immediately preceding and subsequent nodes.

Contents

Background

The use of a switched communications network should not automatically entail disclosure of who communicates with whom. Onion routing is a flexible communications infrastructure that is robust to both eavesdropping and traffic analysis. Onion routing achieves this goal by separating identification from routing. Connections to be established are always anonymous, but communications need not be anonymous. Information exchange can be anonymous by removing the identifying information from the data stream. Onion routing can be used by a wide variety of different Internet applications without any modifications by using proxy servers (non-invasive procedure) or by changing the network protocol stack on a particular machine that will be connected to the network (moderately or highly invasive procedure).

Network traffic analysis can be used to identify communicating parties in a public network. For example, in packet-switched networks, each packet has a header for routing and a payload containing user data. The header, which should be visible to the network (and to attackers watching the network), contains the identity of the sender and the receiver of the packet. Even if the header was somehow hidden, the packet could still be traced as it moves through the network. Encrypting user data is also inefficient, since the purpose of traffic analysis is to identify the participants in the information exchange, and not (at least, not directly) to gain the content of the messages. The efficiency and accessibility of the public Internet are strong motivators for companies to use the Internet instead of internal intranets. However, such companies may want to protect their interests. For example, a researcher using the Internet can expect that the objects of his research will remain known only to him, or the existence of cooperation between some companies will remain confidential. Individuals may also need to protect their privacy. The identity of the participants in the exchange of electronic messages should be known only to the participants themselves. A person shopping on the Internet may not want his browsing history to be tracked by someone. And someone spending electronic currency always expects that the source of the transaction cannot be traced.

Initially, onion-based routing prototypes were developed on Sun Solaris 2.5.1 / 2.6. The prototype included a set of proxy servers for anonymous web browsing, remote connection and anonymization of the proxy servers themselves, which remove the identifying information from the data stream. The motivation for developing onion routing is not the provision of an anonymous method of communication itself, but the separation of identification and routing. Applications and users can reveal their identities to each other, but the use of the public network should not automatically lead to the disclosure of the identities of communication participants to third parties.

Traffic analysis problem

Letters sent by mail are usually placed in an envelope, which is marked with the addresses of the sender and the recipient. We trust the postmen that they will not look into the contents of the envelope, as it is private. In addition, we trust the postal service that no one will keep track of who sends letters to whom, because such information is also private.

These two types of confidential information, the contents of the envelope and addresses on the envelope are also relevant for communication over the Internet. As well as post mail, e-mails travel in electronic envelopes, and protecting the confidentiality of these messages requires the protection of the contents of these envelopes and their addresses. Despite the fact that the participants of communication are usually known to each other, the use of public networks should not entail the disclosure of the participants' identities and the contents of their communication to third parties. The first problem is traffic analysis, the second is eavesdropping.

By obstructing traffic analysis and eavesdropping, confidentiality of communications can be protected. But what about anonymity? Can the two parties communicate if one or both of them want to hide their identity from the other? If the contents of the electronic envelope remain secret and the addresses on the envelope are also hidden, then any identifying information can only be inside the envelope. Thus, for anonymous communication, it is necessary to remove the identifying information from the contents of the envelope. These goals seem unreachable: is it possible to keep the contents of the envelope confidential? How can a letter reach the addressee if its address is hidden? Can two parties communicate without revealing their identities to each other? Can all this be achieved without the use of a third party which will have to be trusted to not memorize addresses or open envelopes?

Onion routing uses a set of well-known network and cryptographic methods to protect both privacy and anonymity of communication over the Internet from eavesdropping and analyzing traffic. Onion encryption is widely used in various Internet applications to ensure confidentiality and anonymity of communication.

Onion Routing

If you protect the communication channel from eavesdropping and analyzing traffic, and remove the identifying information from the data stream, you can achieve confidential and anonymous communication. Onion encryption provides socket connections that are resistant to both eavesdropping and traffic analysis. The confidentiality of these socket connections has been transferred below the application level and made independent of the applications. Standard Internet applications can use these anonymous connections by using proxy servers. The principle of onion routing is as follows: the application, instead of opening a connection directly to the destination machine, opens connections to the onion proxy server. This proxy server creates an anonymous connection through a set of other onion routers to the destination. Each onion router has information only about its neighboring routers along the information path. Before sending data over an anonymous connection, the first router adds one level of encryption for each router along the way. In the process of transferring data over an anonymous connection, each router removes one level of encryption, thus, the data eventually arrive in plain text. When moving data in the opposite direction, this layering of encryption occurs in the reverse order. The data sent over the anonymous connection seems different for each onion router and thus cannot be tracked during the transmission, and the compromised routers will not be able to help the intruders. After the connection is broken, all information about this connection is deleted on all onion routers.

Users can also use onion routing in a more invasive way. A new layer is being injected into the TCP / IP network protocol stack, which redirects all TCP traffic through onion routing. This method is not implemented for Tor, as it goes against the ease of use (no need to be a superuser for installing and using Tor) and platform independence. In the first generation of onion routing, this was possible by using the NRaD redirector (for Windows 95 / NT). This program transparently routed TCP / IP connections through a network of onion routers. Research into such methods, combining ease of use, interoperability, and deployability, which are the goals of onion routing, is still ongoing.

Onion routing differs from other ways of preserving anonymity in three ways: communication takes place in real time and is bidirectional; anonymous connections are application independent; and there is no centralized trusted component. Applications can choose for themselves, whether to identify their users via an anonymous connection or not. However, the use of public switched networks should not automatically reveal the participants of communication.

Tor

Tor is a distributed overlay network created to anonymize applications that use low latency TCP connections, such as a web browser, secure shell, instant messaging. Clients choose a path through a network and create a so-called “circuit”, in which each node (onion router) in the path knows only the previous and subsequent node of the path, but not the other nodes of the circuit. The traffic passing through the circuit is sent in so-called “cells” of fixed size, from which encryption levels are removed using symmetric keys on each node, and which are then sent on.

Users can use onion proxies to access the Tor network. Onion routers maintain connections using TLS between each other, and the onion proxies connect to the network by establishing a TLS connection with one or more onion routers. All peer-to-peer communications are carried out through these TLS connections. The purpose of this network is to prevent third parties or participating nodes from associating the sender of the message with the recipient. User data is transmitted via fixed circuits. A circuit is a path within the Tor network consisting of two or more onion routers (three by default). At the beginning of the circuit is an onion proxy server that transmits data across the circuit; on the Tor network, the onion proxy is not considered part of the circuit. The output node is the onion router, which is responsible for delivering user data to the intended recipient (which may be outside the network). By default, the last node of the circuit acts as an output node, but other nodes can also serve as an output node; This characteristic is called “leaky pipes”. Such variable output nodes are supported in Tor due to the fact that several data streams can pass through one circuit. To simplify further material, the last node of the circuit will be considered as the output node.

Circuit establishing

Onion proxy server is responsible for creating circuits. For this purpose, the proxy server selects a sequence of onion routers, in which each node enters only once. The proxy server sequentially establishes symmetric keys with each of the routers using the circuit extend protocol: at each step the chain is extended by one node in a telescopic way so that establishing a key with an i + 1 node in the circuit passes through the current partial circuit, in which the i-th node temporarily acts as an output node. Establishing a circuit on the Tor network allows you to create a two-way communication channel between the onion proxy server and the output node. Once the circuit is established, the onion-based proxy server has one symmetric key with each node in this chain. Moreover, each onion router communicates with each adjacent node in the circuit using unique identifiers that will be used only in this circuit. Then the onion proxy server uses the established circuit to instruct the output node on which address and port to establish a TCP connection.

Data transmission

Data that is intended for transmission over an established connection is encapsulated into relay cells, and the relay protocol protects each cell with a checksum and several encryption levels: the onion proxy adds one encryption level (AES in counter mode with a key length of 128 bits) for each onion router in the circuit. After receiving the relay cell, the onion router finds the corresponding identifier of the circuit to which the cell belongs and uses the corresponding key to remove one level of encryption. If the cell is routed from the onion proxy server, the onion router checks the checksum of the resulting cell. The checksum consists of two zero bytes and four bytes of a one-way function with a key from the transmitted data. If the check is passed, then this relay cell is directed to this onion router (each node of the circuit can be an output node). Otherwise, the router finds the identifiers of the circuit and the next node for transmitting data in the circuit, replaces the circuit identifier and forwards the relay cell to the next router in the circuit. If a router at the end of the circuit receives an unidentified relay cell, an error occurs and the circuit is destroyed. The onion proxy handles incoming relay cells in a similar way: it iteratively removes one level of encryption for each router in the circuit, starting with the closest and ending at the farthest. If at some point the checksum is correct, then the cell was sent by a router whose encryption level has just been removed.

Cells

In the Tor network, all data exchanged between nodes is encapsulated into cells. In most cases, the cells have a fixed size. In version 4 and above, the fixed cell size is 514 bytes (512 bytes for versions below 4), and the cells consist of a header and a payload. The header consists of a chain identifier (id) field (4 bytes in size) and a one-byte command field (cmd), which indicates what needs to be done with the payload of the cell. The circuit identifiers are tied to the corresponding connections, that is, in the process of moving the cell along the circuit, it will have different identifiers for each connection between the routers through which it passes. The cell payload is protected by onion encryption, in which each cell is additionally encrypted using TLS during transmission between routers. Depending on the value of the command field in the header, the cells are divided into control cells, which are interpreted by the node to which they come, and relay cells that carry the end-to-end data flow. Control cells are used to create, maintain, and destroy circuits. Relay cells have an additional relay header that is in front of the payload and consists of a two-byte stream identifier, a six-byte checksum, a two-byte field for length, and a one-byte field for the relay command. The stream identifier allows you to transfer multiple data streams on the same circuit. The checksum is used by the onion routers to determine if the cell is intended for this router. The length field indicates the length in bytes of the payload to be retransmitted. The onion proxy server exchanges relay commands with the output node to support TCP connections. For example, to send the output node a command to open a TCP connection to the address specified in the body of the relay cell.

Onion Encryption Modelling

The presented model abstracts some aspects which can be important in real life (for example, key distribution and traffic analysis), but which are to some degree orthogonal to the cryptographic component of the problem. The model considers two types of roles, corresponding to onion proxies and onion routers in Tor. Onion routers and proxy servers are modeled by nodes in a (oriented) graph in which direct communications correspond to edges. In this model, there is an assumption that independent unidirectional secure channels correspond to the edges, and the graph is fully oriented, which allows any party to communicate safely (but not anonymously) directly with any other party. All participants have public unique identifiers and can play the role of a proxy server or router. As with Tor, onion proxies are responsible for circuit initialization and message encryption. The circuit consists of the onion proxy server and the path p along the graph of the onion routers. This path cannot contain cycles and its length is denoted by L. The circuit is represented as a vector of nodes p[1], . . . , p[L], where p[0] denotes a proxy server.

In each circuit, the terms sending node, the receiving node and the redirecting node respectively denote the onion proxy server, the last node of the path p [L] and intermediate nodes p[1], . . . , p[L-1] on the way. Proxy servers and routers have their own state vector, containing a state for each circuit to which they belong. Proxy servers use their state for encryption and know which circuit the encrypted cell is sent to. Routers, when receiving a cell, first determine which circuit it belongs to. Accordingly, the decryption is divided into two stages D1 and D2, where D1 is responsible for determining the corresponding circuit and D2 is responsible for processing the cell. The sender ID of a cell can be used to identify a circuit, but not necessarily uniquely, since multiple circuits can travel along a single edge. The node stores its individual decryption states in two separate state vectors τ1 and τ2, respectively for D1 and D2. Each time a new circuit is created, each receiving and redirecting node adds a new component to its state vectors τ1 and τ2.

Onion Encryption

The (symmetric) onion encryption scheme OEformula.png is a set of four algorithms for which the message space MsgSpc.png and the cell space CelSpc.png are defined.

  • The algorithm for creating circuits with states G is an abstraction of how circuits are created (in reality, this process is interactive). This algorithm takes as input p path in which there should not be loops (you cannot use the same router more than once), and includes the proxy server p[0]. The algorithm updates its internal state% (initially equal to ε) and returns the initial encryption state σ (transmitted p[0]) and two vectors t1 and t2 consisting of the initial components of the decryption states, each component of the vectors corresponds to one router in the path, that is, T1t2p.png. When receiving the corresponding t1 and t2 elements, the routers add these values ​​to a pair of their decryption states (τ1, τw). Thus, if Pabcde.png, the individual decryption states are updated in the following order τ1b.append ( t1 [1]), τ2b.append ( t2 [1]),. . . , τ1e.append ( t1 [4]), τ2e.append ( t2 [4]). Similarly, the proxy decryption status vector is updated: σa.append (σ).
  • Algorithm E is used by the proxy server to send a message along one of the circuits. Having the current encryption state σ [w] for the circuit locally designated w, and the message MMsgSp.png, the algorithm updates the encryption state and returns the initial CClSp.png along with the identifier d for the router to which the cell will be transmitted.
  • The deterministic algorithms D1 and D2 together are responsible for processing the incoming cell by the router. At the first stage, the D1 algorithm determines which circuit the incoming cell c belongs to, and the algorithm can also use the node identifier s, from which the given cell was received. In addition, the D1 algorithm takes as input the first decryption state of the current node τ1, but cannot change it. The algorithm returns a local index w, which shows which circuit the cell was associated with, and thus which component of the second decryption state vector τ2 should be used to process the cell by the D2 algorithm. The symbol ⊥ means that the corresponding chain was not found for the cell. At the second stage, the D2 algorithm takes as input parameters to the state component τ2 [w] together with the node from which the cell was obtained, and the cell itself. The algorithm can update the decryption state component (but no other part of the state) and returns the output string x and the destination node d. The value of d indicates the node to which the string x will be transmitted, where XClSp.png. If d = {}, then the router understands that the message is intended for him, in this case XMsgBot.png.

Correctness

Figure 1 - TRANSMIT game used to determine the correctness of the onion encryption scheme

Correctness ensures that the honestly generated cells are correctly redirected and decrypted to the original messages at the destination in the same order in which they were sent. Correctness should be maintained regardless of which circuits are created and when, or the order in which cells are processed, as long as the order of cells belonging to the same circuit is maintained.

Correctness is formulated through the game (see Fig. 1), in which the scheduler is allowed to create circuits, select messages sent by nodes, and determine the order in which individual routers process cells in different circuits. Reordering cells belonging to different circuits simulates the unpredictability of delays in the physical network, as well as the (limited) freedom of the router to mix cell processing to make it difficult to analyze traffic. The scheduler, however, cannot change cells.

Specifically, for each channel i, a list of mi messages is maintained, sent via this channel (using ENC), and a check at the end of the router (PASS) is performed whether the messages arrive in the correct order (counter ctri indicates the next message to mi to be received). Moreover, for each chain i and each of |pi| routers in their path (counted using j), a queue of Qij is maintained to track cells waiting to be processed by this router. Thus, forwarding a cell by a router causes it to be removed from the queue for the current router and chain and queued for the next router.

Definition 1 (Correctness). The OE onion-based encryption scheme is said to be correct if the following is true for all scheduling algorithms S (including computationally unbounded):

Transmit.png

where the game TRANSMIT is presented in Figure 1.


The integrity of the plaintext

Figure 2 - The PINT game used to define plaintext integrity for onion encryption schemes

The integrity of the plaintext ensures that even in the presence of an attacker with an almost complete knowledge of all states and full control over the network, an honest receiving node can be sure that the messages it receives correspond to the messages being sent (provided that the sending node was not compromised). This reflects an attacker's inability to add, modify, reorder, or replay messages.

The PINT game (Figure 2) models the integrity of the plaintext. For each circuit with an honest proxy, we maintain a mi list to check if messages arrive unchanged and in the correct order to an honest recipient. Oracle PROC (s, v, c) models the processing of node v of cell c, obtained from s.

Similar to the correctness game, the ctrn counter indicates the following message to mn to be received. As with determining the integrity of plain text for regular channels, if an honest recipient accepts a message that has not been sent (in that order), the opponent wins. In addition — and this concept is unique to routing networks — if the intermediate forwarding router assumes that the cell contains a valid message intended for it, the opponent wins. This victory is a consequence of the choice not to allow "leaky pipes."

Also note that the indexing of a nonexistent vector component returns a special character outside the set {0, 1}. That is, for any vector m and any k> 0, if Mt.png, then Mtk.png. In particular, if the onion encryption scheme allows for “dummy” cells that are decoded into an empty string, an attacker capable of faking such a dummy cell is considered successful in the PINT game.

Definition 2. The plaintext integrity advantage of adversary A against OE is defined by

Def2.png

where the game PINT is shown in Figure 2.

Confidentiality

Figure 3 - The LOR game used to define left-or-right indistinguishability for onion encryption schemes

Confidentiality ensures that the attacker does not receive knowledge of the contents of the messages sent along the circuit, provided that both the receiving and sending nodes will not be compromised. Otherwise, the attacker may have full knowledge of all states (except%) and full control over the network. As usual for confidentiality, both the passive CPA (Chosen Plaintext Attack) and the active CCA (Chosen Ciphertext Attack) can be considered. The cited game (Figure 3) represents Chosen Cell Attacks (Attacks on selected cells).

The mi lists are the same as before, although, since the concept of "left or right" is considered, they will contain either all the "left" or all the "right" messages. Oracle PROC plays the role of an oracle of decryption, which can lead to trivial victories, if an attacker is allowed to know the result of decrypting the last cell. As long as the receiving router is synchronized with the proxy server, the message that should be displayed will be suppressed by the attacker (by returning the special character "lightning" instead of it). As soon as one message is rejected (which includes the error symbol ⊥), the receiving node is considered unsynchronized and henceforth its output will no longer be suppressed. Intermediate nodes are considered to be unsynchronized from the very beginning, so their output will never be suppressed.

Definition 3. The plaintext confidentiality advantage of adversary A against OE is defined by:

Def3.png

where the LOR game is shown in Figure 3.

Glossary

Bibliography

Got to Bibliography

Back to top

Stukanov А.А., 2019