Block ciphers and their cryptanalysis
In cryptography, a block cipher is a deterministic algorithm operating on fixed-length groups of bits, called blocks, with an unvarying transformation that is specified by a symmetric key. Block ciphers are important elementary components in the design of many cryptographic protocols, and are widely used to implement encryption of bulk data.
The modern design of block ciphers is based on the concept of an iterated product cipher. Product ciphers were suggested and analyzed by Claude Shannon in his seminal 1949 publication Communication Theory of Secrecy Systems as a means to effectively improve security by combining simple operations such as substitutions and permutations.Iterated product ciphers carry out encryption in multiple rounds, each of which uses a different subkey derived from the original key. One widespread implementation of such ciphers is called a Feistel network, named after Horst Feistel, and notably implemented in the DES cipher. Many other realizations of block ciphers, such as the AES, are classified as substitution-permutation networks.
The publication of the DES cipher by the U.S. National Bureau of Standards (now National Institute of Standards and Technology, NIST) in 1977 was fundamental in the public understanding of modern block cipher design. In the same way, it influenced the academic development of cryptanalytic attacks. Both differential and linear cryptanalysis arose out of studies on the DES design. Today, there is a palette of attack techniques against which a block cipher must be secure, in addition to being robust against brute force attacks.
Even a secure block cipher is suitable only for the encryption of a single block under a fixed key. A multitude of block cipher modes of operation|modes of operation have been designed to allow their repeated use in a secure way, commonly to achieve the security goals of confidentiality and authentication|authenticity. However, block ciphers may also be used as building blocks in other cryptographic protocols, such as universal hash functions and pseudo-random number generators.
Contents |
Definition
A block cipher consists of two paired algorithms, one for encryption, E, and the other for decryption, D.Both algorithms accept two inputs: an input block of size n bits and a key of size k bits; and both yield an n-bit output block. The decryption algorithm D is defined to be the inverse function of encryption. More formally a block cipher is specified by an encryption function
which takes as input a key K of bit length k, called the key size, and a bit string P of length n, called the block size, and returns a string C of n bits. P is called the plaintext, and C is termed the ciphertext. For each K, the function (P) is required to be an invertible mapping on {0,1}^{n}. The inverse for E is defined as a function taking a key K and a ciphertext C to return a plaintext value P, such that
Design
Iterated block ciphers
Most block cipher algorithms are classified as iterated block ciphers which means that they transform fixed-size blocks of plain-text into identical size blocks of ciphertext, via the repeated application of an invertible transformation known as the round function, with each iteration referred to as a round.
Usually, the round function R takes different round keys K_{i} as second input, which are derived from the original key: where M_{0} is the plaintext and M_{r} the ciphertext, with r being the round number.
Frequently, key whitening is used in addition to this. At the beginning and the end, the data is modified with key material (often with XOR, but simple arithmetic operations like adding and subtracting are also used):
Substitution-permutation networks
One important type of iterated block cipher known as a substitution-permutation network (SPN) takes a block of the plaintext and the key as inputs, and applies several alternating rounds consisting of a substitution stage followed by a permutation stage—to produce each block of ciphertext output. The non-linear substitution stage mixes the key bits with those of the plaintext, creating Shannon's confusion. The linear permutation stage then dissipates redundancies, creating diffusion.
A substitution box (S-box) substitutes a small block of input bits with another block of output bits. This substitution must be one-to-one, to ensure invertibility (hence decryption). A secure S-box will have the property that changing one input bit will change about half of the output bits on average, exhibiting what is known as the avalanche effect—i.e. it has the property that each output bit will depend on every input bit.
A permutation box (P-box) is a permutation of all the bits: it takes the outputs of all the S-boxes of one round, permutes the bits, and feeds them into the S-boxes of the next round. A good P-box has the property that the output bits of any S-box are distributed to as many S-box inputs as possible.
At each round, the round key (obtained from the key with some simple operations, for instance, using S-boxes and P-boxes) is combined using some group operation, typically XOR.
Decryption is done by simply reversing the process (using the inverses of the S-boxes and P-boxes and applying the round keys in reversed order).
Feistel ciphers
In a Feistel cipher, the block of plain text to be encrypted is split into two equal-sized halves. The round function is applied to one half, using a subkey, and then the output is XORed with the other half. The two halves are then swapped.
Let F be the round function and let K_{0},K_{1},...,K_{n} be the sub-keys for the rounds 0,1,...,n respectively.
Then the basic operation is as follows:
Split the plaintext block into two equal pieces, (L_{0},R_{0})
For each round i =0,1,...,n, compute
Then the ciphertext is (R_{n+1}, L_{n+1}).
Decryption of a ciphertext (R_{n+1}, L_{n+1}) is accomplished by computing for i=n,n-1,...,0
Then (L_{0},R_{0}) is the plaintext again.
One advantage of the Feistel model compared to a substitution-permutation network is that the round function F does not have to be invertible.
Operating modes
Cryptanalysis
Cryptanalysis is the art and science of analyzing information systems in order to study the hidden aspects of the systems. Cryptanalysis is used to breach cryptographic security systems and gain access to the contents of encrypted messages, even if the cryptographic key is unknown. In addition to mathematical analysis of cryptographic algorithms, cryptanalysis also includes the study of side-channel attacks that do not target weaknesses in the cryptographic algorithms themselves, but instead exploit weaknesses in their implementation.
Brute-force search
In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique that consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem's statement.
A brute-force algorithm to find the divisors of a natural number n would enumerate all integers from 1 to the square root of n, and check whether each of them divides n without remainder. A brute-force approach for the eight queens puzzle would examine all possible arrangements of 8 pieces on the 64-square chessboard, and, for each arrangement, check whether each (queen) piece can attack any other.
While a brute-force search is simple to implement, and will always find a solution if it exists, its cost is proportional to the number of candidate solutions - which in many practical problems tends to grow very quickly as the size of the problem increases. Therefore, brute-force search is typically used when the problem size is limited, or when there are problem-specific heuristics that can be used to reduce the set of candidate solutions to a manageable size. The method is also used when the simplicity of implementation is more important than speed.
This is the case, for example, in critical applications where any errors in the algorithm would have very serious consequences; or when using a computer to prove a mathematical theorem. Brute-force search is also useful as "baseline" method when benchmarking other algorithms or metaheuristics. Indeed, brute-force search can be viewed as the simplest metaheuristic. Brute force search should not be confused with backtracking, where large sets of solutions can be discarded without being explicitly enumerated (as in the textbook computer solution to the eight queens problem above). The brute-force method for finding an item in a table — namely, check all entries of the latter, sequentially — is called linear search.
Differential cryptanalysis
Differential cryptanalysis is a general form of cryptanalysis applicable primarily to block ciphers, but also to stream ciphers and cryptographic hash functions. In the broadest sense, it is the study of how differences in an input can affect the resultant difference at the output. In the case of a block cipher, it refers to a set of techniques for tracing differences through the network of transformations, discovering where the cipher exhibits non-random behavior, and exploiting such properties to recover the secretkey. Differential cryptanalysis is usually a chosen plaintext attack, meaning that the attacker must be able to obtain encrypted ciphertexts for some set of plaintexts of his choosing. The scheme can successfully cryptanalyze DES with an effort on the order 2^{47} chosen plaintexts. There are, however, extensions that would allow a known plaintext or even a ciphertext-only attack. The basic method uses pairs of plaintext related by a constant difference; difference can be defined in several ways, but the eXclusive OR (XOR) operation is usual. The attacker then computes the differences of the corresponding ciphertexts, hoping to detect statistical patterns in their distribution. The resulting pair of differences is called a differential. Their statistical properties depend upon the nature of the S-boxes used for encryption, so the attacker analyses differentials (Δ_{X}, Δ_{Y}), where Δ_{Y} = S(X ⊕ Δ_{X}) ⊕ S(X) (and ⊕ denotes exclusive or) for each such S-box S. In the basic attack, one particular ciphertext difference is expected to be especially frequent; in this way, the cipher can be distinguished from random. More sophisticated variations allow the key to be recovered faster than exhaustive search.
Linear cryptanalysis
In cryptography, linear cryptanalysis is a general form of cryptanalysis based on finding affine approximations to the action of a cipher. Attacks have been developed for block ciphers and stream ciphers. Linear cryptanalysis is one of the two most widely used attacks on block ciphers; the other being differential cryptanalysis.
The discovery is attributed to Mitsuru Matsui, who first applied the technique to the FEAL cipher (Matsui and Yamagishi, 1992).Subsequently, Matsui published an attack on the Data Encryption Standard (DES), eventually leading to the first experimental cryptanalysis of the cipher reported in the open community (Matsui, 1993; 1994). The attack on DES is not generally practical, requiring 2^{47} known plaintexts.A variety of refinements to the attack have been suggested, including using multiple linear approximations or incorporating non-linear expressions, leading to a generalized partitioning cryptanalysis. Evidence of security against linear cryptanalysis is usually expected of new cipher designs.
Integral cryptanalysis
In cryptography, integral cryptanalysis is a cryptanalytic attack that is particularly applicable to block ciphers based on substitution-permutation networks. It was originally designed by Lars Knudsen as a dedicated attack against Square, so is commonly known as the Square attack. It was also extended to a few other ciphers related to Square: CRYPTON, Rijndael, and SHARK. Stefan Lucks generalized the attack to what he called a saturation attack and used it to attack Twofish, which is not at all similar to Square, having a radically different Feistel network structure. Forms of integral cryptanalysis have since been applied to a variety of ciphers, including Hierocrypt, IDEA, Camellia, Skipjack, MISTY1, MISTY2, SAFER++, KHAZAD, and FOX (now called IDEA NXT).
Unlike differential cryptanalysis, which uses pairs of chosen plaintexts with a fixed XOR difference, integral cryptanalysis uses sets or even multisets of chosen plaintexts of which part is held constant and another part varies through all possibilities. For example, an attack might use 256 chosen plaintexts that have all but 8 of their bits the same, but all differ in those 8 bits. Such a set necessarily has an XOR sum of 0, and the XOR sums of the corresponding sets of ciphertexts provide information about the cipher's operation. This contrast between the differences of pairs of texts and the sums of larger sets of texts inspired the name "integral cryptanalysis", borrowing the terminology of calculus.
Related-key attack
In cryptography, a related-key attack is any form of cryptanalysis where the attacker can observe the operation of a cipher under several different keys whose values are initially unknown, but where some mathematical relationship connecting the keys is known to the attacker. For example, the attacker might know that the last 80 bits of the keys are always the same, even though he doesn't know, at first, what the bits are. This appears, at first glance, to be an unrealistic model; it would certainly be unlikely that an attacker could persuade a human cryptographer to encrypt plaintexts under numerous secret keys related in some way. However, modern cryptography is implemented using complex computer protocols, often not vetted by cryptographers, and in some cases a related-key attack is made very feasible.
Notable block ciphers
DES
DES is the archetypal block cipher— an algorithm that takes a fixed-length string of plaintext bits and transforms it through a series of complicated operations into another ciphertext bitstring of the same length. In the case of DES, the block size is 64 bits. DES also uses a key to customize the transformation, so that decryption can supposedly only be performed by those who know the particular key used to encrypt. The key ostensibly consists of 64 bits; however, only 56 of these are actually used by the algorithm. Eight bits are used solely for checking parity, and are thereafter discarded. Hence the effective key length is 56 bits, and it is always quoted as such.
The key is nominally stored or transmitted as 8 bytes, each with odd parity. According to ANSI X3.92-1981, section 3.5:
Like other block ciphers, DES by itself is not a secure means of encryption but must instead be used in a mode of operation. FIPS-81 specifies several modes for use with DES.Further comments on the usage of DES are contained in FIPS-74.
Overall structure
The algorithm's overall structure is shown in Figure 1: there are 16 identical stages of processing, termed rounds. There is also an initial and final permutation, termed IP and FP, which are inverses (IP "undoes" the action of FP, and vice versa). IP and FP have no cryptographic significance, but were included in order to facilitate loading blocks in and out of mid-1970s 8-bit based hardware.
Before the main rounds, the block is divided into two 32-bit halves and processed alternately; this criss-crossing is known as the Feistel scheme. The Feistel structure ensures that decryption and encryption are very similar processes — the only difference is that the subkeys are applied in the reverse order when decrypting. The rest of the algorithm is identical. This greatly simplifies implementation, particularly in hardware, as there is no need for separate encryption and decryption algorithms.
The ⊕ symbol denotes the exclusive-OR (XOR) operation. The F-function scrambles half a block together with some of the key. The output from the F-function is then combined with the other half of the block, and the halves are swapped before the next round. After the final round, the halves are swapped; this is a feature of the Feistel structure which makes encryption and decryption similar processes.
The Feistel (F) function
The F-function, depicted in Figure 2, operates on half a block (32 bits) at a time and consists of four stages:
- Expansion — the 32-bit half-block is expanded to 48 bits using the expansion permutation, denoted E in the diagram, by duplicating half of the bits. The output consists of eight 6-bit (8 * 6 = 48 bits) pieces, each containing a copy of 4 corresponding input bits, plus a copy of the immediately adjacent bit from each of the input pieces to either side.
- Key mixing — the result is combined with a subkey using an XOR operation. 16 48-bit subkeys — one for each round — are derived from the main key using the key schedule (described below).
- Substitution — after mixing in the subkey, the block is divided into eight 6-bit pieces before processing by the S-boxes, or substitution boxes. Each of the eight S-boxes replaces its six input bits with four output bits according to a non-linear transformation, provided in the form of a lookup table. The S-boxes provide the core of the security of DES — without them, the cipher would be linear, and trivially breakable.
- Permutation — finally, the 32 outputs from the S-boxes are rearranged according to a fixed permutation, the P-box. This is designed so that, after permutation, each S-box's output bits are spread across 4 different S boxes in the next round.
The alternation of substitution from the S-boxes, and permutation of bits from the P-box and E-expansion provides so-called "confusion and diffusion" respectively, a concept identified by Claude Shannon in the 1940s as a necessary condition for a secure yet practical cipher.
GOST 28147-89
The GOST block cipher, defined in the standard GOST 28147-89, is a Soviet and Russian government standard symmetric key block cipher. Also based on this block cipher is the GOST hash function.
Developed in the 1970s, the standard had been marked "Top Secret" and then downgraded to "Secret" in 1990. Shortly after the dissolution of the USSR, it was declassified and it was released to the public in 1994. GOST 28147 was a Soviet alternative to the United States standard algorithm, DES.
GOST has a 64-bit block size and a key length of 256 bits. Its S-boxes can be secret, and they contain about 354 (log_{2}(16!^{8})) bits of secret information, so the effective key size can be increased to 610 bits; however, a chosen-key attack can recover the contents of the S-Boxes in approximately 2^{32} encryptions. GOST is a Feistel network of 32 rounds. Its round function is very simple: add a 32-bit subkey modulo 2^{32}, put the result through a layer of S-boxes, and rotate that result left by 11 bits. The result of that is the output of the round function. In the diagram to the right, one line represents 32 bits.
The subkeys are chosen in a pre-specified order. The key schedule is very simple: break the 256-bit key into eight 32-bit subkeys, and each subkey is used four times in the algorithm; the first 24 rounds use the key words in order, the last 8 rounds use them in reverse order.
The S-boxes accept a four-bit input and produce a four-bit output. The S-box substitution in the round function consists of eight 4 × 4 S-boxes. The S-boxes are implementation-dependent – parties that want to secure their communications using GOST must be using the same S-boxes. For extra security, the S-boxes can be kept secret. In the original standard where GOST was specified, no S-boxes were given, but they were to be supplied somehow. This led to speculation that organizations the government wished to spy on were given weak S-boxes. One GOST chip manufacturer reported that he generated S-boxes himself using a pseudorandom number generator.
For example, the Central Bank of Russian Federation uses the following S-boxes:
# | S-Box |
---|---|
1 | 4 10 9 2 13 8 0 14 6 11 1 12 7 15 5 3 |
2 | 14 11 4 12 6 13 15 10 2 3 8 1 0 7 5 9 |
3 | 5 8 1 13 10 3 4 2 14 15 12 7 6 0 9 11 |
4 | 7 13 10 1 0 8 9 15 14 4 6 12 11 2 5 3 |
5 | 6 12 7 1 5 15 13 8 4 10 9 14 0 3 11 2 |
6 | 4 11 10 0 7 2 1 13 3 6 8 5 9 12 15 14 |
7 | 13 11 4 1 3 15 5 9 0 10 14 7 6 8 2 12 |
8 | 1 15 13 0 5 7 10 4 9 2 3 14 6 11 8 12 |
Cryptanalysis of GOST Compared to DES, GOST has a very simple round function. However, the designers of GOST attempted to offset the simplicity of the round function by specifying the algorithm with 32 rounds and secret S-boxes.
Another concern is that the avalanche effect is slower to occur in GOST than in DES. This is because of GOST's lack of an expansion permutation in the round function, as well as its use of a rotation instead of a permutation. Again, this is offset by GOST's increased number of rounds.
There is not much published cryptanalysis of GOST, but a cursory glance says that it seems secure. The large number of rounds and secret S-boxes makes both linear and differential cryptanalysis difficult. Its avalanche effect may be slower to occur, but it can propagate over 32 rounds very effectively.
However, GOST is not fully defined by its standard: It does not specify the S-boxes (replacement tables). On the one hand, this can be additional secure information (in addition to key). On the other hand, the following problems arise:
- different algorithm implementations can use different replacement tables, and thus, can be incompatible to each other
- possibility of deliberate weak replacement table usage
- possibility (standard does not forbid it) to use replacement tables in which nodes are not commutation, that may lead to extreme security downfall
Despite its apparently strong construction, GOST is vulnerable to generic attacks based on its short (64-bit) block size, and should therefore never be used in contexts where more than 2^{32} blocks could be encrypted with the same key.
Since 2007, several attacks were developed against GOST implementations with reduced number of rounds and/or keys with additional special properties.
In 2011 several authors discovered more significant flaws in GOST cipher, being able to attack full 32-round GOST with arbitrary keys for the first time. It has been even called "a deeply flawed cipher" by Nicolas Courtois.First attacks were able to reduce time complexity from 2^{256} to 2^{228} at the cost of huge memory requirements,and soon they were improved up to 2^{178} time complexity (at the cost of 2^{70} memory and 2^{64} data).
As of December 2012 the best known attack on GOST (2^{101}) is on par with the best known attack (2^{100}, based on another weakness noted by Nicolas Courtois) on widely used Advanced Encryption Standard.
GOST has been submitted to ISO standardization in 2010.
AES
The Advanced Encryption Standard (AES) is a specification for the encryption of electronic data established by the U.S. National Institute of Standards and Technology (NIST) in 2001. For AES, NIST selected three members of the Rijndael family, each with a block size of 128 bits, but three different key lengths: 128, 192 and 256 bits.
AES is based on a design principle known as a substitution-permutation network, and is fast in both software and hardware.Unlike its predecessor DES, AES does not use a Feistel network. AES has a fixed block size of 128 bits, and a key size of 128, 192, or 256 bits.
AES operates on a 4×4 column-major order matrix of bytes, termed the state, although some versions of Rijndael have a larger block size and have additional columns in the state. Most AES calculations are done in a special Finite field arithmetic|finite field.
The key size used for an AES cipher specifies the number of repetitions of transformation rounds that convert the input, called the plaintext, into the final output, called the ciphertext. The number of cycles of repetition are as follows:
- 10 cycles of repetition for 128-bit keys.
- 12 cycles of repetition for 192-bit keys.
- 14 cycles of repetition for 256-bit keys.
Each round consists of several processing steps, each containing four similar but different stages, including one that depends on the encryption key itself. A set of reverse rounds are applied to transform ciphertext back into the original plaintext using the same encryption key.
High-level description of the algorithm
- KeyExpansion—round keys are derived from the cipher key using Rijndael's key schedule. AES requires a separate 128-bit round key block for each round plus one more.
- InitialRound
- AddRoundKey—each byte of the state is combined with a block of the round key using bitwise xor.
- Rounds
- SubBytes—a non-linear substitution step where each byte is replaced with another according to a lookup table.
- ShiftRows—a transposition step where each row of the state is shifted cyclically a certain number of steps.
- MixColumns—a mixing operation which operates on the columns of the state, combining the four bytes in each column.
- AddRoundKey
- Final Round (no MixColumns)
- SubBytes
- ShiftRows
- AddRoundKey.
The SubBytes step
In the SubBytes step, each byte a_{i,j} in the state matrix is replaced with a SubByte S(a_{i,j}) using an 8-bit substitution box, the Rijndael S-box. This operation provides the non-linearity in the cipher. The S-box used is derived from the multiplicative inverse over GF(2^{8}), known to have good non-linearity properties. To avoid attacks based on simple algebraic properties, the S-box is constructed by combining the inverse function with an invertible affine transformation. The S-box is also chosen to avoid any fixed points (and so is a derangement), i.e., , and also any opposite fixed points, i.e., . While performing the decryption, Inverse SubBytes step is used, which requires first taking the affine transformation and then finding the multiplicative inverse( just reversing the steps used in SubBytes step).
The ShiftRows step
The ShiftRows step operates on the rows of the state; it cyclically shifts the bytes in each row by a certain offset. For AES, the first row is left unchanged. Each byte of the second row is shifted one to the left. Similarly, the third and fourth rows are shifted by offsets of two and three respectively. For blocks of sizes 128 bits and 192 bits, the shifting pattern is the same. Row n is shifted left circular by n-1 bytes. In this way, each column of the output state of the ShiftRows step is composed of bytes from each column of the input state. (Rijndael variants with a larger block size have slightly different offsets). For a 256-bit block, the first row is unchanged and the shifting for the second, third and fourth row is 1 byte, 3 bytes and 4 bytes respectively—this change only applies for the Rijndael cipher when used with a 256-bit block, as AES does not use 256-bit blocks. The importance of this step is to avoid the columns being linearly independent, in which case, AES degenerates into four independent block ciphers.
The MixColumns step
In the MixColumns step, the four bytes of each column of the state are combined using an invertible linear transformation. The MixColumns function takes four bytes as input and outputs four bytes, where each input byte affects all four output bytes. Together with ShiftRows, MixColumns provides diffusion in the cipher.
During this operation, each column is multiplied by a fixed matrix:
Matrix multiplication is composed of multiplication and addition of the entries, and here the multiplication operation can be defined as this: multiplication by 1 means no change, multiplication by 2 means shifting to the left, and multiplication by 3 means shifting to the left and then performing XOR with the initial unshifted value. After shifting, a conditional XOR with 0x1B should be performed if the shifted value is larger than 0xFF. (These are special cases of the usual multiplication in GF(2^{8}).) Addition is simply XOR.
In more general sense, each column is treated as a polynomial over GF(2^{8}) and is then multiplied modulo x^{4}+1 with a fixed polynomial c(x) = 0x03 · x^{3} + x^{2} + x + 0x02. The coefficients are displayed in their hexadecimal equivalent of the binary representation of bit polynomials from GF(2)[x]. The MixColumns step can also be viewed as a multiplication by the shown particular MDS matrix in the finite field GF(2^{8}). This process is described further in the article Rijndael mix columns.
The AddRoundKey step
In the AddRoundKey step, the subkey is combined with the state. For each round, a subkey is derived from the main key using Rijndael's key schedule; each subkey is the same size as the state. The subkey is added by combining each byte of the state with the corresponding byte of the subkey using bitwise XOR.
Optimization of the cipher
On systems with 32-bit or larger words, it is possible to speed up execution of this cipher by combining the SubBytes and ShiftRows steps with the MixColumns step by transforming them into a sequence of table lookups. This requires four 256-entry 32-bit tables, and utilizes a total of four kilobytes (4096 bytes) of memory — one kilobyte for each table. A round can then be done with 16 table lookups and 12 32-bit exclusive-or operations, followed by four 32-bit exclusive-or operations in the AddRoundKey step.
If the resulting four-kilobyte table size is too large for a given target platform, the table lookup operation can be performed with a single 256-entry 32-bit (i.e. 1 kilobyte) table by the use of circular rotates.
Using a byte-oriented approach, it is possible to combine the SubBytes, ShiftRows, and MixColumns steps into a single round operation.
Glossary
- DES
- AES
- Encryption
- Decryption
- Cryptographic protocol
- Key
- Cryptanalysis
- Cryptosystem
- Plaintext
- Decryption algorithm
- Cipher
- Ciphertext
- Block cipher
- Hash Function
References
Go to list of references
Shchukin E.,
Serebryakov D.