Deciphering Encryption: The AES Block Cipher

By November 29, 2016 Cybersecurity Readiness

Last time we explored the once-popular Data Encryption Standard (DES) block cipher. We observed that it implements the confusion and diffusion principles described by Claude Shannon. These are what made DES a useful cipher, but its short key length (just 56 bits) was weak. We saw that it’s possible to make DES securer by encrypting three times (i.e., Triple DES), but this is extra work that shouldn’t be necessary. The death of DES in the 1990s paved the way for the adoption of a new symmetric encryption standard.

The Advanced Encryption Standard

In 1997, the National Institute of Standards and Technology (NIST) put out a call for a new Advanced Encryption Standard (AES) to replace the DES cipher.[1] All in all, 15 algorithms were proposed by 1999. After much testing, analysis, and deliberation, NIST chose Joan Daemen and Vincent Rijmen’s Rijndael cipher. Thus, in 2001, the Rijndael cipher became the new US government standard for symmetric encryption.

How widespread has AES become? Without doubt, it’s used extensively in governmental applications. It has also become the most popular symmetric cipher used in nongovernmental applications. For example, AES has replaced weak ciphers like RC4 in SSL/TLS and the WPA2 Wi-Fi standard. The popularity of the AES cipher is due to its impressive strength and resistance to attacks. Let’s look more closely at AES and see why it’s so popular.

The Basis for AES

Like DES, AES is a symmetric block cipher, but that’s essentially where the similarities end. The AES cipher takes in 128-bit input blocks and produces 128-bit ciphertext blocks. Unlike DES, the AES cipher allows variable-length keys of 128, 192, or 256 bits. Whereas DES is based on simple Feistel networks, AES is based on special algebraic structures called Galois Fields (GFs). We’ll spare the nitty-gritty details of GFs, but let’s identify the key properties as they relate to AES.

 

ENCBlogDCPart4.1

There are different GFs, but AES uses a special one. Because computers operate on binary digits (bits), we want a GF that’s based on the number two (so we can represent zero and one). The designers of AES chose GF(28),[2] which is known as the “AES Field”. For our purposes, it’s worth noting that the AES Field is finite—it has a fixed size. With finite fields, it’s much easier to perform arithmetic operations because we know how large the field is.

We can perform modular addition, subtraction, and multiplication with these types of fields. Even more important for AES, we can perform modular inversion using the Extended Euclidean Algorithm. This inversion operation allows us to perform a kind of division, which completes our arithmetic toolkit.[3] Without getting bogged down in the mathematics of GFs, let’s see how AES encryption works.

Encryption with AES

In AES, there are three possible key lengths. A 128-bit key is standard and fairly secure. A 192-bit key is securer, and a 256-bit key is very secure. The key length determines how many rounds the cipher uses during encryption: A 128-bit key requires 10 rounds, a 192-bit key requires 12 rounds, and a 256-bit key requires 14 rounds. In each round, the entire 128-bit input block is encrypted.[4] Each round (except for the last) also performs identical operations, and before the first round, we add a subkey to the 128-bit input block to add an extra layer of security.[5] We also use a key-scheduling algorithm to determine each round’s subkey.[6]

ENCBlogDCPart4.2[1]

Each normal AES encryption round consists of four parts. First we perform a byte substitution. This provides the confusion element necessary to obscure the relationship between plaintext and ciphertext. We then perform two diffusion functions: ShiftRows and MixColumns. Finally, we add the round’s subkey to the output. The result is then fed through to the next encryption round. The final encryption round performs these same operations but omits the MixColumns function. The end product is a 128-bit ciphertext block. We’ll skip the deep mathematical details, but let’s briefly explore the high-level functions of the AES internals.

ENCBlogDCPart4.3

We say that AES takes a 128-bit input block, but that block is treated as 16 bytes.[7] In effect, instead of operating on the entire 128-bit block at a time, we operate on each of the 16 bytes that comprise that 128-bit block. We begin encryption with a byte substitution function.

The byte substitution divides our 16-byte input into four groups so that each group consists of four bytes. Each byte then undergoes an S-Box substitution. So, we have 16 S-Boxes that each operate on one byte. Each S-Box operates in parallel, and each is identical. We’ll skip the mathematical details, but the S-Box substitution table is precomputed using Galois Field arithmetic. Because the table is fixed by the AES standard, we can efficiently perform substitutions without the complex computations associated with GFs. After all 16 bytes have been substituted, we put them through the ShiftRows function.

The ShiftRows function is fairly simple, more so than the previous byte substitution. Here we shuffle (i.e., permute) the 16 bytes. In RC4 and DES we performed pseudorandom shuffles, but in AES, we perform a rather systematic permutation. If we imagine the 16 bytes as a four-by-four matrix, we can see what is meant by “ShiftRows”. Essentially, we cyclically shift each row in the matrix one position to the left by an increasing amount.[8] For the first row, we shift by zero (i.e., the row stays the same). For the second row, we shift left by one position. For the third row, we shift left by two positions, and for the last row, we shift left by three positions. That concludes the permutation.

ENCBlogDCPart4.4

In the ShiftRows function we used the rows of the four-by-four matrix, but now for the MixColumns function we’re going to use the columns of that matrix. We take each column and perform matrix multiplication with a fixed matrix of Galois Field values.[9] For each 4-byte column, the multiplication produces a 4-byte output. Taken together, they form a 16-byte output. Why are we doing this? This matrix multiplication step actually provides very strong diffusion by ensuring that any bit flipping is spread across the bytes. So, if one bit is flipped early on in the encryption process, MixColumns will ensure that 32 bits (i.e., four bytes) are affected. In effect, within one encryption round, one-quarter of the whole 128-bit input block will be affected. After we finish MixColumns, we put these bytes through one more encryption step.

We end an encryption round with a byte addition. We simply perform an exclusive-or (XOR) on our 16 bytes using the round’s subkey. The result of this XOR is taken as our output that we feed into the next encryption round. If we’re in the last encryption round, then the output of the XOR is taken as our 128-bit ciphertext block. That concludes the AES encryption process.

Decryption with AES

We used Feistel networks in DES, which gave us the neat ability to decrypt ciphertexts by simply running the cipher in reverse. The same principle can be applied to AES. To decrypt in AES, we can simply invert and reverse the cipher’s internals and its key-scheduling algorithm.

We essentially perform the steps of each encryption round in reverse. Each decryption round consists of inverted encryption functions in reversed order. So, we perform a subkey addition and invert the MixColumns function, the ShiftRows function, and the byte substitution. We perform one final subkey addition, which inverts the first subkey addition we performed in encryption.[10]

For each decryption round, we use the reverse order of the encryption subkeys. That is, the first round of decryption uses the last encryption subkey, and the last round of decryption uses the first encryption subkey. The result of this process is the decrypted plaintext message. Neat!

Security and Alternatives

Simply put, AES is probably the strongest publicly available block cipher. It sees widespread application in everything from bank machines to Wi-Fi security. Indeed, it’s actively used as a government standard. As a testament to the cipher’s strength, the NSA formerly used proprietary algorithms until it switched to allow for AES encryption of classified data up to TOP SECRET status using 192-bit or 256-bit keys. If AES is good enough for the NSA, then it must be pretty secure.

That said, advances in quantum computing may put AES at risk. In the next 10-20 years, quantum computers will be powerful enough to break today’s strongest encryptions. Increases in numbers of qubits and improvements in quantum entanglement will enable these computers to solve computationally complex tasks, like discovering AES keys. For now, though, AES is fairly safe and secure.

The major drawback to AES is that it’s a symmetric cipher. This means that encryption and decryption use the same key. This, in turn, means that if Alice and Bob want to use AES, they must either agree upon a key or exchange one. That key-exchange process is a major security liability, especially if a secured channel is unavailable. In the final parts of this series, we’ll see how asymmetric algorithms approach this problem.

[1] It’s of historical interest to note here that the AES competition was fully available to the public. That is, unlike the development of the DES cipher, candidate algorithms were subject to public scrutiny and cryptanalysis. This ensured that weak ciphers were removed from consideration to avoid another situation like with DES.
[2] If you’re curious, 28 = 256. That number should look familiar: It’s the largest key length supported by AES.
[3] It’s worth noting that the GF(28) field contains polynomials with coefficients of either zero or one (think binary). When we perform the actual arithmetic, we don’t add, subtract, multiply, or invert these polynomials. Rather, we operate on the bit vectors that represent them. For example, the polynomial x7 + x3 + x + 1 is represented by (0, 1, 0, 0, 0, 1, 0, 1, 1). We can use vector addition to produce another vector that represents a polynomial in GF(28). The neat implication here is that AES is encrypting bits, but those bits actually represent polynomials in GF(28).
[4] Remember in DES that the 64-bit input block is divided into two 32-bit halves. In each DES encryption round, only one 32-bit half is encrypted. By contrast, all 128 bits of the input block in AES are encrypted per round.
[5] This is known as “key whitening” and wasn’t around when DES was designed. This has become common practice in modern ciphers.
[6] Unfortunately, we don’t have space in this blog to look at the AES key-scheduling algorithm in any great detail. However, it’s worth noting that the subkeys generated by the algorithm are statistically uncorrelated to the original key, which is a huge boon to the cipher’s security.
[7] Here we’re assuming the computer represents each byte as eight bits. So, 16*8=128.
[8] When we shift cyclically, we’re actually rotating the row. As numbers “fall off” the row, they are placed at the end of the row. So, if the row “45 32 C2 24” is shifted two positions to the left, the result would be “C2 24 45 32”.
[9] Remember that we represent polynomials as bit vectors. The bit vector (0, 0, 0, 0, 0, 0, 1, 1) corresponds to 000000112 = 0316. We can use that byte in our matrix operations. Really, though, that byte represents a polynomial in GF(28).
[10] We’re glossing over some detail here, but we figure out the inversion process using Galois Field mathematics. Luckily, the details are specified by the AES standard and are fixed in standards-compliant implementations.