Introduction
With the development of symmetric encryption, the DES data encryption standard algorithm is no longer suitable for the data encryption security requirements of today's distributed open networks due to its small key length (56 bits). Therefore, in 1997, NIST publicly solicited a new data encryption standard, namely AES[1]. After three rounds of screening, the Rijndael algorithm submitted by Joan Daeman and Vincent Rijmen of Belgium was proposed as the final algorithm of AES. This algorithm will become the new data encryption standard in the United States and is widely used in various fields. Although people still have different opinions on AES, in general, AES as a new generation of data encryption standard has the advantages of strong security, high performance, high efficiency, ease of use and flexibility. AES is designed with three key lengths: 128, 192, and 256 bits. Relatively speaking, the 128-bit key of AES is 1021 times stronger than the 56-bit key of DES[2]. The AES algorithm mainly includes three aspects: round change, number of rounds and key expansion. This article takes 128 as an example to introduce the basic principles of the algorithm; combined with AVR assembly language, the advanced data encryption algorithm AES is implemented.
1 AES encryption and decryption algorithm principle and AVR implementation
AES is a block key, the algorithm inputs 128-bit data, and the key length is also 128 bits. Nr represents the number of rounds of encryption for a data block (the relationship between the number of encryption rounds and the key length is listed in Table 1). Each round requires the participation of an expanded key Expandedkey(i) with the same length as the input block. Since the length of the external input encryption key K is limited, a key expansion program (Keyexpansion) is used in the algorithm to expand the external key K into a longer bit string to generate the encryption and decryption keys for each round.
1.1 Round Change
Each round transformation of AES consists of the following three layers:
nonlinear layer - Subbyte transformation;
line mixing layer - ShiftRow and MixColumn operations;
key addition layer - AddRoundKey operation.
① Subbyte transformation is a nonlinear byte transformation acting on each byte in the state, which can be mapped through the calculated S box.
Schange:
ldi zh,$01;Move the pointer to the first address of the S box
mov zl,r2;Set the data to be searched as the pointer low address
ldtemp,z+;Take out the corresponding data
mov r2,temp;Exchange the data to complete the table lookup
.
.
.
ret
② ShiftRow is a byte swap. It cyclically shifts the rows in the state according to different offsets, and this offset is also selected according to different Nb [3].
Shiftrow:;This is a byte-swapping subroutine
mov temp,r3;because it is 4×4
mov r3,r7; r2 r6 r10 r14 r2 r6 r10 r14
mov r7,r11; r3 r7 r11 r15---r7 r11 r15 r3
mov r11,r15; r4 r8 r12 r17 r12 r17 r4 r8
mov r15,temp; r5 r9 r13 r18 r18 r5 r9 r13
mov temp,r4
mov temp1,r8
mov r4,r12
mov r8,r17
mov r12,temp
mov r17,temp1
mov temp,r18
mov r18,r13
mov r13,r9
mov r9,r5
mov r5,temp
ret
③ In MixColumn transformation, each column in the state is regarded as the result of multiplying a polynomial a(x) over GF(28) by a fixed polynomial c(x). The coefficients of b(x)=c(x)*a(x) are calculated as follows: *The operation is not an ordinary multiplication operation, but a special operation, namely
b(x)=c(x)·a(x)(mod x4+1)
For this operation
b0=02. a0+03. a1+a2+a3Let
xtime(a0)=02.
a0Where the symbol "." represents congruence multiplication modulo an irreducible octal polynomial [3].
mov temp,a0;This is a mixcolimn subroutine
rcall xtime;Call the xtime program
mov a0,temp
mov temp,a1
rcall xtime
eor a0,a1
eor a0,temp
eor a0,a2
eor a0,a3;Complete the calculation of b(x)
.
.
.
xtime:;This is a subroutine
ldi temp1,$1b
lsl temp
brcs next1;If the highest bit is 1, then transfer
next: ret;Otherwise, nothing changes
next1:eor temp,temp1
rjmp next
For the inverse change, the matrix C needs to be changed to the corresponding D, that is, b(x)=d(x)*a(x).
④ The key adding operation (addround) is to perform bit-wise "XOR" on the corresponding bytes in the round key state.
⑤ According to the properties of linear change [1], the decryption operation is the inverse change of the encryption change. This will not be described in detail here.
1.2 Round Change
For different packet lengths, the corresponding number of round changes is different, as listed in Table 1.
1.3 Key Expansion
The AES algorithm uses an external input key K (the number of words in the key string is Nk) to obtain a total of 4 (Nr+1) words of the extended key through the key expansion procedure. It involves the following three modules:
① Position transformation (rotword) - change a 4-byte sequence [A, B, C, D] into [B, C, D, A];
② S-box transformation (subword) - perform S-box replacement on a 4-byte;
③ Transformation Rcon[i] - Rcon[i] represents a 32-bit word [xi-1, 00, 00, 00]. Here x is (02), such as
Rcon[1]=[01000000]; Rcon[2]=[02000000]; Rcon[3]=[04000000]...
Generation of extended key: The first Nk words of the extended key are the external key K; the subsequent word W[[i]] is equal to the "XOR" of the previous word W[[i-1]] and the previous Nk-th word W[[i-Nk]], that is, W[[i]]=W[[i-1]]W[[i- Nk]]. However, if i is a multiple of Nk, then W[i]=W[i-Nk]Subword(Rotword(W[[i-1]]))Rcon[i/Nk]. [page]
When the program is executed, the above subroutines are mainly called, and the specific implementation is as follows:
Keyexpansion:
rcall rotwoed
rcall subword
rcall Rcon
.
.
.
The encryption and decryption process of AES is shown in Figure 1.
Figure 1 AES encryption and decryption process
2 Optimization of AES encryption and decryption algorithm
From the above algorithm process, it can be clearly seen that the most time-consuming part of the program is the circle change part, so the optimization of the algorithm is also here; and the circle change part that can be optimized is the column change. Because the column change is a modular multiplication congruence rule. Since AES encryption and decryption are asymmetric, if it is not optimized, the decryption speed of the algorithm will be much faster than the encryption speed [1].
① Encryption operation. The column transformation (Mixcolumn) can be optimized by calling the xtime subroutine. The specific algorithm [1] is implemented as follows:
Another effective optimization method is to construct a table offline, that is, a column change table. In this way, the encryption speed can be improved by simply looking up the table.
② Optimization of the decryption algorithm. Since the coefficients of the decryption column transformation are 09, 0E, 0B and 0D respectively. It obviously takes a lot of time to implement the above multiplication on the AVR microcontroller, which leads to reduced decryption performance.
Optimization method 1: Decompose the column changes to reduce the number of multiplications.
A careful study of the coefficients of the decryption matrix shows that the decryption matrix and the encryption matrix have a certain connection, that is, the decryption matrix is equal to the multiplication of the encryption matrix and a matrix. Through such a connection, the algorithm can be optimized:
In this way, only a few simple "XOR" can be used to achieve column changes, reduce the number of multiplications, and increase the speed of decryption.
Optimization method 2: construct a table.
As with the encryption construction method, four tables can be constructed: T[ea]=e×a; T[9a]=9×a; T[9a]=9×a; T[ba]=b×a. In this way, only table lookup and simple XOR are needed to complete the decryption task. Although this method will increase additional overhead, it is an effective method.
3 Experimental simulation of AES encryption and decryption
According to the above experimental steps and optimization methods, the experimental results listed in Tables 2 and 3 are obtained.
Assume that the master key is: 000102030405060708090a0b0c0d0e0f (128bit).
Encrypted plaintext: 00112233445566778899AABBCCDDEEFF.
Ciphertext: 69C4E0D86A7B0430D8CDB78070B4C55A.
Decrypted ciphertext: 69C4E0D86A7B0430D8CDB78070B4C55A.
Plaintext: 00112233445566778899AABBCCDDEEFF.
In short, AES is an asymmetric cryptographic system, and its decryption is more complicated and time-consuming than encryption. The decryption optimization algorithm does not increase the storage space, but processes based on column changes. The program is smaller than the original one and saves time. The decryption optimization method is the fastest and most efficient, but it will increase the system's storage space, so its program is also the largest one.
Note: AES128 data encryption and decryption program can be found on the website of this journal (www.dpj.com.cn).
Conclusion
AES advanced data encryption algorithm is superior to DES data encryption algorithm in terms of security, efficiency, and key flexibility. It will gradually replace DES and be widely used in the future. This paper implements the AES algorithm based on the high-speed computing performance of AVR, and optimizes the algorithm in combination with assembly language. According to the specific needs of the actual application, the corresponding method can be selected.
References
1 Song Zhen, et al. Cryptography. Beijing: China Water Resources and Hydropower Press, 2002
2 Yang Yixian. New Theory of Modern Cryptography. Beijing: Science Press, 2002
3 Gu Dawu, et al. Advanced Encryption Standard (AES) Algorithm - Design of Rijndael. Beijing: Tsinghua University Press, 2003
4 Geng Degen, et al. AVR Microcontroller Application Technology. Beijing: Beijing University of Aeronautics and Astronautics Press, 2002
5 Song Jianguo, et al. Principles and Applications of AVR High-Speed Embedded Microcontrollers. Beijing: Beijing University of Aeronautics and Astronautics Press, 2001
6 NIST. Advanced Encryption Standard (AES) .Federal Information Processing Standards Publication,2001
Previous article:Design of electric vehicle lithium battery pack based on ATmega16
Next article:Serial Interface Intelligent Converter Based on AVR Microcontroller
Recommended ReadingLatest update time:2024-11-16 19:33
- Popular Resources
- Popular amplifiers
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- There are several problems when ADC samples two sets of DAC output voltages, as shown in the figure. In addition, if the signal is about 1uA, does it need 20 bits?
- The difference between CSS SSC selectors with and without spaces
- Please recommend a unipolar AD conversion chip with few pins
- BLE Bluetooth Protocol - BLE connection establishment process summary
- Today at 10:00 AM, live broadcast with prizes: [Introduction to TI's GaN-based applications]
- AD20.0.13 gets stuck when starting Adding View: Explorer under WIN10.
- IIC bus MSP430F149 and 24c16 comprehensive experiment programming example
- Why are the high-precision positioning systems of Beidou and GPS not open to ordinary users?
- Low power solar cell charging
- What is the purpose of connecting multiple capacitors in parallel and using a polarized capacitor among them?