Implementation of RSA Algorithm on TMS320C54x DSP

Publisher:小悟空111Latest update time:2012-07-14 Source: 单片机及嵌入式系统应用 Keywords:DSP Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

introduction

In today's telecommunications era, the use of large-scale electronic computers to process data has greatly accelerated the transmission of information. However, the most worrying problem has also emerged, which is the security of information. The method to protect information is data encryption. By encrypting data transmitted on the network and data stored in the system, the security of the network and information can be greatly improved. The RSA public key cryptography system, which is widely used for its high security, occupies an important position in modern security systems. The RSA algorithm is difficult to implement because it requires a large number of numerical calculations during the encryption and decryption process; and the use of pure software to implement the RSA algorithm, although it reduces the strength of decryption, it increases the calculation time. This paper adopts a combination of software and hardware to implement the RSA algorithm.

DSP (Digital Signal Processor) chip, namely digital signal processor, is a microprocessor specially suitable for real-time digital signal processing. TMS320C54x series is a microprocessor with a special structure, which adopts Harvard structure with program and data separated. It has a special hardware multiplier, widely adopts pipeline operation, and uses special DSP instructions, which can be used to quickly implement various digital signal processing algorithms. Because of these characteristics of TMS320C54x series, it is more suitable for RSA algorithm to realize encryption and decryption of serial data.

1 RSA Algorithm

The RSA algorithm was jointly developed by Rivest, Shamir and Adleman in 1978 and is a public key algorithm named after them. Its encryption key is public, while the decryption key is kept secret. It is based on a very simple number theory idea: "It is easy to multiply two prime numbers, but it is very difficult to decompose the product."

The special feature of the RSA algorithm is that it uses the indecomposability of prime numbers (that is, prime numbers) and selects very large prime numbers (generally hundreds to thousands of digits). In order to keep the data of government and military departments confidential, most of them use prime numbers with more than thousands of digits as encryption keys. There are two key points and difficulties in the RSA algorithm: ① The algorithm mainly performs modulo remainder operations, which adds practical application difficulty to the application of this algorithm, because it is difficult to perform modulo remainder operations on a prime number of thousands of digits; ② Judging whether a number is prime is also a difficult problem that has been discussed and studied and proved in the mathematical community for hundreds of years. Although Fermat proposed the famous "Fermat Conjecture", it has never been fully proved. Based on this, it is even more difficult to find a prime number of thousands of digits.

(1) Principle of RSA algorithm

The RSA algorithm is based on the congruence theory in number theory. If m represents plaintext, c represents ciphertext, E(m) represents encryption operation, D(c) represents decryption operation, and x=y(mode z) represents the congruence of x and y modulo z, then the encryption and decryption algorithms are simply expressed as follows:

Encryption algorithm c=E(m)=me(mod n)

Decryption algorithm m=D(c)=cd(mod n)

Where n and key e are public, while key d is kept secret.

The following discusses how to obtain the key:

① Select two random large prime numbers p and q (confidential);

② Let n = p × q;

③Euler function φ(n)=(p-1)(q-1) (confidential);

④ Select a positive integer e that is relatively prime to φ(n), that is, satisfies gcd(φ(n), e) = 1 and 0

⑤ Calculate d (confidential) so that e×d=1 (mod φ(n)) is satisfied, that is, d and e are inverse elements of each other modulo φ(n).

From the principle of RSA algorithm, we can know that the core of RSA algorithm is modulo remainder operation, and its security is based on the difficulty of factoring large composite numbers.

(2) Implementation of modular operation

The core operation of the RSA algorithm is also the most time-consuming operation, which is the modular operation. Therefore, developing a fast exponentiation and modular operation is the key to solving the operation speed. Usually, modular operations are implemented using addition and subtraction, because the execution speed of addition and subtraction instructions is fast. However, for the TMS320C54x series chips, there is a dedicated 17-bit × 17-bit multiplier inside, so that the execution time of multiplication instructions is exactly the same as that of addition and subtraction instructions, so multiplication is used in this design to complete modular operations. [page]

When performing modular operations, the exponent e (length is kbit) is generally rewritten into a binary array form e, that is,

Where: ei∈{0,1},i=0,1,Λ,k-1.

In this way, when calculating me (mod n), a square operation is performed first, and then a multiplication operation is performed according to the value of ei, thereby simplifying the complexity of the modular operation.

Since the actual value of e is very large, in order to improve the calculation speed, e can be grouped and calculated. Suppose e is calculated in groups of four (hexadecimal) as me(mod n), then:

Where: ei∈{0,1,2,…,15}, t=k/4;

② Find m2, m3, …, m15 (mod n);

③Set variable c:=1;

④ Repeat the calculation for i=t-1,t-2,…,1,0:

c:=c2(mod n)(squared);

c:=c2(mod n)(fourth power);

c:=c2(mod n)(eighth power);

c:=c2(mod n)(16th power);

e. If ei≠0, then c:=c×mei(mod n).

⑤The value c obtained is what you want.

From the analysis of the modular operation method above, we can see that the number of squares and multiplications required for the operation of this algorithm is the least, so choosing this algorithm to implement modular operation can improve the operation speed. With the basic operation ideas and steps, we can use the TMS320C54x DSP chip to develop the RSA algorithm.

2. Hardware and software implementation

In embedded applications, it is obviously inadequate to use a single-chip microcomputer to implement large-scale multiplication operations; however, the characteristics of the TMS320C54x DSP chip just meet the requirements of the RSA algorithm and are the preferred chip for implementing this algorithm. The chip used in this project is the TMS320C5402 chip produced by Texas Instruments.


(1) Overview of TMS320C5402 chip

The TMS320C54x chip is a fixed-point DSP chip specially designed to achieve low power consumption and high performance. It is mainly used in wireless communication systems and remote communication embedded systems. The TMS320C5402 chip used in this article is a typical product of this series. In addition to inheriting the advantages of old products, it also adds more hardware resources. The main features of this chip are:

① Fast speed, instruction cycle is 10ns, computing power is 100MIPS;

②Powerful addressing capability, 1M×16-bit maximum addressable external storage space, built-in 16K×16-bit RAM, 4K×16-bit ROM;

③40-bit arithmetic logic unit (ALU), including two independent 40-bit accumulators and a 40-bit barrel shift register;

④ One 17-bit × 17-bit hardware multiplier and one 40-bit dedicated adder. The multiplier/adder unit can complete a multiplication-accumulation (MA) operation in one pipeline state cycle.

⑤ Advanced multi-bus structure (3 data buses, 1 program bus and 4 address buses). Multiple data buses can read multiple data at the same time, making the instruction set more powerful and more efficient.

(2) Hardware design

In this design, the serial data provided by the peripheral is a standard RS232 level, which reaches a processable TTL level after level conversion and is directly connected to the asynchronous receive and transmit pins of the DSP chip; the DSP encrypts and decrypts the received data and stores it in an external data memory, waiting for the interrupt program to read it.

The circuit principle block diagram is shown in Figure 1.

In this DSP system, the interface between SRAM and DSP chip constitutes a 32K-word external program memory and a 16-word external data memory, where the address range of the external program memory is 48000H~4FFFFH, and the address range of the external data memory is 4000H~7FFFH; the interface between the parallel 8-bit EPROM and the DSP chip constitutes a 32KB boot loader EPROM, which can make the DSP system an independent operating system, and its address range is 8000H~FFFFH.

When the DSP chip works in microcomputer mode (MP/MC=0), when reset, the external parallel 8-bit boot loader reads the boot loading table from the external EPROM and loads the program code into the DSP off-chip program memory. In the external parallel 8-bit boot loader mode, the software wait state register (SWWER) and the switch control register (BSCR) can be configured to enable the high-speed DSP chip to read data from the relatively slow external EPROM. The default setting is 7 wait states.

Hardware design is the most important. The timing issues in the DSP working process must be strictly analyzed, and the time consumed in the execution of instructions must also be considered. It is also necessary to consider many factors such as whether this time matches the operating speed of peripheral devices. If a single software design is successful but the hardware supporting the software is not successfully designed, it means that the entire design is equal to zero.

(3) Software design

The software development process includes: using any text editor to write source code files, then compiling, assembling and linking to generate DSP executable COFF target code, and finally downloading the generated executable target code to the DSP target system through the simulator to run, and then using the debugging tool to debug to meet the design requirements. After the program is debugged, the debugged program code can be converted into a binary file using the Hex conversion tool, and then the program can be written into the external EPROM using the programmer to form an independent DSP system.

Development languages ​​are divided into assembly language and high-level language. Assembly language compilers are highly efficient, but the assembly languages ​​supported by DSP chips developed by different manufacturers of DSP chips vary greatly, and their instructions and addressing methods vary even more, and their readability and portability are not strong. To overcome this shortcoming, most manufacturers have developed tools that support high-level languages, such as "C language". However, the efficiency of C language compilers is not as good as that of assembly language, especially in processing low-level hardware. Therefore, an optimized and efficient DSP application is completed by using both high-level language and assembly language.

Conclusion

This article introduces the basic principles of the RSA algorithm and the implementation method using the TMS320C5402 DSP chip. The DSP chip is more suitable for the implementation of the RSA algorithm due to its unique hardware structure and flexible software programming function. Practice has proved that the RSA algorithm implemented in this way has greatly improved speed and security performance, so it can be applied to fields such as the Internet and distributed control systems.

Keywords:DSP Reference address:Implementation of RSA Algorithm on TMS320C54x DSP

Previous article:AC sampling technology for tracking frequency changes based on DSP
Next article:Design of Interface between TLC320AD50C and DSP

Recommended ReadingLatest update time:2024-11-16 15:28

Design of TMS320C6455 DSP based on external FLASH automatic loading
Digital signal processors (DSPs) are widely used in digital signal processing, especially when combined with FPGAs, which increases the flexibility and scalability of applications and can give full play to their superiority in signal processing. When designing a signal processing module with DSP as the processor, the
[Embedded]
Design of TMS320C6455 DSP based on external FLASH automatic loading
CEVA's new high-performance sensor hub DSP architecture SensPro - Powering the development of intelligent perception
SAN JOSE, Calif., May 28, 2019 /PRNewswire/ -- CEVA, Inc. (NASDAQ: CEVA), the leading licensor of wireless connectivity and smart sensing technologies, today announced the launch of SensPro™, the industry's first high-performance sensor hub DSP architecture designed to handle multiple sensor processing and fusion work
[Internet of Things]
CEVA's new high-performance sensor hub DSP architecture SensPro - Powering the development of intelligent perception
Design of Power System Harmonic Analyzer Based on Dual DSP
  This paper introduces a design of a power system harmonic analyzer based on dual TMS320F 28335. The analyzer can simultaneously realize the synchronous sampling of multi-channel signals (voltage and current) and perform harmonic analysis on them. With the help of the powerful dual TMS320F28335 platform, real-time an
[Test Measurement]
Design of Power System Harmonic Analyzer Based on Dual DSP
Design of a DC Power Supply System Based on DSP
In order to improve the reliability and intelligence of the system power supply, a design method of a DC power supply system based on CAN bus and using TI's high-performance digital signal processor TMS320F2812 as the controller is proposed. At the same time, the system software proces
[Power Management]
Design of data acquisition and processing system based on DSP and CAN bus
With the rapid development of computer technology, communication technology and electronic technology, the automation level of power system is also increasing. The reliability and maintainability of power system are improved through the application of fieldbus technology and digital signal processing technology. This p
[Embedded]
Design of data acquisition and processing system based on DSP and CAN bus
Design of Busy Tone Detection Based on DSP
A busy tone is a tone that appears when a single tone of a certain frequency and silence interact, usually used to indicate that the phone is busy. In some practical applications, this busy tone needs to be detected. At present, this type of signal tone detection is mostly implemented using dedicated chips, and most o
[Embedded]
Design of Busy Tone Detection Based on DSP
Design of stepper motor focusing system based on DSP chip TMS320F240
0Introduction: When the camera lens shoots a moving object, if the motion trajectory is known, the camera lens must adjust the focal length to adjust the position of the target's image point so that the target is always in focus to achieve the goal of real-time shooting. Traditional zoomin
[Security Electronics]
Design of stepper motor focusing system based on DSP chip TMS320F240
Multi-laser threat signal sorting and code pattern recognition based on DSP
Due to the rapid development and extensive application of laser technology and laser weapons, an important military target on the battlefield may be irradiated and tracked by different laser radiation sources from different directions at the same time. At this time, the signal environment of the laser reconnaissance
[Industrial Control]
Multi-laser threat signal sorting and code pattern recognition based on DSP
Latest Microcontroller Articles
  • Download from the Internet--ARM Getting Started Notes
    A brief introduction: From today on, the ARM notebook of the rookie is open, and it can be regarded as a place to store these notes. Why publish it? Maybe you are interested in it. In fact, the reason for these notes is ...
  • Learn ARM development(22)
    Turning off and on interrupts Interrupts are an efficient dialogue mechanism, but sometimes you don't want to interrupt the program while it is running. For example, when you are printing something, the program suddenly interrupts and another ...
  • Learn ARM development(21)
    First, declare the task pointer, because it will be used later. Task pointer volatile TASK_TCB* volatile g_pCurrentTask = NULL;volatile TASK_TCB* vol ...
  • Learn ARM development(20)
    With the previous Tick interrupt, the basic task switching conditions are ready. However, this "easterly" is also difficult to understand. Only through continuous practice can we understand it. ...
  • Learn ARM development(19)
    After many days of hard work, I finally got the interrupt working. But in order to allow RTOS to use timer interrupts, what kind of interrupts can be implemented in S3C44B0? There are two methods in S3C44B0. ...
  • Learn ARM development(14)
  • Learn ARM development(15)
  • Learn ARM development(16)
  • Learn ARM development(17)
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号