1 Introduction
With the development of communication and information technology and the popularization of digital products, DSP is increasingly used in various digital systems. As a leading digital signal processor supplier in the industry, Texas Instruments (TI) of the United States developed the real-time kernel DSP/BIOS that can run on its DSP products in the 1990s, and proposed a series of DSP software reference frameworks (RF) to help DSP application developers accelerate the software development process. However, there are not many domestic studies on DSP application framework design, and most of the research work focuses on how to use the existing TI reference framework, and there are few discussions on its limitations and improvement plans.
This article first briefly introduces the DSP application reference design framework RF5 proposed by TI and its applicable fields. Then, based on it, an improved DSP application framework ERF5 (Enhanced Referenced Framework Level 5) is proposed for the increasingly used multi-processor systems. In it, a good mechanism for the DSP to communicate with the external general-purpose processor GPP as the core control unit in the system is defined, so that the task scheduling and execution process of the DSP can be controlled by the GPP, and the operating state of the DSP can be efficiently switched to multiple sets of independent digital signal processing algorithms.
2 RF5 Application Framework
TI has proposed a series of DSP application reference frameworks in the eXpressDSP concept to meet the needs of different applications, including RF1 (Reference Framework Level 1), RF3 (Reference Framework Level 3) and RF5[1][2]. Compared with RF1 and RF3, RF5 is the most powerful DSP application reference framework. It is suitable for multi-channel, multi-algorithm high-density DSP application systems. RF5 supports the creation of static and dynamic DSP/BIOS module objects, 1-100 data processing channels and XDAIS algorithm, thread scheduling mechanism implemented by DSP/BIOS task object TSK, and thread blocking. Therefore, it is widely used in complex digital signal processing systems such as audio and video signal processing. Figure 1 shows the DSP application framework based on RF5.
Figure 1 RF5 application framework
TI defines four basic data processing elements for the RF5 application reference framework, namely, tasks, channels, cells, and XDAIS algorithms. The highest level of the RF5 framework is the task, which can be composed of a single or multiple channels. It controls the flow of data at a higher level by communicating with device drivers or other tasks. Each task body can be attributed to the iterative process of "getting data - processing signals in each channel - sending result data". Each channel element is composed of a series of sequentially executed signal processing algorithm units. The algorithm unit is a package of an XDAIS algorithm, which provides a set of standard interfaces between the XDAIS algorithm and external applications. It must implement the ICELL interface module.
In addition to defining the above four data processing elements, RF5 also proposes the concept of data communication elements to ensure efficient data communication between tasks and DSP peripherals, between tasks and between algorithm units. The data communication methods in DSP applications developed based on DSP/BIOS can be divided into task-level data communication and algorithm-level data communication. For task-level data communication, RF5 uses SIO (STream IO) objects and SCOM (SynchronizedCommunication) messages to implement it. For algorithm-level data communication, RF5 uses ICC (Inter-CellCommunication) objects and ICC object lists to implement it.
3 Improved DSP application framework ERF5
With the increasing complexity of embedded systems, DSP is not suitable for process control of complex systems. Therefore, in recent years, DSP is often used as a coprocessor in system design to free it from the heavy and complex system control tasks, while the process control of the entire system is completed by a general-purpose processor GPP, which enables DSP and GPP to complement each other. However, RF5 has great defects in multi-machine communication. It is not suitable for multi-processor systems, especially for the application environment where DSP is a slave device in multi-processor systems. In addition, RF5 implements a single-function multi-task system. Its multi-task feature is only reflected in the splitting of a single-function task into three sub-tasks of input-processing-output. It does not realize a true multi-function multi-task system, that is, a task is an independent signal processing function.
Based on the above two aspects of analysis, it is absolutely necessary for us to improve RF5 to meet the requirements of complex signal processing systems based on multi-processors. The system block diagram of ERF5 proposed in this paper is shown in Figure 2. Task 1, Task 2, and Task 3 are three tasks defined in the system, which are scheduled in turn by the DSP/BIOS task scheduler with equal priority. Each task includes three modules: input preprocessing, core signal processing, and output post-processing, which constitute a complete and independent signal processing task. Each task consists of a single or multiple data processing channels (Channel), and each channel consists of a series of algorithm units (Cell). The GPP in the multi-processor system controls the task execution process of the DSP through the DSP operation control register DSP_CNTL, and the DSP will respond by reflecting its operation status in the DSP operation status register DSP_STAT. In general, ERF5 improves RF5 in the following three aspects:
An effective way of communication between DSP and GPP is defined and implemented; a task implementation framework is given when DSP needs to implement multiple sets of signal processing functions and the execution of a set of signal processing tasks is completely controlled by GPP; the unreasonable task splitting in RF5 is merged to reduce the impact of DSP/BIOS task scheduling on system performance.
Figure 2 ERF5 application framework
3.1 Master-slave communication mode
We define two registers in the DSP storage space: DSP operation control register (DSP_CNTL) and DSP operation status register (DSP_STAT). In DSP_CNTL, a series of control fields can be defined to represent various control operations of the external host on the DSP, and in DSP_STAT, some field information corresponding to DSP_CNTL that describes the current operation status of the DSP can be defined. GPP commands the DSP to perform corresponding operations by setting DSP_CNTL reasonably, and after responding to the CPU's command, the DSP will set DSP_STAT to inform the CPU of the current operation status of the DSP.
In addition, in order to facilitate data exchange between DSP and host, ERF5 opens up two buffers in the DSP's storage space that are dedicated to data exchange between DSP and GPP, and defines a buffer flag bit PPFLG in the DSP operation status register DSP_STAT to inform the host whether the ping-pong buffer it can currently access is "ping" or "pong", so that the data interaction between the host and DSP can be carried out relatively independently of each other.
3.2 Task Implementation Model
After clarifying the communication mode between the host and DSP, the next thing to be solved is how to give a reasonable task implementation model in the application framework so that it can not only support the host's effective control of the DSP but also reduce the task scheduling overhead of DSP/BIOS as much as possible. Here we take our actual project as an example to explain the task implementation model. In our H.264 hybrid codec system, DM642 needs to run three independent tasks: video encoding task, video decoding task and video pass-through task. At any time, whether the core processing process of these three task threads runs or not is completely controlled by GPP. First of all, for the sake of system performance, we define these three tasks in DSP/BIOS in a static configuration, so that when the system is running, there is no need to spend the inevitable performance overhead caused by dynamic task creation. Obviously, these three tasks should have the same priority, otherwise, due to the preemptive characteristics of the DSP/BIOS real-time kernel, some high-priority tasks will always preempt the execution rights of those low-priority tasks, even if GPP does not start those high-priority tasks at some time. In addition, since DSP/BIOS periodically schedules all tasks in the system that are in the ready state, the logic for determining whether the main processing process is executed and the task switching logic in each task must be as short as possible, because this code will be called frequently when the system is executed. It should also be noted that the TSK_sleep(…) function should be used to implement the task switching logic so that the task that is not currently executed by the GPP command is blocked for a period of time (the time interval should be at least the maximum execution period of each periodic task in the system), otherwise the DSP/BIOS task scheduler will frequently schedule the task and affect the normal execution of other tasks. The following task execution flow chart is given by taking the video pass-through task as an example as shown in Figure 3.
Figure 3 Flowchart of the video pass-through task execution
3.3 Task Splitting and Merging
The DSP/BIOS real-time kernel can ensure that all tasks running on it are correctly scheduled at the appropriate time. In general, the more tasks running in the system, the more time is spent on DSP/BIOS task scheduling. The single-task system spends the least time on task scheduling. Therefore, the scale of tasks should be reasonably specified in an application framework. Dividing tasks too finely or too coarsely will have a negative impact on system performance. In ERF5, each functionally independent signal processing module is defined as a task thread, which contains the data input preprocessing and data output postprocessing parts corresponding to the current signal processing function. In an independent task thread, the processing algorithm that can be implemented using peripheral modules such as EDMA is separated from the algorithm that must be operated by the CPU, and double buffering is introduced between them to simulate the pipeline mechanism. In this way, the original communication between task threads is transformed into communication between algorithm units within a single task thread, so that the communication and data exchange between task threads are minimized due to the independence of threads, thereby effectively avoiding the occurrence of system deadlock caused by thread communication.
4 Performance Analysis
This section uses CPU load as an indicator to compare and analyze the performance between the application framework proposed in this paper and RF5. In order to make the experimental results more convincing, we use the MPEG2 codec routine in the TMS320DM642 evaluation board as an implementation example of the RF5 framework. In addition, we use the ERF5 proposed in this paper to implement the MPEG2 codec system. Both use the same MPEG2 codec algorithm library that complies with the XDAIS algorithm standard. Here we define the CPU load as:
For a video signal processing system, it is generally required that the system can process 25-30 frames of image data within 1 second, so it can be used as the real-time indicator of the above video encoding and decoding system, that is, the maximum cycle of the system to encode or decode a frame of image is 33-40 milliseconds. According to the above calculation formula, the CPU load diagram of RF5 and the improved application framework is shown in Figure 4. It can be seen from the figure that the CPU occupancy rate of ERF5 is basically similar to that of RF5, or even slightly better than RF5. If it is applied in the field of video signal processing, its CPU occupancy rate is only 7.92%-9.50%, which fully meets the needs of practical applications.
Figure 4 Comparison of CPU load between ERF5 and RF5 in MPEG2 codec system
5 Conclusion
This article briefly introduces the TI DSP reference framework RF5 and proposes the ERF5 application framework, which solves the problem that RF5 cannot be effectively applied to multi-processor complex digital signal processing systems with DSP as coprocessor, and the CPU occupancy rate is comparable to RF5. Our actual project experience proves that RF5 is suitable for single-processor signal processing systems with TI DSP as the main control and main processing unit, and can achieve good performance; ERF5 can provide maximum support for multi-processor systems and has been successfully applied to a complex H.264 hybrid codec system.
Previous article:Research on Low Power Design of Embedded DSP Accessing Off-Chip SDRAM
Next article:TI DSP application system low power design solution
Recommended ReadingLatest update time:2024-11-16 15:56
- Popular Resources
- Popular amplifiers
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- [SAMR21 new gameplay] 23. Rotary encoder rotaryio
- I hope you can give me some advice on the prospects of smart sensors.
- Share: Capsule Robot Antenna Design Information (English)
- The award-winning live broadcast will start at 10:00 this morning: "Dadatong helps you unlock the new generation of ADAS technology"
- Free Pinouts eBook (The Pinouts Book) V0.3
- [ESK32-360 Review] 4. Run the virtual serial port routine
- 【TI recommended course】#Lecture on basic knowledge of electronic circuits#
- Low-pass filter waveform distortion problem
- Questions about PCB antenna
- 【AT-START-F425 Evaluation】01 Development Environment Construction