Data communication method between ARM and DSP in embedded machine vision system-EEWORLD

Collect

DSP has powerful computing and processing capabilities for digital signals and numerical algorithms, so it is widely used in signal acquisition and processing, but it does not have an advantage in task management, real-time control, human-computer interaction, etc. ARM microcontrollers have powerful control functions, can load embedded operating systems, and can provide good human-computer interaction, task management, network communication and other functions. Therefore, giving full play to the respective advantages of DSP and ARM processors, the design scheme of using ARM+DSP structure has become a hot topic in the research of embedded systems and has attracted much attention. Through the design example of embedded machine vision system, the design idea of organic combination of ARM and DSP is explained, focusing on the data communication between ARM and DSP.

1Overall solution of embedded machine vision system

The overall structure of the machine vision system using ARM+DSP structure is shown in Figure 1. Samsung's high-performance ARM processor S3C2440 is used as the main controller, and the Linux operating system is configured and transplanted. At the same time, TI's DSP chip TMS320C5402 is used as the image processing coprocessor. Through the hardware connection scheme and driver design of the HPI interface of the DSP chip, the two are organically combined, each of which plays its unique advantages and cooperates to complete target acquisition, processing and visual tracking.

First, the intelligent camera collects images of the moving target on site, and the ARM controls the data to be stored in the common storage area of ARM and DSP, and notifies the signal processing module DSP to call various algorithms (frame difference, image segmentation, feature extraction, centroid calculation, etc.) to process the video image, complete the recognition and positioning of the target, and transmit the result information to the ARM processor, which controls the stepper motor to adjust the PTZ camera posture (horizontal rotation P-Pan, vertical rotation T-Tilt, depth of field expansion Z-Zoom) to align it with the moving target and achieve real-time tracking. In addition, the ARM processor is also responsible for multi-task management, human-computer interaction and interrupt alarm.

As a dual-core system, each of the two cores has very good performance, so whether the host ARM and the coprocessor DSP can exchange data quickly and reliably directly determines the operating efficiency of the machine vision system. In terms of hardware design, the host parallel interface HPI is used to achieve hardware direct connection between the host ARM and the coprocessor DSP. The design is simple and convenient, and the clock frequency can reach 1/5 of the DSP clock frequency, which can support high-speed data transmission between the DSP and the host. In terms of software design, the embedded Linux operating system is transplanted, and the driver design is completed by treating HPI as a character device.

2. Dual-machine communication hardware design

2.1 Introduction to HPI Interface

The host interface (HPI) is an interface component built into TI's C54x series fixed-point signal processors (DSPs), which can be used to easily connect DSPs to other hosts. There are three main types of host interfaces in C54x: standard 8-bit host interface HPI-8, enhanced 8-bit host interface HPI-8, and enhanced 16-bit host interface HPI-16. The enhanced HPI allows the host to access all on-chip RAM units of the DSP, while the standard host interface only allows access to the fixed 2 K on-chip RAM. The TMS320C5402 used in this article has an enhanced HPI-16 host interface. It consists of the following 5 parts:

1) HPI memory (DARAM) transfers data between the host and DSP. It can be accessed twice in one machine cycle and can be used as a general-purpose dual-address data RAM or program RAM.

2) HPI Address Register (HPIA) Only the host can directly access it. This register stores the address of the currently addressed HPI storage unit.

3) HPI Control Register (HPIC) Address 002CH, both the host and DSP can directly access it, and it is used to store the control and status bits of HPI operations.

4) HPI data latch (HPID) Only the host can directly access it. If the current operation is a read operation, the HPI stores the data to be read from the HPI memory; if the current operation is a write operation, the HPI stores the data to be written to the HPI memory.

5) The HPI control logic processes the interface signals between the HPI and the host.

2.2 Interface circuit and its working principle

3. Software Design

3.1 Linux Driver

The Linux operating system introduces the concept of device files, that is, each device is regarded as a file and the device is operated like a file. Under the Linux operating system, there are three main types of device files: character devices, block devices, and network devices, each corresponding to a type of device driver. The driver of the HPI interface designed in this article belongs to the character device driver.

The driver under Linux writes multiple basic functions for the corresponding device and fills the file_operations structure. It defines various operation functions. As shown below:

Among them, open and release complete the opening and closing of the device. mmap is the memory address mapping operation. The design of the driver is to provide a unified interface for the corresponding device at the application layer by implementing the basic functions of the operation functions.

Below is the device entry hpi_open function of the driver of the HPI interface in this article, which is responsible for opening and preparing the device.

Whenever the character device (HPI) interface is opened, the device's open entry point (hpi_open) will be called. Therefore, the open function (hpi_open) must make necessary preparations for the upcoming I/O operations (reading and writing data to the DSP). For example, if the device is exclusive, the open function (hpi_open) must mark the device as busy, as shown in the two lines at position ① in the above routine.

3.2 Implementation of mapping in driver

Since in Linux system, user applications cannot directly operate the memory space of the driver, the memory mapping mechanism must be used. Memory mapping refers to mapping a specific memory space in the kernel to the memory of the user space. For the driver, memory mapping can provide user programs with the ability to directly access the device memory.

The mmap system call maps a device, which means associating a segment of address in user space with the device memory. This means that as long as the program reads or writes within the allocated address range, it is actually accessing the device.

The mmap method is part of the file_operations structure. To implement the mapping, two steps are required:

1) Call the remap_page_range function in the kernel. Its function is to construct a new page table for mapping a physical address, and realize the mapping between kernel space and user space. Its prototype is as follows:

The exact meaning of the function parameters is as follows: unsigned long virt_add is the virtual address where the remapping starts. This function constructs the page table for the range between the virtual address space virt_add and virt_add+size. unsigned long phys_add is the physical address to which the virtual address should be mapped. unsigned long size is the size of the area to be remapped, in bytes. pgprot_t prot is the "protection" attribute requested for the new VMA. The driver does not have to modify the protection, the parameters found in vma->vma_page_prot can be used without change.
The code using the mmap call in this project is as follows:

In this way, a new page table is constructed between vma->vm_start and vma->vm_end for the bus physical address corresponding to the HPI interface of the DSP: 0x10000000 (corresponding to nGCS2).

hpi_mmap_add = mmap (NULL, length, PROT_READ|PROT_WRITE, MAP_SHARED, hpi_fd, 0) The parameter start indicates the starting address of the mapping area of the "file" (that is, /dev/hpi device) corresponding to the descriptor fd in the process address space. It must be a page-aligned address and is usually set to NULL to let the kernel automatically select the starting address. In any case, the return value of mmap is the starting address of the memory mapping area. In this way, by operating hpi_mmap_add, the memory segment with the starting address of 0x10000000 is operated.

3.3 Driver kernel loading method

After completing the writing and testing of the embedded Linux driver, the next step is to load the written driver into the system kernel to complete the work of driving the hardware. There are usually two ways to do this:

1) The driver is directly compiled into the kernel. The driver compiled in this way is already in the memory when the kernel starts. There is no need to load the driver automatically during runtime, and dedicated memory space can be reserved.

2) Driver module loading Drivers that adopt the module loading method will be stored in the file system in the form of modules, and can be dynamically loaded into the kernel when needed, so that the driver can be loaded on demand and save memory when not in use. In addition, the driver is relatively independent of the kernel, and the upgrade and authorization methods are flexible. This article adopts this method.

Because the module loading method is adopted, two important functions init_module() and cleanup_module() are also needed to complete the registration and unloading of the module. The specific source code can be found in /usr/src/linux/kemel/module.c. After version 2.3, a new method is used to name these two functions, defining exampie_init() instead of init_module() function and example_cleanup() instead of cleanup_module() function. At the end of the program, use the following two lines of code to declare:

module_init(S3C2440_HPI_init);
module_exit(S3C2440_HPI_exit);

3.4 Design method for specific interface applications

Using the prepared driver, users can write different application interface programs. The following is a method for self-increment writing:
According to the interface circuit of Figure 2, the interfaces corresponding to A2, A3, A4, and A5 are HCNTI0, HCNTL1, and HHWIL respectively. When the self-increment mode writes the lower half word, their values should be: HCNTL0=0, HCNTL1=1, and HHWIL=0, that is, A[5:2]=0010. By adding an offset to HPI_VA_BASE, the control port line can be controlled. So when writing the first half word in self-increment, add 00000100, that is: 0x04. When writing the high byte in self-increment mode, their values should be HCNTL0=0, HCNTL1=1, and HHWIL=1, that is, A[5:2]=1010. So when writing the second half word in self-increment, add 00101000, that is: 0x28. The following macro definition is used to write the address of the HPI control register:

In addition, during the process of self-increment, the DSP processor as the receiving end needs to clarify: 1) whether it is ready to write data. In HPI-16, the status of HRDY can be queried through the HPIC register. When HRDY is 1, it means that the HPI is ready; 2) specify the area address to write data, that is, dsp_add_w=(hpi.hpi_dsp_add). This is the parameter passed from the application to determine the starting address of the write data area. The code and comments for self-increment are as follows:

4 Conclusion

Through an embedded machine vision system engineering example, this paper explains the design method of using the ARM+DSP dual-core structure to load the Linux operating system in the embedded system, and communicating and exchanging data through the HPI interface. The hardware circuit connected by the HPI interface and the driver in the Linux environment are designed, and the specific application design method of the interface is described.

The dual-core system of ARM+DSP is a new method for building embedded machine vision systems. The dual-machine communication method designed here for exchanging data through the HPI interface has been successfully applied in machine vision system projects and has been proven to have a data transmission speed of 10 Mb/s, which can meet the real-time requirements of embedded systems and has broad application prospects. However, it should be noted in the application that the reading and writing process of the HPI interface involves the relevant operations of the common registers (HPI control registers, address registers and data registers) and the memory provided by the HPI interface to the host (host) for reading and writing. Therefore, in the design of specific applications and drivers, it is necessary to use mutual exclusion mechanisms such as semaphores to protect them, otherwise, there will be confusion in reading and writing.

Author: Mao Xiaobo, Liu Guodong, Chen Tiejun, Huang Yunfeng

Reference address：Data communication method between ARM and DSP in embedded machine vision system

Previous article：Design of power-off protection for embedded systems
Next article：Design of embedded wireless video monitoring system based on DSP chip TMS320DM642