【DM642】Porting of H.264 source code on DM642

灞波儿奔

【DM642】Porting of H.264 source code on DM642 [Copy link]

TI provides CCS (Code Composer Studio) for C language development. This platform includes an optimized ANSI compiler, which allows DSP programs to be developed in C language. This method not only greatly speeds up DSP development, but also greatly increases the readability and portability of DSP programs, and program modification is also very convenient. Since CCS is a special development environment for DSP, the C under CCS is different from the C used in general computers, mainly in the following aspects: the C language of DSP does not include the extended part that is connected to the peripherals; the compilation process of the C language of DSP is divided into two steps: first, C is compiled into assembly (asm) code, and then the assembly code is compiled into executable (obj) code. C and assembly code directly correspond to each other, and the relationship is very clear, which is convenient for manual optimization; the DSP code needs to be absolutely positioned, and the C code of the host is positioned by the operating system; the C code of DSP is highly efficient and very suitable for embedded systems.

The following issues should be addressed during the transplantation process:

1. Changes to library files

Because CCS is a development environment dedicated to DSP, and its support library is also related to DSP hardware, the header files included in some library functions of VC need to be changed accordingly to adapt them to the CCS operating environment. For example, the dynamic storage distribution functions such as malloc and calloc supported in VC are all included in the stdlib library in CCS, so the original include files must be modified.

2. Adjustment of variable access method

In CCS, programs are stored in segments, and the contents of each main segment are as follows:

.text: stores executable code

.cinit: stores initialized global variables and static variable tables

.switch: used to store branch jump table

.bss: stores static and global variables

.far: used to store global and local variables declared as far calls

.stack: stores the system stack

.system: stores dynamic storage space allocation heap

CCS's C compiler supports two memory modes: small mode and large mode. Different memory modes affect access to .bss segment variables. Global and static variables in the program are distributed in the .bss segment. In small mode, their total size cannot exceed 32K.

3. Adjustment of data types

In CCS , there is no long long type defined. Long represents a 40 -bit long integer, and double represents a 64-bit floating point. In VC, both long and int represent 32-bit integers. Because the general registers of the C64 series are all 32 bits, when accessing 40-bit data, two registers must be read and written. In order to save CPU processing time, the corresponding data types should be adjusted.

4. Storage space allocation

In CCS, storage space allocation is achieved by configuring .cmd files. Before allocating storage space, you must understand the size of the chip's entire available internal and external storage space. After compilation, the program runs away because it accesses a non-existent storage area. Generally speaking, some frequently called execution codes should be placed on-chip to increase the execution speed of the code.

Secondly, the size of the heap and stack should be reallocated. The size can be configured by setting -heap and -stack in the .cmd file. The heap is used to allocate dynamic storage space, corresponding to the .system segment, and the stack is used to save the return address of the function, corresponding to the .stack segment. During the video decoding process, storing reference frames and other structures requires a lot of dynamic storage overhead, so as much heap space as possible should be allocated. However, the DM642 chip has only 1M of storage space, so the heap space can only be opened in the 32M space outside the chip. The allocation of stack space can be determined according to usage. When the program runs abnormally, you should check whether the stack overflows.

5. Initialization of some structures

The uninitialized structure members will be automatically initialized to 0 in VC, but not in CCS. Usually, they will be a relatively large number. If the initialization program is not performed, it is easy to make mistakes. Therefore, it is necessary to initialize some structures.

6. Adjustment of data width

In DM642, registers are all 32 bits in size and can process 32 bits of data at a time. If the storage address in the program is not a legal 32-bit word address, the LDW instruction will automatically adjust the address to a legal word address when the program loads data.

7. Deletion of redundant code

There are many codes in the code implemented on the PC that are irrelevant to the DSP implementation, which can be deleted to improve the efficiency of code execution. For example, there are a lot of debug information, trace information, assert information, and printf functions in the original code, which are all the information needed for debugging during the code writing process. When it is implemented on the DSP side, it can be deleted. There are also some data analysis and calculation functions, such as the function for calculating SNR and a large number of statistical functions, which have a large amount of calculation. However, these functions can be ignored for us to realize a compact coding system, and there is no need for the DSP to complete them synchronously.