introduction
SOPC is a programmable system-on-chip solution proposed by Altera. It integrates the modules required for system design, such as CPU, memory, I/O interface, DSP module and phase-locked loop, into an FPGA to form a programmable system-on-chip. It optimizes the designed circuit in terms of scale, reliability, size, power consumption, function, time to market, development cycle, product maintenance and hardware upgrade [1].
Currently, IP cores of controllers including UART, SPI, Ethernet, SDRAM, Flash, and DMA are integrated in Altera SOPC">SOPC Builder. In addition, users can also design their own IP cores or purchase IP cores from third-party manufacturers according to system needs, and easily bundle them into the system through the Avalon bus like building blocks. IP cores are functionally verified intellectual property cores. Using IP cores has the following advantages: (1) improving design performance; (2) reducing product development costs; (3) shortening the design cycle; (4) strong design flexibility; (5) convenient simulation; and (6) OpenCore Plus supports risk-free applications.
Of course, the IP core functions mentioned in this paper are not that rich. In fact, it is just a user logic with correct function verification, and there is still a certain gap between it and the IP core of commercial application. The main work of this paper is to describe the logic of video signal acquisition, distribution, storage and color space conversion through hardware description language, and verify the correctness of the functions.
1. Video encoding and decoding Camera_show principle
In addition to the necessary power supply circuit, the embedded camera control system also includes storage circuit, communication circuit and download circuit, etc. All devices are connected to the Avalon bus. Here we mainly introduce the user logic interface Camera_show, which completes the function of converting analog video data into digital video data and displaying it on VGA. It mainly includes the acquisition, distribution (completed by serial-to-parallel conversion circuit), storage (completed by storage control logic and on-chip RAM) and color space conversion of analog video signals. The specific functional block diagram is shown in Figure 1.
Figure 1 Principle block diagram of user logic Camera_show
2. Video codec IP core Camera_Show design
The main functions of the video codec IP core include the acquisition, distribution, storage and color space conversion of video signals. After passing through ADV7181B, the analog video signal becomes a YUV digital signal that complies with ITU-R656. However, to process the YUV signal, the three signals must be processed separately and in parallel, so it is necessary to acquire and distribute the three signals, which is the function that the 2.1 IP core needs to implement; since the analog video signal is interlaced, but the CRT display is progressive, if it is not processed, it will inevitably lead to line stagger, so the data needs to be stored and the interlaced to progressive conversion is realized through control, which is the function that the 2.2 IP core needs to implement; finally, the processed three-way YUV digital signal needs to complete the color space conversion to become an RGB signal, which is the function that the 2.3 IP core needs to implement.
2.1 YUV signal acquisition and distribution[2]
In the embedded camera control system, ADV7181 is mainly responsible for decoding the video data of the analog camera, converting analog signals such as CVBS into YUV signals of the ITU-R656 standard. Figure 2 shows the functional block diagram of ADV7181.
Figure 2 ADV7181 functional block diagram
As can be seen from the figure, the input analog signals such as CVBS are converted by the ADV7181B chip to output YUV signals, line synchronization signals HS, and frame synchronization signals VS. These are the required digital video signals, which solves the problem of digital video source. Figure 2 shows the composition and arrangement of the YUV signal. "FF, 00, 00" is the beginning of the AV signal, so a detection circuit needs to be constructed. Note that SAV and EAV both start with FF, 00, 00, but the values of XY are different. According to the chip data, XY[4] represents V, which is the dividing point between the useful signal and the blank signal. If V=0, it represents SAV, otherwise it is EAV. XY[6] is the distinguishing mark of the field signal. 0 is the odd field and 1 is the even field. [page]
One row of analog signal is 1716 CLOCKs, and useful signal is 1440 CLOCKs. In the process of signal acquisition and distribution, only useful signal needs to be acquired, so it is necessary to use the detection of SAV as a sign to start the signal distribution process.
Since the YUV signal is interwoven in the analog signal, a signal selection circuit is required. YUV has three signals in total. A counter is designed to select. When the count is 0 and 2, it is the UV signal, and when the count is 1 and 3, it is the Y signal. What is actually completed is the process of converting serial signals to parallel signals. The above process can be represented by the principle block diagram of Figure 3.
Figure 3 Schematic diagram of YUV signal acquisition and distribution
In hardware description language, it is relatively simple to complete the above process. For example, for the detection circuit, you only need to describe a shift register. The specific code is as follows:
The wire variable Y_check is a flag that is set to 1 when FF, 00, 00 is detected. According to the above, the distinction between SAV and EAV is determined by XY[4], and the distinction between odd and even fields is determined by XY[7]. Therefore, the signal distribution circuit is only effective when the subsequent signal is SAV, so a logic needs to be described to make the judgment. The code is as follows:
The START signal is the sign for starting signal collection and distribution. The signal distribution circuit will work only when TD_D = 0, that is, START = 1. The serial-to-parallel circuit code is as follows:
The above code completes the function of Figure 3. The input signal is called TD_D, and the three output signals are Cbb, YY, and Crr. Note that there is also a YPix_clock, which is actually 27M divided by 2. This clock is very useful and will be explained in detail below. [page]
2.2 Storage of YUV Signals
To convert interlaced video signals to progressive, there are 2 solutions:
The first method is to store a frame of data. According to the difference between odd and even fields (which can be distinguished by XY[7]), during the write cycle, because there are even field signals between the odd field lines, it is necessary to jump the address when writing data. The lines are distinguished by the line synchronization signal (or SAV). When the line is changed, the address is added with an extra 720 (to store the even field signal mixed in the odd field signal). Until the even field signal appears (that is, XY[6]=1), the address is switched to the initial base address plus 720. The rest is handled the same as the odd lines. For the specific address allocation table, please refer to Figure 4.
Figure 4 Address allocation table
In the read cycle, you only need to read out in sequence. It should be noted that the write clock is 13.5M, the read clock is 27M, and the Y, U, and V signals must be stored separately.
The second method is to store one line of data. Since 1716 clock cycles are just equal to the time of two lines of VGA, 7 valid video signals can be read twice during this period of time, and the odd line signal can replace the even line signal to achieve the purpose of interlaced to progressive. In terms of implementation, it only needs two RAM blocks to perform ping-pong operation, which will be explained in detail later.
Comparing the two implementation methods, the advantage of method 1 is that the image is not distorted, that is, the odd and even line signals are still interlaced together, but method 2 cannot achieve this. In addition, method 1 can also increase the running speed through the ping-pong method, but due to the asynchronous reading and writing clocks, each storage space should be read twice. Method 2 also reads twice, but reads twice for each line, while method 1 reads twice for one frame of data.
The disadvantage of method 1 is that the amount of data stored is too large. The Y component of a frame of data alone is 8bit*720*525 =3024000bit = 378KB. This data is not suitable for operation in SRAM and requires the use of SDRAM. However, operating SDRAM is relatively complicated, so method 2 is generally considered because it requires very little space and can be implemented using the on-chip resources of FPGA. When image data is transmitted very quickly, the human eye basically cannot distinguish between odd and even field signals, so method 2 is feasible. Before discussing method 2, it is necessary to understand the ping-pong operation that is often used in pipeline operations. This is a common design concept and technique for programmable logic. Ping-pong operations are often used in data flow control. A typical ping-pong operation is shown in Figure 5 [3][4].
Figure 5 Ping-Pong operation diagram
The processing flow of the ping-pong operation is described as follows: the input data stream passes through the "input data stream selection unit" and distributes the data stream to two data buffer modules in a synchronous manner. The data buffer module can be any storage module, and the more commonly used storage units are dual-port RAM (DPRAM), single-port RAM (SPRAM) and FIFO. In the first buffer cycle, the input data stream is cached in "data buffer module 1". In the second buffer cycle, the input data stream is cached in "data buffer module 2" through the switching of the "input data stream selection unit". At the same time, the data of the first cycle cached in "data buffer module 1" is sent to the "data stream operation processing module" for operation processing through the selection of the "output data stream selection unit". In the third buffer cycle, the input data stream is cached in "data buffer module 1" through the switching of the "input data stream selection unit" again. At the same time, the data of the second cycle cached in "data buffer module 2" is sent to the "data stream operation processing module" for operation processing through the selection of the "output data stream selection unit". This cycle repeats over and over again.
The most significant feature of the ping-pong operation is that the buffered data stream is sent to the "data stream operation processing module" without any pause through the rhythmic and coordinated switching of the "input data stream selection unit" and the "output data stream selection unit" to be operated and processed. If we regard ping-pong as a whole and look at the data from both ends of this module, the input data stream and the output data stream are continuous without any pause, so it is very suitable for pipeline processing of data streams. Therefore, the ping-pong method is often used in pipeline algorithms to complete seamless buffering and processing of data. [page]
In FPGA, the use of ping-pong operation is a reflection of the principle of area and speed trade-off.
Method 2 can be implemented as follows: Use Megacore inside the FPGA to construct a dual-port RAM. The hardware description language definition of the input and output signals of the dual-port RAM is as follows:
The signals used include: data signals data_a, dat_b; read and write valid signals wren_a, wren_b; address signals address_a, address_b; clock signals clock_a, clock_b; output data signals q_a, q_b. It can be seen that all signals appear in pairs, just for ping-pong data transmission. It is divided into two RAM areas, A and B, which are equivalent to the data buffer modules 1 and 2 in the ping-pong method mentioned above. The two RAM blocks are read and written alternately (determined by I_a and I_b), and the output data flow is also determined by I. As mentioned earlier, the write clock is 13.5M and the read clock is 27M, so clock_a and clock_b must be read and write clocks switching input, and the address count is also different. The clock for address increase during the write cycle is 13.5M, and the clock for address increase during the read cycle is 27M. Therefore, the data of each row is read twice, which is equivalent to changing from interlaced to progressive. Figure 6 is a simulation diagram of the ping-pong operation function of RAM under Quaartus II:
Figure 6 RAM Ping-Pong Operation Simulation Diagram
The allocation table of the RAM block for ping-pong operation signals is as follows:
The final output DATA signal enters the next level unit, which is the conversion from YUV to RGB. [page]
2.3 Design of Color-Space Conversion Part [5]
Why is this conversion necessary? Because both TVs and CRT
(1) Incompatible with black and white images;
(2) Occupies too much bandwidth;
(3) Poor anti-interference ability.
R = 1.164 ( Y-16 ) + 1.596 ( Cr-128 );
G = 1.164 ( Y-16 )- 0.813 ( Cr-128 ) - 0.392(Cb-128);
B = 1.164 ( Y-16 ) + 2.017 ( Cb-128 );
From the above formula, we can see that the conversion requires multiplication and addition operations, and decimals are used in the formula, so the coefficients must be magnified. After reasonable conversion, the formula is as follows:
R = (1/256) * ( 298*Y + 409*Cr - 57065 );
G = (1/256) * ( 298*Y - 100*Cb - 208*Cr + 34718 );
B = (1/256) * ( 298*Y + 516*Cb - 70861 );
Use Verilog HDL to write code to achieve the conversion from YUV to RGB. It includes 3 modules and 1 simulation stimulus. In the module const_mult, the multiplication operation is mainly implemented. The main code is as follows:
In the module csc.v, the const_mult module is called, the values of the parameters IN_SIZE, OUT_SIZE, and CST_MULT are changed through parameter passing, and then the addition operation is implemented.
Taking R = (1/256) * ( 298*Y + 409*Cr - 57065 ) as an example, the main codes are as follows:
[page]
The code used to implement G and B is similar to the above, so I will not repeat it here. The following code implements the R_full*1/256 function.
The main module yuv2rgb implements the calling of submodules and is simulated using Modelsim. The simulation waveform is shown in Figure 7:
Figure 7 YUV to RGB conversion simulation diagram
3. Conclusion
This paper designs a video codec controller IP core based on SOPC">SOPC. According to the top-down design concept, the IP core is divided into hierarchical functions, and the IP core is simulated and verified to achieve the acquisition, distribution, storage and color space conversion of video signals. This IP core has good portability and can be easily applied to various embedded systems that require video codec controller functions with Nios II as the core.
Previous article:Application of PSoC in the acquisition of fiber optic gyroscope pulse output
Next article:Digital Frequency Domain Interference Canceller Based on Xilinx FPGA
- Popular Resources
- Popular amplifiers
- Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
- Analysis of the application of several common contact parts in high-voltage connectors of new energy vehicles
- Wiring harness durability test and contact voltage drop test method
- Sn-doped CuO nanostructure-based ethanol gas sensor for real-time drunk driving detection in vehicles
- Design considerations for automotive battery wiring harness
- Do you know all the various motors commonly used in automotive electronics?
- What are the functions of the Internet of Vehicles? What are the uses and benefits of the Internet of Vehicles?
- Power Inverter - A critical safety system for electric vehicles
- Analysis of the information security mechanism of AUTOSAR, the automotive embedded software framework
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- Rambus Launches Industry's First HBM 4 Controller IP: What Are the Technical Details Behind It?
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- Optimizing the feedback loop when input data has gaps
- Award-winning live broadcast: Application of TI mmWave radar sensors in smart home and smart security
- The 10 Best Pico Accessories of 2021
- Yole: GaN RF market size will exceed $2 billion in 2025
- How do I add a sub-device of a part in TINA?
- The problem of which two SW pins are used for debugging in STM32CubeMX
- Live broadcast at 10:30 this morning [Microchip Embedded Security Solutions | Secure Boot of Microprocessors]
- [Rawpixel RVB2601 development board trial experience] Display QR code
- 100 super practical questions and answers on the basics of oscilloscopes
- GCC 9.4 released: no longer mandatory code contribution copyright transfer to FSF