Design of high dynamic range image signal processing based on FPGA

Publisher:感恩的7号Latest update time:2011-11-01 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere
Video image signal processing (ISP) has come a long way from the analog era. Today, digital signal processing enables image data processing at the bit level, providing unprecedented control over image quality. Digital signal processing is obviously not the same as digital signal processors or DSPs. While DSPs have been widely used in the digital field of video image signal processing, ISPs can be implemented by a variety of processing devices, such as DSPs, ASICs, ASSPs, and increasingly field programmable gate arrays, or FPGAs.

Why Use FPGAs?

There are several reasons driving the growing popularity of FPGAs. Two of these reasons reflect recent trends in security cameras, which have greatly increased the amount of image data that needs to be processed, and the third is economic, namely the cost of materials (BOM) for camera components.

Security Camera Trends: There are two major trends that are changing the way security cameras are designed:

1. The advent of megapixel sensors

2. The need for high (or wide) dynamic range (HDR/WDR)

Megapixel sensor

There was a time when VGA resolution sensors were sufficient for cameras used for security purposes, which were usually monitored by human operators or simply archived for later review. However, as the number of security cameras used worldwide has increased dramatically, there are not enough human operators to monitor, so the security industry has begun to rely on software to analyze video to monitor whether there are any anomalies in "areas of interest", either in real time or for later analysis. Sophisticated video analysis (VA) algorithms have been developed to distinguish abnormal situations from normal situations; however, in order to be effective, these algorithms need more detailed information than VGA resolution cameras can provide.

Cameras need higher resolutions to allow VA to identify general activity in limited and/or large areas, such as a parking lot. A camera needs about 30 pixels/input for license plate recognition, and about 150 pixels/input to see more detailed activity, such as determining cash register transactions. One megapixel covers detailed information within a 7-foot x 7-foot area, and it takes 4 VGA cameras to equal one megapixel camera.

Image sensors have been developed and commercially available for resolutions of 1, 2, 5, and even 10 million pixels. Obviously, as the number of pixels increases, so too does the amount of data that must be processed to take advantage of the higher resolution.

High Dynamic Range (HDR)

HDR, also known as Wide Dynamic Range (WDR), measures how well the sensor and ISP function to distinguish between dark and bright areas. We are all familiar with amateur family photos with the sun in the background. While the landscape bathed in sunlight is bright and clear, the faces of the people are very dark. This is because the (usually automatic) camera adjusts the exposure for scenes in sunlight. However, this exposure is too short to correctly identify darker objects. If you manually set the exposure or open the aperture to let in more light, you can distinguish the details in the dark areas, but the result of doing so is that the details in the bright areas are now overexposed or even completely washed out. This is not a good result for the operator or the VA software, because a lot of detail information in the area of ​​interest has been lost.

HDR sensors solve this problem in a creative way by taking multiple photos at different exposure times, and then combining and fusing these images through the ISP pipeline to preserve and present the details of the bright and dark areas of the area of ​​interest. Obviously, when multiple exposures are taken for the same image, the amount of data that needs to be processed increases. For example, when a camera can output a full HD 1080p image at 60 frames per second with an HDR sensor and 3 exposures per frame, when it is working, the ISP pipeline in the camera is actually processing 60× 3, which is equivalent to 180 frames per second.

FPGAs and increased processing load

The combination of a megapixel sensor and HDR significantly increases the processing load on the ISP pipeline. DSP devices are essentially "sequential processing engines" and have difficulty handling huge data processing loads. It may still be possible to handle the data of the 1080p 60HDR pipeline in the high-end DSP in our example above, but the cost and power consumption are prohibitive and not economically affordable. FPGAs are well suited to handle the increased load of high-definition, high dynamic range image signal processing due to their inherent parallelism.

The Importance of Programmability

In addition to providing high performance at very low power and cost, FPGAs are programmable by definition, which provides significant advantages over ASICs and ASSPs. ASICs are extremely expensive to design and build, and once completed, they cannot be changed. ASSP-based camera designs may be limited by the functionality of standard parts that have already been made, which cannot be modified. In fact, some DSPs and other ASSP devices in the video image processing market require an FPGA to bridge the sensor and standard parts to accommodate the new serial interfaces that new sensor manufacturers are using to receive the megapixel data transmitted by their sensors. With the advent of FPGA-based implementation methods, camera manufacturers can take advantage of programmability to quickly adopt new sensors and technologies in their designs, or quickly change their ISP algorithms.

[page]

Implementing ISP with HDR in FPGA

In order to implement ISP with HDR in FPGA, at least the ISP block in the image signal processing pipeline shown in Figure 1 must be implemented.

Figure 1 Image signal processing pipeline

Figure 1 Image signal processing pipeline

The following ISP modules are required:

Sensor port with automatic black level correction: This is necessary to detect and configure the image sensor registers and capture image data.

Black level correction: Each color channel has a time-dependent offset. Color processing requires linear signal processing, so all signals must be free of any offset. CMOS image sensors have a so-called dark line output to measure the average offset of each color channel. Black level correction subtracts the dedicated color channel based on the baseline noise of the line to achieve the best black level result.

Automatic Exposure: The purpose of the automatic exposure module is to continuously adjust the exposure to changing light conditions in real time.

Linearization: For example, the AptinaMT9Mo24/34 HDR sensor outputs 20 bits of information per color channel. In order to minimize the number of actual lines output by the sensor, Aptina uses a smart compression mechanism to compress this data to 12 bits. Linearization is the process of decompressing this 12-bit data back to the original 20 bits.

Defective pixel correction: Dead or hot pixels in the sensor due to the manufacturing process need to be corrected using the defective pixel correction module. This module corrects the defective pixel values ​​using interpolation of adjacent pixels based on the same color channel. Typical correction methods include cold or hot pixel detection, using the median or average value of the current pixel neighborhood to estimate.

2-D noise reduction: In addition to cold and hot noise, sensor pixels can randomly become noisy in the frame. This means that they output an intensity that is too high or too low compared to neighboring pixels. 2D noise reduction is performed based on interpolation of neighboring pixels of the same color channel to correct noisy pixels, similar to how the defective pixel correction module works.

De-Bayering (color filter array interpolation): Each pixel on the sensor has a so-called Bayer filter that uses one of three colors: red, green or blue. Therefore, two-thirds of the color data is lost and the resulting image is a mosaic of three colors. To obtain a full-color image, various demosaicing algorithms are used to get a complete set of red, green and blue values ​​for each pixel using an interpolation algorithm.

Color Correction Matrix (CCM): Image sensors often provide incorrect color reproduction due to so-called cross-color effects, which are caused by signal crosstalk between pixels. This effect results in images with incorrect colors (e.g. bluish green). Color correction involves complex matrix multiplications of pixel data to achieve pure colors.

Auto White Balance (AWB): "Recognizes" colors that sensors are not very good at. AWB adjusts other colors in the image to infer the white in the image as a reference through a so-called "gray world" algorithm. AWB determines white by examining the frequency (or wavelength) of the incident light and presents an image with natural colors.

Gamma Correction: Sensor pixels respond to incident light intensity in a linear manner. In order to be able to provide pixel data for common video systems, such as the logarithmic response of a picture tube, it may be necessary to convert to a non-linear encoding of values. Gamma correction provides this conversion.

High/Wide Dynamic Range (HDR/WDR) processing: This is the block that maps the 20-pixel sensor data into 8-bit RGB data, thereby representing the bright and dark areas of the image in the displayed image. A wide internal pipeline is required to ensure that details in the shadows are not lost, even when an intruder shines a flash directly into the camera lens. HDR works closely with the fast auto-exposure algorithm to quickly adjust exposure in changing light conditions.

Figure 2 HDR processed image: Strong flash hits the lens directly from a distance of 10 inches, causing no image loss

Figure 2 HDR processed image: Strong flash hits the lens directly from a distance of 10 inches, causing no image loss

Table 1 below shows the typical values ​​of FPGA resources required to implement all of the above ISP blocks in a 33K lookup table (KLUT), low-cost, low-power FPGA:

Table 1 FPGA resource usage of the ISP pipeline in the Lattice ECP3-35 FPGA

Table 1 FPGA resource usage of the ISP pipeline in the Lattice ECP3-35 FPGA

In addition to the ISP module already mentioned, the actual application data includes a statistics engine, image histograms used by specific modules in the system, a Lattice Mico32 soft processor for dynamic pipeline control, an I2C master for controlling various signals, an HDMI PHY module for driving HDMI signals directly from the FPGA, and even logo graphics overlays. This shows that a low-cost, low-power FPGA such as the Lattice ECP3-35 can be used to implement the entire image signal processing pipeline plus HDMI output. The internal HDR pipeline is 32 bits wide and can provide a high dynamic range of 192dB (20 log 2**32). In this actual application, a sensor with a dynamic range of 120dB is used, limiting the HDR to 120dB - still the highest value achievable by any FPGA. The actual application is able to process 1080p images at 60 frames per second while providing a high dynamic range of 120dB.

Advantages of using FPGA in HDR image signal processing

low cost

As mentioned above, a simple low-cost 33KLUT FPGA can easily handle a 1080p60 pipeline. The components and materials of a 1080p60 HDR camera implemented using the Lattice ECP3-35 mainly include the sensor, FPGA and associated clock oscillator, resistors and capacitors, voltage regulators, an HDMI connector, and lens assembly.

high performance

The display implementation delivers 120dB HDR, 1080p60 performance, the industry’s fastest auto-exposure, and extremely high-quality auto-white balance.

Low power consumption

The LatticeECP3 has extremely low static and dynamic power consumption compared to competing FPGAs or DSPs.

DDR3 Support: The FPGA supports the use of DDR3. Manufacturers who wish to include frame buffer memory in their designs can take advantage of this capability to use high-performance, low-cost DDR3 memory in their camera designs.

Low-power SERDES: A low-power FPGA with SERDES functionality enables manufacturers to implement the HDMI PHY directly in the FPGA, providing HDMI functionality without the cost of adding an external HDMI chip.

Overview

Low-cost, low-power FPGAs are ideal for handling the massive increase in signal processing load caused by the need to use megapixel sensors and HDR features in cameras in security applications. Programmable FPGAs also provide unprecedented flexibility. FPGA implementations provide high-performance ISP pipelines with high dynamic range at a cost equal to or lower than traditional image signal processing methods.

Reference address:Design of high dynamic range image signal processing based on FPGA

Previous article:Design of digital phase shifter-voltage-frequency converter module based on DSP control
Next article:Data Acquisition and Display Based on FPGA

Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号