Implementation of a Fast Algorithm for Connected Region Labeling in Binary Images Based on FPGA

Publisher:NanoScribeLatest update time:2011-05-27 Source: 贺明Keywords:FPGA Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

1 Introduction

In the process of automatic target recognition and tracking in images, the image targets are first segmented and extracted by threshold. The binary image obtained usually contains multiple connected areas. The system uses the shape characteristics of the image targets to automatically identify suspicious high-threat flying targets. Therefore, it is necessary to detect and judge each connected area block separately. This paper uses an improved fast marking algorithm suitable for FPGA implementation to detect and extract each connected domain.

The following methods are commonly used to detect connected objects in binary images [1] [2] [3]: Region growing method: First, scan the image row (column) by row (column). Whenever an unlabeled "1" pixel is encountered, it is assigned an unused label. Then its area is detected. If there is an unlabeled "1" pixel, the same label is assigned. Repeat this operation until there are no "1" pixels whose labels should be propagated. Then continue to scan the image row (column). If an unlabeled "1" pixel is detected, it is assigned a new label and the same processing is performed as above. When the entire image is scanned, the algorithm terminates. This method can accurately detect various types of connected objects. However, the processing time is also long because the neighborhood of each "1" pixel must be detected one by one, and repeated scanning of "1" pixels occurs. Tracking algorithm: Each pixel with a value of "1" in the binary image is marked with a label related to its coordinates, such as a number consisting of a string of n and m. After heating,

Scan the labeled image and change the label of every ten pixels to the minimum label in its neighborhood. Repeat this process until no label changes are needed. This method converges slowly when dealing with small and convex objects.

This paper proposes a fast binary image connected domain labeling algorithm with computational regularity for the purpose of FPGA implementation. Compared with the traditional binary image labeling algorithm, this algorithm has the characteristics of computational simplicity, regularity and scalability, and is suitable for FPGA implementation. Under the working clock of 100MHz, the processing of 384×288 pixel infrared images can reach a labeling speed of more than 400 frames/second, which is enough to meet the requirements of real-time target recognition systems. The processing speed can meet the requirements of most real-time target recognition systems. The algorithm can also be applied to embedded DSP systems in software programming.

2 Algorithm Description

First, before the marking algorithm is performed, an independent image marking cache and connectivity relationship array are opened up by hardware. Then, during the acquisition and transmission of the video stream, the image is scanned row by row in a pipeline manner in the order of video transmission. Then, the neighborhood of each pixel is checked for connectivity and the equivalent marking relationship is merged in the counterclockwise and horizontal directions respectively. The detected results are used to update the marking equivalence array and the marking cache. After a frame of image acquisition and transmission is completed, the preliminary marking result of the image and the connectivity relationship between the preliminary markings are obtained. Finally, the labels of the connectivity relationship array are merged from small to large according to the label transmission process. The merged connectivity relationship array is used to replace the labels in the image marking cache. The replaced image is the final marking result, and the connected domain is assigned a unique continuous natural number according to the scanning order.

Marking algorithm process

Figure 1 Marking algorithm flow

This paper's fast binary image connected domain labeling algorithm is divided into three parts:

1. Preliminary image labeling: assign a temporary label to each pixel and record the equivalence relationship of the temporary labels in the equivalence table

2. Organize the equivalence table: This step is divided into two steps:

(1) All temporary marks with equivalence relations are made equivalent to their minimum values;

(2) Renumber the connected regions in natural number order to obtain an equivalence relationship between the temporary labels and the final labels.

3. Image replacement: Replace the image pixel by pixel, and replace the temporary mark with the final mark. After three steps of processing, the algorithm outputs the marked image. The connected domains in the image are marked with continuous natural numbers in the order from top to bottom and from left to right. [page]

2.1 Initial Image Labeling

Marking algorithm symbol convention: When the algorithm detects connected domains in the counterclockwise direction, w1 and w2 are used to represent two consecutive lines of image data. When the algorithm detects connected domains in the clockwise direction, k0 and k are used to represent two consecutive lines of image data after counterclockwise marking. Their positions in the working window are respectively illustrated in Figures 2 and 3; the initial counterclockwise temporary mark is represented by Z. The initial mark value of Z is 1.

The binary image connected domain labeling algorithm adopts the 8-connectivity judgment criterion and eliminates the image boundary effect by narrowing the labeling range. In order to simplify the labeling process and complete the labeling process within the hardware transmission operation time of one frame of image, the labeling process is divided into two continuous types using the intermediate data cache. Type 1 is used for direct image sequence transmission. When the hardware initiates image sequence transmission, type 1 uses counter-clockwise connected domain detection to initially label the binary pixels in the 2×3 working window. Type 2 performs horizontal connected domain detection and merging on the image data initially labeled by type 1, and then stores the labeling results in the image storage area.

Image initial tag type 1:

Step 1 reads pixels w1(2), w1(1), w1(0), w0(2), w0(1), and the corresponding binary pixel values.

Step 2: Read pixel w0(1) and compare it with w1(0), w1(1), w1(2), and w0(2) in a counterclockwise direction. If w0(1) = w1(0), then k0(1) = k(2); if w0(1) = w1(1), then k0(1) = k(1); if w0(1) = w1(2), then k0(1) = k(0); if w0(1) = w0(2), then k0(1) = k0(0); otherwise (i.e., w0(1) ≠ (w1(2), w1(1), w1(0), w0(2)), k0(1) = Z; Z++.

Step 3 writes the equivalence relation table, and writes Z into the equivalence relation array with Z as the address.

Counterclockwise initial marking work window

Figure 2 Counterclockwise initial marking work window

Image initial tag type 2:

Step 1 determines that after counterclockwise marking, if w0(1) = w0(2) = 1, and the marking grayscale k0(1) ≠ k0(0), proceed to the next step.

Step 2: Assume k0(1)>k0(0), determine if lab(k0(1))=k0(1) or lab(k0(1))=k0(0), then lab(k0(1))=k0(0), otherwise perform tracking permutation on the label array. Jump to step 3.

Step 3: Assume k0(1) < k0(0), determine whether lab(k0(0)) = k0(0) or lab(k0(0)) = k0(1), then lab(k0(0)) = k0(1), otherwise perform tracking permutation on the label array.

Tracking permutation method: In step 2, the tracking permutation is set to t = lab (k0 (0)); if lab (t) ≠ t, then set t = lab (t), and repeat until lab (t) = t; in step 3, the tracking permutation is set to t1 = lab (k0 (1)), and the above tracking process is also performed on lab (k0 (1)).

Horizontal initial marking work window

Figure 3 Horizontal initial marking work window

2.2 Equivalence table arrangement and image substitution

First, scan the equivalence table starting from address 1.

The equivalence table is checked in turn to see if each temporary tag has an equivalence relationship. If so, the equivalence table is updated with the tag value as the data of the equivalence table address. Since the sorting process starts from the equivalence table address 1, the scanning of the entire equivalence table can be completed in one go.

The image substitution process replaces each pixel in the temporary labeled image to generate the final labeled image. The specific method is: let the temporary label value of the pixel with coordinates (n, m) in the image be S, then write lab(S) to the position (n, m) in the image. In the image obtained after substitution, the connected areas are marked with unique natural numbers in the order from top to bottom and from left to right. [page]

2.3 Analysis of Algorithm Characteristics

The algorithm design has the following characteristics:

a. The algorithm in this paper marks the identification and tracking of aerial targets, and can remove the image boundaries that have no effect on the identification of aerial targets. The marking of other pixels in the image is completed using the algorithm described in 2.1. Therefore, the operation has regularity and same address. When using hardware implementation, only two algorithm framework functions need to be defined for cyclic use, which saves algorithm memory resources.

b. During the initial image labeling process, the equivalence table is preliminarily sorted out while recording the labeling equivalence information. This arrangement can ensure that when there are complex connectivity relationships between regions, the equivalence table can save all the equivalence relationships that have been detected; on the other hand, when the labeling algorithm is implemented in hardware circuits, the initial image labeling and the initial sorting of the equivalence table can be executed in parallel. The preliminary sorting of the equivalence table can simplify the subsequent sorting of the equivalence table, which is equivalent to compressing the entire labeling execution process.

c. In this algorithm, two measures are taken to reduce the number of temporary marks: first, all the mark information generated within the 8-neighborhood range is repeatedly used, and after marking the 8-neighborhood range in counterclockwise order, the equivalent marks are merged in the horizontal direction with the help of the order of image transmission, which reduces the probability of assigning new mark values; second, when merging equivalent marks in the equivalence table, the equivalent marks are compared and replaced in the order of the equivalence table address from small to large, so that the equivalent marks take smaller values ​​and no equivalent marks are omitted. Third, combined with the video data stream transmission method, the ping-pong storage structure is used for pipeline processing, and image marking and image mark replacement are performed at the same time. This enables image marking to achieve real-time processing.

3 FPGA Implementation of Algorithm

FPGA (Field Programmable Gate Array) is a large-scale programmable logic device that can be used in various digital logic systems, especially in real-time processing, and has unique advantages. In the real-time implementation of this algorithm, the cost-effective Cyclone II EP2C8 FPGA[4] from Altera is used. The device has 8256 LEs, 18 DSP modules, and 165888 bits of storage units. These storage units can be configured as memories of different sizes and bits, which can reduce the use of external memory, reduce the size of hardware, and facilitate the miniaturization of circuits.

Hardware structure of fast marking algorithm FPGA implementation

Figure 4 Hardware structure of the fast marking algorithm FPGA implementation

The hardware structure of the fast marking algorithm FPGA implementation shown in Figure 4 is mainly composed of binary video stream delay FIFO serial-to-parallel conversion, counterclockwise marking unit, merge data transmission interface, horizontal direction mark merging unit, mark equivalence relationship table, mark equivalence relationship sorting unit, image mark substitution and other units.

The video stream acquisition unit inside the FPGA binarizes the collected grayscale data according to the segmentation threshold and outputs a binary video stream; the serial binary video data stream is converted into two lines of parallel data through the serial-to-parallel conversion of the delay FIFO; the counterclockwise direction marking unit uses the shift register to form a window as shown in Figure 2 for the received parallel data stream, performs a counterclockwise connectivity check on the data in the window, generates an initial equivalence relationship table and temporarily marks the pixel data; the horizontal direction marking merging unit immediately follows the counterclockwise direction marking, and forms a data window as shown in Figure 3 for the pixel data after the initial marking through the shift register, performs a marking equivalence judgment on the data in the data window in the horizontal direction, merges the marking values ​​belonging to the same area, and tracks the permutation marking equivalence relationship table, unifies all the equivalent marking values ​​before this into the smallest marking value, and finally stores the merged parallel marking video stream in the memory group composed of the subsequent dual-port RAM; the external dual-port RAM stores the marked pixel data of the previous row in the external dual-port RAM when processing the next row of video data stream.

After a frame of image data is marked, the equivalence relation table array is sorted and merged in the gap between video data. When the next frame of video data is transmitted, the previous frame of data is taken out from the external dual-port RAM to replace the marked image and complete the final image marking. [page]

The tag cache adopts a ping-pong structure through the dual-port RAM in the FPGA. When marking two lines of image data, the external dual-port RAM interface stores the marked line of image data. As shown in Figure 5, the tag cache structure uses a total of three 384*10bit dual-port RAMs. Each dual-port RAM corresponds to a line of image tag data. The horizontal merge unit and the external DPRAM interface interact to store data. When the horizontal merge unit stores two of the dual-port RAMs at the same time, the external DPRAM interface performs a storage operation on the remaining third dual-port RAM, forming a tag cache ping-pong structure storage operation. The external storage interface uses 4 times the pixel clock to move the data in the cache to ensure that the external data is moved after the remaining two dual-port RAMs are marked. Experiments have shown that 3 lines of tag data can be moved in the time it takes to transmit a line of tag data.

Tag cache structure

Figure 5 Tag cache structure

In order to meet the real-time pipeline processing of the labeling, the peripheral dual-port RAM also adopts a ping-pong structure. While storing the label data of a frame of image, the data is retrieved to replace the label image, and the shape of the label result image is recognized during the data retrieval process.

The internal structure is shown in Figure 6. The peripheral dual-port RAM is set to two areas of image frame A and B, and two data address ports are used to operate on the two areas A and B at the same time. Therefore, the marking algorithm delays the transmission time of one frame of video data as a whole, which has strong real-time performance.

4 Experimental results and analysis

Figures 7, 8, 9, and 10 show the marking results of four 384×288 binary infrared images. Since this paper is designed for flying targets in the sky, the image edges are removed (the background targets at the edge of the image are not marked and are set to 0); the marking algorithm rules are unified to reduce the impact of edge background on target recognition.

Figures 7, 8, 9, and 10 all show aircraft formations in the sky with a cloudy background. For clouds of different shapes, there are complex complex connectivity relationships, which generate complex equivalence table operations and are ultimately correctly assigned the same label. The simulation results are shown in Table 1. Among them, the FPGA working clock is 100 MHz; n is the number of connected regions; N (MAX) is the maximum temporary label; T1 is the execution time of TI's 6416DSP through the traditional convergence labeling algorithm; T2 is the execution time of TI's 6416DSP through the fast labeling algorithm designed in this paper; T3 is the execution time of FPGA through the fast labeling algorithm designed in this paper.

Internal partitioning of peripheral dual-port RAM

Figure 6 Internal partitioning of the peripheral dual-port RAM

The simulation results show that when the traditional convergence labeling algorithm runs in software on TI's DSP6416 system, the algorithm processing speed is uncertain and depends on the shape and number of connected domains in the image; when the binary segmentation image labeling fast algorithm described in this article runs in software on TI's DSP6416 system, the algorithm can process 384×288 pixel images at a processing speed of 50 frames; however, when the algorithm is implemented in FPGA, it can reach a processing speed of 400 frames/second at a working clock of 100MHz.

Comparison of image labeling simulation results

Table 1 Comparison of image labeling simulation results [page]

Binary infrared image and marked image containing 4 aircraft targets
Figure 7 Binary infrared image with 4 aircraft targets and the labeled image
Binary infrared image containing two aircraft targets and the marked image
Figure 8 Binary infrared image with 2 aircraft targets and the labeled image
Binary infrared image and marked image containing 4 aircraft targets
Figure 9 Binary infrared image with 4 aircraft targets and the labeled image
Binary infrared image containing two aircraft targets and the marked image
Figure 10 Binary infrared image with 2 aircraft targets and the labeled image

References

[1] Zhang Shusheng. A fast detection method for connected objects in binary images based on line-based label propagation[J]. Computer Research and Development, 1994, 31(10): 51-54

[2]Ranganathan N,Mehrotra R.Subramanmian SA high speed systolic architecture for labeling connected components in an image[J].IEEE Transaction on Systems.Man.and Cybernetics, 1995, 25(3):415-423

[3]Nicol CJ,A systolic approach for reahime connected component labeling[J].CVGIP,Image Understanding, 1995, 61(1):17-31

[4]Altera.cycloneII_handbook[R].published by Altera,2005

Keywords:FPGA Reference address:Implementation of a Fast Algorithm for Connected Region Labeling in Binary Images Based on FPGA

Previous article:Swapping Bits to Improve FPGA-PWM Counter Performance
Next article:Radio Interface Conversion Module Based on FPGA

Recommended ReadingLatest update time:2024-11-16 22:41

Design of catenary fault signal analyzer implemented in FPGA chip implanted with 8051 microprocessor
introduction As chips become larger and larger and resources become more abundant, the design complexity of chips also increases greatly. In fact, after the chip design is completed, some controls sometimes need to be changed according to the situation, which is often encountered during use. It would be very undesirab
[Microcontroller]
Design of catenary fault signal analyzer implemented in FPGA chip implanted with 8051 microprocessor
Design of General SSR Signal Processor Based on DSP+FPGA
The secondary surveillance radar (SSR) target identification system can conduct a "question-and-answer" interrogation of a target equipped with a transponder by transmitting a specific radio frequency pulse sequence, and obtain the target's altitude, number and other information from the transponder's reply pulse code
[Embedded]
Design of General SSR Signal Processor Based on DSP+FPGA
Application of FPGA in Virtual Instrument Design
As we all know, virtual instrument technology is a system in which general test hardware functions are defined by software according to user needs. By applying reconfigurable hardware to a virtual instrument system, engineers can use software to develop algorithms and apply them to an embedded chip, thereby extendin
[Test Measurement]
FPGA Implementation of Multi-Phase Representation Structure of Sampling Rate Converter
FPGA is an efficient means of implementing digital signal processing. In the field of high-bandwidth signal processing, FPGA technology can achieve higher computing speeds than general-purpose DSP chips through multi-level computing units on a chip . Since sampling rate conversion can be implemented in a parallel way,
[Embedded]
FPGA Implementation of Multi-Phase Representation Structure of Sampling Rate Converter
Dual-channel rotary transformer angle measurement system based on FPGA
O Introduction A rotary transformer is an electromagnetic induction sensor used to measure the angular displacement and angular velocity of a rotating object. It consists of a stator and a rotor. The stator winding serves as the primary side of the transformer and receives the excitation voltage. The rotor wind
[Embedded]
Dual-channel rotary transformer angle measurement system based on FPGA
How Anlu Technology uses FPGA to break through Industry 4.0
On August 27, at the 2020 World Semiconductor Forum, Anlu Technology was named China's IC Unicorn of 2019. This is the second year that Anlu Technology has won the honor of FPGA unicorn. With the development of new generation information technology, the world has entered an unprecedented era of innovation-intensive
[Mobile phone portable]
Design of Storage Test System Based on FPGA
0 Introduction Dynamic testing technology is a comprehensive technology that aims to capture and process various dynamic information. It plays an important role in contemporary science and technology and is widely used in aerospace, instrumentation, transportation, military, medical and other research. Common
[Test Measurement]
Design of Storage Test System Based on FPGA
Design of a programmable traffic control system based on new rules
This system consists of a single-chip microcomputer system, keyboard, LCD display, and traffic light demonstration system. The system includes sidewalks, left turns, right turns, and basic traffic light functions. In addition to the basic traffic light functions, the system also has countdown, time setting, eme
[Embedded]
Design of a programmable traffic control system based on new rules
Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号