Design and implementation of document image information acquisition system

Publisher:LuckyDaisyLatest update time:2011-03-14 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

With the continuous development of information technology and the continuous improvement of various social rules and regulations, people need to use various certificates to represent personal information. Collecting and identifying information on these certificates is a necessary means of obtaining personal information and security, which is conducive to the supervision and service provision of relevant departments to the society. A complete certificate information collection system consists of certificate information collection equipment and corresponding supporting software. The collection equipment is connected to the PC through an interface, and the supporting software running on the PC is used to realize the automatic extraction and entry of certificate information, and the information management system is used to realize the management and maintenance of information. It can be widely used in public security, banking, telecommunications, hotels, transportation, securities, civil affairs, education, entry and exit and other departments and industries, saving a lot of manpower and material resources and ensuring the safety of public places. Commonly used automation technologies for certificate information collection include optical character recognition (OCR) technology and radio frequency tag recognition (RFID) technology. At present, only the second-generation resident ID card in my country adopts RFID technology, and other certificates such as passports and driver's licenses do not adopt RFID technology. Therefore, optical character recognition or image information collection and recognition of certificates is an indispensable part of the general certificate information collection system.
An excellent ID card image information acquisition system (hereinafter referred to as the ID card system) should have a short acquisition time, high information acquisition accuracy, high imaging quality, long service life and humanized system and appearance design. Taking the above factors into consideration, this paper uses a macro wide-angle lens to capture ID card images. After the captured images are preliminarily processed in the ID card image information acquisition hardware system, the data is transmitted to the PC host computer through USB2.0, the most widely used high-speed peripheral interface, and the image information is extracted and classified by the relevant processing software system in the host. This is a system development model that combines data acquisition equipment with PC-based application software. It can use the high-speed and powerful computing power of general PCs in daily life to quickly process and calculate images that require high algorithm complexity, providing a guarantee for the realization of complex information processing algorithms, and can also meet the requirements of many application fields in terms of response speed and real-time performance. This ID card system includes hardware and software parts. The lens uses a macro wide-angle lens with a focal length of 1.8 mm and a viewing angle of more than 140°, so that the document can be fully framed at about 45 mm in front of the lens, thereby realizing the miniaturization of the system device. The scalability and reusability of the document system are mainly reflected in the ability to recognize more types of documents (such as identity cards, passports, household registration books, etc.) and universal image data processing algorithms. In view of the scalability and reusability issues, this paper adopts the MVC (Model View Controller) architecture [1] to design the software part of the document system, and realizes loose coupling between modules in the software architecture.
1 System overall framework
The document system itself is an independent data collection and information acquisition system, which is composed of its own hardware and software. The hardware part includes a camera module, a data transmission module, an LED light module, and a power supply module. If the function of obtaining information by radio frequency sensing of the RFID card of the second-generation ID card is added, there is also an RFID sensing module. Since the use of the ID card RFID module needs to be certified by the public security department, the system described in this paper does not include the specific implementation of the RFID module function, but the corresponding interface has been reserved when the system was designed. The software part mainly includes the camera driver subroutine running on the PC, the image acquisition subroutine, the image distortion compensation, color enhancement, segmentation and other processing subroutines, the information extraction subroutine and the database storage and management subroutine. The whole system framework is shown in Figure 1.

The document image information acquisition device communicates with the PC through the USB2.0 interface. The application software running on the PC host can send control commands to the USB camera through this interface to capture the corresponding image or video data. If there is an RFID sensing module, the sensed data is also transmitted to the application through this interface for processing.
2 System Hardware
The document image information acquisition device is mainly composed of four parts: image acquisition module, transmission interface module, power module, and LED light source module (excluding RFID sensing module). The hardware block diagram is shown in Figure 2.

2.1 CMOS image acquisition module
CMOS camera is used for image acquisition, which has clear images and low power consumption. Short focal length wide-angle lens is used for object imaging, which can effectively reduce the distance between the object and the CMOS sensor while expanding the field of view, thereby effectively reducing the size of the instrument, making the product more portable, and even assembled in the chassis of a general PC. The system implemented in this paper uses a macro wide-angle lens with a focal length of 1.8 mm and a viewing angle greater than 140°, so that the document can be fully framed at about 45 mm in front of the lens. Including the height of the lens, the total thickness is about 60 mm. The use of a wide-angle lens will cause barrel distortion, which will be compensated by software algorithms in the host program to restore the image as much as possible.
2.2 Transmission interface module
The system uses a 480 Mb/s USB2.0 interface for high-speed transmission. After completing data acquisition, the acquisition module can quickly transmit it to the host computer through this interface. Considering that the system must eventually expand the RFID sensing module to obtain the information of the second-generation ID card by induction, the same USB interface also needs to transmit information from the RFID module. This only requires adding a USB Hub chip to realize the multiplexing function of the interface. The more common GL850A[2] USB Hub chip can be used to select the two signals to meet the system requirements.
2.3 LED light source module
Since the image acquisition of the document is carried out in an environment similar to a darkroom, it is necessary to provide an illumination light source. The color temperature range of LED is relatively wide, and the color temperature has a direct impact on the image quality. Therefore, LEDs with relatively weak directionality and luminous colors close to white light are selected, and two LED light source groups are designed. From the actual shooting effect of the system, this design can ensure that the final image will not produce artificial bright spots due to the directionality of the light source, nor will it cause excessive color distortion due to the color temperature of the light source deviating too much from the normal range.
2.4 Power supply module
The system uses an AC-DC converter to convert the 220 V mains power to generate the 12 V voltage required to supply the LED light source group. As for the power supply of the CMOS image sensor module, it is directly provided through the USB connection cable, and its operating voltage is 5 V.
3 System software
3.1 System software composition

Since most image data processing is implemented through supporting software running on a PC, the main task of software system design is to design a software system that is efficient, easily expandable, and suitable for the Windows system. This system is mainly divided into an image acquisition module, an image processing module, and an information extraction module to achieve image acquisition, processing, information extraction, echoing, and editing. Since the entire system is ultimately running on the Windows operating system, the image acquisition module is developed using DirectShow[3] in DirectX to achieve the function of efficiently acquiring image/video information on the Windows platform; for the design of image distortion compensation and enhancement processing algorithms, Intel's efficient and reliable open source OpenCV image processing library[4] is used; for image information extraction, the MODI recognition control[5] embedded in Office 2003 is used, and the Chinese character recognition engine in this component is Tsinghua Unigroup's OCR engine; for the graphical user interface design, Microsoft's MFC[6] is used. Figure 3 shows the relationship described above.

Given that DirectX is based on the Component Object Model (COM) technology, the OpenCV library is also mainly developed in C/C++, and components such as MODI and MFC are also based on C++ descriptions and other technical factors, the software system uses Visual Studio 6.0 as the development tool and chooses C++ for program design. Using a compiler that supports C++, programs with high execution efficiency can be generated, and the design of object-oriented programs can achieve good data encapsulation.
3.2 System scalability design
The MVC software architecture can achieve the maximum degree of loose coupling between modules and ensure the scalability and reusability of the system [1]. Figure 4 is the UML main class diagram of the system. The design scheme implements the expansion of the certificate system through the expansion model, view and controller.

(1) Model part
The model part is represented by data-related processing operations such as image acquisition, processing, and text recognition. When designing the system, the processing operations related to image data processing are encapsulated and the CDibImage class is implemented. This class describes all data processing related to document images. Based on the analysis and summary of the characteristics of document images in such systems, a set of reasonable and effective processing flow solutions are proposed, and the OpenCV image library is used to implement automatic white balance color compensation, distortion spline curve modeling distortion compensation [7], tilt angle detection based on contour information and Hough transform to complete tilt correction, coarse and fine segmentation combined with image prior knowledge, and text information recognition. This solution is a general processing flow and solution that realizes the correct segmentation of each information block of the document image and is suitable for all document image processing, so it can adapt to the needs of new document recognition in the future. The expansion of the model includes increasing the function set of the model and expanding the application objects of the model. The former, such as adding image gamma value correction algorithm, can be achieved by adding new data processing functions or encapsulating through some specific classes. The latter, for example, uses an existing model to process a new type of certificate. In this case, it can be realized by simply redesigning the distribution pattern of the corresponding information segments on the certificate according to the above process.
(2) Controller part
In order to ensure logical consistency, only a single controller is usually designed in a system using MVC design[9]. Obviously, a single controller needs to be called multiple times in the system. For this reason, the system uses the singleton pattern [8] to design the controller. Applying the singleton pattern to the controller design in the MVC pattern can make the system development process more secure, eliminate the potential insecurity caused by the global scope of the controller (equivalent to global variables), and facilitate the expansion design of software scale. Usually, the singleton class itself is responsible for saving its only instance and uses a static member function to provide a global access point[9].
Define a controller class Controller and define a static member function GetInstance for the class to provide a global access point for the class operation. The Controller class also defines a state member variable static Controller* singleton to save a pointer to its only instance. Clients only access this singleton through the Controller::GetInstance function. The pointer variable Singleton is initialized to 0, and the static function GetInstance returns the variable value. If its value is 0, it is initialized with the only instance. Singleton uses lazy initialization, and its return value is not created and saved until the first access. The constructor of the Controller class is a private type. Programs that attempt to directly instantiate the Controller class will get an error message during compilation, which ensures that only one instance can be created.
(3) View part
The view part of the software system is presented as a graphical user interface that interacts with the user. The CTabCtrl page table container control is used in the software design to accommodate various document display pages. For this purpose, a class CTabSheet is designed, and its UML class attributes are shown in Figure 5.

Using the CTabSheet class, you only need to maintain an object of this class, CTabSheet m_tabSheetCard, in the main interface to manage and control each page. For example, to add a page, just call the CTabSheet::AddPage() method; to change the page, just call the TabSheet::SetCurSel() method to set the currently selected new page. In the system, all certificate sub-dialog classes inherit from the CCertificate class, as shown in Figure 6, and this class inherits from the CDialog class in MFC. If you want to add a new passport page, you can use the MFC wizard to generate the corresponding CPassport class.

4 Experiments
The lens of the hardware device in the experiment uses the wide-angle lens introduced above, and the PC uses Lenovo Pentium dual-core CPU Qitian M6900 PC machine. The system first performs spline function distortion compensation on the collected image, and the ID card image after correction is shown in Figure 7; then the processed ID card image is recognized, and the information content of the collected ID card is shown in Figure 8. From the perspective of the entire collection and recognition situation, the ID card system designed in this scheme can effectively realize the functions of image input, capture, processing and ID card information recognition.

This paper proposes a document information collection system and describes the software and hardware parts of the whole system. The system uses a macro wide-angle lens with a focal length of 1.8 mm and a viewing angle greater than 140°, so that the document can be fully framed at about 45 mm in front of the lens. The MVC architecture is used to design the document image information collection software, and the loose coupling between the modules is achieved in the software architecture to ensure the scalability and reusability of the system software part. The experiment shows that this document system can meet the actual document information collection and use requirements, and has good scalability, providing a good software platform for expanding the document recognition types of the system and adding updated image processing algorithms in the future. This document information collection system is also an effective functional supplement to the RFID-based document recognition system.
References
[1] Sun Weiqin, Li Hongcheng. Detailed Explanation of Tomcat and Java Web Development Technology [M]. Beijing: Electronic Industry Press, 2006.
[2] Genesys Logic, Inc. GL850A USB-2.0 Low Power Hub Controller DataSheet [M]. 2007.
[3] PESCE M D. Programming microsoft direct show for digital video and TV [M]. Washington: Microsoft Press, 2003.
[4] The open computer vision library [EB/OL]. http://sourceforge.net/projects/opencvlibrary/.
[5] OCR Images Using Microsoft Office2003 SDK [EB/OL]. http://www.print-driver.com/sdk/postprint/ocr_office2003_vc6.html.
[6] CORPORATION M. MFC library reference: CTabCtrl Class, Microsoft Visual Studio 2005 Documentation [M].
[7] Wang Zhanbin, Zhao Hui, Tao Wei, et al. Spline function correction method for barrel distortion of wide-angle lens [J]. Optoelectronic Engineering, 2008, 35(4): 140-144.
[8] GAMMA E. Design Patterns: The Basis of Reusable Object-Oriented Programs [M]. Translated by Li Yingjun. Beijing: Machinery Industry Press, 2000.
[9] Lu Qiming. Selected DirectShow Practices [M]. Beijing: Science Press, 2004.

Reference address:Design and implementation of document image information acquisition system

Previous article:New 32-bit microcontroller enables true single-chip DRM digital audio codec
Next article:Application circuit design of AD sampling chip MAX197

Latest Industrial Control Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号