Composition of the visual system What are the commonly used interfaces in machine vision?-EEWORLD

Collect

Composition of the visual system

From the definition of machine vision, one can infer that a camera and a processor are combined to form a vision system. However, this is only a partially complete description. It is true that a vision system requires a camera and a processor, but there are more components, as shown in Figure 8.

The camera needs a lens to form an optical image. The lens must provide the right working distance (lens-to-scene distance), the correct magnification so that the scene fills the camera's field of view, and the ability to resolve detail. A light source is also needed to ensure that the camera gets enough appropriate lighting to create a usable and reliable image. There needs to be an interface between the camera and the processor. Software is also needed to perform the required application analysis. Finally, the processor must support the required inputs and outputs with other connected devices.

Types of Vision Systems

Typically the camera and processor are separate. This allows the camera to be smaller while the processor has all the required computing power and input and output capabilities. It also allows one processor to service two or more cameras. A common form of this type of vision system is to use a personal computer as the processor (see Figure 9). PC-based vision systems offer the greatest flexibility. There are many camera interfaces to choose from, many different software packages to choose from, and significant flexibility in input and output configurations. However, this flexibility increases the engineering difficulty of the application. To make this configuration compatible with more demanding environments, reliable industrial computers are available.

A similar configuration replaces the personal computer with a proprietary processor (see Figure 10). This processor comes with proprietary software. It can use standard or proprietary camera interfaces and is usually flexible in terms of input and output. This configuration is often designed for factory environments. The application is less difficult to engineer than a PC-based vision system.

There is also a trend to merge the camera and processor into a single compact device, called a smart camera or smart sensor (see Figure 11). This has become very effective for single camera applications and IoT devices. Due to the compact size, processing and input and output are limited. Although smart cameras sometimes come with built-in lighting, other light sources are usually required. In most smart cameras, the software is already built in, making these the easiest vision systems to apply. There are some products designed for OEMs who can take on more engineering work, and these products do not come with pre-installed software.

This category is the Application Specific Machine Vision (ASMV) system. Figure 12 shows an example. This is a vision system designed for a specific application. It requires little engineering work beyond installation and product specific configuration to be ready for immediate use. The higher cost of the ASMV can offset the advantage of requiring little application engineering. In addition, more comprehensive field support is available relative to other vision systems designed for just one installation.

The last category is embedded vision systems. There is no completely consistent definition of what an embedded vision system is. In practice, we can think of it as a vision system designed to be integrated into a final product, with the camera and processor tightly coupled, as shown in Figure 13. It is mainly for OEMs to integrate it into their products.

Camera

Let's look at the camera in more detail. Figure 14 shows the components of a camera.

The image sensor is the main component of a camera and gives it most of its important characteristics.

The electronics control the timing of the image sensor, adjust functions such as the image sensor's gain, and provide other features unique to the camera.

The lens holder is used to mount the lens to project the optical image onto the image sensor.

The interface connects the camera to the processor.

Image Sensor

An image sensor, shown in Figure 15, consists of a set of sensing elements, which are shown in an enlarged view, along with circuitry to read the signals from the elements and convert them into digital signals.

When exposed to light, the sensing element converts incident photons into electrical charges (see Figure 16).

During the exposure, these photogenerated electrons accumulate in the sensing element. When the sensing element is read out, its voltage is proportional to the number of photogenerated electrons. After the readout, the sensing element is reset, clearing out all the photogenerated electrons and making it ready for the next exposure. You might correctly infer that the sensing element is actually a photon counter. That's almost right, except it's an imperfect photon counter. It doesn't count every photon. The percentage of photons it counts is called its quantum efficiency, or QE for short. Quantum efficiency varies with the wavelength of the light, as shown in Figure 17.

Color imaging

There are two main approaches to making image sensors for color cameras. The first approach uses three separate image sensors and optics to separate the red, green, and blue color components into separate directions so that each sensor senses only one of the colors (see Figure 19). Each pixel has three values, one from each of the three image sensors.

Another approach is to place red, green, and blue filters on each individual sensing element. A common pattern of these filters is the Bayer pattern (see Figure 18). Each pixel still has three values: one directly sensed through its color filter, and the other two color values interpolated from neighboring pixels.

High Dynamic Range

Noise levels in cameras typically limit data to 8 or 10 bits per pixel. This equates to 256 or 1,024 grayscale levels. For many applications, this is perfectly adequate. However, some applications require the ability to perceive detail in both the highlight (bright) and shadow (dark) areas of an image. A solution to this challenge is high dynamic range (HDR) imaging. This involves taking two (or more) photos with different exposure times. The upper bits of the pixel with the shorter exposure time will retain detail in the highlights, while the lower bits of the pixel with the longer exposure time will retain detail in the shadows. By combining the bits from each exposure to generate a new pixel value, a higher dynamic range is achieved, as shown in Figure 20.

Lens Installation

There are many different lens mounts used on machine vision cameras, but the most common is the C-mount. A lens mount is characterized by its threads (unless it is a bayonet mount) and its flange focal length. The flange focal length is the distance from the lens mount to the image sensor. For the C-mount, the threads are 1" x 24 threads/inch. The flange focal length is 17.52 mm (0.69 in). Another common lens mount is the CS-mount. It has the same threads as the C-mount, but its flange focal length is 12.52 mm. The lens must be mounted to the camera. Usually, both have the same mount. However, sometimes adapters are used when the lens and camera have different mounts.

Camera interface

There are many different camera interfaces. The following six are commonly used in machine vision:

Camera Link

GigE Vision

USB3 Vision

CoaXPress

Camera Link HS

MIPI

Most of these standards have different options, some with many options. Therefore, a detailed description of each interface is beyond the scope of this article. These interfaces differ in the following important characteristics:

Bandwidth/Speed - How fast the image data can be transferred from the camera to the processor.

Latency - The time between the start and completion of image data. Variation in latency is usually the most important factor.

Data reliability - If data is corrupted during transmission, can the interface detect and correct it?

Cable Length - How long a cable can be used and still have adequate bandwidth and data reliability.

Cables/Connectors - What types of cables and connectors are used. Many standards provide different cable and connector options. Most interface standards also support the use of fiber-optic cables.

Frame grabber card (frame acquisition card) - The interface requires the use of a special adapter card called a frame grabber card.

Power - Can the camera's power be supplied via the same cable as the data?

Another important interface standard is GenICam. GenICam is not a physical standard and has nothing to do with cables, data transfer rates, etc. It is a standard about the camera providing information about the camera's characteristics to the processor it is connected to. According to GenICam, the camera contains an XML file that can be read by the processor. This file describes the camera and all its settings in detail. The software in the processor can query this file and know all the information it needs to interface with and control the camera. Even different cameras from the same manufacturer can be controlled differently. GenICam allows vision system developers to program their applications without having to care about which camera is connected.

Lenses

The lens is a critical component of a vision system. Together with the camera’s image sensor, it determines the working distance and the field of view that encompasses the scene. The lens and its aperture also determine how much light energy is passed to the image sensor, the depth of field or focus range, and the resolution of detail in the optical image projected onto the image sensor. Most experienced machine vision engineers have learned how to select lenses for most applications. For those new to machine vision or for very complex applications, it may be necessary to seek the help of an experienced optical engineer. There are three basic types of lenses: normal or endocentric lenses (Figure 22), macro lenses (Figure 23), and telecentric lenses (Figure 25). Most applications can be solved by using endocentric lenses. These lenses can be focused over a range of working distances, extending to infinity, and usually have an adjustable aperture.

[1] [2]

Reference address：Composition of the visual system What are the commonly used interfaces in machine vision?

Previous article：What is the structure of the servo? Analysis of the principles of servo motor and servo controller
Next article：High voltage inverter consists of several parts

Popular Resources
Popular amplifiers