MicroPython Hands-on (08) - Learn Color Recognition with MicroPython from Scratch

eagler8

MicroPython Hands-on (08) - Learn Color Recognition with MicroPython from Scratch [Copy link]

I searched "color recognition" on Baidu this morning and got a general idea of it. I'll still use the old method, do it by hand, do more experiments, and move forward. I'd also like to ask for your guidance.

OpenCV (Baidu Encyclopedia)
is a cross-platform computer vision library released under the BSD license (open source) and can run on Linux, Windows, Android and Mac OS operating systems. It is lightweight and efficient - consisting of a series of C functions and a small number of C++ classes, while providing interfaces for languages such as Python, Ruby, and MATLAB, and implementing many common algorithms in image processing and computer vision. OpenCV is written in C++, and its main interface is also in C++, but it still retains a large number of C language interfaces. The library also has a large number of Python, Java and MATLAB/OCTAVE (version 2.5) interfaces. The API interface functions of these languages can be obtained through online documentation. Support for C#, Ch, Ruby, and GO is also provided today.

OpenCV was founded by Intel in 1999 and is now supported by Willow Garage. OpenCV is a cross-platform computer vision library released under the BSD license (open source) and can run on Linux, Windows and Mac OS operating systems. It is lightweight and efficient - consisting of a series of C functions and a small number of C++ classes, while providing interfaces for languages such as Python, Ruby, MATLAB, etc., implementing many common algorithms in image processing and computer vision. OpenCV has a cross-platform mid- and high-level API with more than 500 C functions. It does not depend on other external libraries - although some external libraries can also be used. OpenCV provides a transparent interface for Intel Integrated Performance Primitives (IPP). This means that if there are IPP libraries optimized for specific processors, OpenCV will automatically load these libraries at runtime. (Note: The code of OpenCV version 2.0 has been significantly optimized and does not require IPP to improve performance, so version 2.0 no longer provides IPP interfaces)

This content is originally created by EEWORLD forum user eagler8 . If you want to reprint or use it for commercial purposes, you must obtain the author's consent and indicate the source

eagler8

OpenCV Overview
Its full name is Open source Computer Vision Library. In other words, it is an open source API library for computer vision. This means:
(1) It can be used for both scientific research and commercial applications;
(2) The source code of all API functions is public, and you can see the program steps implemented internally;
(3) You can modify the source code of OpenCV and compile it to generate the specific API functions you need. However, as a library, it only provides some commonly used, classic, and popular algorithm APIs.

A typical computer vision algorithm should include the following steps:
(1) data acquisition (for OpenCV, it is images);
(2) preprocessing;
(3) feature extraction;
(4) feature selection;
(5) classifier design and training;
(6) classification discrimination;
and OpenCV provides APIs for these six parts (remember this word).

eagler8

Color recognition based on OpenCV

Color model
The commonly used models in digital image processing are RGB (red, green, blue) model and HSV (hue, saturation, brightness). RGB is widely used in color monitors and color video cameras. Our daily pictures are generally RGB models. The HSV model is more in line with the way people describe and interpret colors. The color description of HSV is natural and very intuitive to people.

HSV model
The color parameters in the HSV model are: hue (H: hue), saturation (S: saturation), and brightness (V: value). A color space created by AR Smith in 1978, also known as the Hexcone Model.

Hue (H): measured in degrees, ranging from 0° to 360°, starting from red and counting counterclockwise, red is 0°, green is 120°, and blue is 240°. Their complementary colors are: yellow is 60°, cyan is 180°, and magenta is 300°;
Saturation (S): ranges from 0.0 to 1.0, the larger the value, the more saturated the color.
Brightness (V): ranges from 0 (black) to 255 (white).

Convert RGB to HSV
Let (r, g, b) be the red, green, and blue coordinates of a color, respectively, and their values are real numbers between 0 and 1. Let max be equivalent to the maximum of r, g, and b. Let min be equal to the minimum of these values. To find the (h, s, v) value in HSV space, where h ∈ [0, 360) is the hue angle, and s, v ∈ [0,1] are the saturation and brightness. There is a function under OpenCV that can directly convert the RGB model to the HSV model. In OpenCV, H ∈ [0, 180), S ∈ [0, 255], V ∈ [0, 255]. We know that the H component can basically represent the color of an object, but the values of S and V must also be within a certain range, because S represents the degree of mixing of the color represented by H and white, that is, the smaller the S, the whiter the color, that is, the lighter; V represents the degree of mixing of the color represented by H and black, that is, the smaller the V, the darker the color. The value of blue is roughly 100 to 140 for H, and 90 to 255 for S and V.

eagler8

OpenCV color recognition ideas
1. Create a slider: used to adjust the threshold and identify different colors.

2. Color space conversion: Convert RGB to HSV model, so that different colors can be identified by different HSV thresholds. This can be achieved in Opencv using cvtcolor(). Generally, color images are in RGB color space, and the HSV color space model is a color system that is even more commonly used in people's lives. It is very common on TV remote controls, in painting palettes, and when adjusting the brightness while watching TV, because it is more in line with the way people describe colors - what color it is, how dark the color is, and how bright the color is. It should be noted that in opencv, the H, S, and V value ranges are [0,180), [0,255), [0,255), respectively, rather than the actual model [0,360], [0,1], [0,1].

3. Histogram equalization: Due to the influence of light, each frame of the picture read by the mobile phone may be too bright or too dark. Histogram equalization can make the pixel distribution in each interval more even, making the image more layered. This can be achieved with the equalizeHist() function in Opencv. Histogram equalization stretches the original histogram so that it is evenly distributed in the entire grayscale range, thereby enhancing the contrast of the image. The central idea of histogram equalization is to change the grayscale histogram of the original image from a relatively concentrated area to a uniform distribution in the entire grayscale range.

4. Binarization: Set the grayscale value of the pixel on the image to 0 or 255, which will make the entire image appear obvious black and white. Binarization of grayscale images can highlight a certain range of information. It sets the pixel color value within the set range (such as ab) to 255 and the pixel color value outside the range to 0. However, for different values of ab and, the effect of binarization will be very different.

5. Opening operation: used to remove noise in the image, that is, interference information. In Opencv, you can use the getStructuringElement() function to make corresponding settings. The opening operation is based on the expansion and erosion of the image. Expansion is to "expand the field" of the highlight part of the image, and the effect image has a larger highlight area than the original image; erosion is the erosion of the highlight area in the original image, and the effect image has a smaller highlight area than the original image. The opening operation is to corrode the image first and then expand it to eliminate small objects. The mathematical principle is to define a convolution kernel B and convolve it with the target image to achieve the corresponding effect. Kernels of different shapes and sizes will produce different effects.

6. Closing operation: There may be some disconnected areas after the opening operation. The closing operation can close these unconnected areas to make the image more complete. The closing operation is the opposite of the opening operation. It first expands and then corrodes. It is used to exclude small black holes. Its principle is the same as the opening operation.

eagler8

I happened to have a Rubik's Cube with five colors, so I used it as an experimental prop for color identification.

eagler8

Open MaixPy IDE, select Tools - Machine Vision - Value Editor

eagler8

Open the source image location and select the frame buffer

eagler8

Adjust the LAB values, mainly in the binary image column, the white pixels are the pixels being tracked

eagler8

The LAB color model (Baidu Encyclopedia)
is based on an international standard for measuring color established by the Commission International Eclairage (CIE) in 1931. It was improved and named in 1976. The Lab color model makes up for the shortcomings of the RGB and CMYK color models. It is a device-independent color model and a color model based on physiological characteristics. The Lab color model consists of three elements, one of which is brightness (L), and a and b are two color channels. The colors included in a range from dark green (low brightness value) to gray (medium brightness value) to bright pink (high brightness value); and b ranges from bright blue (low brightness value) to gray (medium brightness value) to yellow (high brightness value). Therefore, this color mixture will produce a color with a bright effect.

The Lab mode does not rely on light or pigments. It is a color mode determined by the CIE organization that theoretically includes all colors visible to the human eye. The Lab mode makes up for the shortcomings of the RGB and CMYK color modes. Compared with the RGB color space, Lab is an uncommon color space. It was established on the basis of the international color measurement standard formulated by the International Commission on Illumination (CIE) in 1931. In 1976, it was officially named CIELab after being revised. It is a device-independent color system and a color system based on physiological characteristics. This means that it uses a digital method to describe human visual perception. The L component in the Lab color space is used to represent the brightness of the pixel, with a value range of [0,100], representing from pure black to pure white; a represents the range from red to green, with a value range of [127,-128]; b represents the range from yellow to blue, with a value range of [127,-128].

The Lab color space is larger than the color gamut of a computer display or even human vision. A Lab bitmap requires more pixel data than an RGB or CMYK bitmap to obtain the same accuracy. The Lab mode defines the most colors, is independent of light and equipment, and has the same processing speed as the RGB mode, much faster than the CMYK mode. Therefore, you can use the Lab mode in image editing with confidence. Moreover, the colors of the Lab mode are not lost or replaced when converted to the CMYK mode. Therefore, the best way to avoid color loss is to edit the image in the Lab mode and then convert it to the CMYK mode for printing.

eagler8

Thoroughly understand Lab color space

Name
Before we begin, let's clarify the name of the Lab color space:

The full name of Lab is CIELAB, sometimes written as CIE L a b*
The CIE here stands for International Commission on Illumination, which is an international authoritative organization on lighting, color, etc.

The Lab channel
is composed of a brightness channel and two color channels. In the Lab color space, each color is represented by three numbers: L, a, and b. The meaning of each component is as follows:

L* represents brightness
a* represents the component from green to red
b* represents the component from blue to yellow

Perceptual uniform
Lab is designed based on people's perception of color. More specifically, it is perceptual uniform. Perceptual uniform means that if the numbers (i.e. the three numbers L, a, and b mentioned above) change by the same magnitude, then the visual changes they bring to people are also similar. Compared with color spaces such as RGB and CMYK, Lab is more in line with human vision and easier to adjust: if you want to adjust the brightness (without considering the Helmholtz–Kohlrausch effect, see the note below), adjust the L channel; if you want to adjust only the color balance, adjust a and b separately.

Note: The Helmholtz–Kohlrausch effect is an optical illusion of the human eye—colors appear brighter when their saturation is high.

Lab
has a very good feature - device-independent. That is, given the white point of the color space (the figure below shows the white point of a color space), this color space can clearly determine how each color is created and displayed, regardless of the display medium used. It should be noted that Lab defines the color relative to the white point. Only after defining the color of the white point (for example, defining it as CIE standard illuminant D50), can we know other colors.

Numerical range
Theoretically, L , a , and b* are all real numbers, but in practice they are generally limited to an integer range:

The larger L is, the higher the brightness is. 0 L represents black, and 100 L represents white.
When a and b are 0, they both represent gray.
When a* changes from negative to positive, the corresponding color changes from green to red.
As b* changes from negative to positive, the corresponding color changes from blue to yellow.
In practical applications, we often use the range of color channels _{between -100 and +100 or -128 to 127.}

Visualization
As you can see, Lab * has three components, so it can be presented in three-dimensional space. In two-dimensional space, a chromaticity diagram is often used to visualize it, that is, to fix the brightness L and see the changes in a and b. Note that these visualizations are not accurate, but can only help people understand.

eagler8

#MicroPython Hands-on (08) - Learn MicroPython from scratch to recognize colors
#Experimental program 1: find red blob Dynamically identify red blocks

Try to choose the best color tracking threshold by adjusting the threshold with the slider bar. The red blocks are highlighted on the binary image (shown in white).

The measured LAB values are: 55, 70, 42, 65, 52, 8

eagler8

#MicroPython Hands-on (08) - Learn MicroPython from scratch to recognize colors
#Experimental program 1: find red blob Dynamically identify red blocks

#MicroPython动手做（08）——零基础学MaixPy之识别颜色
#实验程序：find red blob 动态识别红色块

import sensor
import image
import lcd
import time

lcd.init(freq=15000000)
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.run(1)
red_threshold  = (55, 70, 42, 65, 52, 8) 
while True:
    img=sensor.snapshot()
    blobs = img.find_blobs([green_threshold])
    if blobs:    
        for b in blobs:
            tmp=img.draw_rectangle(b[0:4]) 
            tmp=img.draw_cross(b[5], b[6]) 
            c=img.get_pixel(b[5], b[6])
    lcd.display(img)

eagler8

eagler8

Rubik's cube for experiment, there are 5 colors

eagler8

After running, you can see the frame and + sign, which indicates that the system has been accurately identified.

eagler8

Experimental scene in front of the window on a cloudy day

eagler8

#MicroPython Hands-on (08) - Learn to recognize colors with MicroPython from scratch
#Experimental program 2: find green blob Dynamically identify green blocks

Get the green LAB threshold values as 0, 88, -42, -6, -9, 13

eagler8

#MicroPython动手做（08）——零基础学MaixPy之识别颜色
#实验程序之二：find green blob 动态识别绿色块

import sensor
import image
import lcd
import time

lcd.init(freq=15000000)
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.run(1)
green_threshold  = (0, 88, -42, -6, -9, 13) 
while True:
    img=sensor.snapshot()
    blobs = img.find_blobs([green_threshold])
    if blobs:    
        for b in blobs:
            tmp=img.draw_rectangle(b[0:4]) 
            tmp=img.draw_cross(b[5], b[6]) 
            c=img.get_pixel(b[5], b[6])
    lcd.display(img)

eagler8

eagler8

#MicroPython Hands-on (08) - Learn to recognize colors with MicroPython from scratch
#Experimental program 3: find orange blob Dynamically identify orange blocks

Get the LAB threshold values of orange as 0, 80, 66, -20, 80, 50 (intersecting with red)

MicroPython Hands-on (08) - Learn Color Recognition with MicroPython from Scratch [Copy link]

Latest reply

Thoroughly understand Lab color space