Smart car based on ESP32 road sign recognition
[Copy link]
Title of Work
Smart car based on ESP32 road sign recognition
Author: Wang Zhiyuan
1. Introduction
The smart car based on ESP32 road sign recognition is ESP32 as the master control, controlling the movement of the Mecanum wheel car; OpenMV is the slave control, identifying the road direction and road traffic signs, and transmitting the road direction and recognized traffic sign parameter information to ESP32. ESP32 controls the Mecanum wheel car to track according to the information, and determines the direction of travel according to the traffic signs on the road, thereby realizing unmanned driving of the car.
2. System Block Diagram
3. Functional description of each part
OpenMV slave controller:
The STM32H OpenMV-H7 is low-power Python3 programmable machine vision hardware that supports a wide range of image processing functions and neural networks in conjunction with a camera. The OpenMV-H7 is programmed using a cross-platform IDE that allows viewing the camera's frame buffer, access to sensor controls, and uploading scripts to the camera via USB serial (or WiFi/BLE if available). The OpenMV-H7 baseboard is based on the STM32H743 MCU running at 400MHz, with 1MB SRAM, 2MB Flash, FPU, DSP, and hardware JPEG encoder. The baseboard uses a modular sensor design that separates the sensor from the camera. The modular sensor design enables the camera to support multiple sensors, including the OV7725, MT9V03x global shutter sensors, and FLIR Lepton 1, 2, and 3 thermal sensors.
STM32H743 is the MCU of OpenMV-H7. It is a 32-bit chip with Cortex-M7 core. The core has a double-precision floating-point processing unit FPU with a maximum frequency of 400MHz and built-in 1M RAM and 2M Flash. Figure 2 shows the architecture of the STM32H743 chip. Physical picture of the OpenMV-H7 chip:
Figure 1 OpenMV-H7 chip
ESP32 master control:
ESP32 integrates traditional Bluetooth, low-power Bluetooth and Wi-Fi, and has a wide range of uses: Wi-Fi supports communication connections in a very large range, and also supports direct connection to the Internet through a router; while Bluetooth allows users to connect to mobile phones or broadcast BLE Beacons for signal detection. The ESP32 chip contains two hardware timer groups. Each group has two general hardware timers. They are all based on 16-bit prescalers and 64-bit. 64-bit general timers with up/down counters with automatic reload function. Physical picture of ESP32:
Figure 2 ESP32 master control
OV7725 Camera:
The OV7725 camera is a 30W pixel camera module that supports the maximum VGA resolution output and the maximum frame rate of 60fps. It can be configured to output video streams in RAW RGB, RGB (GRB422, RGB565/RGB444), and YVA422 formats. In layman's terms: OV7725 has 30W pixels. Although the pixels are low, the frame rate is high, and it works well in applications such as color tracking.
LM298N Motor Driver:
LM298N contains 4-channel logic drive circuit, including two H-bridge high voltage and high current dual full-bridge drivers, which can drive two DC motors. Ports A and B are connected to the PWM channel configured by the ESP code to output PWM waves to control the running speed of the motor. The 4 logic input pins of the two motor drivers of the McWheel trolley are connected to the microcontroller pins to control the forward and reverse rotation of the wheels. Actual picture of the motor drive:
Figure 3 Motor drive actual picture
McLen Trolley:
The Mecanum wheel is more complex than the ordinary tire structure. It is composed of many spindle-shaped rollers installed obliquely. The inclination of this roller is, so when driving, the left and right forces are very uniform, and the middle position of the roller is higher than the other two sides. This is the characteristic of the Mecanum wheel to change the driving direction at will 45°. The friction forces of different sizes in different directions can realize the horizontal movement of the car. The following is a force analysis diagram of the car when it moves:
Figure 4 Force analysis diagram of the McLen trolley
Circuit principle:
Since the components are large in size and not convenient to fix on the car, the connecting wires are numerous and complicated, and the connections are loose from time to time, I decided to buy an ESP32 module, use PCB instead of connecting wires, and simplify the car motherboard. The circuit analysis diagram and physical diagram of the car are shown in the following figure:
Tracking principle:
The traffic signs recognized by OPenMV include red light, green light, go straight, U-turn, stop, turn left, turn right and background. When the car is driving, OpenMV performs binary processing on the image, performs linear regression processing on the black road, and determines the forward angle deviation and the distance deviation between the car and the road. At the same time, if a traffic sign is recognized, the recognized parameters are transmitted to ESP32 through the UART protocol. ESP32 controls the movement of the car according to the recognized structure. It stops when it recognizes a red light, goes forward when it recognizes a green light, and moves when it meets the results of the traffic sign. If there is no traffic sign, the car will track in the forward direction. The traffic signs and roads are shown in the figure below:
4. Source Code
ESP32 master control main program:
- from machine import Pin, UART, Timer
- from machine import Timer
- import time
- import bluetooth
- from car import CAR
- from neopixel import NeoPixel
- RED=(25,0,0)
- GREEN=(0,25,0)
- LEFT=(0,25,25)
- RIGHT= (25,0,25)
- BLACK = (0,0,0)
- global num
- num = 8
- pin = Pin(5, Pin.OUT)
- np = NeoPixel(pin, 25)
- buf = bytes(0)
- BLE_MSG = ""
- uart = UART(1, baudrate=115200, tx=17, rx=16)
- if __name__ == "__main__":
- global speed
- speed = 200
- Car = CAR()
- Car.setspeed(speed)
- while True:
- if (uart.any()):
- buf = uart.readline()
- print('Echo Byte: {}'.format(buf))
- angle = buf.decode("utf-8")
- angle = angle.replace("\\n",'')
- x=angle.split(" ",1)
- distance = float (x[0])
- angle = float (x[1])
- Car.setspeed(300)
- Car.diserror(distance,angle)
- Car.angleerror(distance,angle)
OpenMV slave main program:
- import sensor, image, time, math
- from pyb import UART, LED
- uart = UART(3, 115200)
- sensor.set_hmirror(True)
- GRAYSCALE_THRESHOLD = [(0, 25)]
- ROIS = [(0, 100, 160, 20, 0.7),
- (0, 50, 160, 20, 0.3),
- (0, 0, 160, 20, 0.1) ]
- weight_sum = 0
- for r in ROIS: weight_sum += r[4]
- sensor.reset()
- sensor.set_pixformat(sensor.GRAYSCALE)
- sensor.set_framesize(sensor.QQVGA)
- sensor.skip_frames(time = 2000)
- sensor.set_auto_gain(False)
- sensor.set_auto_whitebal(False)
- sensor.set_hmirror(True)
- sensor.set_vflip(True)
- clock = time.clock()
- while (True):
- clock.tick()
- img = sensor.snapshot()
- centroid_sum = 0
- for r in ROIS:
- blobs = img.find_blobs(GRAYSCALE_THRESHOLD, roi=r[0:4], merge=True)
- if blobs:
- largest_blob = max(blobs, key=lambda b: b.pixels())
- img.draw_rectangle(largest_blob.rect())
- img.draw_cross(largest_blob.cx(),
- largest_blob.cy()
- centroid_sum += largest_blob.cx() * r[4]
- center_pos = (centroid_sum / weight_sum)
- deflection_angle = 0
- deflection_angle = -math.atan((center_pos-80)/60)
- deflection_angle = math.degrees(deflection_angle)
- uart.write("%d\n" % deflection_angle)
- print("Turn Angle: %d\n" % deflection_angle)
- time.sleep_ms(100)
5. Demonstration video of the work’s functions
Click to view >> Demo video
6. Project Summary
Experience:
By participating in this Digi-Key Innovation Competition, I had the opportunity to systematically learn the development process of neural networks and have a deeper understanding of the entire image recognition. Before participating in this competition, although I had participated in technology competitions such as smart cars, electric races, and Challenge Cup, my understanding of embedded systems was only superficial, and I thought that microcontrollers plus operating systems were embedded systems. However, after participating in this competition, my understanding has completely changed.
During the competition, a lot of data research in the early stage also made me understand the urgency of the real deployment of intelligent algorithms. Many times, we have many excellent intelligent algorithms, but due to a series of reasons such as the complexity of the algorithms, the stringent requirements for the computing power of the hardware platform and the hardware cost, these methods that can benefit us have not been well implemented. This time, the hardware platform we chose was the ESP32 chip and OPenMV for our work production. The entire work involves embedded development, image recognition, computer network communication and other aspects. Although there are many things to learn, we still make full use of our spare time and race against time to learn new knowledge bit by bit. But when we persisted step by step, we actually found that it was not as difficult as we imagined. On the contrary, after this competition, my overall ability has been improved. It gave me a new understanding of the principles, construction, training and deployment of AI algorithms. I believe that this knowledge will definitely be of great help to my future scientific research and learning.
Share this post:
Kit sharing: https://bbs.eeworld.com.cn/thread-1209483-1-1.html
7. Neural Network Model Training
In the neural network part, we selected YOLOV2 detection network and RESNET18 classification network to meet the needs. The detection network data set is about 4,000 images, and the classification network data set is about 3,600 images.
Figure 5. Schematic diagram of data set
The training lasted about 48 hours, and the model converged after 12 hours, and the training was completed. The verification set accuracy of the detection network and the classification network on the Linux side was 97% and 98% respectively, which met the ideal state.
Figure 6. Validation set accuracy at the end of training
After the model training is completed, the final weight file is saved and the trained weights are tested in the local Linux system. As shown in the figure, during the test process, we set the output threshold to 70%. From the test results, it can be seen that when multiple types of road signs appear at the same time, most of the road signs can be accurately classified. Since YOLOV2 has a faster detection rate, it can identify all the collected road signs in a shorter time; when a certain type of road sign appears, it can accurately identify the type of road sign, which can achieve our expected goal and can perform quantitative processing of the algorithm model.
Figure 7 PC-side prediction results of different road signs
|