Home > Other >Special Application Circuits > Share a fire sensor solution using MLX90640

Share a fire sensor solution using MLX90640

Source: InternetPublisher:狂妄火龙果 Keywords: Thermal Imager MLX90640 Updated: 2024/12/13

The project is able to provide firefighters with infrared vision to help them rescue people from burning buildings.

background

Thermal imaging cameras (TICs) have proven to be a valuable tool for firefighters. They enable firefighters to locate victims faster, successfully exit burning homes more consistently, and reduce the time required to satisfactorily complete searches. However, some issues remain. When searching for victims, time is spent pulling out the TIC, setting it up, and reading the display. Handheld TIC displays are difficult to see accurately in thick smoke, resulting in slower victim location. Using the display requires the operator to take their eyes and attention away from their surroundings, leading to loss of consciousness and increasing the risk of tunnel vision.

Sensory substitution devices that leverage machine learning to provide firefighters with infrared (IR) senses would be a useful tool. This would reduce the number of times firefighters look away from their surroundings at a display (also reducing the risk of tunnel vision) as they would no longer feel the need to constantly monitor the TIC display as they would be constantly getting important IR information. They would be able to react more quickly when the appropriate stimulus is sensed rather than having to refer to the display, thus reducing the time it takes to rescue the victim. Streaming critical information directly to the firefighter would reduce situations where low visibility of the display in heavy smoke becomes an issue. Overall, firefighters with IR awareness would be more effective in search and rescue operations.

How to set up the project
We connect the Qwiic cable (breadboard jumper (4-pin)) to the MLX90640 SparkFun IR Array Breakout (MLX). The four wires (black, red, yellow, blue) represent GND, VIN (3.3V), SCL, SDA respectively.

Because our Qwiic cable is not a female jumper, we use four ff breadboard wires to connect the Qwiic cable pins to the Raspberry Pi (Pi). In the image below, you can see that the black, red, yellow, and blue Qwiic pins are connected to the brown, red, yellow, and orange ff breadboard wires respectively.

ff The colour of the breadboard wires is not important, but the key is that the Qwiic cable GND, VIN, SCL and SDA pins connect to the appropriate Pi pins (pins 6, 1, 5 and 3 respectively). Below you can see a guide to the Pi pinouts and our connections to the Pi.

Once that was done we created a small cardboard camera stand to help support the camera by cutting off one side of the pizza box. This step is not necessary so feel free to skip it or make your own.

Implementation Overview

Connecting the Camera to the Raspberry Pi
Using the Adafruit library, we are able to read data from the MLX90640 thermal camera.

import adafruit_mlx90640
import time,board,busio

i2c = busio.I2C(board.SCL, board.SDA, frequency=1000000) # setup I2C
mlx = adafruit_mlx90640.MLX90640(i2c) # begin MLX90640 with I2C comm
mlx.refresh_rate = adafruit_mlx90640.RefreshRate.REFRESH_2_HZ # set refresh rate

frame = [0]*768 # setup array for storing all 768 temperatures

mlx.getFrame(frame)

At this point, we have the temperature data stored in an array, ready for preprocessing.

Preprocessing the raw temperature data and feeding it to the classification service
Our preprocessing step requires converting the temperature data into an image that can later be fed into our edge impulse model. This is done using the Matplotlib library, and the image is then saved to a Buffer Stream and encoded as a base64 string.

mlx.getFrame(frame)

mlx_shape = (24,32)
fig = plt.figure(frameon=False)

ax = plt.Axes(fig, [0., 0., 1., 1.])
ax.set_axis_off()
fig.add_axes(ax)

thermal_image = ax.imshow(np.zeros(mlx_shape), aspect='auto')

MIN= 18.67
MAX= 43.68

data_array = (np.reshape(frame,mlx_shape)) # reshape to 24x32
thermal_image.set_data(np.fliplr(data_array)) # flip left to right
thermal_image.set_clim(vmin=MIN,vmax=MAX) # set bounds

buf = io.BytesIO()
fig.savefig(buf,format='jpg',facecolor='#FCFCFC',bbox_inches='tight')
img_b64 = base64.b64encode(buf.getvalue()).decode()

buf.close()
plt.close(fig)

The base64 string and raw frame data are sent to our classification microservice using an http post request. The microservice will run the image through the Edge Impulse model and after some post-processing (described below) return a flag stating whether a person was detected in the frame and if so, what orientation they are, i.e. left, center, right.

Classification Service
The classification service accepts a post request containing two data objects: the raw data array and the image represented in base64. The image needs to be preprocessed before being passed to the classifier. First, we have to extract the raw features of the image. The image is decoded into an image buffer, which is then converted into a hexadecimal representation of the image. This hexadecimal string is split into individual RGB values and converted into integers, ready to be processed into the classifier.

let raw_features = [];
let img_buf = Buffer.from(request.body.image, 'base64')

try{
let buf_string = img_buf.toString('hex');

// store RGB pixel value and convert to integer
for (let i=0; i raw_features.push(parseInt(buf_string.slice(i, i+6), 16 ));
}

} catch(error) {
throw new Error("Error Processing Incoming Image");
}

The raw features are fed into a classifier and an object consisting of two labels is returned: the person in the image and the confidence level of the person not in the image.

let result = {"hasPerson":false}
let classifier_result = classifier.classify(raw_features);

no_person_value = 0
person_value = 0

if(classifier_result["results"][0]["label"] === "no person"){
no_person_value = classifier_result["results"][0]["value"]
} else {
throw new Error("Invalid Model Classification Post Processing")
}

if(classifier_result["results"][3]["label"] === "person"){
person_value = classifier_result["results"][3]["value"]
} else {
throw Error("Invalid Model Classification Post Processing")
}

These two label values are then compared to our confidence threshold to determine if a human has been seen in the frame. If not, the classifier service responds to the publish request with an object containing one field:

"hasPerson" = false.

However, if the confidence value meets or exceeds the threshold criteria, the raw temperature data is used to determine where in the frame the heat source is coming from.

if(person_value > person_threshold
&& no_person_value < no_person_threshold){
result["hasPerson"] = true
// If is person find brightspot in the image
let frame_data = request.body.frame
let column_average = new Array(32)

index_count = 0;
for(let j = 0; j < 24; j++){
for (let i = 0; i < 32; i ++){
column_average[i] = (column_average[i] || 0)
+ parseFloat (frame_data[index_count])
index_count++
}}

left_avg = 0
centre_avg = 0
right_avg = 0

for(let i = 0; i < 16; i++){
left_avg = left_avg + column_average[i]
}
for(let i = 8; i < 24; i++){
centre_avg = centre_avg + column_average[i]
}
for(let i = 17; i < 32; i++){
right_avg = right_avg + column_average[i]
}

var direction
if(left_avg > centre_avg && left_avg > right_avg){
direction = 1
} else if (centre_avg > left_avg && center_avg > right_avg){
direction = 2
} else if (right_avg > left_avg && right_avg > center_avg){
direction = 3
} else {
direction = 4
}

result["direction"] = direction
A response object is returned from the post request with two values:

"hasPerson" = true

"Direction" = <direction value>

We used the Neosensory SDK for Python to send motor commands to Buzz after pairing it with the Pi. We chose to use spatiotemporal scans (“patterns encoded in space and time”) because a study by Novich and Eagleman [2] found that they are the best way to encode data into the skin compared to spatial patterns and patterns composed of a single motor stimulating an area of the skin through vibration. The higher recognition performance of spatiotemporal scans allows for greater information transfer (IT) through the skin, which means that the firefighter can receive more useful information (and greater potential effectiveness of the IR perception obtained). Since we only care about three orientation values, three scan arrays are created to describe whether a person is on the left, right, or center of the frame, as shown below:

sweep_left = [255,0,0,0,0,255,0,0,0,0,255,0,0,0,0,255,0,0,0,0]

sweep_right = [0,0,0,255,0,0,255,0,0,255,0,0,255,0,0,0,0,0,0,0]

sweep_centre = [255,0,0,0,0,255,0,0,0,0,0,255,0,0,255,0,0,0,0,0]

When the Pi receives a response object indicating that there is a person in the frame and where they are in the frame, it sends a vibration motor command to Buzz:

if(response['hasPerson'] == True):
print("has person")
if(response['direction']):

print(response['direction'])

if response['direction'] == 1:
await my_buzz.vibrate_motors(sweep_right)
print("Right")

elif response['direction'] == 2:
await my_buzz.vibrate_motors(sweep_centre)
print("Centre")

elif response['direction'] == 3:
await my_buzz.vibrate_motors(sweep_left)
print("Left")

else:
print("inconclusive")
else:
print("no person")

We found that we had to put the Buzz into pairing mode every time we ran the code to minimize the chances of the Buzz not vibrating when we sent a command.

Edge pulse

Dataset Collection
When the MLX is pointed at a person, an array of 768 temperature values is saved in a .txt file. The files are organized in a way to facilitate uploading to Edge Impulse, where a person is in the frame, where there is an object in the frame (such as a radiator or a dog), and where there is nothing.

Data Preprocessing
The way we preprocessed the data for training the model was similar to how we processed the live feed from the camera above, except the data source was text files containing temperature values. We wrote a python script to loop through all of these files and output them as images. When converting temperatures to images, we need to have a minimum and maximum value to assign to the color range. To find these minimum and maximum values, we found the lowest and highest temperatures in our dataset of about 400 temperature arrays.

Creating the Model
Using Edge Impulse, we uploaded our collected data with appropriate labels of “person” and “no person” and allowed the data to be automatically split between training and testing. For Impulse design, we tried different combinations of processing and learning modules (e.g., image and neural network (Keras)), but we found that the image processing module and the transfer learning learning module performed best with our data.

We use RGB as the color depth parameter and then generate features.

We set the number of training epochs to 20, the learning rate to 0.0005, and the minimum confidence to 0.6.

We deploy impulse as a WebAssembly library with default optimizations.

The model
runs well, but there are some issues as it occasionally outputs false positives and false negatives. Here are some screenshots of the model output running in our node js server and in Python code on the Pi when someone is sitting directly in front of the MLX and the model correctly identifies them as being in the center of the frame:

Future Possibilities

Improved Thermal Imaging Cameras
While the MLX is a great TIC for the home hobbyist, we don’t think it’s up to the task of helping firefighters save lives. A “high-performance” TIC like the FLIR K53 seen here has an impressive 320x240 resolution (76,800 total pixels compared to the MLX’s 768) and a 60 Hz refresh rate. A more sophisticated TIC (with a higher resolution and refresh rate) would make it easier for the model to accurately detect human shapes, and would be necessary to turn this project into a product.

Further training of the model
We also believe that a more complex model is needed to turn this project into a product. The next step of this project will be to develop a model that can detect multiple people in the same frame and output not only whether a person is detected in the frame, but also their position in the frame. The model should give the position in terms of the x-axis and y-axis. If possible, information about the distance of the person from the camera should be conveyed, and even information about how "visible" the person is (for example, if only the arm is detected, the model should convey that the person is "partially visible" due to an obstacle, such as a bed or rubble).

For this project, we had a very simple tactile language that only had to convey information about three locations. Going forward, as the model outputs more information, a more refined tactile language will need to be designed. This could be used with a larger array of actuators located on the firefighter's body to facilitate richer spatiotemporal scanning. The final product might be designed to allow firefighters to wear tactile sleeves or vests instead of a Buzz.

狂妄火龙果

Latest Other Circuits

Popular Circuits

Popular Components