[Silicon Labs Development Kit Review] TensorFlow Application Code Analysis Based on Deep Learning
[Copy link]
Tensorflow application code analysis based on deep learning
1. The most interesting update of this version of Silab SDK is the introduction of the deep learning framework tensorflow into embedded development. A sample program, tensorflow_lite_micro_helloworld, is also provided, which is worth analyzing and comparing.
2. Create a sample program according to the above process, then compile and download for debugging.
The code process is to output the waveform according to the sine function, corresponding to the gradual change of the brightness of the LED on the PWM control board. The corresponding values of x and y output by the serial port are as follows
3. Code Analysis
The entire logic still uses the standard framework analysis method, starting the task app.c in main.c, and then starting the final core code in app.c. Here is the tensorflow subdirectory project,
main.c
int main(void)
{
// Initialize Silicon Labs device, system, service(s) and protocol stack(s).
// Note that if the kernel is present, processing task(s) will be created by
// this call.
sl_system_init();
// Initialize the application. For example, create periodic timer(s) or
// task(s) if the kernel is present.
app_init();
#if defined(SL_CATALOG_KERNEL_PRESENT)
// Start the kernel. Task(s) created in app_init() will start running.
sl_system_kernel_start();
#else // SL_CATALOG_KERNEL_PRESENT
while (1) {
// Do not remove this call: Silicon Labs components process action routine
// must be called from the super loop.
sl_system_process_action();
app_process_action();
#if defined(SL_CATALOG_POWER_MANAGER_PRESENT)
// Let the CPU go to sleep if the system allows it.
sl_power_manager_sleep();
#endif
}
#endif // SL_CATALOG_KERNEL_PRESENT
}
app.c
void app_init(void)
{
#if defined(SL_CATALOG_MVP_PRESENT)
sli_mvp_init_t init = { .use_dma = false };
sli_mvp_init(&init);
#endif
tensorflow_lite_micro_helloworld_init();
}
/***************************************************************************//**
* App ticking function.
******************************************************************************/
void app_process_action(void)
{
tensorflow_lite_micro_helloworld_process_action();
}
void tensorflow_lite_micro_helloworld_init(void)
{
sl_pwm_start(&sl_pwm_led0);
setup();
}
/***************************************************************************//**
* Ticking function.
******************************************************************************/
void tensorflow_lite_micro_helloworld_process_action(void)
{
// Delay between model inferences to simplify visible output
sl_sleeptimer_delay_millisecond(100);
loop();
}
The core code is in main_functions.c
// The name of this function is important for Arduino compatibility.
void setup() {
// Set up logging. Google style is to avoid globals or statics because of
// lifetime uncertainty, but since this has a trivial destructor it's okay.
// NOLINTNEXTLINE(runtime-global-variables)
static tflite::MicroErrorReporter micro_error_reporter;
error_reporter = µ_error_reporter;
// Map the model into a usable data structure. This doesn't involve any
// copying or parsing, it's a very lightweight operation.
model = tflite::GetModel(g_model);
if (model->version() != TFLITE_SCHEMA_VERSION) {
TF_LITE_REPORT_ERROR(error_reporter,
"Model provided is schema version %d not equal "
"to supported version %d.",
model->version(), TFLITE_SCHEMA_VERSION);
return;
}
// This pulls in all the operation implementations we need.
// NOLINTNEXTLINE(runtime-global-variables)
static tflite::AllOpsResolver resolver;
// Build an interpreter to run the model with.
static tflite::MicroInterpreter static_interpreter(
model, resolver, tensor_arena, kTensorArenaSize, error_reporter);
interpreter = &static_interpreter;
// Allocate memory from the tensor_arena for the model's tensors.
TfLiteStatus allocate_status = interpreter->AllocateTensors();
if (allocate_status != kTfLiteOk) {
TF_LITE_REPORT_ERROR(error_reporter, "AllocateTensors() failed");
return;
}
// Obtain pointers to the model's input and output tensors.
input = interpreter->input(0);
output = interpreter->output(0);
// Keep track of how many inferences we have performed.
inference_count = 0;
}
// The name of this function is important for Arduino compatibility.
void loop() {
// Calculate an x value to feed into the model. We compare the current
// inference_count to the number of inferences per cycle to determine
// our position within the range of possible x values the model was
// trained on, and use this to calculate a value.
float position = static_cast<float>(inference_count) /
static_cast<float>(kInferencesPerCycle);
float x_val = position * kXrange;
// Place our calculated x value in the model's input tensor
input->data.f[0] = x_val;
// Run inference, and report any error
TfLiteStatus invoke_status = interpreter->Invoke();
if (invoke_status != kTfLiteOk) {
TF_LITE_REPORT_ERROR(error_reporter, "Invoke failed on x_val: %f\n",
static_cast<double>(x_val));
return;
}
// Read the predicted y value from the model's output tensor
float y_val = output->data.f[0];
// Output the results. A custom HandleOutput function can be implemented
// for each supported hardware target.
HandleOutput(error_reporter, x_val, y_val);
// Increment the inference_counter, and reset it if we have reached
// the total number per cycle
inference_count += 1;
if (inference_count >= kInferencesPerCycle) inference_count = 0;
}
In fact, the tensorflow API is called directly here
const tflite::Model* model = nullptr;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
TfLiteTensor* output = nullptr;
First read the calculated model
model = tflite::GetModel(g_model);
Then, import the input data,
input->data.f[0] = x_val;
Start inference
TfLiteStatus invoke_status = interpreter->Invoke();
Read the output directly,
float y_val = output->data.f[0];
According to the values of x and y, serial port output and LED brightness control are performed.
HandleOutput(error_reporter, x_val, y_val);
4. Development and application process of deep computing.
4.1 Read the model configuration first
#include "tensorflow/lite/micro/examples/hello_world/model.h"
// Keep model aligned to 8 bytes to guarantee aligned 64-bit accesses.
alignas(8) const unsigned char g_model[] = {
0x1c, 0x00, 0x00, 0x00, 0x54, 0x46, 0x4c, 0x33, 0x00, 0x00, 0x12, 0x00,
0x1c, 0x00, 0x04, 0x00, 0x08, 0x00, 0x0c, 0x00, 0x10, 0x00, 0x14, 0x00,
0x00, 0x00, 0x18, 0x00, 0x12, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00,
0x60, 0x09, 0x00, 0x00, 0xa8, 0x02, 0x00, 0x00, 0x90, 0x02, 0x00, 0x00,
0x3c, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00,
........
}
This is the standard mode of tensorflow-lite provided by Google. The usual calculation method is to generate a set of parameter data of deep learning models, but tensorflow-lite compresses it into integer data, but the loss of calculation accuracy is not large. The compression tool can be downloaded and used from the Google website.
4.2 First, you need to train the model offline, build a deep learning model, and then train it with Google's AI engine. After convergence, compress the model and try to embed it into the code of the development board. Many models start with several megabytes, which cannot be used in general embedded development.
This process is quite painful. This example has completed this process and directly converted the model into a data format for easy evaluation.
5. Transplantation and development of deep learning.
Although this example program is easy to use, deep learning transplantation is very difficult. At least the convergence and pruning of the model requires a lot of experience. The learning curve is relatively steep. On a lightweight system, the computing performance is limited and the speed is relatively slow. If conventional algorithms can be used, the efficiency is much higher than the loop calculation of deep learning.
However, this is also a very meaningful example that can expand project development capabilities and imagination.
|