Engineers say | Introduction to R-Car DNN simulator (2)

Latest update time：2023-07-13

Reads：

Summary

Among some DNN simulators related to the R-Car SoC provided by Renesas, this article will focus on Accurate Simulator, which can achieve equivalent operations to actual hardware , and explain how to apply it to analyze and improve the accuracy of neural networks.

Hiroshi Ota

Principal Software Engineer

background

R-Car SoC is a high-performance, low-power SoC proposed by us. In order to run models trained by customers using mainstream deep learning frameworks (such as PyTorch, TensorFlow, etc.) on R-Car SoC, non-equivalent approximation methods such as pruning ^(*1) and quantization ^(*2) need to be used for model compression. The R-Car CNN tool we provide can not only execute the above approximate program on the R-Car SoC and run the deep learning model trained by the customer, but also provide simulators with different accuracy and speed according to the customer's application scenarios. This way you can verify operation and make performance estimates even if you don't have R-Car SoC hardware. ^(*3)

Among various simulators, Accurate Simulator can obtain the most consistent output results with the actual R-Car SoC. This article proposes a method of using Accurate Simulator to debug, analyze and improve the accuracy of the model. By step-by-step tracing the intermediate output of the model that cannot be confirmed in the actual R-Car SoC, we will introduce methods to determine the cause of unexpected results and improve accuracy.

scenes to be used

In order to convert the customer-trained deep learning model into a format that can be executed on the R-Car SoC, non-equivalent approximate model compression, such as pruning and quantization, is required. Quantization is a method of approximating a model for floating point operations to a model for integer operations. In this process, the maximum and minimum values of the output tensor of each layer are estimated from multiple input images, the maximum and minimum values of the weight parameters of each layer, and the quantization parameter (scaleとzero point) will be determined (calibrated). When validating this quantized model on an actual R-Car SoC or simulator ^(*4) , different input image data may lead to unexpected results compared to the results of the original trained model. In this case, it is very useful to analyze the model using Accurate Simulator, which allows direct observation of intermediate outputs in the model that are not available on the actual R-Car SoC.

Use Accurate Simulator

The process of model analysis

In the above case, insufficient quality or quantity of input image data during calibration may result in (a) suboptimal calibration or (b) quantification failure due to intermediate layers with large output fluctuations. In this case, first determine whether the cause is (a) or (b), and then either (a) increase or update the input image data and calibrate again, or (b) identify the layer where the problem occurs and increase the bit width of that layer To improve the accuracy of quantitative models and other effective methods.

Accurate Simulator is a simulator designed to ensure that the output results exactly match the actual R-Car SoC machine. Unlike the R-Car SoC, Accurate Simulator allows users to extract the intermediate output of each layer in the model. Specifically, users can use Accurate Simulator to extract the intermediate output of each layer one by one starting from the layer side of the input image data, and compare it with the intermediate output of the original training model to confirm the error.

Demonstration example

When using our R-Car SoC, customers use our R-Car CNN tool to convert the trained model into the R-Car SoC's execution format and execute it. The following assumes a method to find out the cause and fix it if the output results of the original training model (e.g. TensorFlow) and R-Car SoC do not match at runtime. We illustrate how to use Accurate Simulator to estimate quantization error by directly comparing the original TensorFlow model and the intermediate output of the model in the R-Car executable format.

Convert customer-trained TensorFlow models to ONNX and use our R-Car CNN tool to convert them into an Accurate Simulator executable format, while providing quantification conditions and a sufficient amount of image data for calibration.

Run the customer's TensorFlow model and extract the intermediate outputs of the layers you want to compare.

Use R-Car SDK runtime to run the execution format model of Accurate Simulator generated in ①. The intermediate outputs of the layers that need to be compared can be extracted at this step.

Compare the components of the intermediate output obtained in ② and ③. The output results of Accurate Simulator are expressed as integers based on model quantization, and we have also prepared tools for inverse quantization. The graph in the figure shows a direct comparison of the intermediate output tensor components generated by TensorFlow and Accurate Simulator. In this example, the comparison results are almost identical and there are no issues with this layer.

Repeat steps ① to ④ to determine which layer is approximately destroyed. The output accuracy of the quantization model can be improved by increasing the display bit width of the quantization parameters of the relevant layer (for example, from 8 bits to 16 bits).

Figure 1: Intermediate output comparison process between TensorFlow and Accurate Simulator

Summarize

This article introduces a method. When the customer-trained model runs on our R-Car SoC and its output results are not satisfactory, you can use Accurate Simulator to find out the reason and improve the accuracy of the model. Accurate Simulator is designed to obtain calculation results comparable to those of an actual R-Car SoC and can be used to investigate intermediate outputs of models that cannot be checked using real equipment. We hope that customers can use this to debug and evaluate the model and improve the accuracy of the model. In the future, Renesas will continue to work on the development of R-Car CNN tools for customers to use for model evaluation and verification.

Remark

(*1) Weights that contribute little to the recognition results are set to zero, and the calculation of these weights is skipped, thereby reducing the amount of calculation and memory usage.

(*2) Floating point calculations are usually converted into approximate (such as 8-bit) integer operations during reasoning. The quantization here is called PTQ (post training quantization), which optimizes the quantization parameters (scale & zero point) by using multiple input images for calibration.

(*3) Click to view previous articles: Introduction to R-Car DNN simulator

(*4) In addition to Accurate Simulator, Renesas also provides Instruction Set Simulator (ISS), which aims to achieve calculation accuracy equivalent to that of actual hardware. Not only that, ISS also simulates the calculation process itself of actual hardware, allowing users to test models in an environment very close to actual hardware.

END

Renesas Electronics (TSE: 6723)

Technology makes life easier and is committed to creating a safer, smarter and more sustainable future. As a global microcontroller supplier, Renesas Electronics combines expertise in embedded processing, analog, power and connectivity to provide complete semiconductor solutions. The winning product portfolio accelerates the launch of automotive, industrial, infrastructure and IoT applications, enabling billions of connected smart devices to improve people's work and lifestyle. For more information, please visit renesas.com

Latest articles about

■Watch the Renesas Electronics "Core" Moment at the CIIE [Interaction Gift]

■Good News | Renesas Electronics Won the "Best Electronics Company of the Year" Award at the Global Electronics Achievement Awards

■New Product Release | Renesas Launches New RA8 Entry-Level MCU Product Group, Providing Cost-Effective, High-Performance Arm Cortex-M85 Processor

■Engineers say | Differentiate HMI products with the high-precision, low-power touch function of the RX261/RX260

■News Flash | Renesas once again exhibits a variety of advanced solutions at the 7th China International Import Expo

■Renesas announces third quarter 2024 financial results

■Interaction Rewards | Renesas Motor Control Solutions Accelerate Product Launch

■Engineers say | Improving autonomy and safety through AI

■New Product Release | Renesas Launches New RX261/RX260 MCU Product Group with Superior Energy Efficiency, Advanced Touch Functionality, and Powerful Security Features

■News Flash | Renesas and Intel Collaborate to Develop Advanced Power Management Solutions for Intel’s New Core Ultra 200V Processor Series