Engineers say | Introduction to R-Car DNN simulator (2)
background
R-Car SoC is a high-performance, low-power SoC proposed by us. In order to run models trained by customers using mainstream deep learning frameworks (such as PyTorch, TensorFlow, etc.) on R-Car SoC, non-equivalent approximation methods such as pruning (*1) and quantization (*2) need to be used for model compression. The R-Car CNN tool we provide can not only execute the above approximate program on the R-Car SoC and run the deep learning model trained by the customer, but also provide simulators with different accuracy and speed according to the customer's application scenarios. This way you can verify operation and make performance estimates even if you don't have R-Car SoC hardware. (*3)
Among various simulators, Accurate Simulator can obtain the most consistent output results with the actual R-Car SoC. This article proposes a method of using Accurate Simulator to debug, analyze and improve the accuracy of the model. By step-by-step tracing the intermediate output of the model that cannot be confirmed in the actual R-Car SoC, we will introduce methods to determine the cause of unexpected results and improve accuracy.
scenes to be used
In order to convert the customer-trained deep learning model into a format that can be executed on the R-Car SoC, non-equivalent approximate model compression, such as pruning and quantization, is required. Quantization is a method of approximating a model for floating point operations to a model for integer operations. In this process, the maximum and minimum values of the output tensor of each layer are estimated from multiple input images, the maximum and minimum values of the weight parameters of each layer, and the quantization parameter (scaleとzero point) will be determined (calibrated). When validating this quantized model on an actual R-Car SoC or simulator (*4) , different input image data may lead to unexpected results compared to the results of the original trained model. In this case, it is very useful to analyze the model using Accurate Simulator, which allows direct observation of intermediate outputs in the model that are not available on the actual R-Car SoC.
Use Accurate Simulator
The process of model analysis
In the above case, insufficient quality or quantity of input image data during calibration may result in (a) suboptimal calibration or (b) quantification failure due to intermediate layers with large output fluctuations. In this case, first determine whether the cause is (a) or (b), and then either (a) increase or update the input image data and calibrate again, or (b) identify the layer where the problem occurs and increase the bit width of that layer To improve the accuracy of quantitative models and other effective methods.
Accurate Simulator is a simulator designed to ensure that the output results exactly match the actual R-Car SoC machine. Unlike the R-Car SoC, Accurate Simulator allows users to extract the intermediate output of each layer in the model. Specifically, users can use Accurate Simulator to extract the intermediate output of each layer one by one starting from the layer side of the input image data, and compare it with the intermediate output of the original training model to confirm the error.
Demonstration example
When using our R-Car SoC, customers use our R-Car CNN tool to convert the trained model into the R-Car SoC's execution format and execute it. The following assumes a method to find out the cause and fix it if the output results of the original training model (e.g. TensorFlow) and R-Car SoC do not match at runtime. We illustrate how to use Accurate Simulator to estimate quantization error by directly comparing the original TensorFlow model and the intermediate output of the model in the R-Car executable format.
1
Convert customer-trained TensorFlow models to ONNX and use our R-Car CNN tool to convert them into an Accurate Simulator executable format, while providing quantification conditions and a sufficient amount of image data for calibration.
2
Run the customer's TensorFlow model and extract the intermediate outputs of the layers you want to compare.
3
Use R-Car SDK runtime to run the execution format model of Accurate Simulator generated in ①. The intermediate outputs of the layers that need to be compared can be extracted at this step.
4
Compare the components of the intermediate output obtained in ② and ③. The output results of Accurate Simulator are expressed as integers based on model quantization, and we have also prepared tools for inverse quantization. The graph in the figure shows a direct comparison of the intermediate output tensor components generated by TensorFlow and Accurate Simulator. In this example, the comparison results are almost identical and there are no issues with this layer.
5
Repeat steps ① to ④ to determine which layer is approximately destroyed. The output accuracy of the quantization model can be improved by increasing the display bit width of the quantization parameters of the relevant layer (for example, from 8 bits to 16 bits).
Figure 1: Intermediate output comparison process between TensorFlow and Accurate Simulator
Summarize
This article introduces a method. When the customer-trained model runs on our R-Car SoC and its output results are not satisfactory, you can use Accurate Simulator to find out the reason and improve the accuracy of the model. Accurate Simulator is designed to obtain calculation results comparable to those of an actual R-Car SoC and can be used to investigate intermediate outputs of models that cannot be checked using real equipment. We hope that customers can use this to debug and evaluate the model and improve the accuracy of the model. In the future, Renesas will continue to work on the development of R-Car CNN tools for customers to use for model evaluation and verification.
1
END
1