【STM32H7S78-DK】Evaluation + DSP calculation speed evaluation
[Copy link]
This post was last edited by dql2016 on 2024-10-10 22:33
In the previous post, the DSP library was successfully added for mathematical calculations. This post adds the open source performance evaluation library perf_counter ( ) of the silly kid to test the computing performance and measure the DSP performance by calculating the time.
perf_counter is a C-based module that provides the following functions:
-
Accurately measure system performance
-
Accurately measure function execution time
-
Accurately measure interrupt response latency
-
Provide blocking or non-blocking delay services accurate to the us level
-
Improving the random number properties of pseudo-random numbers
-
Provide system timestamp
-
……
It uses SysTick but does not occupy SysTick, and supports all Cortex-M processor bare metal environments and RTOS environments, as well as compiler platforms such as KEIL, IAR, and GCC. After downloading the source code package, you will get the following files:
For bare metal, just add the 4 files shown in the figure to the project:
First, add a reference to the header file in main.c:
#include"perf_counter.h"
Then initialize the perf_counter library:
init_cycle_counter(false);
or
init_cycle_counter(true);
If the user does not initialize SysTick himself , pass false to the function init_cycle_counter() .
If the user initializes SysTick himself , pass true to the function init_cycle_counter() .
The project produced by this project initializes SysTick, so the entry parameter is true
Call perfc_port_insert_to_system_timer_insert_ovf_handler() in the systick interrupt:
Block the printing of RMS calculation in the previous post
//DSP库求均方根测试
static void DSP_RMS(void)
{
float32_t pSrc[10] = {0.7060f, 0.0318f, 0.2769f, 0.0462f, 0.0971f, 0.8235f, 0.6948f, 0.3171f,0.9502f, 0.0344f};
float32_t pResult;
uint32_t pIndex;
q31_t pSrc1[10];
q31_t pResult1;
q15_t pSrc2[10];
q15_t pResult2;
//printf("******** stm32h7s78-dk eeworld dsp test ***********\r\n");
arm_rms_f32(pSrc, 10, &pResult);
//printf("arm_rms_f32 : pResult = %f\r\n", pResult);
/*****************************************************************/
for(pIndex = 0; pIndex < 10; pIndex++)
{
pSrc1[pIndex] = rand();
}
arm_rms_q31(pSrc1, 10, &pResult1);
//printf("arm_rms_q31 : pResult = %d\r\n", pResult1);
/*****************************************************************/
for(pIndex = 0; pIndex < 10; pIndex++)
{
pSrc2[pIndex] = rand()%32768;
}
arm_rms_q15(pSrc2, 10, &pResult2);
//printf("arm_rms_q15 : pResult = %d\r\n", pResult2);
//printf("******************************************************************\r\n");
}
Measure the time it takes to call this function:
start_cycle_counter();
DSP_RMS();
int64_t lCycleUsed = stop_cycle_counter();
printf("cycle counter = %lld\n",lCycleUsed);
If printf needs to print 64-bit data of type int64 or uint64, it is necessary to enable support for the complete standard output library:
Compile and download the program directly, and the output calculation print result is 0. The answer is found in the STM32 official forum:
https://community.st.com/t5/stm32cubeide-mcus/wrong-result-when-printing-a-int64-t-value-using-stm32cubeide/td-p/148913
STM32CubeIDE uses a reduced library by default in order to reduce code size, as it's usually preferred for embedded development. Part of this tradeoff is dropping support for long long ints in printf-type functions.
> What can I do to print a 64-bit int value?
Change the runtime library from reduced to standard.
The previously configured standard output library was streamlined, resulting in incorrect output.
After modification, the print is correct:
The main frequency of stm32h7s7 is 600MHz, 1/600 000 000 (s) * 35630 (cycle) = 5.938*10^-5 (s) = 59.38us
It can be seen that the calculation speed is still very fast. A total of 10 data arm_rms_f32, arm_rms_q31, and arm_rms_q15 are calculated.
|