In the previous evaluation , we have set up the environment required for development, created a new project from scratch, and successfully printed Hello World
Next, I will give a score for the microcontroller. Now many electronic product consumers will make horizontal comparisons when choosing, such as various parameter performances, especially for mobile phones and computers. Merchants will show their own scores when promoting performance, such as mobile phone scores based on Antutu, computer scores based on CPU-Z AID64, etc. Although the scores cannot fully represent the real performance of the hardware, they can also be used as a reference to a certain extent. Similarly, when choosing an MCU, on the one hand, of course, we must meet the requirements of our own projects, and on the other hand, we prefer to choose an MCU with a higher cost performance.
Then, the performance test based on the microcontroller came into being. When it comes to performance testing, the most famous ones are CoreMark and Dhrystone. Although the Dhrystone test can be used as a reference, it is more easily affected by other factors .
CoreMark can reflect the actual working ability. It was developed by the Embedded Microprocessor Benchmark Consortium ( EEMBC) to replace the outdated Dhrystone standard . ARM officially recommends using CoreMark instead of Dhrystone for benchmarking.
The software is written in C language and is a free and easy-to-port benchmark program. Currently, CoreMark has become the industry standard benchmark for measuring and comparing the performance of various processors. The higher the CoreMark score, the higher the performance. The following figure shows the comparison between CoreMark and Dhrystone
From the Internet
CoreMark's simulated workload mainly includes several commonly used algorithms:
Matrix operations : simulate commonly used operations;
Linked list operations : simulate various uses of pointers;
State machine operation : simulate program branch running operation;
Cyclic Redundancy Check ( CRC) : A common function in embedded systems.
Next, we started porting CoreMark.
The first step is to go to the official website to download the source code at the corresponding location. The website is www.eembc.org. The downloaded files are shown in the figure below.
The second step is to open the project "AT32_Demo" ( https://en.eeworld.com/bbs/thread-1163406-1-1.html ) that I created in the previous review. Create a subfolder CoreMark and put core_list_join.c, core_main.c, core_matrix.c, core_state.c, core_util.c, and coremark.h into it. Create another subfolder CoreMark_Test and put core_portme.c and core_portme.h in the simple folder into it. Open the project at the same time and add the above c files to the project. Don't forget to add the path. The final effect is as follows. (In this project, I put the serial port related configuration and use separately in USART.c and put it in the HARDWARE folder. I will explain it here.)
The third step , because there is already a main() function in the core_main.c file, we need to block or delete the main function in the original project. I choose to block here, select main.c, right-click and select the first one, and uncheck include in Target Build as shown below , as shown below.
The most important porting work is to adapt core_portme.c,
First , add
#define SysTick_Counter_Disable ((uint32_t)0xFFFFFFFE)
#define SysTick_Counter_Enable ((uint32_t)0x00000001)
#define SysTick_Counter_Clear ((uint32_t)0x00000000)
__IO uint32_t Ticks;
#define ITERATIONS 2500;
ITERATIONS This depends on the situation. If ERROR! Must execute for at least 10 secs for a valid result! appears , then you need to change this value to make the program run for at least 10 seconds.
At the same time, shield
the following code #define NSECS_PER_SEC CLOCKS_PER_SEC
#define CORETIMETYPE clock_t
#define GETMYTIME(_t) (*_t = clock())
#define MYTIMEDIFF(fin, ini) ((fin) - (ini))
#define TIMER_RES_DIVIDER 1
#define SAMPLE_TIME_IMPLEMENTATION 1
#define EE_TICKS_PER_SEC (NSECS_PER_SEC / TIMER_RES_DIVIDER)
static CORETIMETYPE start_time_val, stop_time_val;
Add to
#define EE_TICKS_PER_SEC 1000.0
Modify the original three functions to the following
void start_time(void)
{
Ticks++;
SysTick_Config(SystemCoreClock / 1000); //1ms interrupt
}
void stop_time(void)
{
/* Stop the Timer and get the encoding time */
SysTick->CTRL &=SysTick_Counter_Disable;
/* Clear the SysTick Counter */
SysTick->VAL = SysTick_Counter_Clear;
}
CORE_TICKS get_time(void)
{
CORE_TICKS elapsed=(CORE_TICKS) Ticks;//(MYTIMEDIFF(stop_time_val, start_time_val));
return elapsed;
}
At the same time, since the main function in core_main.c first calls the portable_init function in core_portme.c when it is executed, it is necessary to put the RCC_Configuration, GPIO_Configuration, NVIC_Configuration, and assert_failed in the original main function into core_portme.c and call them in the portable_init function, as shown in the figure below (USART_Configuration is in USART.c, remember to add USART.h)
Step 4 : Modify core_portme.h and coremark.h. First, we need to adapt the ee_printf printing function. Since our board has implemented the printf function, keep the following code block in coremark.h unchanged.
#if HAS_PRINTF
#define ee_printf printf
#endif
In this way, when the program is called, ee_printf will be replaced with printf to implement the printing function
If the board does not have a printf function, you have to implement the print function yourself and make corresponding replacements.
At the same time , change the following functions in core_portme.h to your corresponding compiler version and optimization level
#ifndef COMPILER_VERSION
#ifdef __GNUC__
#define COMPILER_VERSION "GCC"__VERSION__
#else
#define COMPILER_VERSION "ARM Compiler 5.06 update 7 (build 960)"//changed
#endif
#endif
#ifndef COMPILER_FLAGS
#define COMPILER_FLAGS "-g -O3 -Otime"//changed /* "Please put compiler flags here (eg -o3)" */
#endif
#ifndef MEM_LOCATION
#define MEM_LOCATION "STACK"
#endif
Finally , because we use the system tick timer for timing, which is the timing reference required by start_time, stop_time, and CORE_TICKS get_time, we need to modify the SysTick_Handler in at32f4xx_it.c to the following code;
extern __IO uint32_t Ticks;
void SysTick_Handler(void)
{
Ticks++;
}
The last step is to change the Optimization level to Level 3 and the chip frequency to 120MHZ, so that the score will be higher. It should be noted that running CoreMark requires a lot of stack memory, so we need to change Stack_Size EQU 0x00000400 in the startup_at32f421c8t7.s startup file to Stack_Size EQU 0x00001000
I would like to point out here that I may have forgotten some of the porting details, such as various header files, but I have already mentioned the important steps. For specific available programs, please see the files I uploaded.
After compiling and downloading, if everything is normal, the following information will appear when you use the serial port assistant to observe
The final score is 201 points. Here I want to explain that it may be due to different IDE versions, different compilation chains or other hardware reasons, or my limited ability and failure to consider some details, which may lead to different CoreMark scores or may not be the true performance of AT32F421. Therefore, I am only responsible for the programs I run and the supporting software and hardware. For comparison, I also tested STM32F103RCT6, and the score was only 86 points, which is lower than the official score.
-------------------------------------------------- ----------------
CoreMark Test Begin!
Transplant CoreMark programs By DMZ!
CoreMark Test is running, Plase Wait!
2K performance run parameters for coremark.
CoreMark Size: 666
Total ticks : 17303
Total time (secs): 17.303000
Iterations/Sec : 86.690169
Iterations : 1500
Compiler version: ARM Compiler 5.06 update 7 (build 960)
Compiler flags: -g -O3 -Otime
Memory location : STACK
seedcrc : 0xe9f5
[0]crclist : 0xe714
[0]crcmatrix : 0x1fd7
[0]crcstate : 0x8e3a
[0]crcfinal : 0x25b5
Correct operation validated. See readme.txt for run and reporting rules.
CoreMark 1.0: 86.690169 / ARM Compiler 5.06 update 7 (build 960 ) -g -O3 -Otime / STACK
To add , when I later checked relevant information, I found that there is an optimization option of MDK 5 called Optimize for time. When it is checked , the CoreMark score will increase. The reason is probably that in Keil, when the "Optimize for time" compilation option is not selected, the local float variable occupies 8 bytes (the compiler automatically expands to double type by default). Once you use the "Optimize for time" compilation option, the local float variable will only occupy 4 bytes. In other words, a lot of unnecessary and cumbersome variable definitions are optimized, which can greatly optimize the compilation speed. I don't know the details. This should be the characteristics of each compiler. I don't understand the principle. But I still want to try it. When I checked Optimize for time, the score increased by more than 50%, as shown in the figure below. Of course, I have to increase ITERATIONS at this time. I choose 3500 here, otherwise it will be less than 10S.
At the same time , I modified the STM32F103 program and checked Optimize for time. The running score also increased by more than 50%. Although it is not yet as high as the official claim, it is closer than before, as shown below.
-------------------------------------------------- ----------------
CoreMark Test Begin !
Transplant CoreMark programs By DMZ!
CoreMark Test is running, Plase Wait!
2K performance run parameters for coremark.
CoreMark Size: 666
Total ticks : 11323
Total time (secs): 11.323000
Iterations/Sec : 132.473726
Iterations : 1500
Compiler version: ARM Compiler 5.06 update 7 (build 960 )
Compiler flags: -g -O3 -Otime
Memory location : STACK
seedcrc : 0xe9f5
[0]crclist : 0xe714
[0]crcmatrix : 0x1fd7
[0]crcstate : 0x8e3a
[0]crcfinal : 0x25b5
Correct operation validated. See readme.txt for run and reporting rules.
CoreMark 1.0: 132.473726 / ARM Compiler 5.06 update 7 (build 960 ) -g -O3 -Otime / STACK
-------------------------------------------------- ----------------
CoreMark (same test program and IDE and related configuration)
|
Uncheck Optimize for time
|
Check Optimize for time
|
AT32F421C8T7(120MHZ)
|
201.3
|
316.9
|
STM32F103RCT6(72MHZ)
|
86.7
|
132.7
|
According to the table above , I can finally conclude that, limited to my own hardware and software and the test board at hand, the performance of 120MHZ AT32F421 (based on CoreMark) can be said to be more than twice that of 72MHZ STM32F103.
This evaluation is over. In general, the cost performance of the Yatli AT32F421 is very good.
References :
http://mcu.eetrend.com/content/2019/100046454.html
https://www.bilibili.com/read/cv7196931/
https://en.eeworld.com/bbs/thread-610349-1-1.html
|