How to write code that is good for compiler optimization

Publisher:EE小广播Latest update time:2021-11-09 Source: EEWORLDAuthor: IAR SystemsKeywords:compiler Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

In embedded development, code size and operating efficiency are very important. Code size often corresponds to the FLASH and RAM capacity of the chip, and the program's operating efficiency also requires running on a processor with corresponding capabilities. In most cases, mature developers hope to reduce code size and improve code operating efficiency, but how to do it specifically? This article will take the compiler of IAR Systems, an internationally renowned compiler manufacturer, as an example to answer the problems that developers often encounter in actual work. Engineer friends can conduct practical verification on the IAR compiler.


For embedded systems, the size and efficiency of the final code depends on the executable code generated by the compiler, not the source code written by the developer; however, source code optimization can help the compiler generate better executable code. Therefore, developers should not only consider the source code system from the perspective of overall efficiency, but also pay close attention to the performance of the compiler and the convenience of compilation optimization.


Compilers with optimizing capabilities can generate executable code that is both small and fast. Compilers achieve optimization by repeatedly transforming the source code. Usually, compiler optimizations follow a well-established mathematical or logical theoretical basis. However, some compiler optimizations are based on heuristic methods. Experience has shown that some code transformations tend to produce better code or open up space for further compiler optimizations.


Compiler optimization only relies on the black technology of the compiler in a few cases. Most of the time, the way you write the source code determines whether the program can be optimized by the compiler. In some cases, even a small change to the source code can have a significant impact on the efficiency of the code generated by the compiler.


This article will talk about things to pay attention to when writing code, but we should first make it clear that we don't need to minimize the amount of code, because even if you use ?:- expressions, post-increment and comma expressions in an expression to eliminate side effects, it will not make the compiler generate more efficient code. This will only make your source code obscure and difficult to maintain. For example, adding a post-increment or assignment in the middle of a complex expression can be easily overlooked when reading the code. Please try to write code in an easy-to-read style.


cycle


Will the following seemingly simple loop report an error?


for (i = 0; i != n; ++i)

{

a[i] = b[i];

}


Although no error will be reported, there are several points that will affect the efficiency of the code generated by the compiler.


For example, the type of an index variable should match that of a pointer.


An array expression like a[i] is actually *(&a[0]+i*sizeof(a[0]), or in layman's terms: add the offset of the i-th element to the pointer to the first element of a. For pointer arithmetic, the type of the index expression should preferably match the type of the pointer to which it points (except for __far pointers, where the type of the pointer to which it points is different from the type of the index expression). If the type of the index expression does not match the type of the pointer to which it points, it must be cast to the correct type before adding it to the pointer.


If in the application, stack space resources (stack is usually placed in RAM) are more valuable than code size resources (code is usually placed in ROM or Flash), you can choose a smaller type for the index variable to reduce the use of stack space, but this often sacrifices code size and execution time (code size becomes larger, execution time becomes slower). Not only that, this conversion will also hinder the optimization of loop code.


In addition to the above issues, we also need to pay attention to the loop condition, because loop optimization can only be performed if the number of iterations can be calculated before entering the loop. However, this calculation is very complicated and is not as simple as subtracting the initial value from the final value and dividing by the increment. For example, if i is an unsigned char, n is an integer, and the value of n is 1000, what will happen? The answer is that the variable i will overflow before it reaches 1000.


While the programmer certainly does not want an infinite loop that repeatedly copies 256 elements from b to a, the compiler cannot know the programmer's intent. It must assume the worst case and cannot apply optimizations that require the number of trips to be provided before entering the loop. In addition, you should also avoid using the relational operators <= and >= in the loop condition if the final value is a variable. If the loop condition is i <= n, then n may be the highest value representable in the type, so the compiler must assume that this is a potentially infinite loop.


Aliases


In general, we do not recommend using global variables. This is because you can modify a global variable from anywhere in the program, and the program will change based on the value of the global variable. This creates complex dependencies that make it hard to understand the program and determine how changing the value of a global variable will affect the program. From the optimizer's perspective, this is even worse because the value of any global variable can be changed by storing a pointer. If a variable can be accessed in multiple ways, this is called aliasing, and aliasing makes the code harder to optimize.


char *buf

void clear_buf()

{

int i;

for (i = 0; i < 128; ++i)

{

buf[i] = 0;

}

}


Although the programmer knows that writing to the buffer pointed to by buf will not change the buf variable itself, the compiler still has to make the worst plan and reload buf from memory in each iteration of the loop.


You can eliminate the aliasing if you pass the address of the buffer as an argument instead of using a global variable:


void clear_buf(char *buf)

{

int i;

for (i = 0; i < 128; ++i)

{

buf[i] = 0;

}

}


With this solution, the pointer buf is not affected by the store through the pointer. As a result, the pointer buf remains unchanged in the loop and its value only needs to be loaded once before the loop, rather than reloaded on each iteration.


However, if you need to pass information between code segments that do not share a caller/callee relationship, then you can simply use global variables. However, for computationally intensive tasks, especially those involving pointer manipulation, it is better to use automatic variables.

Try not to use post-increment and post-decrement


In the following, everything said about postincrement also applies to postdecrement. The standard text on postincrement semantics in C states: "The result of the postfix ++ operator is the value of its operand. After the result is obtained, the value of the operand is incremented". While microcontrollers commonly have addressing modes that increment pointers after load or store operations, few of them can handle other types of postincrement with the same efficiency. To comply with the standard, the compiler must copy the operand to a temporary variable before performing the increment. For straight-line code, the increment can be taken out of the expression and placed after the expression.

For example, the following expression:


foo = a[i++];

Can be changed to

foo = a[i];

i = i + 1;


But what happens if the post-increment is part of the condition in a while loop? Since there is no place to insert the increment after the condition, the increment must be added before the test. For these common but closely related designs for generating executable code efficiency, tools such as IAR Systems' Embedded Workbench provide optimization solutions after summarizing a lot of practices.


For example, the following loop


i = 0;

while (a[i++] != 0)

{

...

}


Should be changed to


loop:

temp = i; /* Save the value of the operand*/

i = temp + 1; /* increment operand */

if (a[temp] == 0) /* use the saved value */

goto no_loop;

...

goto loop;

no_loop:

or

loop:

temp = a[i]; /* use the value of the operand*/

i = i + 1; /* increment operand */

if (temp == 0)

goto no_loop;

...

goto loop;

no_loop:


If the value of i after the loop is not relevant, it is better to put the increment inside the loop. For example, the following almost identical loop


i = 0;

while (a[i] != 0)

{

++i;

...

}


It can be executed without temporary variables:


loop:

if (a[i] == 0)

goto no_loop;

i = i + 1;

...

goto loop;

no_loop:


Optimizing compiler developers are well aware that post-incrementation makes code more complicated to write, and although we have done our best to recognize these patterns and eliminate temporary variables as much as possible, there are always some cases where we cannot produce efficient code, especially when encountering loop conditions that are more complicated than the above. Often, we will split a complex expression into several simpler expressions, just like the loop condition above is split into a test and an increment.


In a C++ environment, the choice of pre-increment or post-increment is even more important. This is because both operator++ and operator-- can be overloaded in both prefix and postfix form. When overloading operators as class objects, it is not necessary to mimic the behavior of primitive operators, but it should be as close as possible. Therefore, classes that can intuitively increment and decrement objects, such as iterators, usually have both prefix (operator++() and operator--()) and postfix forms (operator++(int) and operator--(int)).

[1] [2]
Keywords:compiler Reference address:How to write code that is good for compiler optimization

Previous article:STMicroelectronics updates TouchGFX software, adds video features to enrich STM32 user experience
Next article:Socionext develops LSI for next-generation cloud labeling to accelerate digital transformation in logistics

Recommended ReadingLatest update time:2024-11-16 13:27

Debugging Mini2440 with IAR
Using IAR to debug mini2440, ready to play naked running first When I was developing Lpc2214, I used IAR debugging environment, which was pretty good. I once misunderstood this development environment. Because Lpc can configure the Jtag pin as a normal IO I accidentally configured it as IO, which caused IAR to b
[Microcontroller]
SiRuiPu and IAR work together to build an embedded development ecosystem
IAR Embedded Workbench for Arm fully supports 3PEAK TPS32 mixed-signal microcontroller mainstream products Shanghai, China – January 18, 2024 – IAR, a global leader in embedded development software and services, and 3RuiPu today jointly announced that IAR’s flagship product IAR Embedded Workbench
[Embedded]
SiRuiPu and IAR work together to build an embedded development ecosystem
[IAR warning] Error[e16]: Segment INTVEC error
Error : Segment INTVEC (size: 0xec align: 0x2) is too long for segment definition. At least 0xac more bytes needed.    The problem occurred while processing the segment placement command "-Z(CODE)INTVEC=00-3F", where at the    moment of placement the available memory ranges were "CODE:0-3f"     Reserved ranges relev
[Microcontroller]
IAR+STM8_EXTI external IO interrupt
MCU: STM8S207SB IO port to be operated: PD7 uses falling edge to trigger interrupt Code: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 twenty one twenty two twenty three twenty four 25 26 27 28 29 30 31 32 33 34 35 36 #include iostm8s207sb.h   #define LED1_FLASH PD_ODR_ODR3 = !PD_ODR_ODR3 // LED1 on the d
[Microcontroller]
static, const, code keywords in C51 development
static Keyword Static can be used to modify variables in C, and can also be used to modify functions. static modifier variable Variables can exist in three places in C: the global data area, the stack, and the heap. The stack and the heap are different. int a ; int main() {   int b ;   int c* = (int *)malloc(sizeo
[Microcontroller]
STM8 firmware library + IAR -- Environment
It's been quite a while since I got the board, but I've been busy with all sorts of things and haven't been able to properly practice with this board. Now... I'll just show you my development board, a three-in-one kit~~ I downloaded a basic program, no problem! But I saw that the stm8 on the board is STM8S207SB, but
[Microcontroller]
Calling printf() function in IAR based on STM32
Add the following code to the main file main.c: #ifdef __GNUC__     #define PUTCHAR_PROTOTYPE int __io_putchar(int ch) #else   #define PUTCHAR_PROTOTYPE int fputc(int ch, FILE *f) #endif   PUTCHAR_PROTOTYPE {       USART_SendData(EVAL_COM1, (uint8_t) ch);     while (USART_GetFlagStatus(EVAL_COM1, USART_FLAG_TC) ==
[Microcontroller]
STM32 development board example explanation 1: Use IAR 5.3 to create a project
       Learning STM32, getting started is very important, and the first one you build yourself is also very important. Getting started as soon as possible and building up confidence in learning will be of great help to subsequent learning. This article will describe how to build a project on IAR and start your first ex
[Microcontroller]
Latest Embedded Articles
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号