Article count:1382 Read by:1966155

Account Entry

Function call stack, a must-have advanced technology for programmers

Latest update time:2022-05-19
    Reads:

Everyone knows that function calls are implemented through the stack, and we know that the local variables of the function are stored in the stack. But the implementation details of the stack may not be clear. This article will introduce how the function stack is implemented on the Linux platform. Some students may feel that it is not necessary to understand so deeply, but this is not the case. Based on our many years of experience, understanding the deep-seated principles of the system is very helpful in analyzing difficult problems .

Figure 0 Function stack

Just like being familiar with packet capture is an advanced weapon for solving network communication problems , being familiar with the function call stack is an advanced weapon for analyzing program memory problems. This article takes C language development under the Linux 64-bit operating system as an example to introduce the implementation principle of the application call stack, and uses an example and the GDB tool to specifically analyze the call stack content of a certain program. Before introducing the specific call stack, we first introduce some basic knowledge, which is the basis for understanding the subsequent function call stack.

X86 CPU registers

CPU registers are basic knowledge that you need to understand, because in the X64 system, function parameters are passed through registers. Figure 1 is a list of X86 CPU registers and a brief description of their functions.

Figure 1 Intel X86 CPU register usage

We know that Intel's CPUs are forward compatible when designed, that is, programs compiled on older generation CPUs can be run on new generation CPUs. To ensure compatibility, new generation CPUs retain the aliases of older generation registers. Taking the 16-bit register AX as an example, AL represents the lower 8 bits and AH represents the upper 8 bits. After the advent of 32-bit CPUs, 32-bit registers were represented by a register named EAX, and AX remained. By analogy, RAX represents a 64-bit register.

Figure 2 Different register names

Application address space

The operating system provides a unified memory mapping address for all applications through virtual memory . As shown in Figure 3, from top to bottom are the user stack, shared library memory, runtime heap and code segment. Of course, this is a rough segmentation, and the actual segmentation may be slightly more complicated than this, but the overall pattern has not changed significantly.

Figure 3 Application address space

It can be seen from the figure that the user stack grows from top to bottom. That is to say, the user stack will first occupy the high address space, and then occupy the low address space. At present, we can have a general understanding, and we will analyze the details of the user stack in detail later.

Function calls and assembly instructions

In order to understand the details of the function call stack, it is necessary to understand the implementation of function calls in assembler . Function calls are mainly divided into two parts, one is the call and the other is the return . In assembly language, function calls are done through the call instruction, and returns are done through the ret instruction.

The call instruction in assembly language is equivalent to performing two steps, namely, 1) pushing the current IP or CS and IP onto the stack; 2) jumping, similar to the jmp instruction. Similarly, the ret instruction is also divided into two steps, namely, 1) pop the address in the stack to the IP register; 2) jump to execute the subsequent instructions. This is basically the principle of function calling.

In addition to jumping between codes, function calls often need to pass a parameter, and there may be a return value after the processing is completed. The transfer of these data is carried out through registers. The parameters are stored in the registers introduced above before the function call, and the return result is stored in the RAX register (EAX for 32-bit systems) before the function returns.

Another important knowledge point is the stack-related registers RSP and RBP during function calls. These two registers mainly record the stack location. Their specific functions are as follows:

RSP: stack pointer register (reextended stack pointer), which stores a pointer that always points to the top of the stack frame at the top of the system stack.

RBP: Base pointer register (reextended base pointer), which stores a pointer that always points to the bottom of the top stack frame of the system stack.

The name of the register is related to the architecture. This article is a 64-bit system, so the registers are RSP and RBP. If it is a 32-bit system, the register names are ESP and EBP.

Application call stack

Let’s first look at the main contents of the function call stack as a whole, as shown in Figure 4. The function stack mainly includes the function parameter table, local variable table, stack base address and function return address . The base address of the stack here is the base address of the previous stack frame . Because this base address needs to be used to access the contents of the stack in this function, the base address in the previous stack frame needs to be pushed onto the stack first.

Figure 4 Overview of function call stack

To facilitate understanding, we take a specific program as an example. This program is very simple. It mainly simulates the function calling relationship and parameter transfer of multiple functions. In addition, two formal parameters are defined in function func_2 to simulate the process of multi-parameter transfer.

Figure 5 Function stack assembly analysis

In this example, the main function calls the func_1 function. Let's start the analysis from the main function. You can first look at the C language code on the right. The first is the preparation process of function parameters. When the main function calls func_1, the parameters passed in are 1, 2, 3 and 4+g, and the last parameter needs to be calculated. According to the dotted line of the red box, we can see the corresponding assembler. In the assembler, the last parameter is processed first, then the second to last, and so on (the order of processing function parameters needs to be paid attention to in daily development) content focus). At the same time, we see that the name of the register storing parameters is consistent with the previous article.

After preparing the parameters, the func_1 function is called . This is the call func_1 line in assembly language. Although it is just a line of assembly instructions, it actually does some things internally. We introduced this when we introduced the call instruction earlier. You can refer to the previous article.

Then enter the processing logic of the func_1 function. The very beginning is the pushq %rbp assembler. The function of this instruction is to push RBP into the function stack. This stack push and subsequent updated RBP values ​​(moveq %rsp, %rbp) are the stack frame headers used to construct this function. Subsequent access to the contents of this stack frame is performed through the frame header (RBP). Next is the process of pushing parameters onto the stack and initializing local variables. For specific distribution, refer to the green and red boxes in Figure 5.

After completing the operation within the function, the operation result is finally placed in the register EAX, and then the instructions leave and ret are called. What needs to be explained here is the leave instruction, which is equivalent to the following two assembly instructions. You can compare the assembly instructions of the function entry. In fact, the two are symmetrical. The leave instruction assigns the stack base address of this frame to the stack pointer (step 2 in Figure 6), and then pops the contents into the RBP (step 3 in Figure 6). In fact, RBP points to the stack frame of the previous frame (caller), which is a recovery process.

movl %ebp %esp 
popl %ebp


Figure 6 Function return diagram

In this way, after the function returns, the registers RBP and RSP are switched from the callee's stack frame to the caller's stack frame.

Analyze function call stack through GDB

The above is an analysis of the call stack and stack frame of the function through disassembly. We can also dynamically analyze the usage of function stacks and stack frames through gdb. We still use the main function to call the func_1 function as an example to analyze. Here we set a single point at the entry of function func_1, and then run the program, and the program stops at the breakpoint. As shown in Figure 7, we gradually execute the changing process of the function stack. We will not go into details here. You can actually operate it.

Figure 7 Function stack change process

The purpose of this article is to give everyone an overall understanding of the function call stack, so that we can have more ideas for solving problems in future programs. Because there are many stack-related problems in actual production environments, such as stack overflow caused by too many local variables, or stack destruction caused by memory problems, etc. Therefore, if you understand the principle of the function stack, you will have new ideas when encountering so-called inexplicable problems. Often, many problems are not that the problem itself is inexplicable, but that our knowledge reserves are insufficient and we feel that it is inexplicable.

end



A mouthful of Linux


Follow and reply [ 1024 ] Massive Linux information will be given away

Collection of wonderful articles

Article recommendation

【Album】 ARM
【Album】 Fans Q&A
【Album】 All original works
Album Introduction to linux
Album Computer Network
Album Linux driver


Click " Read the original text " to view more sharing, welcome to share, collect, like, and watch


Latest articles about

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号