ARM Linux exception handling data abort 2

Publisher:PeacefulSoulLatest update time:2016-06-20 Source: eefocusKeywords:ARM Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere
As mentioned above, during the normal processing of data abort, the do_DataAbort function will eventually be called. The following is an analysis of the processing process of this function.

do_DataAbort

asmlinkage void __exception do_DataAbort(

      unsigned long addr, // memory address that caused the exception

      unsigned int fsr, // Register value in CP15 when an exception occurs, see previous text

      struct pt_regs *regs) // List of register values ​​before the exception occurs

{     

      const struct fsr_info *inf = fsr_info + (fsr & 15) + ((fsr & (1 << 10)) >> 6);

      

      if (!inf->fn(addr, fsr, regs))

             return;

 

      info.si_signo = inf->sig;

      info.si_errno = 0;

      info.si_code  = inf->code;

      info.si_addr = (void __user *)addr;

      arm_notify_die("", regs, &info, fsr, 0);

}

When processing data abort, first get the reason for abort based on the value of fsr, then get the struct fsr_info structure for processing this abort from a global array fsr_info based on this reason, and then call the fn function in the structure to process. If the fn function is empty or the function return is not 0, call the arm_notify_die function.

arm_notify_die

First, let's look at a simpler case, where fn is undefined in fsr_info and arm_notify_die is called:

void arm_notify_die(const char *str, struct pt_regs *regs,

             struct siginfo *info, unsigned long err, unsigned long trap)

{

      if (user_mode(regs)) {

             // 。。。

             force_sig_info(info->si_signo, info, current);

      } else {

             die(str, regs, err);

      }

}

This function first uses user_mode to determine whether the abort is in user mode or kernel mode by looking at the mode bit in the cpsr register. According to the definition of ARM, a mode bit of 0 represents user mode.

If it is user mode, a signal is forcibly sent to the task that caused the abort (note that the task here may be a thread). The specific signal to be sent is determined by the value defined in the struct fsr_info structure. Generally speaking, it is a signal that can stop the process, such as SIGSEGV, etc. (Signals such as SIGSEGV will stop the entire process even if they are sent to a thread. For details, see the get_signal_to_deliver function).

If it is kernel mode, the die function is called, which is the standard function for kernel to handle OOPS.

 

fsr_info

The fsr_info array is defined in fault.c. For each possible reason for data abort, there is a corresponding fsr_info structure.

static struct fsr_info fsr_info[] = {

      { do_bad,              SIGSEGV, 0,        "vector exception"            },

      // 。。。

      { do_translation_fault,    SIGSEGV, SEGV_MAPERR, "section translation fault"},

      { do_bad,              SIGBUS, 0,          "external abort on linefetch"},

      { do_page_fault, SIGSEGV, SEGV_MAPERR, "page translation error" },

      { do_bad,              SIGBUS, 0,          "external abort on non-linefetch" },

      { do_bad,              SIGSEGV, SEGV_ACCERR,"section domain fault"              },

      { do_bad,              SIGBUS, 0,          "external abort on non-linefetch" },

      { do_bad,              SIGSEGV, SEGV_ACCERR,"page domain fault"           },

      { do_bad,              SIGBUS, 0,          "external abort on translation"         },

      { do_sect_fault,     SIGSEGV, SEGV_ACCERR, "section permission fault"        },

      { do_bad,              SIGBUS, 0,          "external abort on translation"         },

      { do_page_fault,     SIGSEGV, SEGV_ACCERR, "page permission fault"            },

      { do_bad,              SIGBUS,  0,         "unknown 16"                  },

      // 。。。

      { do_bad,              SIGBUS,  0,         "unknown 30"                  },

      { do_bad,              SIGBUS,  0,         "unknown 31"                  }

};

fsr_info calls the do_bad function to handle most aborts. The do_bad function simply returns 1, so that the arm_notify_die mentioned above can continue to be executed.

fsr_info handles the following four special aborts separately:

l "section translation fault" do_translation_fault
segment translation error, that is, the secondary page table cannot be found

l "page translation fault" do_page_fault
page table error, that is, the linear address is invalid and there is no corresponding physical address

l "section permission fault" do_sect_fault
segment permission error, that is, secondary page table permission error

l "page permission fault" do_page_fault
page permission fault

 

Segment permission error do_sect_fault

The do_sect_fault function directly calls do_bad_area for processing and returns 0, so it does not pass through arm_notify_die. In do_bad_area, it is determined whether it belongs to user mode. If it is user mode, the __do_user_fault function is called; otherwise, the __do_kernel_fault function is called.

void do_bad_area(unsigned long addr, unsigned int fsr, struct pt_regs *regs)

      if (user_mode(regs))

             __do_user_fault(tsk, addr, fsr, SIGSEGV, SEGV_MAPERR, regs);

      else

             __do_kernel_fault(mm, addr, fsr, regs);

In __do_user_fault, a signal is sent to the current thread.

__do_kernel_fault is more complicated:

l Call fixup_exception to perform a fixup operation. The details of fixup can be found in the kernel document exception.txt. It can be used to handle the situation where the address parameter passed to functions such as get_user is invalid.

If the error cannot be repaired, the die function is called to handle the oops.

If there is no process context, the kernel will panic in the oops of the previous step. So there must be a process associated with it, so the do_exit(SIGKILL) function is called to exit the process, and SIGKILL will be set in the exit_code field of task_struct.

Segment table error do_translation_fault

In the do_translation_fault function, it will first determine whether the address that caused the abort is in user space.

If it is a user space address, do_page_fault is called and the process is the same as page table fault and page permission fault.

If it is a kernel space address, it will determine whether the secondary page table pointer corresponding to the address is in init_mm. If it is in init_mm, the secondary page table pointer points to the primary page table of the current process; otherwise, do_bad_area is called for processing (fixup may be called).

My personal understanding of the logic of handling segment table errors is as follows (not guaranteed to be completely accurate):
In addition to fixup, there are two other reasons for Linux to generate segment table errors: one is that the linear address mapped in user space is abnormal, and the other is that the linear address allocated by calling vmalloc in the kernel is abnormal. The exception handling of user space addresses is easy to understand. For kernel addresses, it can be seen from the implementation code of vmalloc that the mapping relationship of the linear space it allocates will be saved in the global variable init_mm, so any secondary page table of the linear space generated by vmalloc should be found in init_mm. (init_mm is the kernel's mm_struct, which manages the memory mapping of the entire kernel).

It can also be seen here that access to the vmalloc address may cause two exceptions: the first is a segment table error, generating a secondary page table; the second is a page table error, allocating real physical pages to the linear space.

For details about init_mm, please refer to http://my.chinaunix.net/space.php?uid=25471613&do=blog&id=323374. It roughly means that when the kernel page table changes, only the kernel page table init_mm of the init process is changed, and other processes update their own maintained kernel page tables from init_mm through page fault exceptions.

Page table fault do_page_fault

Page permission fault do_page_fault

do_page_fault completes the actual physical page allocation work. In addition, stack extension, mmap support, etc. are also here. For the allocation of physical pages, do_anonymous_page->...->__rmqueue will be called, and the partner algorithm for physical page allocation is implemented in __rmqueue.

If there are not enough physical pages for memory allocation, the allocation fails:

l Kernel mode abort will call __do_kernel_fault, which is the same as the processing of segment permission fault.

l In user mode, do_group_exit will be called to exit the process to which the task belongs.

When a user program requests memory space, if the library function's own memory pool cannot meet the allocation, it will call the brk system call to request the system to expand the heap space. However, at this time, only the linear space is expanded. The system will not allocate physical pages through data abort until the linear space is actually used. If the malloc function in the user space returns non-NULL, it only means that the resources of the linear space are obtained, but the physical memory may not be mapped to it. Therefore, when the real physical memory allocation fails, the process will still exit directly due to insufficient resources.

Keywords:ARM Reference address:ARM Linux exception handling data abort 2

Previous article:ARM Linux exception handling data abort
Next article:ARM Linux interrupt processing

Latest Microcontroller Articles
  • Download from the Internet--ARM Getting Started Notes
    A brief introduction: From today on, the ARM notebook of the rookie is open, and it can be regarded as a place to store these notes. Why publish it? Maybe you are interested in it. In fact, the reason for these notes is ...
  • Learn ARM development(22)
    Turning off and on interrupts Interrupts are an efficient dialogue mechanism, but sometimes you don't want to interrupt the program while it is running. For example, when you are printing something, the program suddenly interrupts and another ...
  • Learn ARM development(21)
    First, declare the task pointer, because it will be used later. Task pointer volatile TASK_TCB* volatile g_pCurrentTask = NULL;volatile TASK_TCB* vol ...
  • Learn ARM development(20)
    With the previous Tick interrupt, the basic task switching conditions are ready. However, this "easterly" is also difficult to understand. Only through continuous practice can we understand it. ...
  • Learn ARM development(19)
    After many days of hard work, I finally got the interrupt working. But in order to allow RTOS to use timer interrupts, what kind of interrupts can be implemented in S3C44B0? There are two methods in S3C44B0. ...
  • Learn ARM development(14)
  • Learn ARM development(15)
  • Learn ARM development(16)
  • Learn ARM development(17)
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号