ARM big.LITTLE architecture multi-core scheduling algorithm under Linux and Android kernel

Publisher:清新微笑Latest update time:2016-07-14 Source: eefocus Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere
In 2013, the big.LITTLE family was expanded with new SoC implementations, including the ARM reference test chip TC2 with 2 Cortex-A15 + 3 Cortex-A7 cores, and the Samsung-LSI 'Octa-core' chip with 4 Cortex-A15 cores + 4 Cortex-A7 cores used in the Samsung Galaxy S4. Linaro has made many performance optimizations (including load control, performance and power management for the big.LITTLE architecture) for the Linux and Android kernels on ARM's big.LITTLE SoCs to improve the energy efficiency of multi-core programs and increase standby time. The latest multi-core multi-tasking scheduling methods include the in-kernel switch (or CPU Migration/IKS In Kernel Switcher) and global task scheduling (or Global Task Scheduling, or MP/ big.LITTLE MP).

Figure 1. Multi-core task scheduling algorithm for big.LITTLE SoCs

The early big.LITTLE soft models had cluster migration or CPU migration scheduling algorithms, that is, the software switched between cores, but could not run all cores at the same time. The latest software model, Global Task Scheduling, can enable all cores at the same time and directly control the thread allocation between cores. The big and small core switching uses the dynamic voltage and frequency scaling (DVFS) method to switch tasks between high-voltage large cores and low-voltage small cores, thereby improving energy efficiency under various load conditions. The task switching time between cores is 30 microseconds, and the DVFS driver evaluates the OS and cores every 50 microseconds. The GTS algorithm will balance the load according to the thread load. The above multi-core task scheduling algorithms are all performed at the kernel level, so there is no need to modify the user application.

CPU Migration Algorithm IKS – In Kernel Switcher (CPU Migration)

IKS is a chip developed by Linaro for symmetric Cortex-A7 and Cortex-A15 core groups. Each pair of Cortex-A7 and Cortex-A15 core groups is regarded as a virtual symmetric core in the Linux kernel. Threads run in two mutually exclusive symmetric cores, that is, either in the high-performance Cortex-A15 or in the low-power Cortex-A7 core, that is, the highest performance depends only on the Cortex-A15 core. The IKS algorithm has been implemented in the Linux kernel, which is easy to test and productize.

Figure 2. IKS (4+4) and Cortex-A7 and Cortex-A15 core group architecture

Global Task Scheduling (big.LITTLE MP)

The GTS algorithm developed by ARM is also called big.LITTLE MP in Linaro. Under this algorithm, all big and small cores are visible in the Linxu kernel for task scheduling. The recent Linaro builds include this scheduling algorithm.

Figure 3. GTS (4+4) and Cortex-A7 and Cortex-A15 core group architecture

Compared with the IKS algorithm, the GTS algorithm has the following advantages:

  • More sophisticated inter-core load control, because the scheduler can directly switch tasks between cores, reducing the extra overhead of the core and thus reducing power consumption;
  • The implementation in the scheduler makes faster decisions than the implementation based on the cpufreq framework, and has a performance improvement of about 10% compared to IKS.
  • GTS supports asymmetric architectures, such as 2 Cortex-A15 cores plus 4 Cortex-A7 cores;
  • All peak processing power can be applied simultaneously, such as the processing power of 4 Cortex-A15 cores plus 4 Cortex-A7 cores in Figure 3.

The big.LITTLE MP kernel patch creates a list of Cortex-A15 and Cortex-A7 cores that handle the current task, and then assigns and tracks each task based on historical load statistics and performs task switching between cores. High processing power requirements are transferred to the Cortex-A15, while low processing power requirements are migrated to the low-power Cortex-A7 core.

Table 1. Comparison of big.LITTLE IKS vs big.LITTLE MP (GTS) kernel scheduling algorithms

 

big.LITTLE IKS CPU Migration

big.LITTLE MP, GTS

Core Configuration

Cortex-A15+Cortex-A7 core

Any number of Cortex-A15 cores + Cortex-A7 cores can run simultaneously.

Impact on the kernel

Minimal modification, changes only apply to the governor

There are many changes to the kernel, including scheduler, process annotation, etc.

Maximum processing capacity

All Cortex-A15

All Cortex-A15 cores + Cortex-A7 cores

Task Switching

Based on cpufreq framework

Use scheduler directly, 10% performance improvement;

Availability

Available in Linaro's monthly builds

Available in Linaro's monthly builds

Kernel.org

It will exist in 3.11 or 3.12

Will upload in the next few quarters

 

The above scheduling algorithm takes the multi-core SoCs of Cortex-A15 core + Cortex-A7 core as an example, but more big.LITTLE SoCs may adopt ARM's newer Cortex-A57 + Cortex-A53 architecture.

Summarize

The above big.LITTLE architecture task scheduling algorithms already exist in the Linaro build, and some algorithms have been evaluated in actual systems. For example, Samsung's latest Galaxy S4 uses an 8-core system, that is, a system with 4 Cortex-A15 cores and 4 Cortex-A7 cores, which has adopted a cluster-based migration algorithm. Even the least economical cluster migration algorithm has proven its superior energy efficiency in Qualcomm's multi-core Snapdragon system. Samsung has already used the energy consumption of Cortex-A7 to achieve Cortex-A15-level performance in Exynos 5.

Reference address:ARM big.LITTLE architecture multi-core scheduling algorithm under Linux and Android kernel

Previous article:Compilation and optimization of NEON instructions on ARM platform
Next article:Soft floating point and hard floating point issues when compiling ARM code with ARMCC and GCC

Latest Microcontroller Articles
  • Download from the Internet--ARM Getting Started Notes
    A brief introduction: From today on, the ARM notebook of the rookie is open, and it can be regarded as a place to store these notes. Why publish it? Maybe you are interested in it. In fact, the reason for these notes is ...
  • Learn ARM development(22)
    Turning off and on interrupts Interrupts are an efficient dialogue mechanism, but sometimes you don't want to interrupt the program while it is running. For example, when you are printing something, the program suddenly interrupts and another ...
  • Learn ARM development(21)
    First, declare the task pointer, because it will be used later. Task pointer volatile TASK_TCB* volatile g_pCurrentTask = NULL;volatile TASK_TCB* vol ...
  • Learn ARM development(20)
    With the previous Tick interrupt, the basic task switching conditions are ready. However, this "easterly" is also difficult to understand. Only through continuous practice can we understand it. ...
  • Learn ARM development(19)
    After many days of hard work, I finally got the interrupt working. But in order to allow RTOS to use timer interrupts, what kind of interrupts can be implemented in S3C44B0? There are two methods in S3C44B0. ...
  • Learn ARM development(14)
  • Learn ARM development(15)
  • Learn ARM development(16)
  • Learn ARM development(17)
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号