introduction
μC/OSII's ready table setting, clearing, and searching algorithms are highly efficient and cross-platform programs. It uses two search arrays, OSMapTbl[8] and OSUnMapTbl[256], to increase the speed of searching the ready table and obtain the highest priority of the ready task as quickly as possible.
CortexM3 is a newer architecture version of ARM, mainly used in the field of single-chip microcomputers. More and more 32-bit chips are produced based on it; CortexM3 only supports Thumb2 instruction set, which can achieve better performance between performance and code density.
1 Reasons and pros and cons of changing the algorithm on ARM
Since the ready list operation is run in the interrupt-disabled state, its execution affects the interrupt response time of the system. Therefore, the efficiency of the ready list operation algorithm is one of the benchmarks for measuring the quality of the real-time operating system.
In the instruction set used by CortexM3, some instruction functions cannot be underestimated, such as leading zero count clz, word bit inversion rbit, bit clear bic. Among them, clz and bic point out another way for the high priority acquisition algorithm of μC/OS ready table.
(1) Advantages after the change
① Save storage space. The search arrays OSMapTbl[8] and OSUnMapTbl[256] are no longer used. The purpose of setting up these two arrays is to improve the efficiency of searching the ready table.
② Improve search efficiency. clz is a single-cycle instruction that uses addition instructions with shifts to significantly shorten the calculation time.
③ The number of tasks supported by μC/OSII has been increased from 64 to 1,024 (the number of tasks supported by version 2.84 has reached 256, but the efficiency has decreased).
(2) Existing deficiencies
① Realview MDK (version 3.20 and its instruction simulator are used here) does not yet support the use of Thumb2 instruction inline assembly in C language programs. When using inline assembly functions, function calls (jump returns) reduce execution efficiency.
② C language does not support the clz instruction enough, so the new algorithm is not cross-platform. However, given that ARM chips are widely used and the instruction is widely compatible with chips after ARM9, the application space is still broad.
2μC/OSII Ready Table Algorithm Introduction and Specific Changes
The μC/OSII ready table is an array. The value of a bit in the array element (1 or 0) corresponds to whether a task is ready or not. The position of the bit in the array indicates the priority of the task. When the highest priority task that is ready needs to be scheduled to run, the task is searched in the ready table.
2.1 Introduction to μC/OSII Ready List Algorithm
One solution is to determine whether each item in the array is 0 in turn: if > 0, enter the item and search for the 1-bit position with the smallest weight; if = 0, add a base to the priority and check the next item until the priority is found.
μC/OSII is one step ahead and sets a variable to determine whether each item in the ready table is 0, called the ready table group. A bit in the ready table group is 0 or 1, corresponding to whether the value of an item in the ready table is 0. By finding the position of the least weighted bit in the ready table group that is set to 1, the index corresponding to the first ready table item > 0 is determined, thereby avoiding loops and greatly improving efficiency.
2.2 Modification method and source code
The clz algorithm adopts the idea of μC/OSII and optimizes it by using the clz instruction. The difference is that clz searches from right to left, and the high-weight bit in the binary corresponds to the high priority, while the priority of μC/OSII is small.
Considering that sometimes not many tasks are used, it is a waste to use an array as a ready list. Therefore, when the total number of tasks is less than 32, a 32-bit unsigned integer variable is used as the ready list. Note that the ready list group variable OSRdyGrp is used as the ready list at this time.
The constant OS_LES_TSK indicates whether to use a smaller number of tasks. 0 indicates using a maximum of 32 tasks, and 1 indicates using a maximum of 1,024 tasks.
The constant RdySt sets the most significant bit of a 32-bit integer to 1 for shifting.
2.3 C language implementation
The following algorithm is written using a function with an embedded clz instruction to implement the setting and clearing of tasks with a specified priority in the ready list, and to search for the highest priority of the ready tasks in the ready list.
#defineOS_LES_TSK 0
#defineRdySt #0x80000000
intOSRdyGrp ; //Task ready list group, if the number of tasks is less than 32, it will be used as a ready list
intOSRdyTbl[32] ;//Task ready table
//Set the task with priority prio in the ready list
void OSRdySET(unsigned shortprio) {
prio=prio; //Prevent some compilers from reporting errors
//When the total number of tasks is <32, OSRdyTbl[] will not be compiled to improve efficiency
#ifOS_LES_TSK
OSRdyGrp|=(RdySt>>prio);
#else
//The number of tasks is between 1 and 1 024, so scheduling will take more time
//prio>>5: The position of the bit that needs to be set to 1 in the task group OSRdyGrp,
//That is, OSRdyTbl[] subscript
OSRdyGrp|=(RdySt>>(prio>>5));
//prio & 31: bitwise AND, take the lower 5 bits of prio, set the array element
OSRdyTbl[prio>>5]|=(RdySt>>(prio & 31));
#endif
}
// Clear the task with priority prio in the ready list
void OSRdyCLR(unsigned shortprio) {
prio=prio;
#ifOS_LES_TSK //When the total number of tasks < 32
OSRdyGrp&=~(RdySt>>prio);
#else
//The number of tasks is between 1 and 1 024, so scheduling will take more time
//prio>>5: The position of the bit that needs to be cleared in the ready table group OSRdyGrp,
//That is, the subscript of the ready table OSRdyTbl[]
//prio & 31: bitwise AND, take the lower 5 bits of prio, set the array element
OSRdyTbl[prio>>5] &= ~(RdySt>>(prio & 31));
//If an item in the ready table OSRdyTbl[] is zero, clear the corresponding
//One of the thread table groups
if (OSRdyTbl[prio>>5]==0 )
OSRdyGrp &= ~(RdySt>>(prio>>5));
#endif
}
//Leading zero count: embedded assembly function, may reduce execution efficiency
__asm unsigned shortClzGet(intOSRdyGrp) {
clz.w r0,r0
bx lr //Can return even if not written, see below for details
}
//Get the highest priority of the task ready group/table, with leading zero count
//Function calls reduce efficiency. You can write the following statement in another function.
unsignedshortOSPrioHighRdy() {
unsignedshorty;
#ifOS_LES_TSK
//If the number of tasks is less than 32, you can consider defining y as an 8-bit unsigned integer
//unsignedchar y;
y = ClzGet(OSRdyGrp);
#else
//The number of tasks is between 1 and 1 024, so scheduling will take more time
y = ClzGet(OSRdyGrp);
y = ClzGet (OSRdyTbl [y]) + (y << 5);
#endif
returny;
}
// Check if the execution result is correct, 33 is a suitable value for checking
intmain() {
unsigned shortprio=0;
while (prio < 1024) {
OSRdySET (prio); //Set the task ready table bit and group bit according to the priority
prio = OSPrioHighRdy(); //Get the highest priority of the task in the ready list
OSRdyCLR (prio); //Clear the corresponding
//Ready table position, group position
prio+=33;
if (prio>1023)
prio=0;
}
}
Some materials require that bx r14 in the program must be written, but after checking the disassembled code, the compiler has already added it. It seems that the compiler has been upgraded. Whether there will be an error depends on the compiler used. It is recommended to write it according to the specification. Since the return of the embedded function call is time-consuming, the search algorithm cannot be fully utilized. It is necessary to improve the compiled assembly code to achieve higher efficiency, or rewrite this part of the program using assembly code.
2.4THUMB2 assembly instruction implementation
Tips for writing programs in assembly language: In the setup and clearing functions of the highest priority task, the C language operator "|= " is equivalent to the assembly instruction "orr", and "&= ~" is equivalent to the assembly instruction "bic". Both instructions can perform pre-shift operations, greatly improving execution efficiency. You can check the disassembled source code to see if the C compiler takes advantage of this convenience.
In the search function, the inline assembly call in the C language program can be omitted to reduce redundant instructions. The pseudo code is as follows:
ldr r0,=OSRdyGrp; load ready table group variable OSRdyGrp address
ldr r1, [r0]; Get the value of the ready table group variable OSRdyGrp
#ifOS_LES_TSK
clz.w r0, r1; take the highest priority
#else
clz..w r0, r1; Get the index of the ready list entry with the highest priority
ldr r2,=OSRdyTbl; load ready table first address
add.wr2, r2, r0, #2; Calculate the address of the ready list entry with the highest priority
ldr r3, [r2]; get the value of this item
Clz.w r1,r3; take the highest priority in this item
add.w r0,r1,r0 ,lsl #5; calculate the highest priority
#endif
bxr14
It can be seen that, in addition to the data loading instructions, the core algorithm of the search has only 3 instructions (only 1 instruction when using <32 tasks). However, when actually designing the algorithm, it is also necessary to consider the instruction pipeline stall to achieve the best effect.
2.5μC/OSII 2.84 source code introduction
The following is the translated and sorted μC/OSII priority search algorithm source code (version 2.84). The longer comments are added algorithm descriptions.
/************************************
* Find the highest priority task in the ready list to run
*Description: This function is called by other μC/OSII services to ensure that the ready list
*The highest priority task is executed first, and the global variable "OSPrioHighRdy" is also
*Changes
* Tips: ① This is an internal function of μC/OSII, user programs should not call it
Previous article:Design of network fingerprint access control system based on ARM and POE
Next article:A Brief Analysis of Embedded Operating Systems
- Popular Resources
- Popular amplifiers
- Learn ARM development(16)
- Learn ARM development(17)
- Learn ARM development(18)
- Embedded system debugging simulation tool
- A small question that has been bothering me recently has finally been solved~~
- Learn ARM development (1)
- Learn ARM development (2)
- Learn ARM development (4)
- Learn ARM development (6)
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
- LED chemical incompatibility test to see which chemicals LEDs can be used with
- Application of ARM9 hardware coprocessor on WinCE embedded motherboard
- What are the key points for selecting rotor flowmeter?
- LM317 high power charger circuit
- A brief analysis of Embest's application and development of embedded medical devices
- Single-phase RC protection circuit
- stm32 PVD programmable voltage monitor
- Introduction and measurement of edge trigger and level trigger of 51 single chip microcomputer
- Improved design of Linux system software shell protection technology
- What to do if the ABB robot protection device stops
- Microchip Accelerates Real-Time Edge AI Deployment with NVIDIA Holoscan Platform
- Microchip Accelerates Real-Time Edge AI Deployment with NVIDIA Holoscan Platform
- Melexis launches ultra-low power automotive contactless micro-power switch chip
- Melexis launches ultra-low power automotive contactless micro-power switch chip
- Molex leverages SAP solutions to drive smart supply chain collaboration
- Pickering Launches New Future-Proof PXIe Single-Slot Controller for High-Performance Test and Measurement Applications
- Apple faces class action lawsuit from 40 million UK iCloud users, faces $27.6 billion in claims
- Apple faces class action lawsuit from 40 million UK iCloud users, faces $27.6 billion in claims
- The US asked TSMC to restrict the export of high-end chips, and the Ministry of Commerce responded
- The US asked TSMC to restrict the export of high-end chips, and the Ministry of Commerce responded
- vscode for web released
- Mir MYC-YT507 development board review: Performance test 3: Storage performance test
- What kind of chip is my fake stm32f103c8t6
- How are components connected in a PCB?
- [NXP Rapid IoT Review] Is your kit still working properly?
- On Double 11, are there any SSDs with SATA interface that I can buy?
- [LPC] [Help] I have a problem now. I want to use the LPC1768 MCU hardware I2C interrupt to drive the OLED
- Looking for Samsung DDR3eMMC Datasheet
- How to log in to a website account using ESP32?
- Taking the core-G1 [STM32F103C8T6] core board marquee experiment introductory series 1 project as an example, this paper introduces STM3...