System efficiency
phenomenon 1: This main frequency100M The CPU can only process 70% of the200M The main frequency is fine.
Comment: The processing capacity of the system involves a variety of factors. In the communication business, the bottleneck is generally on the memory. No matter how fast the CPU is, it is futile if the external access is not fast.
Phenomenon 2: The CPU should be faster if it uses a larger cache . Comment: The increase of the cache does not necessarily lead to an improvement in system performance. In some cases, closing the cache is faster than using the cache . The reason is that the data moved to the cache must be reused many times to improve the efficiency of the system. Therefore, in the communication system, only the instruction cache is generally opened. Even if the data cache is opened, it is limited to part of the storage space, such as the stack. At the same time, the program design is also required to take into account the capacity and block size of the cache , which involves the length of the key code loop body and the jump range. If a loop is just a little larger than the cache and it is repeatedly looped, it will be miserable. Phenomenon 3: For so many tasks, should interrupts or queries be used? It is better to use interrupts faster . Comment: Interrupts have strong real-time performance, but they are not necessarily fast. If there are a lot of interrupt tasks, this one has not been exited, and the latter one will follow, and the system will crash in a while. If there are many tasks but they are very frequent, a lot of CPU energy will be used on the overhead of interrupts, and the system efficiency is extremely low. If the query method is used instead, the efficiency can be greatly improved, but the query sometimes cannot meet the real-time requirements, so the best way is to query in the interrupt, that is, to enter an interrupt and process all the accumulated tasks before exiting. Phenomenon 4: The timing of the memory interface is the manufacturer's default configuration and does not need to be modified. Comment: The default values of the BSP for the memory interface are set according to the most conservative parameters. In actual applications, they should be reasonably adjusted in combination with parameters such as the bus operating frequency and the waiting period. Sometimes lowering the frequency can improve efficiency. For example, the RAM access cycle is 70ns and the bus frequency is
40M When the access time is 3 cycles, 75ns is sufficient; if the bus frequency is50M When the access time is set to 4 cycles, the actual access time is slowed down to 80ns .
Phenomenon 5: If one CPU cannot handle it, use two distributed processing, and the processing capacity can be doubled.
Comment: For moving bricks, two people should be twice as efficient as one person; for painting, one more person can only do more harm than good. How many CPUs to use can be determined after having a better understanding of the business, and try to reduce the cost of coordination between the two CPUs , so that 1+1 is as close to 2 as possible , and never less than 1. Phenomenon 6: This CPU has a DMA module, and it is definitely faster to use it to move data . Comment: The real DMA is that the hardware seizes the bus and starts the devices at both ends at the same time, reading on this side and writing on the other side in one cycle. However, many DMAs embedded in CPUs are just simulations. Before starting each DMA , a lot of preparation work must be done (setting the starting address and length, etc.). When transmitting, the data is often read into the chip for temporary storage first, and then written out. That is, it takes two clock cycles to move data once, which is faster than software (no need to fetch instructions, no extra work such as loop jumps, etc.), but if only a few bytes are moved at a time, a lot of preparation work must be done, which generally involves function calls, and the efficiency is not high. Therefore, this type of DMA is only applicable to large data blocks.
Reference address:Some tips on hardware design: system efficiency
phenomenon 1: This main frequency
Comment: The processing capacity of the system involves a variety of factors. In the communication business, the bottleneck is generally on the memory. No matter how fast the CPU is, it is futile if the external access is not fast.
Phenomenon 2: The CPU should be faster if it uses a larger cache . Comment: The increase of the cache does not necessarily lead to an improvement in system performance. In some cases, closing the cache is faster than using the cache . The reason is that the data moved to the cache must be reused many times to improve the efficiency of the system. Therefore, in the communication system, only the instruction cache is generally opened. Even if the data cache is opened, it is limited to part of the storage space, such as the stack. At the same time, the program design is also required to take into account the capacity and block size of the cache , which involves the length of the key code loop body and the jump range. If a loop is just a little larger than the cache and it is repeatedly looped, it will be miserable. Phenomenon 3: For so many tasks, should interrupts or queries be used? It is better to use interrupts faster . Comment: Interrupts have strong real-time performance, but they are not necessarily fast. If there are a lot of interrupt tasks, this one has not been exited, and the latter one will follow, and the system will crash in a while. If there are many tasks but they are very frequent, a lot of CPU energy will be used on the overhead of interrupts, and the system efficiency is extremely low. If the query method is used instead, the efficiency can be greatly improved, but the query sometimes cannot meet the real-time requirements, so the best way is to query in the interrupt, that is, to enter an interrupt and process all the accumulated tasks before exiting. Phenomenon 4: The timing of the memory interface is the manufacturer's default configuration and does not need to be modified. Comment: The default values of the BSP for the memory interface are set according to the most conservative parameters. In actual applications, they should be reasonably adjusted in combination with parameters such as the bus operating frequency and the waiting period. Sometimes lowering the frequency can improve efficiency. For example, the RAM access cycle is 70ns and the bus frequency is
Phenomenon 5: If one CPU cannot handle it, use two distributed processing, and the processing capacity can be doubled.
Comment: For moving bricks, two people should be twice as efficient as one person; for painting, one more person can only do more harm than good. How many CPUs to use can be determined after having a better understanding of the business, and try to reduce the cost of coordination between the two CPUs , so that 1+1 is as close to 2 as possible , and never less than 1. Phenomenon 6: This CPU has a DMA module, and it is definitely faster to use it to move data . Comment: The real DMA is that the hardware seizes the bus and starts the devices at both ends at the same time, reading on this side and writing on the other side in one cycle. However, many DMAs embedded in CPUs are just simulations. Before starting each DMA , a lot of preparation work must be done (setting the starting address and length, etc.). When transmitting, the data is often read into the chip for temporary storage first, and then written out. That is, it takes two clock cycles to move data once, which is faster than software (no need to fetch instructions, no extra work such as loop jumps, etc.), but if only a few bytes are moved at a time, a lot of preparation work must be done, which generally involves function calls, and the efficiency is not high. Therefore, this type of DMA is only applicable to large data blocks.
Previous article:Description of the USB (D12) data communication process between PC and MCU
Next article:A few tips on hardware design: cost savings
Recommended Content
Latest Microcontroller Articles
He Limin Column
Microcontroller and Embedded Systems Bible
Professor at Beihang University, dedicated to promoting microcontrollers and embedded systems for over 20 years.
MoreSelected Circuit Diagrams
MorePopular Articles
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
MoreDaily News
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
Guess you like
- [NXP Rapid IoT Review] This electric tiger
- 24GHz human detection radar
- Gizwits ESP8266 controls the lifting of rolling shutter doors
- How to connect unused op amps in the same package?
- Help analyze this small wind power generation circuit
- Lessons learned from transplanting LwIP on C6414
- Lichee RV 86 PANEL Review (7) - Deploy a personal blog on Lichee
- [NXP Rapid IoT Review] Bluetooth Control RGB
- [National Technology N32G457 Review] Comparison between development board N32G457 and AB32VG1
- I2C pull-up levels