Cortex-M3 vs Cortex-M4

Publisher:Ziran520Latest update time:2013-10-11 Source: eefocusKeywords:Cortex-M3  Cortex-M4 Reading articles on mobile phones Scan QR code
Read articles on your mobile phone anytime, anywhere

This article mainly explains the differences between M3 and M4 from four aspects: MPU, DSP capabilities, debug and power management. 

1. Memory Protection Unit MPU 

The MPU is an optional component for memory protection in the Cortex-M4, as in the Cortex-M3. The processor supports the standard ARMv7 memory protection architecture model. You can use the MPU to enforce privilege/access rules, or a separate process. The MPU provides full support for:

Protected Area

Overlapping protection zones, increasing zone priority (7 = highest priority, 0 = lowest priority)

Access  rights

Output memory attributes to the system

 

2.  DSP Capabilities

 

The following figure shows the relative performance comparison of the digital signal processing capabilities of the Cortex-M3 and Cortex-M4 processors running at the same speed.

In the figures below, the Y axis represents the relative number of cycles used to perform a given calculation. Therefore, the smaller the cycle count, the better the performance. Taking the Cortex-M3 as a reference, the performance of the Cortex-M4 is calculated, and the performance ratio is roughly the inverse of its cycle count. For example, for the PID function, the number of cycles of the Cortex-M4 is about 0.7 times that of the Cortex-M3, so the relative performance is 1/0.7, or 1.4 times.

 

Cortex-M series 16-bit cycle counting function

 

Cortex-M series 32-bit cycle counting function

 

 

This clearly shows that the Cortex-M4 has a great advantage in digital signal processing compared to the 16-bit or 32-bit operations of the Cortex-M3.

All DSP instruction sets executed by Cortex-M4 can be completed in one cycle, while Cortex-M3 requires multiple instructions and multiple cycles to complete the equivalent functions. Even for PID algorithm, the most resource-intensive work in general DSP operation, Cortex-M4 can provide a 1.4 times performance improvement. Another example, MP3 decoding requires 20-25Mhz in Cortex-M3, while Cortex-M4 only needs 10-12MHz.   [page]

 

      1). 32-bit multiply-accumulate (MAC)

The 32-bit multiply-accumulate (MAC) includes new instruction sets and optimizations for the Cortex-M4 hardware execution unit. It is able to complete a 32 × 32 + 64 -> 64 operation or two 16 × 16 operations in a single cycle. The following table lists the computational capabilities of this unit.

 

 

      2). SIMD

The Cortex-M4 supports the SIMD instruction set, which was not available in the previous generation of the Cortex-M series. Some of the instructions in the above table are SIMD instructions. Working together with the hardware multiplier (MAC), all these instructions can be executed in a single cycle. Benefiting from the support of SIMD instructions, the Cortex-M4 processor is able to complete up to 32 × 32 + 64 -> 64 operations in a single cycle, freeing up processor bandwidth for other tasks instead of consuming computing resources by multiplication and addition. Consider the following complex arithmetic operation, where two 16 × 16 multiplications plus a 32-bit addition are compiled into a single instruction: SUM = SUM + (A*C) + (B*D)

 

 

 

 

      3).FPU 

The FPU is an optional unit for floating point operations in the Cortex-M4. It is therefore a unit dedicated to floating point tasks. This unit is able to handle single precision floating point operations with hardware performance improvements and is compatible with the IEEE 754 standard. This completes the floating point extension of the ARMv7-M architecture for single precision variables. The FPU extends the program model of registers with a register file containing 32 single precision registers. These can be seen as:

  • 16 64-bit double-word registers, D0 - D15
  • 32 32-bit single-word registers, S0 - S31 The FPU provides three modes of operation to suit various applications
  • Full compatibility mode (In full compatibility mode, FPU processes all operations in accordance with the IEEE754 hardware standard)
  • Flush-to-zero mode (Sets the FZ bit in the floating-point status and control register FPSCR [24] to flush-to-zero mode. In this mode, the FPU treats all non-normal input operands as zero in arithmetic CDP operations, except when the result from a zero operand is appropriate. VABS, VNEG, and VMOV are not treated as arithmetic CDP operations and are not affected by flush-to-zero mode. The result is incremented by zero as described in the IEEE 754 standard, if the increase in the destination precision is less than the lowest normal value after rounding. The IDC flag, FPSCR [7], changes when the inputs are flushed. The UFC flag, FPSCR [3], changes when the flush is complete)
  • Default NaN mode (Setting the DN bit, FPSCR [25], enters the default NaN mode. In this mode, the result of any arithmetic operation involving an input NaN, or producing a NaN result, returns the default NaN. The fractional bits are incremented and retained only for VABS, VNEG, and VMOV operations. All other CDP operations ignore any information about the fractional bits of the input NaN.)

The following table shows the FPU instruction set

 

3.debug

As with the Cortex-M3, Cortex-M4 devices are debugged via a standard JTAG or Serial Wire Debug connector. To interface to the host, a simple, standardized external connector is necessary. [page]

 

4. Power supply

    1). Power management

 

       2). Power consumption comparison

From the graph, it is clear that the performance of the Cortex-M4 is much better than that of the Cortex-M3 in terms of power efficiency. 

Keywords:Cortex-M3  Cortex-M4 Reference address:Cortex-M3 vs Cortex-M4

Previous article:Violation vehicle video detection system based on embedded system
Next article:Power frequency interference and suppression measures in single chip microcomputer temperature control system

Latest Microcontroller Articles
  • Download from the Internet--ARM Getting Started Notes
    A brief introduction: From today on, the ARM notebook of the rookie is open, and it can be regarded as a place to store these notes. Why publish it? Maybe you are interested in it. In fact, the reason for these notes is ...
  • Learn ARM development(22)
    Turning off and on interrupts Interrupts are an efficient dialogue mechanism, but sometimes you don't want to interrupt the program while it is running. For example, when you are printing something, the program suddenly interrupts and another ...
  • Learn ARM development(21)
    First, declare the task pointer, because it will be used later. Task pointer volatile TASK_TCB* volatile g_pCurrentTask = NULL;volatile TASK_TCB* vol ...
  • Learn ARM development(20)
    With the previous Tick interrupt, the basic task switching conditions are ready. However, this "easterly" is also difficult to understand. Only through continuous practice can we understand it. ...
  • Learn ARM development(19)
    After many days of hard work, I finally got the interrupt working. But in order to allow RTOS to use timer interrupts, what kind of interrupts can be implemented in S3C44B0? There are two methods in S3C44B0. ...
  • Learn ARM development(14)
  • Learn ARM development(15)
  • Learn ARM development(16)
  • Learn ARM development(17)
Change More Related Popular Components

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews


Room 1530, 15th Floor, Building B, No.18 Zhongguancun Street, Haidian District, Beijing, Postal Code: 100190 China Telephone: 008610 8235 0740

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号