New Competitiveness—ARM Cortex-A9 Processor
[Copy link]
Category: Embedded Systems Processor IP licensor ARM Holding splc has developed two implementations of the dual-core Cortex-A9 processor design, known as Osprey. The Cortex-A9 processor is compatible with the rest of the Cortex family of processors and the popular ARM MPCore technology, enabling it to leverage a rich ecosystem of operating systems, real-time operating systems (OSRTOS), middleware and applications, reducing the cost of adopting a new processor. By leveraging key microarchitectural advances for the first time, the Cortex-A9 processor offers a highly scalable and power-efficient solution. Leveraging a dynamic-length, eight-stage superscalar structure, a multi-event pipeline, and speculative out-of-order execution, it can execute up to four instructions per cycle in devices with frequencies exceeding 1GHz, while reducing the cost and improving the efficiency of today's mainstream eight-stage processors. Osprey will be a serious competitor to Atom, at least until Intel changes its manufacturing process. It adopts the form of hard macro, designed and manufactured using TSMC's 40G40nm manufacturing process technology. Osprey's hard macro is optimized for power consumption and performance respectively, and the optimization for performance makes ARM processor fully enter the field of high-performance application competition. "Osprey's goal is performance and performance." ARM vice president of marketing for the processor division said, "We are exploring new markets, such as netbooks ( netbooks ), smartbooks ( smartbooks , and smartbooks ( smartbooks ). Osprey itself is a dual-core processor, but there's nothing stopping licensees from putting more than one core on the die, Schorn pointed out. While ARM is still waiting for TSMC to produce fully tested chips, which will be available in the fourth quarter of this year, the two designs mentioned above are already available for licensing, with the IP set to ship in the fourth quarter of 2009. The speed-optimized implementations are suitable for enterprise servers, network equipment, printers and other applications that require peak performance at clock frequencies up to and beyond 2GHz. The core occupies 6.7 square millimeters of silicon area and delivers 10,000 DMIPS of computing power at a 2 GHz clock rate while consuming approximately 1.9 watts. The power-optimized implementation is suitable for mobile computing devices, smart computers and other consumer electronics devices requiring clock rates from 800 MHz to 1 GHz and above. It occupies a die area of 4.9 square millimeters and can provide 4000DMIPS of computing power at a clock frequency of 800MHz, with a power consumption of 0.5 watts. Both implementations will use TSMC's 40G process with support for the low-leakage GL process option. The above design includes a fixed-size L1 cache of 32kB instructions and 32kB data, and an L2 cache controller that supports L2 cache spaces from 128kB to 8MB. Schorn claims that, on an equivalency comparison, the Osprey is between 13 and 14 times the size of Intel's Atom processors, which are made using a similar 4045nm process technology. ARM's Osprey has also passed the Embedded Microprocessor Benchmark Consortium Coremark benchmark. According to ARM, both implementations outperform an Atom N270 running at 1.6 GHz. The power-optimized implementation does this at 800 MHz, while the speed-optimized version outperforms by 2.5 times. Each core in this dual-core design includes a Neon SIMD engine and floating-point processing units that support image and multimedia processing. "Network processing is not really the strength of Neon or the floating-point units. But you have to make some hard choices when you use hard macros. But it has the advantage of being proven and implemented in silicon," Schorn said. ARM has had such hard macros for some time, dating back to the ARM922 and ARM926. "The ARM926 has a configurable cache and is increasingly using foundry business. The foundries themselves offer multiple process nodes in low-power, general-purpose and high-performance, so the number of targets has increased," said Schorn. "But as we are seeing now, the node changes are decreasing and the number of targets for shortening the life of hard macros is increasing. We want to achieve multiple licenses with one project." ARM's semiconductor partners, early adopters of the Cortex-A9, have already implemented this processor core in a low-power process, Schorn pointed out. "Many partners use low-power processes, so we are not going to repeat what our partners have already done. Low power is very relevant to wireless communications. This high-performance core is a different beast, with four to five times the power efficiency of Atom," Schorn said. The Osprey processor does not include a graphics processor, but it is interesting that the test chips that will be shipped do. "The MALI-400 multimedia processor and the MALI-VE video engine are integrated on the dual Osprey test chip," Schorn revealed. Similarly, the Osprey core does not include Intrinsity's Fast14 technology, but this technology is used by Samsung for its Cortex-A8 processors with clock speeds above 1GHz. "This Intrinsity Fast14 technology is amazing and has been used in the Cortex-A8, but not in the Osprey implementation. It will certainly not be abandoned in the future." The Osprey does incorporate clock gating and low-power design techniques used in other ARM low-power processor designs. The main processing unit consumes no power if there are no instructions in the pipeline. The design also uses six independent power islands to manage leakage power when performance is not required. The entire pipeline can be shut down while the SRAM data remains unchanged to enable possible immediate loading. The cache detection unit and the L2 cache controller unit can also be controlled independently. "This is a radical departure from past work," Schorn concluded. "By working with partners to complement each other, we can further expand the application of the ARM architecture." About ARM ARM (ADVANCED RISC Machines) is a leading company in the microprocessor industry, designing a large number of high-performance, low-cost, low-energy RISC processors, related technologies and software. The technology has the characteristics of high performance, low cost and low energy consumption. It is applicable to a variety of fields, such as embedded control, consumer education multimedia, DSP and mobile applications.ARM is a joint venture of Apple, Acorn, VLSI, TECHNOLOGY and other companies. ARM licenses its technology to many well-known semiconductor, software and OEM manufacturers in the world. Each manufacturer gets a unique set of ARM related technologies and services. Leveraging this partnership, ARM quickly became the creator of many global RISC standards. Currently, a total of 30 semiconductor companies have signed hardware technology licensing agreements with ARM, including Intel, IBM, LG Semiconductor, NEC, SONY, Philips and National Semiconductor. As for the software system partners, they include a series of well-known companies such as Microsoft, Sun and MRI.
|