Intel XScale Embedded System Based on ARM Core
[Copy link]
Abstract: Briefly introduce the characteristics of Intel XScale and the difference with Intel StrongARM; focus on the structure, function and interface characteristics of PXA250 processor and PCM-7210 single board computer. Keywords: XScale ARM core PXA250 PCM-7210
1 Introduction Intel XScale microarchitecture provides a new, cost-effective, low-power solution based on ARMv5TE architecture, supporting 16-bit Thumb instructions and DSP expansion. Microprocessors developed based on XScale technology can be used in mobile phones, portable terminals (PDAs), network storage devices, backbone network (BackBone) routers, etc. The Intel PXA250 microprocessor chip is a highly integrated application processor that integrates a 32-bit Intel XScale processor core, multiple communication channels, LCD controller, enhanced storage controller and PCMCIA/CF controller, and general I/O ports. The processing speed of Intel XScale processor is twice that of Intel StrongARM, and its internal structure has also changed accordingly: The capacity of data cache increased from 8KB to 32KB; The capacity of the instruction cache increased from 16KB to 32KB; The capacity of the micro data cache is increased from 512B to 2KB; In order to increase the execution speed of instructions, the super pipeline structure is increased from 5 to 7 levels; Added multiplier/adder MAC and specific DSP-type coprocessor CP0 to improve support for multimedia technology; Dynamic power management enables the XScale processor to reach a clock speed of 1GHz, power consumption of 1.6W, and 1200MIPS. The XScale microprocessor architecture is specially designed, and the core is manufactured using Intel's advanced 0.18μm process technology; it has low power consumption characteristics, and the applicable range is from 0.1mW to 1.6W. At the same time, its clock operating frequency will be close to 1GHz. Compared with StrongARM, XScale can significantly reduce the operating voltage and obtain higher performance. Specifically, in the current StrongARM, the operating frequency can reach 133MHz at 1.55V and 206MHz at 2.0V; after adopting XScale, the operating frequency reaches 150MHz at 0.75V, 400MHz at 1.0V, and 800MHz at 1.65V. The combination of ultra-low power and high performance makes Intel XScale suitable for a wide range of Internet access devices. In all aspects of the Internet, from handheld Internet devices to Internet infrastructure products, Intel XScale has demonstrated satisfactory processing performance. 2 Structure and features of PXA250 The block diagram of the Intel XScale PXA250 is shown in Figure 1. 2.1 Main features of the PXA250 processor
(1) High performance Low power consumption, high performance 32-bit Intel XScale processor core, operating frequency up to 400MHz; Compatible with ARMv5TE architecture; Adopt 7-stage super pipeline structure; Support multimedia processing technology, using 40-bit accumulator and 16-bit multiplier to enhance the decoding capability of audio and video; Provides high-performance frame and paging mode interfaces supporting synchronous Intel StrataFlash memory. (2) Low power consumption Multiple power management modes; 32KB data and 32KB instruction cache; 2KB tiny data cache; Supports 2.5V and 3.3V memory. (3) I/O expansion 100MHz memory bus, 6 static storage spaces (16 or 32-bit ROM (SMROM) / Flash / SRAM), 4 dynamic storage partitions (16 or 32-bit SDRAM); Supports 2 PCMCIA or Compact Flash slots. (4) Peripheral Control Module 16-channel configurable DMA controller; LCD controller, unique DMA mode supporting fast-changing color screen; 920kbps Bluetooth interface; Serial ports (IrDA, I2C, I2S, AC97, 3 UARTs, SPI and SSP); USB interface; Support MMC/SD card. (5) Clock control Five clock sources: 32.768kHz oscillator; 3.6864MHz oscillator; Programmable core phase-locked loop; 95.85MHz peripheral fixed frequency phase-locked loop; 147.46MHz fixed frequency phase-locked loop. (6) Power management operation mode (normal processing mode), Turbo mode (running at 400MHZ), Idle mode (power off), Sleep mode (power off). (7) Package form: 17 mm × 17 mm 256-pin PBGA package. 2.2 Intel XScale Kernel The Intel XScale CPU core uses a super-pipelined RISC processor architecture with an enhanced memory pipeline. This new high-performance, low-power micro-architecture is compatible with the ARMv5TE ISA instruction set (without supporting floating-point instruction sets). This micro-architecture provides instruction and data memory management units, instruction, data and tiny data caches, write buffers, full buffers, suspended buffers and branch target buffers, power management, performance monitoring, debugging and JTAG units, as well as coprocessor interfaces, MAC coprocessors and core memory buses around the ARM core. The super pipeline structure is composed of an integer pipeline, a memory pipeline and a MAC pipeline. The integer pipeline includes a 7-stage pipeline structure, fetching instruction 1 (branch target buffer) → fetching instruction 2 → decoding → registering/shifting → ALU implementation → state execution → reply; the memory pipeline includes the first 5 stages of the integer pipeline, followed by 3 caches, data Cache1, data Cache2 and data reply cache, a total of 8-stage pipeline structure; the MAC pipeline is a 6-9-stage pipeline structure, including the first 4 stages of the integer pipeline and 4-stage MAC segments, and a data reply cache, where the selection of MAC2-4 is determined by data. The more stages of the pipeline structure, the faster the execution speed of the instruction can be improved. The purpose of using the branch target buffer is to successfully predict the result of the branch instruction. Each entry of the 128-entry branch target buffer contains the address of the branch instruction, the target address associated with the branch instruction and the execution status of the branch, which is enabled by the coprocessor 15. The use of the branch target buffer is intended to avoid branch delays in the super pipeline structure. The MM (IMMU and DMMU) of the PXA250 CPU provides a 32-entry translation bypass buffer (ITLB and DTLB), each of which can map segments, large pages, and small pages in the memory. In order to ensure the access to instructions and data in the core cycle, the PXA250 includes a 32KB instruction cache and a 32KB data cache. In addition, in order to avoid frequent changes in data flow access in the data cache, a 2KB tiny data cache is also provided. Both the instruction and data caches have 32 entries and 32-way associative caches. Each way contains a flag address, a 32-byte cache queue and a valid bit, and uses a circular method to refresh and store. The tiny data cache is a cache with 32 entries and 2-way associative caches, which also uses a circular method to refresh and store. The PXA250 core also provides a 4-entry full buffer and suspended buffer to improve core performance and work with the data cache and micro data cache. In addition, an 8-entry write buffer, each entry can store 16 bytes, which gets data from the core, data cache or micro data cache and temporarily stores data before the system bus is selected. 2.3 System control function The system control module of PXA250 provides a real-time clock, watchdog and interval timer, power management controller, interrupt controller, reset controller and two on-chip oscillators. The system timer supports the timer unit derived from the SA-11x0 processor. The OS timer uses a 3.6864MHz oscillator and includes four timing match registers (OSMR), one timing status register (OSSR) and one timing interrupt enable register (OIER). The watchdog timing interrupt can be achieved by activating the OS timing watchdog enable register (OWER). All interrupt sources handled by the interrupt controller have two interrupt types: interrupt request (IRQ) and fast interrupt request (FIQ). The interrupt controller can allow the CPU to be interrupted or keep the pre-interrupt according to the value of the mask register. Each register in the interrupt controller is 1-bit mapped, and each bit is pre-assigned to a different interrupt source. 2.4 Clock and Power Management In order to optimize the ratio between processing performance and energy consumption, the clock and power manager is used to control the clock frequency of different modules and handle the conversion between different energy management operation modes. The clock and power manager provides a fixed clock for each peripheral and a programmable frequency clock for the LCD controller, memory controller and CPU, all of which are derived from the internal phase-locked loop clock source. The clock manager can also reduce power consumption by shutting down the clocks of unused devices. Power management provides four working modes: Turbo mode, running mode, idle mode and sleep mode. In Turbo mode, the CPU core runs at the peak frequency. In order to avoid the waiting time of the core for the external memory, the external memory is rarely accessed in this mode; in running mode, the CPU core runs at the normal standard frequency. It can be assumed that the core continuously accesses the external memory. The slowing down of the running speed is beneficial to the best balance between performance and power consumption; in idle mode, the clock to the CPU is suspended, but the clock to the peripheral devices is enabled; in sleep mode, the entire system will be in the lowest power consumption state, and the system must be restarted to wake up the sleep state. 2.5 Memory and PCMCIA/Compact Flash Control Module The PXA250 processor's external memory bus interface supports synchronous dynamic memory (SDRAM), synchronous and asynchronous paging mode segments, page mode flash memory, synchronous mask read-only memory (SMROM), page mode ROM, SRAM, static segment support variable wait time I/O devices (VLIO), 16-bit PC card expansion memory and Compact Flash. The type of memory can be determined by the memory interface configuration register. 2.6 Peripheral Control Module The PXA250 processor defines a 16-channel DMA controller. It can respond to requests from internal and external devices to complete data reading and writing from the main memory. DMAC is used for data transmission between peripheral devices and the storage system. The LCD controller provides an interface that supports dual-scan passive matrix color display (DSTN, commonly known as pseudo-color) or active matrix color display (TFT, commonly known as true color) screens, and supports monochrome and multi-pixel formats. It has its own independent dual-channel DMA controller, with two channels used for single-panel and dual-panel displays respectively. The maximum supported display resolution is 1024×1024 pixels, and the recommended maximum resolution is 800×600 pixels. In passive monochrome mode, up to 256 levels of gray are supported. For color display, whether in active or passive mode, up to 65536 colors are supported. The LCD controller converts the pixel encoding value in the frame buffer to a 16-bit wide 256-entry palette RAM, and determines the number of colors based on the data width. The serial ports supported by the PXA250 processor include: USB client service module interface based on Universal Serial Bus version 1.1, which supports up to 16 endpoints and provides a 48MHz internal clock; 3 Universal Asynchronous Receiver/Transmitter ports (UARTs), a full-function UART with a maximum speed of 230Kbps (complete handshake signals), a Bluetooth UART and a standard UART with a maximum speed of 921Kbps; a high-speed infrared communication port (FICP) half-duplex, a speed of 4Mbps, and implements the 4PPM standard; the AC97 controller supports AC97 The 2.0 revision of the multimedia digital signal codec, the AC97 controller provides a separate 16-bit channel for stereo PCM input and output, modem input and output, and a single microphone input; the I2S controller provides a serial connection for the digital stereo standard I2S multimedia digital signal codec, multiplexing the AC97 controller pins; the I2C bus interface provides a 2-pin universal serial communication port, 2 pins for data address and clock respectively; in addition, it provides 2 MMC card interfaces supporting MMC or SPI protocols, up to 20Mbps serial data transmission and an SSP interface. The SSP logic interface supports the National Microwire protocol, Texas Instruments protocol, synchronous serial protocol (SSP) and Motorola SPI protocol, all of which are used for A/D conversion, audio and telecommunications multimedia digital signal codecs and other devices that meet serial data transmission protocols. 3 Advantech's latest XScale single-board computer PCM-7210 PCM-7210 is a single-board computer that integrates Intel XScale low-power RISC processor PXA250. It consists of a support board and a CPU board. The CPU board integrates the processor PXA250, 64MB SDRAM and 32MB Flash memory. Other peripheral devices are placed on the support board, including 10Mbps Ethernet interface, 4 full-function RS-232 and 1 RS-485 serial interface, AC97 audio interface, 2 USB host terminals and 1 client terminal, digital I/O pins and CF/PCMCIA expansion slots. In addition, there are interfaces that support LCD/CRT display and intelligent power interface. The functional block diagram of PCM-7210 is shown in Figure 2.
|