Advance Information
MPC7455EC
Rev. 4, 9/2003
MPC7455
RISC Microprocessor
Hardware Specifications
The MPC7455 and MPC7445 are implementations of the PowerPC™ microprocessor family
of reduced instruction set computer (RISC) microprocessors. This document is primarily
concerned with the MPC7455; however, unless otherwise noted, all information here also
applies to the MPC7445. This document describes pertinent electrical and physical
characteristics of the MPC7455. For functional characteristics of the processor, refer to the
MPC7450 RISC Microprocessor Family User’s Manual.
This document contains the following topics:
Topic
Section 1.1, “Overview”
Section 1.2, “Features”
Section 1.3, “Comparison with the MPC7400, MPC7410, MPC7450,
MPC7451, and MPC7441”
Section 1.4, “General Parameters”
Section 1.5, “Electrical and Thermal Characteristics”
Section 1.6, “Pin Assignments”
Section 1.7, “Pinout Listings”
Section 1.8, “Package Description”
Section 1.9, “System Design Information”
Section 1.10, “Document Revision History”
Section 1.11, “Ordering Information”
To locate any published updates for this document, refer to the website at
http://www.motorola.com/semiconductors.
Page
1
2
7
10
10
33
35
41
47
60
62
1.1
Overview
The MPC7455 is the third implementation of the fourth generation (G4) microprocessors from
Motorola. The MPC7455 implements the full PowerPC 32-bit architecture and is targeted at
networking and computing systems applications. The MPC7455 consists of a processor core,
a 256-Kbyte L2, and an internal L3 tag and controller which support a glueless backside L3
cache through a dedicated high-bandwidth interface. The MPC7445 is identical to the
MPC7455 except it does not support the L3 cache interface.
Features
Figure 1 shows a block diagram of the MPC7455. The core is a high-performance superscalar design
supporting a double-precision floating-point unit and a SIMD multimedia unit. The memory storage
subsystem supports the MPX bus protocol and a subset of the 60x bus protocol to main memory and other
system resources. The L3 interface supports 1 or 2 Mbytes of external SRAM for L3 cache data.
Note that the MPC7455 is footprint-compatible with the MPC7450 and MPC7451, and the MPC7445 is
footprint-compatible with the MPC7441.
1.2
Features
This section summarizes features of the MPC7455 implementation of the PowerPC architecture.
Major features of the MPC7455 are as follows:
•
High-performance, superscalar microprocessor
— As many as four instructions can be fetched from the instruction cache at a time
— As many as three instructions can be dispatched to the issue queues at a time
— As many as 12 instructions can be in the instruction queue (IQ)
— As many as 16 instructions can be at some stage of execution simultaneously
— Single-cycle execution for most instructions
— One instruction per clock cycle throughput for most instructions
— Seven-stage pipeline control
•
Eleven independent execution units and three register files
— Branch processing unit (BPU) features static and dynamic branch prediction
– 128-entry (32-set, four-way set-associative) branch target instruction cache (BTIC), a
cache of branch instructions that have been encountered in branch/loop code sequences. If
a target instruction is in the BTIC, it is fetched into the instruction queue a cycle sooner
than it can be made available from the instruction cache. Typically, a fetch that hits the
BTIC provides the first four instructions in the target stream.
– 2048-entry branch history table (BHT) with two bits per entry for four levels of
prediction—not-taken, strongly not-taken, taken, and strongly taken
– Up to three outstanding speculative branches
– Branch instructions that do not update the count register (CTR) or link register (LR) are
often removed from the instruction stream.
– Eight-entry link register stack to predict the target address of Branch Conditional to Link
Register (bclr) instructions
— Four integer units (IUs) that share 32 GPRs for integer operands
– Three identical IUs (IU1a, IU1b, and IU1c) can execute all integer instructions except
multiply, divide, and move to/from special-purpose register instructions
– IU2 executes miscellaneous instructions including the CR logical operations, integer
multiplication and division instructions, and move to/from special-purpose register
instructions
2
MPC7455 RISC Microprocessor Hardware Specifications
MOTOROLA
Instruction Unit
Branch Processing Unit
Fetcher
Tags
IBAT Array
BHT (2048-Entry)
Dispatch
Unit
Data MMU
SRs
(Original)
VR Issue
(4-Entry/2-Issue)
DBAT Array
GPR Issue
(6-Entry/3-Issue)
FPR Issue
(2-Entry/1-Issue)
128-Entry
DTLB
Tags
LR
BTIC (128-Entry)
CTR
Instruction Queue
(12-Word)
SRs
(Shadow)
128-Entry
ITLB
Instruction MMU
128-Bit (4 Instructions)
MOTOROLA
32-Kbyte
I Cache
32-Kbyte
D Cache
Reservation
Stations (2-Entry)
EA
Load/Store Unit
Vector Touch Engine
+ (EA Calculation)
Finished
Stores
L1 Castout
PA
FPR File
16 Rename
Buffers
Reservation
Stations (2)
VR File
16 Rename
Buffers
Integer
Unit 2
Integer
Integer
Integer
Unit 122
Unit
Unit
(3)
+++
32-Bit
32-Bit
x÷
16 Rename
Buffers
Reservation
Stations (2)
GPR File
Reservation
Reservation
Reservation
Station
Station
Station
Vector
Touch
Queue
Vector
Integer
Unit 1
Vector
FPU
Floating-
Point Unit
L1 Push
Completed
Stores
+ x ÷
FPSCR
Load Miss
64-Bit
64-Bit
32-Bit
128-Bit
128-Bit
L3 Cache Controller
Line Block 0/1
Tags Status
L3CR
Bus Accumulator
Not in
MPC7445
18-Bit
64-Bit Data
Address (8-Bit Parity)
Memory Subsystem
System Bus Interface
L2 Prefetch (3)
Bus Store Queue
Push
Castout
Queue
(9)
External SRAM
(1 or 2 Mbytes)
Bus Accumulator
36-Bit Address Bus
64-Bit Data Bus
L2 Store Queue (L2SQ)
Snoop Push/
L1 Castouts
Interventions
(4)
256-Kbyte Unified L2 Cache/Cache Controller
Line
Block 0 (32-Byte)
Block 1 (32-Byte)
Tags Status
Status
L1 Service Queues
L1 Store Queue
(LSQ)
L1 Load Queue (LLQ)
L1 Load Miss (5)
Instruction Fetch (2)
Cacheable Store
Request (1)
Additional Features
• Time Base Counter/Decrementer
• Clock Multiplier
• JTAG/COP Interface
• Thermal/Power Management
• Performance Monitor
96-Bit (3 Instructions)
Reservation Reservation Reservation Reservation
Station
Station
Station
Station
Figure 1. MPC7455 Block Diagram
Vector
Permute
Unit
Vector
Integer
Unit 2
Completion Unit
MPC7455 RISC Microprocessor Hardware Specifications
Completion Queue
(16-Entry)
Features
Completes up to three instructions per clock
3
Features
— Five-stage FPU and a 32-entry FPR file
– Fully IEEE 754-1985-compliant FPU for both single- and double-precision operations
– Supports non-IEEE mode for time-critical operations
– Hardware support for denormalized numbers
– Thirty-two 64-bit FPRs for single- or double-precision operands
— Four vector units and 32-entry vector register file (VRs)
– Vector permute unit (VPU)
– Vector integer unit 1 (VIU1) handles short-latency AltiVec™ integer instructions, such as
vector add instructions (vaddsbs,
vaddshs,
and
vaddsws,
for example)
– Vector integer unit 2 (VIU2) handles longer-latency AltiVec integer instructions, such as
vector multiply add instructions (vmhaddshs,
vmhraddshs,
and
vmladduhm,
for
example)
– Vector floating-point unit (VFPU)
— Three-stage load/store unit (LSU)
– Supports integer, floating-point, and vector instruction load/store traffic
– Four-entry vector touch queue (VTQ) supports all four architected AltiVec data stream
operations
– Three-cycle GPR and AltiVec load latency (byte, half-word, word, vector) with one-cycle
throughput
– Four-cycle FPR load latency (single, double) with one-cycle throughput
– No additional delay for misaligned access within double-word boundary
– Dedicated adder calculates effective addresses (EAs)
– Supports store gathering
– Performs alignment, normalization, and precision conversion for floating-point data
– Executes cache control and TLB instructions
– Performs alignment, zero padding, and sign extension for integer data
– Supports hits under misses (multiple outstanding misses)
– Supports both big- and little-endian modes, including misaligned little-endian accesses
•
Three issue queues FIQ, VIQ, and GIQ can accept as many as one, two, and three instructions,
respectively, in a cycle. Instruction dispatch requires the following:
— Instructions can be dispatched only from the three lowest IQ entries—IQ0, IQ1, and IQ2
— A maximum of three instructions can be dispatched to the issue queues per clock cycle
— Space must be available in the CQ for an instruction to dispatch (this includes instructions that
are assigned a space in the CQ but not in an issue queue)
•
Rename buffers
— 16 GPR rename buffers
— 16 FPR rename buffers
— 16 VR rename buffers
•
Dispatch unit
— Decode/dispatch stage fully decodes each instruction
4
MPC7455 RISC Microprocessor Hardware Specifications
MOTOROLA
Features
•
Completion unit
— The completion unit retires an instruction from the 16-entry completion queue (CQ) when all
instructions ahead of it have been completed, the instruction has finished execution, and no
exceptions are pending.
— Guarantees sequential programming model (precise exception model)
— Monitors all dispatched instructions and retires them in order
— Tracks unresolved branches and flushes instructions after a mispredicted branch
— Retires as many as three instructions per clock cycle
•
Separate on-chip L1 instruction and data caches (Harvard architecture)
— 32-Kbyte, eight-way set-associative instruction and data caches
— Pseudo least-recently-used (PLRU) replacement algorithm
— 32-byte (eight-word) L1 cache block
— Physically indexed/physical tags
— Cache write-back or write-through operation programmable on a per-page or per-block basis
— Instruction cache can provide four instructions per clock cycle; data cache can provide four
words per clock cycle
— Caches can be disabled in software
— Caches can be locked in software
— MESI data cache coherency maintained in hardware
— Separate copy of data cache tags for efficient snooping
— Parity support on cache and tags
— No snooping of instruction cache except for
icbi
instruction
— Data cache supports AltiVec LRU and transient instructions
— Critical double- and/or quad-word forwarding is performed as needed. Critical quad-word
forwarding is used for AltiVec loads and instruction fetches. Other accesses use critical
double-word forwarding.
•
Level 2 (L2) cache interface
— On-chip, 256-Kbyte, eight-way set-associative unified instruction and data cache
— Fully pipelined to provide 32 bytes per clock cycle to the L1 caches
— A total nine-cycle load latency for an L1 data cache miss that hits in L2
— PLRU replacement algorithm
— Cache write-back or write-through operation programmable on a per-page or per-block basis
— 64-byte, two-sectored line size
— Parity support on cache
•
Level 3 (L3) cache interface (not implemented on MPC7445)
— Provides critical double-word forwarding to the requesting unit
— Internal L3 cache controller and tags
— External data SRAMs
— Support for 1- and 2-Mbyte L3 caches
MOTOROLA
MPC7455 RISC Microprocessor Hardware Specifications
5