Standard Products
ACT 7000SC
64-Bit Superscaler Microprocessor
June 16, 2004
FEATURES
■
■
Full militarized QED RM7000 microprocessor
Dual Issue symmetric superscalar
microprocessor with instruction prefetch
optimized for system level price/performance
●
■
Embedded application enhancements
●
●
●
150, 200, 210, 225 MHz operating frequency
Consult Factory for latest speeds
MIPS IV Superset Instruction Set Architecture
●
●
■
High performance interface (RM52xx
compatible)
●
●
●
●
●
600 MB per second peak throughput
75 MHz max. freq., multiplexed address/data
Supports 1/2 clock multipliers (2, 2.5, 3, 3.5, 4,
4.5, 5, 6, 7, 8, 9)
IEEE 1149.1 JTAG (TAP) boundary scan
■
●
●
■
Integrated primary and secondary caches - all
are 4-way set associative with 32 byte line size
●
●
Specialized DSP integer Multiply-Accumulate
instruction, (MAD/MADU) and three-operand
multiply instruction (MUL/U)
Per line cache locking in primaries and
secondary
Bypass secondary cache option
I&D Test/Break-point (Watch) registers for
emulation & debug
Performance counter for system and software
tuning & debug
Ten fully prioritized vectored interrupts -
6 external, 2 internal, 2 software
Fast Hit-Writeback-Invalidate and
Hit-Invalidate cache operations for efficient
cache management
●
16KB instruction
16KB data: non-blocking and write-back or
write-through
256KB on-chip secondary: unified, non-blocking,
block writeback
Data PREFETCH instruction allows the
processor to overlap cache miss latency and
instruction execution
Floating point combined multiply-add
instruction increases performance in signal
processing and graphics applications
Conditional moves reduce branch frequency
Index address modes (register + register)
High-performance floating point unit -
600 M FLOPS maximum
●
●
■
MIPS IV instruction set
●
●
Single cycle repeat rate for common
single-precision operations and some
double-precision operations
Single cycle repeat rate for single-precision
combined multiply- add operations
Two cycle repeat rate for double-precision
multiply and double-precision combined
multiply-add operations
■
●
Fully static CMOS design with dynamic power
down logic
●
●
●
●
Standby reduced power mode with WAIT
instruction
4 watts typical @ 2.5V Int., 3.3V I/O, 200MHz
■
■
Embedded supply de-coupling capacitors and
additional PLL filter components
Integrated memory management unit
(ACT52xx compatible)
●
■
■
208-lead CQFP, cavity-up package (F17)
208-lead CQFP, inverted footprint (F24), with
the same pin rotation as the commercial QED
RM5261
●
●
●
Fully associative joint TLB (shared by I and D
translations)
48 dual entries map 96 pages
4 entry DTLB and 4 entry ITLB
Variable page size (4KB to 16MB in 4x
increments)
SCD7000 Rev C
On - Chip 256K Byte Secondary Cache, 4 - Way Set Associative
Secondary Tags
Set A
Primary Data Cache
4 - Way Set Associative
Secondary Tags
Set B
DTag
DTLB
Secondary Tags
Set C
ITag
ITLB
Secondary Tags
Set D
Primary Instruction Cache
4 - Way Set Associative
A/D Bus
Pad Bus
Store Buffer
Write Buffer
Read Buffer
Pad Buffer
Address Buffer
Prefetch Buffer
Instruction Dispatch Unit
F Pipe Register
M Pipe Register
F-Pipe Bus
M-Pipe Bus
D Bus
Floating-Point
Load / Align
Floating-Point
Register File
Packer / Unpacker
Comparator
Floating-Point
MultAdd, Add, Sub,
Cvt, Div, Sqrt
Multiplier Array
Floating - Point Control
Joint TLB
Coprocessor 0
System / Memory
Control
PC Incrementer
Branch PC Adder
ITLB Virtuals
Program Counter
DVA
Load Aligner
Integer Register File
M Pipe
Adder
StAin/Sh
Logicals
FA Bus
IVA
F Pipe
Adder
Shifter
Logicals
DTLB Virtuals
PLL/Clocks
Int Mult. Div. Madd
Block Diagram
SCD7000 Rev C
2
Integer Control
DESCRIPTION
The ACT 7000SC is a highly integrated symmetric
superscalar microprocessor capable of issuing two
instructions each processor cycle. It has two high
performance 64-bit integer units as well as a high
throughput, fully pipelined 64-bit floating point unit. To
keep its multiple execution units running efficiently, the
ACT 7000SC integrates not only 16KB 4-way set
associative instruction and data caches but backs them up
with an integrated 256KB 4-way set associative secondary
as well. For maximum efficiency, the data and secondary
caches are writeback and nonblocking. A RM52XX family
compatible, operating system friendly memory
management unit with a 64/48-entry fully associative TLB
and a high-performance 64-bit system interface supporting
hardware prioritized and vectored interrupts round out the
main features of the processor.
The
ACT 7000SC is ideally suited for highend
embedded control applications such as internetworking,
high performance image manipulation, high speed printing,
and 3-D visualization.
CPU Registers
Like all MIPS ISA processors, the ACT 7000SC CPU
has a simple, clean user visible state consisting of 32
general purpose registers, or GPR’s, two special purpose
registers for integer multiplication and division, and a
program counter; there are no condition code bits. Figure 1
shows the user visible state.
Superscalar Dispatch
The
ACT 7000SC has an efficient symmetric
superscalar dispatch unit which allows it to issue up to two
instructions per cycle. For purposes of instruction issue, the
ACT 7000SC defines four classes of instructions: integer,
load/store, branches, and floating-point. There are two
logical pipelines, the
function,
or F, pipeline and the
memory,
or M, pipeline. Note however that the M pipe can
execute integer as well as memory type instructions.
Table 1 – Instruction Issue Rules
F Pipe
one of:
integer, branch, floating-point,
integer mul, div
M Pipe
one of:
integer, load/store
HARDWARE OVERVIEW
The ACT 7000SC offers a high-level of integration
targeted at high-performance embedded applications. The
key elements of the ACT 7000SC are briefly described
below.
Figure 2 is a simplification of the pipeline section and
illustrates the basics of the instruction issue mechanism.
General Purpose Registers
63
0
r1
r2
•
•
•
•
r29
r30
r31
63
PC
63
LO
0
63
HI
0
Multiply/Divide Registers
0
Program Counter
0
Figure 1 – CP0 Registers
SCD7000 Rev C
3
.
Table 2 – Dual Issue Instruction Classes
Instruction
Cache
Dispatch
Unit
F Pipe IBus
M Pipe IBus
integer
add, sub, or,
xor, shift, etc.
load/store
floating-point
branch
fadd, fsub,
beq, bne,
lw, sw, ld,
fmult, fmadd, bCzT, bCzF,
sd, ldc1,
j, etc.
sdc1, mov, fdiv, fcmp, fsqrt,
etc.
movc, fmov,
etc.
FP
F Pipe
FP
M Pipe
Integer
F Pipe
Integer
M Pipe
The symmetric superscalar capability of the ACT
7000SC, in combination with its low latency integer
execution units and high-throughput fully pipelined
floating-point execution unit, provides unparalleled
price/performance in computational intensive embedded
applications.
Pipeline
Figure 2 – Instruction Issue Paradigm
The logical length of both the F and M pipelines is five
stages with state committing in the register write, or W,
pipe stage. The physical length of the floating-point
execution pipeline is actually seven stages but this is
completely transparent to the user.
Figure 3 shows instruction execution within the
ACT 7000SC
when
instructions
are
issuing
simultaneously down both pipelines. As illustrated in the
figure, up to ten instructions can be executing
simultaneously. This figure presents a somewhat simplistic
The figure illustrates that one F pipe instruction and one
M pipe instruction can be issued concurrently but that two
M pipe or two F pipe instructions cannot be issued. Table 2
specifies more completely the instructions within each
class.
I0
I1
I2
I3
I4
I5
I6
I7
I8
I9
1l
1l
2l
2l
1R
1R
1l
1l
2R
2R
2l
2l
1A
1A
1R
1R
1l
1l
2A
2A
2R
2R
2l
2l
1D
1D
1A
1A
1R
1R
1l
1l
2D
2D
2A
2A
2R
2R
2l
2l
1W
1W
1D
1D
1A
1A
1R
1R
1l
1l
2W
2W
2D
2D
2A
2A
2R
2R
2l
2l
1W
1W
1D
1D
1A
1A
1R
1R
2W
2W
2D
2D
2A
2A
2R
2R
1W
1W
1D
1D
1A
1A
2W
2W
2D
2D
2A
2A
1W
1W
1D
1D
2W
2W
2D
2D
1W
1W
2W
2W
one cycle
1I-1R:
2I:
2R:
1A:
1A:
1A-2A:
2A:
2A-2D:
1D:
2W:
Instruction cache access
Instruction virtual to physical address translation
Register file read, Bypass calculation, Instruction decode, Branch address calculation
Issue or slip decision, Branch decision
Data virtual address calculation
Integer add, logical, shift
Store Align
Data cache access and load align
Data virtual to physical address translation
Register file write
Figure 3 – Pipeline
SCD7000 Rev C
4
view of the processors operation however since the
out-of-order completion of loads, stores, and long latency
floating-point operations can result in there being even
more instructions in process than what is shown.
Note that instruction dependencies, resource conflicts,
and branches result in some of the instruction slots being
occupied by NOPs.
Table 3 – ALU Operations
Unit
Adder
Logic
Shifter
F Pipe
add, sub
logic, moves, zero
shifts (nop)
non zero shift
M Pipe
add, sub, data
address add
logic, moves, zero
shifts (nop)
non zero shift, store
align
Integer Unit
Like the ACT 52xx family, the
ACT 7000SC
implements the MIPS IV Instruction Set Architecture, and
is therefore fully upward compatible with applications that
run on processors such as the R4650 and R4700 that
implement the earlier generation MIPS III Instruction Set
Architecture. Additionally, the ACT 7000SC includes
two implementation specific instructions not found in the
baseline MIPS IV ISA, but that are useful in the embedded
market place. Described in detail in a later section of this
datasheet,
these
instructions
are
integer
multiply-accumulate and three-operand integer multiply.
The ACT 7000SC integer unit includes thirty-two
general purpose 64-bit registers, the HI/LO result registers
for the two-Pipeline operand integer multiply/divide
operations, and the program counter, or PC. There are two
separate execution units, one of which can execute
function, or F, type instructions and one which can execute
memory, or M, type instructions. See above for a
description of the instruction types and the issue rules. As
a special case, integer multiply/divide instructions as well
as their corresponding MFHi and MFLo instructions can
only be executed in the F type execution unit. Within each
execution unit the operational characteristics are the same
as on previous QED designs with single cycle ALU
operations (add, sub, logical, shift), one cycle load delay,
and an autonomous multiply/divide unit.
Register File
The ACT 7000SC has thirty-two general purpose
registers with register location (r0) hard wired to zero
value. These registers are used for scalar integer operations
and address calculation. In order to service the two integer
execution units, the register file has four read ports and two
write ports and is fully bypassed both within and between
the two execution units to minimize operation latency in
the pipeline.
Integer Multiply/Divide
The ACT 7000SC has a single dedicated integer
multiply/divide unit optimized for high-speed multiply and
multiply-accumulate operations. The multiply/divide unit
resides in the F type execution unit. Table 4 shows the
performance of the multiply/divide unit on each operation.
Table 4 – Integer Multiply / Divide Operations
Opcode
MULT/U,
MAD/U
MUL
DMULT,
DMULTU
DIV, DIVD
DDIV,
DDIVU
Operand
Size
16 bit
32 bit
16 bit
32 bit
any
any
any
Latency
4
5
4
5
9
36
68
Repeat
Rate
3
4
3
4
8
36
68
Stall
Cycles
0
0
2
3
0
0
0
ALU
The ACT 7000SC has two complete integer ALU’s each
consisting of an integer adder/subtractor, a logic unit, and a
shifter. Table 3 shows the functions performed by the
ALU’s for each execution unit. Each of these units is
optimized to perform all operations in a single processor
cycle.
The baseline MIPS IV ISA specifies that the results of a
multiply or divide operation be placed in the Hi and Lo
registers. These values can then be transferred to the
general purpose register file using the Move-from-Hi and
Move-from-Lo (MFHI/MFLO) instructions.
In addition to the baseline MIPS IV integer multiply
instructions, the ACT 7000SC also implements the
3-operand multiply instruction, MUL. This instruction
specifies that the multiply result go directly to the integer
register file rather than the Lo register. The portion of the
multiply that would have normally gone into the Hi register
is discarded. For applications where it is known that the
upper half of the multiply result is not required, using the
MUL instruction eliminates the necessity of executing an
explicit MFLO instruction.
Also included in the ACT 7000SC are the multiply-add
instructions MAD/MADU. This instruction multiplies two
operands and adds the resulting product to the current
contents of the Hi and Lo registers. The
multiply-accumulate operation is the core primitive of
almost all signal processing algorithms allowing the ACT
7000SC to eliminate the need for a separate DSP engine in
many embedded applications.
5
SCD7000 Rev C