SHARC
Digital Signal Processor
ADSP-21160M/ADSP-21160N
SUMMARY
High performance 32-bit DSP—applications in audio, medi-
cal, military, graphics, imaging, and communication
Super Harvard architecture—4 independent buses for dual
data fetch, instruction fetch, and nonintrusive, zero-over-
head I/O
Backward compatible—assembly source level compatible
with code for ADSP-2106x DSPs
Single-instruction, multiple-data (SIMD) computational
architecture—two 32-bit IEEE floating-point computation
units, each with a multiplier, ALU, shifter, and register file
Integrated peripherals—integrated I/O processor, 4M bits
on-chip dual-ported SRAM, glueless multiprocessing fea-
tures, and ports (serial, link, external bus, and JTAG)
FEATURES
100 MHz (10 ns) core instruction rate (ADSP-21160N)
Single-cycle instruction execution, including SIMD opera-
tions in both computational units
Dual data address generators (DAGs) with modulo and bit-
reverse addressing
Zero-overhead looping and single-cycle loop setup, provid-
ing efficient program sequencing
IEEE 1149.1 JTAG standard Test Access Port and on-chip
emulation
400-ball 27 mm
×
27 mm PBGA package
Available in lead-free (RoHS compliant) package
200 million fixed-point MACs sustained performance
(ADSP-21160N)
CORE PROCESSOR
TIMER
INSTRUCTION
CACHE
32 x 48-BIT
DUAL-PORTED SRAM
TWO INDEPENDENT
DUAL-PORTED BLOCKS
PROCESSOR PORT
ADDR
DATA
ADDR
DATA
BLOCK 0
JTAG
BLOCK 1
6
TEST AND
EMULATION
I/O PORT
DATA
ADDR
DATA
ADDR
DAG1
8 x 4 x 32
DAG2
8 x 4 x 32
PROGRAM
SEQUENCER
32
IOD
64
IOA
18
EXTERNAL
PORT
ADDR BUS
MUX
MULTIPROCESSOR
INTERFACE
DATA BUS
MUX
HOST PORT
64
32
PM ADDRESS BUS
DM ADDRESS BUS
32
PM DATA BUS
BUS
CONNECT
(PX)
DM DATA BUS
16/32/40/48/64
32/40/64
MULT
DATA
REGISTER
FILE
(PEX)
16 x 40-BIT
BARREL
SHIFTER
BARREL
SHIFTER
DATA
REGISTER
FILE
(PEY)
16 x 40-BIT
MULT
IOP
REGISTERS
(MEMORY
MAPPED)
CONTROL,
STATUS AND
DATA BUFFERS
DMA
CONTROLLER
SERIAL PORTS
(2)
LINK PORTS
(6)
4
6
6
60
ALU
ALU
I/O PROCESSOR
Figure 1. Functional Block Diagram
SHARC and the SHARC logo are registered trademarks of Analog Devices, Inc.
Rev. D
Document Feedback
Information furnished by Analog Devices is believed to be accurate and reliable.
However, no responsibility is assumed by Analog Devices for its use, nor for any
infringements of patents or other rights of third parties that may result from its use.
Specifications subject to change without notice. No license is granted by implication
or otherwise under any patent or patent rights of Analog Devices. Trademarks and
registered trademarks are the property of their respective owners.
One Technology Way, P.O. Box 9106, Norwood, MA 02062-9106 U.S.A.
Tel: 781.329.4700
©2015 Analog Devices, Inc. All rights reserved.
Technical Support
www.analog.com
ADSP-21160M/ADSP-21160N
Single-instruction, multiple-data (SIMD)
architecture provides
Two computational processing elements
Concurrent execution—each processing element executes
the same instruction, but operates on different data
Code compatibility—at assembly level, uses the same
instruction set as the ADSP-2106x SHARC DSPs
Parallelism in buses and computational units allows
Single-cycle execution (with or without SIMD) of a multiply
operation, an ALU operation, a dual memory read or
write, and an instruction fetch
Transfers between memory and core at up to four
32-bit floating- or fixed-point words per cycle
Accelerated FFT butterfly computation through a multiply
with add and subtract
Memory attributes
4M bits on-chip dual-ported SRAM for independent access
by core processor, host, and DMA
4G word address range for off-chip memory
Memory interface supports programmable wait state gen-
eration and page-mode for off-chip memory
DMA controller supports
14 zero-overhead DMA channels for transfers between
ADSP-21160x internal memory and external memory,
external peripherals, host processor, serial ports, or link
ports
64-bit background DMA transfers at core clock speed, in
parallel with full-speed processor execution
Host processor interface to 16- and 32-bit microprocessors
Multiprocessing support provides
Glueless connection for scalable DSP multiprocessing
architecture
Distributed on-chip bus arbitration for parallel bus con-
nect of up to 6 ADSP-21160x processors plus host
6 link ports for point-to-point connectivity and array
multiprocessing
Serial ports provide
Two synchronous serial ports with companding hardware
Independent transmit and receive functions
TDM support for T1 and E1 interfaces
64-bit-wide synchronous external port provides
Glueless connection to asynchronous and SBSRAM exter-
nal memories
Rev. D |
Page 2 of 58 |
September 2015
ADSP-21160M/ADSP-21160N
TABLE OF CONTENTS
General Description ................................................. 4
ADSP-21160x Family Core Architecture .................... 4
Memory and I/O Interface Features ........................... 7
Development Tools ............................................... 9
Additional Information ......................................... 10
Related Signal Chains ........................................... 10
Pin Function Descriptions ........................................ 11
Specifications ......................................................... 15
Operating Conditions—ADSP-21160M .................... 15
Electrical Characteristics—ADSP-21160M ................. 16
Operating Conditions—ADSP-21160N ..................... 17
Electrical Characteristics—ADSP-21160N ................. 18
Absolute Maximum Ratings ................................... 19
ESD Sensitivity ................................................... 19
Package Information ............................................ 19
Timing Specifications ........................................... 20
Output Drive Currents—ADSP-21160M ................... 47
Output Drive Currents—ADSP-21160N ................... 47
Power Dissipation ............................................... 47
Test Conditions .................................................. 48
Environmental Conditions .................................... 51
400-Ball PBGA Pin Configurations ............................. 52
Outline Dimensions ................................................ 57
Surface-Mount Design ............................................. 57
Ordering Guide ..................................................... 58
REVISION HISTORY
9/15—Rev. C to Rev. D
Removed model ADSP-21160NKB-100 (no longer available)
from
Ordering Guide ............................................... 58
Rev. D |
Page 3 of 58 |
September 2015
ADSP-21160M/ADSP-21160N
GENERAL DESCRIPTION
The ADSP-21160x SHARC
®
DSP family has two members:
ADSP-21160M and ADSP-21160N. The ADSP-21160M is fabri-
cated in a 0.25 micron CMOS process. The ADSP-21160N is
fabricated in a 0.18 micron CMOS process. The ADSP-21160N
offers higher performance and lower power consumption than
the ADSP-21160M. Easing portability, the ADSP-21160x is
application source code compatible with first generation
ADSP-2106x SHARC DSPs in SISD (single instruction, single
data) mode. To take advantage of the processor’s SIMD (single-
instruction, multiple-data) capability, some code changes are
needed. Like other SHARC DSPs, the ADSP-21160x is a 32-bit
processor that is optimized for high performance DSP applica-
tions. The ADSP-21160x includes a core running up to
100 MHz, a dual-ported on-chip SRAM, an integrated I/O pro-
cessor with multiprocessing support, and multiple internal
buses to eliminate I/O bottlenecks.
Table 1
shows major differences between the ADSP-21160M
and ADSP-21160N processors.
Table 1. ADSP-21160x SHARC Processor Family Features
Feature
SRAM
Operating Voltage
Instruction Rate
Link Port Transfer Rate (6)
Serial Port Transfer Rate (2)
ADSP-21160M
4 Mbits
3.3 V I/O
2.5 V Core
80 MHz
80 MBytes/s
40 Mbits/s
ADSP-21160N
4 Mbits
3.3 V I/O
1.9 V Core
100 MHz
100 MBytes/s
50 Mbits/s
Table 2. ADSP-21160x Benchmarks
Benchmark Algorithm
1024 Point Complex FFT
(Radix 4, with reversal)
FIR Filter (per tap)
IIR Filter (per biquad)
Matrix Multiply (pipelined)
[33]
[31]
[44]
[41]
Divide (y/x)
Inverse Square Root
DMA Transfer Rate
ADSP-21160M ADSP-21160N
80 MHz
100 MHz
115 μs
92 μs
6.25 ns
25 ns
56.25 ns
100 ns
37.5 ns
56.25 ns
560M bytes/s
5 ns
20 ns
45 ns
80 ns
30 ns
45 ns
800M bytes/s
The functional block diagram (Figure
1 on Page 1)
of the
ADSP-21160x illustrates the following architectural features:
• Two processing elements, each made up of an ALU, multi-
plier, shifter, and data register file
• Data address generators (DAG1, DAG2)
• Program sequencer with instruction cache
• PM and DM buses capable of supporting four 32-bit data
transfers between memory and the core every core proces-
sor cycle
• Interval timer
• On-chip SRAM (4M bits)
• External port that supports:
• Interfacing to off-chip memory peripherals
• Glueless multiprocessing support for six
ADSP-21160x SHARC DSPs
• Host port
• DMA controller
• Serial ports and link ports
• JTAG test access port
Figure 2
shows a typical single-processor system. A multipro-
cessing system appears in
Figure 3 on Page 6.
The ADSP-21160x introduces single-instruction, multiple-data
(SIMD) processing. Using two computational units
(ADSP-2106x SHARC DSPs have one), the ADSP-21160x can
double performance versus the ADSP-2106x on a range of DSP
algorithms.
Fabricated in a state-of-the-art, high speed, low power CMOS
process, the ADSP-21160N has a 10 ns instruction cycle time.
With its SIMD computational hardware running at 100 MHz,
the ADSP-21160N can perform 600 million math operations
per second (480 million operations for ADSP-21160M at a
12.5 ns instruction cycle time).
Table 2
shows performance benchmarks for the ADSP-21160x.
These benchmarks provide single-channel extrapolations of
measured dual-channel (SIMD) processing performance. For
more information on benchmarking and optimizing DSP code
for single- and dual-channel processing, see the Analog Devices
website (www.analog.com).
The ADSP-21160x continues the SHARC family’s industry-
leading standards of integration for DSPs, combining a high
performance 32-bit DSP core with integrated, on-chip system
features. These features include a 4M-bit dual-ported SRAM
memory, host processor interface, I/O processor that supports
14 DMA channels, two serial ports, six link ports, external par-
allel bus, and glueless multiprocessing.
ADSP-21160X FAMILY CORE ARCHITECTURE
The ADSP-21160x processor includes the following architec-
tural features of the ADSP-2116x family core. The
ADSP-21160x is code compatible at the assembly level with the
ADSP-2106x and ADSP-21161.
SIMD Computational Engine
The ADSP-21160x contains two computational processing ele-
ments that operate as a single-instruction multiple-data (SIMD)
engine. The processing elements are referred to as PEX and
PEY, and each contains an ALU, multiplier, shifter, and register
file. PEX is always active, and PEY may be enabled by setting the
PEYEN mode bit in the MODE1 register. When this mode is
September 2015
Rev. D |
Page 4 of 58 |
ADSP-21160M/ADSP-21160N
Data Register File
ADSP-21160X
CLOCK
4
CLKIN
BMS
CS
ADDR
DATA
ADDR
DATA MEMORY/
MAPPED
OE
DEVICES
WE
(OPTIONAL)
ACK
CS
CONTROL
ADDRESS
DATA
3
4
LINK
DEVICES
(6 MAX)
(OPTIONAL)
CLK_CFG3–0
CIF
EBOOT
LBOOT
BRST
IRQ2–0
ADDR31–0
FLAG3–0
TIMEXP DATA63–0
RDx
LXCLK
WRx
LXACK
ACK
LXDAT7–0
MS3–0
TCLK0
RCLK0
TFS0
RSF0
DT0
DR0
TCLK1
RCLK1
TFS1
RSF1
DT1
DR1
RPBA
ID2–0
RESET
JTAG
6
PAGE
SBTS
CLKOUT
DMAR1–2
DMAG1–2
CS
HBR
HBG
REDY
BR1–6
PA
BOOT
EPROM
(OPTIONAL)
A general-purpose data register file is contained in each pro-
cessing element. The register files transfer data between the
computation units and the data buses, and store intermediate
results. These 10-port, 32-register (16 primary, 16 secondary)
register files, combined with the ADSP-2116x enhanced
Harvard architecture, allow unconstrained data flow between
computation units and internal memory. The registers in PEX
are referred to as R0–R15 and in PEY as S0–S15.
Single-Cycle Fetch of Instruction and Four Operands
The processor features an enhanced Harvard architecture in
which the data memory (DM) bus transfers data, and the pro-
gram memory (PM) bus transfers both instructions and data
(see the functional block diagram
1).
With the ADSP-21160x
DSP’s separate program and data memory buses and on-chip
instruction cache, the processor can simultaneously fetch four
operands and an instruction (from the cache), all in a single
cycle.
SERIAL
DEVICE
(OPTIONAL)
DMA DEVICE
(OPTIONAL)
DATA
SERIAL
DEVICE
(OPTIONAL)
HOST
PROCESSOR
INTERFACE
(OPTIONAL)
ADDR
DATA
Instruction Cache
The ADSP-21160x includes an on-chip instruction cache that
enables three-bus operation for fetching an instruction and four
data values. The cache is selective—only the instructions whose
fetches conflict with PM bus data accesses are cached. This
cache allows full-speed execution of core, providing looped
operations, such as digital filter multiply- accumulates and FFT
butterfly processing.
Data Address Generators with Hardware Circular Buffers
Figure 2. Single-Processor System
enabled, the same instruction is executed in both processing ele-
ments, but each processing element operates on different data.
This architecture is efficient at executing math-intensive DSP
algorithms.
Entering SIMD mode also has an effect on the way data is trans-
ferred between memory and the processing elements. In SIMD
mode, twice the data bandwidth is required to sustain computa-
tional operation in the processing elements. Because of this
requirement, entering SIMD mode also doubles the bandwidth
between memory and the processing elements. When using the
DAGs to transfer data in SIMD mode, two data values are trans-
ferred with each access of memory or the register file.
The ADSP-21160x DSP’s two data address generators (DAGs)
are used for indirect addressing and provide for implementing
circular data buffers in hardware. Circular buffers allow efficient
programming of delay lines and other data structures required
in digital signal processing, and are commonly used in digital
filters and Fourier transforms. The two DAGs of the product
contain sufficient registers to allow the creation of up to 32 cir-
cular buffers (16 primary register sets, 16 secondary). The DAGs
automatically handle address pointer wraparound, reducing
overhead, increasing performance, and simplifying implemen-
tation. Circular buffers can start and end at any memory
location.
Flexible Instruction Set
The 48-bit instruction word accommodates a variety of parallel
operations for concise programming. For example, the proces-
sor can conditionally execute a multiply, an add, and subtract,
in both processing elements, while branching, all in a single
instruction.
Independent, Parallel Computation Units
Within each processing element is a set of computational units.
The computational units consist of an arithmetic/logic unit
(ALU), multiplier, and shifter. These units perform single-cycle
instructions. The three units within each processing element are
arranged in parallel, maximizing computational throughput.
Single multifunction instructions execute parallel ALU and
multiplier operations. In SIMD mode, the parallel ALU and
multiplier operations occur in both processing elements. These
computation units support IEEE 32-bit single-precision float-
ing-point, 40-bit extended-precision floating-point, and 32-bit
fixed-point data formats.
Rev. D |
Page 5 of 58 |
September 2015