D ts e t
aa h e
R c e t r lc r nc
o h se Ee to is
Ma u a t r dCo o e t
n fc u e
mp n n s
R c e tr b a d d c mp n ns ae
o h se rn e
o oet r
ma ua trd u ig ete dewaes
n fcue sn i r i/ fr
h
p rh s d f m te oiia s p l r
uc a e r
o h r n l u pi s
g
e
o R c e tr waes rce td f m
r o h se
fr e rae r
o
te oiia I. Al rce t n ae
h
r nl P
g
l e rai s r
o
d n wi tea p o a o teOC
o e t h p rv l f h
h
M.
P r aetse u igoiia fcoy
at r e td sn r n la tr
s
g
ts p o rmso R c e tr e eo e
e t rga
r o h se d v lp d
ts s lt n t g aa te p o u t
e t oui s o u rne
o
rd c
me t o e c e teOC d t s e t
es r x e d h
M aa h e.
Qu l yOv riw
ai
t
e ve
• IO- 0 1
S 90
•A 92 cr ct n
S 1 0 et ai
i
o
• Qu l e Ma ua trr Ls (
ai d
n fcues it QML MI- R -
) LP F
385
53
•C a sQ Mitr
ls
lay
i
•C a sVS a eL v l
ls
p c ee
• Qu l e S p l r Ls o D sr uos( L )
ai d u pi s it f it b tr QS D
e
i
•R c e trsacic l u pir oD A a d
o h se i
r ia s p l t L n
t
e
me t aln u t a dD A sa d r s
es lid sr n L tn ad .
y
R c e tr lcrnc , L i c mmi e t
o h se Ee t is L C s o
o
tdo
t
s p ligp o u t ta s t f c so r x e t-
u pyn rd cs h t ai y u tme e p ca
s
t n fr u lya daee u loto eoiial
i s o q ai n r q a t h s r n l
o
t
g
y
s p l db id sr ma ua trr.
u pi
e yn ut
y n fcues
T eoiia ma ua trr d ts e t c o a yn ti d c me t e e t tep r r n e
h r n l n fcue’ aa h e a c mp n ig hs o u n r cs h ef ma c
g
s
o
a ds e ic t n o teR c e tr n fcue v rino ti d vc . o h se Ee t n
n p c ai s f h o h se ma ua trd eso f hs e ie R c e tr lcr -
o
o
isg aa te tep r r n eo i s mio d co p o u t t teoiia OE s e ic -
c u rne s h ef ma c ft e c n u tr rd cs o h r n l M p c a
o
s
g
t n .T pc lv le aefr eee c p r o e o l. eti mii m o ma i m rt g
i s ‘y ia’ au s r o rfrn e up s s ny C r n nmu
o
a
r xmu ai s
n
ma b b s do p o u t h rceiain d sg , i lt n o s mpetsig
y e a e n rd c c aa tr t , e in smuai , r a l e t .
z o
o
n
© 2 1 R cetr l t n s LC Al i t R sre 0 1 2 1
0 3 ohs E cr i , L . lRg s eevd 7 1 0 3
e e oc
h
T l r m r, l s v iw wrcl . m
o e n oe p ae it w . e c o
a
e
s
o ec
a
SUMMARY
High Performance Signal Computer for Communica-
tions, Audio, Automotive, Instrumentation and
Industrial Applications
Super Harvard Architecture Computer (SHARC
®
)
Four Independent Buses for Dual Data, Instruction,
and I/O Fetch on a Single Cycle
32-Bit Fixed-Point Arithmetic; 32-Bit and 40-Bit Floating-
Point Arithmetic
544 Kbits On-Chip SRAM Memory and Integrated I/O
Peripheral
2
I S Support, for Eight Simultaneous Receive and Trans-
mit Channels
KEY FEATURES
66 MIPS, 198 MFLOPS Peak, 132 MFLOPS Sustained
Performance
User-Configurable 544 Kbits On-Chip SRAM Memory
Two External Port, DMA Channels and Eight Serial
Port, DMA Channels
DSP Microcomputer
ADSP-21065L
SDRAM Controller for Glueless Interface to Low Cost
External Memory (@ 66 MHz)
64M Words External Address Range
12 Programmable I/O Pins and Two Timers with Event
Capture Options
Code-Compatible with ADSP-2106x Family
208-Lead MQFP or 196-Ball Mini-BGA Package
3.3 Volt Operation
Flexible Data Formats and 40-Bit Extended Precision
32-Bit Single-Precision and 40-Bit Extended-Precision IEEE
Floating-Point Data Formats
32-Bit Fixed-Point Data Format, Integer and Fractional,
with Dual 80-Bit Accumulators
Parallel Computations
Single-Cycle Multiply and ALU Operations in Parallel with
Dual Memory Read/Writes and Instruction Fetch
Multiply with Add and Subtract for Accelerated FFT But-
terfly Computation
1024-Point Complex FFT Benchmark: 0.274 ms (18,221
Cycles)
DUAL-PORTED SRAM
BLOCK 1
CORE PROCESSOR
INSTRUCTION
CACHE
32
48 BIT
TWO INDEPENDENT
DUAL-PORTED BLOCKS
PROCESSOR PORT
ADDR
ADDR
DATA
DATA
BLOCK 0
JTAG
TEST &
EMULATION
7
I/O PORT
DATA
ADDR
ADDR
DATA
DAG1
8
4
32
8
DAG2
4
24
PROGRAM
SEQUENCER
24
32
PM ADDRESS BUS
DM ADDRESS BUS
IOA
17
IOD
48
EXTERNAL
PORT
SDRAM
INTERFACE
ADDR BUS
MUX
MULTIPROCESSOR
INTERFACE
24
48
PM DATA BUS
DATA BUS
MUX
HOST PORT
32
BUS
CONNECT
(PX)
40 DM DATA BUS
DATA
REGISTER
FILE
MULTIPLIER
16
40 BIT
IOP
REGISTERS
(MEMORY MAPPED)
DMA
CONTROLLER
SPORT 0
4
(2 Rx, 2Tx)
(I
2
S)
(2 Rx, 2Tx)
BARREL
SHIFTER
ALU
CONTROL,
STATUS, TIMER
&
DATA BUFFERS
SPORT 1
(I
2
S)
I/O PROCESSOR
Figure 1. Functional Block Diagram
SHARC is a registered trademark of Analog Devices, Inc.
REV. C
Information furnished by Analog Devices is believed to be accurate and
reliable. However, no responsibility is assumed by Analog Devices for its
use, nor for any infringements of patents or other rights of third parties that
may result from its use. No license is granted by implication or otherwise
under any patent or patent rights of Analog Devices. Trademarks and
registered trademarks are the property of their respective companies.
One Technology Way, P.O. Box 9106, Norwood, MA 02062-9106, U.S.A.
Tel: 781/329-4700
www.analog.com
Fax: 781/326-8703
© 2003 Analog Devices, Inc. All rights reserved.
ADSP-21065L
544 Kbits Configurable On-Chip SRAM
Dual-Ported for Independent Access by Core Processor
and DMA
Configurable in Combinations of 16-, 32-, 48-Bit Data and
Program Words in Block 0 and Block 1
DMA Controller
Ten DMA Channels—Two Dedicated to the External Port
and Eight Dedicated to the Serial Ports
Background DMA Transfers at up to 66 MHz, in Parallel
with Full Speed Processor Execution
Performs Transfers Between:
Internal RAM and Host
Internal RAM and Serial Ports
Internal RAM and Master or Slave SHARC
Internal RAM and External Memory or I/O Devices
External Memory and External Devices
Host Processor Interface
Efficient Interface to 8-, 16-, and 32-Bit Microprocessors
Host Can Directly Read/Write ADSP-21065L IOP Registers
Multiprocessing
Distributed On-Chip Bus Arbitration for Glueless, Parallel
Bus Connect Between Two ADSP-21065Ls Plus Host
132 Mbytes/s Transfer Rate Over Parallel Bus
Serial Ports
Independent Transmit and Receive Functions
Programmable 3-Bit to 32-Bit Serial Word Width
I
2
S Support Allowing Eight Transmit and Eight Receive
Channels
Glueless Interface to Industry Standard Codecs
TDM Multichannel Mode with -Law/A-Law Hardware
Companding
Multichannel Signaling Protocol
–2–
REV. C
ADSP-21065L
GENERAL DESCRIPTION
The ADSP-21065L is a powerful member of the SHARC
family of 32-bit processors optimized for cost sensitive appli-
cations. The SHARC—Super Harvard Architecture—offers the
highest levels of performance and memory integration of any
32-bit DSP in the industry—they are also the only DSP in the
industry that offer both fixed and floating-point capabilities,
without compromising precision or performance.
The ADSP-21065L is fabricated in a high speed, low power
CMOS process, 0.35
mm
technology. With its on-chip instruc-
tion cache, the processor can execute every instruction in a
single cycle. Table I lists the performance benchmarks for the
ADSP-21065L.
The ADSP-21065L SHARC combines a floating-point DSP
core with integrated, on-chip system features, including a
544 Kbit SRAM memory, host processor interface, DMA con-
troller, SDRAM controller, and enhanced serial ports.
Figure 1 shows a block diagram of the ADSP-21065L, illustrat-
ing the following architectural features:
Computation Units (ALU, Multiplier, and Shifter) with a
Shared Data Register File
Data Address Generators (DAG1, DAG2)
Program Sequencer with Instruction Cache
Timers with Event Capture Modes
On-Chip, dual-ported SRAM
External Port for Interfacing to Off-Chip Memory and
Peripherals
Host Port and SDRAM Interface
DMA Controller
Enhanced Serial Ports
JTAG Test Access Port
Table I. Performance Benchmarks
CONTROL
ADDRESS
CLOCK
RESET
01
CLKIN
ADSP-21065L
#1
DATA
CS
ADDR
DATA
RESET
ID
1-0
SPORT0
TX0_A
TX0_B
RX0_A
RX0_B
SPORT1
TX1_A
TX1_B
RX1_A
RX1_B
CONTROL
ADDR
23-0
DATA
31-0
RD
WR
ACK
MS
3-0
BMS
SBTS
SW
CS
HBR
HBG
REDY
RAS
CAS
DQM
SDWE
SDCLK
1-0
SDCKE
SDA10
CPA
BR
2
BR
1
BOOT
EPROM
(OPTIONAL)
HOST
PROCESSOR
(OPTIONAL)
CS
ADDR
DATA
ADDR
DATA
CS
RAS
CAS
DQM
WE
CLK
CKE
A10
SDRAM
(OPTIONAL)
Figure 2. ADSP-21065L Single-Processor System
Independent, Parallel Computation Units
The arithmetic/logic unit (ALU), multiplier, and shifter all
perform single-cycle instructions. The three units are arranged
in parallel, maximizing computational throughput. Single multi-
function instructions execute parallel ALU and multiplier
operations. These computation units support IEEE 32-bit
single-precision floating-point, extended precision 40-bit floating-
point, and 32-bit fixed-point data formats.
Data Register File
Benchmark
Cycle Time
1024-Pt. Complex FFT
(Radix 4, with Digit Reverse)
Matrix Multiply (Pipelined)
[3
¥
3]
¥
[3
¥
1]
[4
¥
4]
¥
[4
¥
1]
FIR Filter (per Tap)
IIR Filter (per Biquad)
Divide Y/X
Inverse Square Root (1/÷x)
DMA Transfers
Timing
15.00 ns
0.274 ns
135 ns
240 ns
15 ns
60 ns
90 ns
135 ns
264 Mbytes/sec.
Cycles
1
18221
9
16
1
4
6
9
A general-purpose data register file is used for transferring data
between the computation units and the data buses, and for
storing intermediate results. This 10-port, 32-register (16 primary,
16 secondary) register file, combined with the ADSP-21000
Harvard architecture, allows unconstrained data flow between
computation units and internal memory.
Single-Cycle Fetch of Instruction and Two Operands
The ADSP-21065L features an enhanced Super Harvard Archi-
tecture in which the data memory (DM) bus transfers data and
the program memory (PM) bus transfers both instructions and
data (see Figure 1). With its separate program and data memory
buses, and on-chip instruction cache, the processor can simulta-
neously fetch two operands and an instruction (from the cache),
all in a single cycle.
Instruction Cache
ADSP-21000 FAMILY CORE ARCHITECTURE
The ADSP-21065L is code and function compatible with the
ADSP-21060/ADSP-21061/ADSP-21062. The ADSP-21065L
includes the following architectural features of the SHARC
family core.
The ADSP-21065L includes an on-chip instruction cache that
enables three-bus operation for fetching an instruction and two
data values. The cache is selective—only the instructions that
fetches conflict with PM bus data accesses are cached. This
allows full-speed execution of core, looped operations such as
digital filter multiply-accumulates and FFT butterfly processing.
Data Address Generators with Hardware Circular Buffers
The ADSP-21065L’s two data address generators (DAGs)
implement circular data buffers in hardware. Circular buffers
allow efficient programming of delay lines and other data
REV. C
–3–
ADSP-21065L
structures required in digital signal processing, and are com-
monly used in digital filters and Fourier transforms. The
ADSP-21065L’s two DAGs contain sufficient registers to allow
the creation of up to 32 circular buffers (16 primary register
sets, 16 secondary). The DAGs automatically handle address
pointer wraparound, reducing overhead, increasing perfor-
mance, and simplifying implementation. Circular buffers can
start and end at any memory location.
Flexible Instruction Set
Off-Chip Memory and Peripherals Interface
The 48-bit instruction word accommodates a variety of parallel
operations, for concise programming. For example, the ADSP-
21065L can conditionally execute a multiply, an add, a subtract
and a branch, all in a single instruction.
ADSP-21065L FEATURES
The ADSP-21065L’s external port provides the processor’s
interface to off-chip memory and peripherals. The 64M words,
off-chip address space is included in the ADSP-21065L’s
unified address space. The separate on-chip buses—for program
memory, data memory and I/O—are multiplexed at the external
port to create an external system bus with a single 24-bit
address bus, four memory selects, and a single 32-bit data bus.
The on-chip Super Harvard Architecture provides three bus
performance, while the off-chip unified address space gives
flexibility to the designer.
SDRAM Interface
The ADSP-21065L is designed to achieve the highest system
throughput to enable maximum system performance. It can be
clocked by either a crystal or a TTL-compatible clock signal.
The ADSP-21065L uses an input clock with a frequency equal
to half the instruction rate—a 33 MHz input clock yields a
15 ns processor cycle (which is equivalent to 66 MHz). Inter-
faces on the ADSP-21065L operate as shown below. Hereafter
in this document, 1x = input clock frequency, and 2x = processor’s
instruction rate.
The following clock operation ratings are based on 1x = 33 MHz
(instruction rate/core = 66 MHz):
SDRAM
External SRAM
Serial Ports
Multiprocessing
Host (Asynchronous)
66 MHz
33 MHz
33 MHz
33 MHz
33 MHz
The SDRAM interface enables the ADSP-21065L to transfer
data to and from synchronous DRAM (SDRAM) at 2x clock
frequency. The synchronous approach coupled with 2x clock
frequency supports data transfer at a high throughput—up to
220 Mbytes/sec.
The SDRAM interface provides a glueless interface with stan-
dard SDRAMs—16 Mb, 64 Mb, and 128 Mb—and includes
options to support additional buffers between the ADSP-21065L
and SDRAM. The SDRAM interface is extremely flexible and
provides capability for connecting SDRAMs to any one of the
ADSP-21065L’s four external memory banks.
Systems with several SDRAM devices connected in parallel may
require buffering to meet overall system timing requirements.
The ADSP-21065L supports pipelining of the address and
control signals to enable such buffering between itself and
multiple SDRAM devices.
Host Processor Interface
Augmenting the ADSP-21000 family core, the ADSP-21065L
adds the following architectural features:
Dual-Ported On-Chip Memory
The ADSP-21065L contains 544 Kbits of on-chip SRAM,
organized into two banks: Bank 0 has 288 Kbits, and Bank 1 has
256 Kbits. Bank 0 is configured with 9 columns of 2K
¥
16 bits,
and Bank 1 is configured with 8 columns of 2K
¥
16 bits. Each
memory block is dual-ported for single-cycle, independent accesses
by the core processor and I/O processor or DMA controller.
The dual-ported memory and separate on-chip buses allow two
data transfers from the core and one from I/O, all in a single
cycle (see Figure 4 for the ADSP-21065L Memory Map).
On the ADSP-21065L, the memory can be configured as a
maximum of 16K words of 32-bit data, 34K words for 16-bit
data, 10K words of 48-bit instructions (and 40-bit data) or
combinations of different word sizes up to 544 Kbits. All the
memory can be accessed as 16-bit, 32-bit or 48-bit.
While each memory block can store combinations of code and
data, accesses are most efficient when one block stores data,
using the DM bus for transfers, and the other block stores
instructions and data, using the PM bus for transfers. Using the
DM and PM busses in this way, with one dedicated to each
memory block, assures single-cycle execution with two data
transfers. In this case, the instruction must be available in the
cache. Single-cycle execution is also maintained when one of
the data operands is transferred to or from off-chip, via the
ADSP-21065L’s external port.
The ADSP-21065L’s host interface provides easy connection to
standard microprocessor buses—8-, 16-, and 32-bit—requiring
little additional hardware. Supporting asynchronous transfers at
speeds up to 1x clock frequency, the host interface is accessed
through the ADSP-21065L’s external port. Two channels of
DMA are available for the host interface; code and data trans-
fers are accomplished with low software overhead.
The host processor requests the ADSP-21065L’s external bus
with the host bus request (HBR), host bus grant (HBG), and
ready (REDY) signals. The host can directly read and write the
IOP registers of the ADSP-21065L and can access the DMA
channel setup and mailbox registers. Vector interrupt support
enables efficient execution of host commands.
DMA Controller
The ADSP-21065L’s on-chip DMA controller allows zero-
overhead, nonintrusive data transfers without processor inter-
vention. The DMA controller operates independently and
invisibly to the processor core, allowing DMA operations to
occur while the core is simultaneously executing its program
instructions.
DMA transfers can occur between the ADSP-21065L’s internal
memory and either external memory, external peripherals, or a
host processor. DMA transfers can also occur between the
ADSP-21065L’s internal memory and its serial ports. DMA
transfers between external memory and external peripheral
devices are another option. External bus packing to 16-, 32-, or
48-bit internal words is performed during DMA transfers.
Ten channels of DMA are available on the ADSP-21065L—
eight via the serial ports, and two via the processor’s external
port (for either host processor, other ADSP-21065L, memory or
–4–
REV. C