SHARC Processors
ADSP-21362/ADSP-21363/ADSP-21364/ADSP-21365/ADSP-21366
SUMMARY
High performance 32-bit/40-bit floating point processor
optimized for high performance audio processing
Single-instruction, multiple-data (SIMD) computational
architecture
On-chip memory—3M bits of on-chip SRAM
Code compatible with all other members of the SHARC family
The ADSP-2136x processors are available with up to 333 MHz
core instruction rate with unique audiocentric peripherals
such as the digital applications interface, S/PDIF trans-
ceiver, DTCP (digital transmission content protection
protocol), serial ports, precision clock generators, and
more. For complete ordering information, see
Ordering
Guide on Page 56.
DEDICATED AUDIO COMPONENTS
S/PDIF-compatible digital audio receiver/transmitter
8 channels of asynchronous sample rate converters (SRC)
16 PWM outputs configured as four groups of four outputs
ROM-based security features include:
JTAG access to memory permitted with a 64-bit key
Protected memory regions that can be assigned to limit
access under program control to sensitive code
PLL has a wide variety of software and hardware multi-
plier/divider ratios
Available in 136-ball CSP_BGA and 144-lead LQFP_EP
packages
Internal Memory
SIMD Core
Instruction
Cache
5 stage
Sequencer
Block 0
RAM/ROM
Block 1
RAM/ROM
Block 2
RAM
Block 3
RAM
DAG1/2
Timer
DMD 64-BIT
S
DMD 64-BIT
PMD 64-BIT
B0D
64-BIT
B1D
64-BIT
B2D
64-BIT
B3D
64-BIT
PEx
PEy
Core Bus
Cross Bar
Internal Memory I/F
PMD 64-BIT
FLAGx/IRQx/
TMREXP
JTAG
PERIPHERAL BUS
32-BIT
IOD 32-BIT
MTM/
DTCP
PERIPHERAL BUS
IOD BUS
CORE TIMER
FLAGS 2-0
ASRC
3-0
S/PDIF
Tx/Rx
PCG
A-B
SPI B
PDAP/ SPORT
IDP7-0
5-0
SPI
Core
Flags
PWM
3-0
PP
DAI Routing/Pins
PP Pin MUX
DAI Peripherals
Peripherals
Figure 1. Functional Block Diagram
SHARC and the SHARC logo are registered trademarks of Analog Devices, Inc.
Rev. J
Document Feedback
Information furnished by Analog Devices is believed to be accurate and reliable.
However, no responsibility is assumed by Analog Devices for its use, nor for any
infringements of patents or other rights of third parties that may result from its use.
Specifications subject to change without notice. No license is granted by implication
or otherwise under any patent or patent rights of Analog Devices. Trademarks and
registered trademarks are the property of their respective owners.
One Technology Way, P.O. Box 9106, Norwood, MA 02062-9106 U.S.A.
Tel: 781.329.4700
©2013 Analog Devices, Inc. All rights reserved.
Technical Support
www.analog.com
ADSP-21362/ADSP-21363/ADSP-21364/ADSP-21365/ADSP-21366
TABLE OF CONTENTS
Summary ............................................................... 1
Dedicated Audio Components .................................... 1
General Description ................................................. 3
SHARC Family Core Architecture ............................ 4
Family Peripheral Architecture ................................ 6
I/O Processor Features ........................................... 8
System Design ...................................................... 8
Development Tools ............................................... 9
Additional Information ........................................ 10
Related Signal Chains .......................................... 10
Pin Function Descriptions ....................................... 11
Specifications ........................................................ 14
Operating Conditions .......................................... 14
Electrical Characteristics ....................................... 15
Package Information ........................................... 16
ESD Caution ...................................................... 16
Maximum Power Dissipation ................................. 16
Absolute Maximum Ratings ................................... 16
Timing Specifications ........................................... 16
Output Drive Currents ......................................... 46
Test Conditions .................................................. 46
Capacitive Loading .............................................. 46
Thermal Characteristics ........................................ 47
144-Lead LQFP_EP Pin Configurations ....................... 48
136-Ball BGA Pin Configurations ............................... 50
Package Dimensions ............................................... 53
Surface-Mount Design .......................................... 54
Automotive Products .............................................. 55
Ordering Guide ..................................................... 56
REVISION HISTORY
7/13—Revision I to Revision J
Updated
Development Tools .......................................9
Added Nominal Value column in
Operating Conditions .. 14
Changed Max values in
Table 30
in
Pulse-Width Modulation
Generators ............................................................ 35
Updated
Ordering Guide .......................................... 56
Rev. J |
Page 2 of 60 |
July 2013
ADSP-21362/ADSP-21363/ADSP-21364/ADSP-21365/ADSP-21366
GENERAL DESCRIPTION
The ADSP-2136x SHARC
®
processor is a member of the SIMD
SHARC family of DSPs that feature Analog Devices, Inc., Super
Harvard Architecture. The processor is source code-compatible
with the ADSP-2126x and ADSP-2116x DSPs, as well as with
first generation ADSP-2106x SHARC processors in SISD
(single-instruction, single-data) mode. The ADSP-2136x are
32-/40-bit floating-point processors optimized for high
performance automotive audio applications. They contain a
large on-chip SRAM and ROM, multiple internal buses to elim-
inate I/O bottlenecks, and an innovative digital audio interface
(DAI).
As shown in the functional block diagram
on Page 1,
the
ADSP-2136x uses two computational units to deliver a signifi-
cant performance increase over the previous SHARC processors
on a range of signal processing algorithms. With its SIMD com-
putational hardware, the ADSP-2136x can perform two
GFLOPS running at 333 MHz.
Table 2. ADSP-2136x Family Features
Feature
RAM
ROM
Audio Decoders in ROM
1
Pulse-Width Modulation
S/PDIF
DTCP
2
SRC SNR Performance
1
Table 1
shows performance benchmarks for these devices.
Table 2
shows the features of the individual product offerings.
Table 1. Benchmarks (at 333 MHz)
Benchmark Algorithm
1024 Point Complex FFT (Radix 4, with reversal)
FIR Filter (per tap)
1
IIR Filter (per biquad)
1
Matrix Multiply (pipelined)
[3×3] × [3×1]
[4×4] × [4×1]
Divide (y/x)
Inverse Square Root
1
Speed
(at 333 MHz)
27.9 μs
1.5 ns
6.0 ns
13.5 ns
23.9 ns
10.5 ns
16.3 ns
Assumes two files in multichannel SIMD mode.
ADSP-21362
3M bit
4M bit
No
Yes
Yes
Yes
–128 dB
ADSP-21363
3M bit
4M bit
No
Yes
No
No
No SRC
ADSP-21364
3M bit
4M bit
No
Yes
Yes
No
–140 dB
ADSP-21365
3M bit
4M bit
Yes
Yes
Yes
Yes
–128 dB
ADSP-21366
3M bit
4M bit
Yes
Yes
Yes
No
–128 dB
Audio decoding algorithms include PCM, Dolby Digital EX, Dolby Pro Logic IIx, DTS 96/24, Neo:6, DTS ES, MPEG-2 AAC, MP3, and functions like bass management, delay,
speaker equalization, graphic equalization, and more. Decoder/post-processor algorithm combination support varies depending upon the chip version and the system
configurations. Please visit www.analog.com for complete information.
2
The ADSP-21362 and ADSP-21365 processors provide the Digital Transmission Content Protection protocol, a proprietary security protocol. Contact your Analog Devices
sales office for more information.
The diagram
on Page 1
shows the two clock domains that make
up the ADSP-2136x processors. The core clock domain contains
the following features:
• Two processing elements, each of which comprises an
ALU, multiplier, shifter, and data register file
• Data address generators (DAG1, DAG2)
• Program sequencer with instruction cache
• PM and DM buses capable of supporting four 32-bit data
transfers between memory and the core at every core pro-
cessor cycle
• One periodic interval timer with pinout
• On-chip SRAM (3M bit)
• On-chip mask-programmable ROM (4M bit)
• JTAG test access port for emulation and boundary scan.
The JTAG provides software debug through user break-
points, which allow flexible exception handling.
The diagram
on Page 1
also shows the following architectural
features:
• I/O processor that handles 32-bit DMA for the peripherals
• Six full duplex serial ports
• Two SPI-compatible interface ports—primary on dedi-
cated pins, secondary on DAI pins
• 8-bit or 16-bit parallel port that supports interfaces to off-
chip memory peripherals
• Digital audio interface that includes two precision clock
generators (PCG), an input data port with eight serial inter-
faces (IDP), an S/PDIF receiver/transmitter, 8-channel
asynchronous sample rate converter (ASRC), DTCP
cipher, six serial ports, a 20-bit parallel input data port
(PDAP), 10 interrupts, six flag outputs, six flag inputs,
three timers, and a flexible signal routing unit (SRU)
Rev. J |
Page 3 of 60 |
July 2013
ADSP-21362/ADSP-21363/ADSP-21364/ADSP-21365/ADSP-21366
SHARC FAMILY CORE ARCHITECTURE
The ADSP-2136x is code-compatible at the assembly level with
the ADSP-2126x, ADSP-21160, and ADSP-21161, and with the
first generation ADSP-2106x SHARC processors. The
ADSP-2136x shares architectural features with the ADSP-2126x
and ADSP-2116x SIMD SHARC processors, as shown in
Figure 2
and detailed in the following sections.
Entering SIMD mode also has an effect on the way data is trans-
ferred between memory and the processing elements. When in
SIMD mode, twice the data bandwidth is required to sustain
computational operation in the processing elements. Because of
this requirement, entering SIMD mode also doubles the
bandwidth between memory and the processing elements.
When using the DAGs to transfer data in SIMD mode, two data
values are transferred with each access of memory or
the register file.
SIMD Computational Engine
The processor contains two computational processing elements
that operate as a single-instruction, multiple-data (SIMD)
engine. The processing elements are referred to as PEX and PEY
and each contains an ALU, multiplier, shifter, and register file.
PEX is always active, and PEY can be enabled by setting the
PEYEN mode bit in the MODE1 register. When this mode is
enabled, the same instruction is executed in both processing ele-
ments, but each processing element operates on different data.
This architecture is efficient at executing math intensive signal
processing algorithms.
Independent, Parallel Computation Units
Within each processing element is a set of computational units.
The computational units consist of an arithmetic/logic unit
(ALU), multiplier, and shifter. These units perform all opera-
tions in a single cycle. The three units within each processing
element are arranged in parallel, maximizing computational
throughput. Single multifunction instructions execute parallel
ALU and multiplier operations. In SIMD mode, the parallel
ALU and multiplier operations occur in both processing
elements. These computation units support IEEE 32-bit,
single-precision floating-point, 40-bit extended-precision
floating-point, and 32-bit fixed-point data formats.
S
JTAG
FLAG
TIMER INTERRUPT CACHE
SIMD Core
PM ADDRESS 24
DMD/PMD 64
5 STAGE
PROGRAM SEQUENCER
PM DATA 48
DAG1
16x32
DAG2
16x32
PM ADDRESS 32
DM ADDRESS 32
PM DATA 64
SYSTEM
I/F
USTAT
4x32-BIT
PX
64-BIT
DM DATA 64
MULTIPLIER
SHIFTER
ALU
RF
Rx/Fx
PEx
16x40-BIT
DATA
SWAP
RF
Sx/SFx
PEy
16x40-BIT
ALU
SHIFTER
MULTIPLIER
MRF
80-BIT
MRB
80-BIT
ASTATx
STYKx
ASTATy
STYKy
MSB
80-BIT
MSF
80-BIT
Figure 2. SHARC Core Block Diagram
Rev. J |
Page 4 of 60 |
July 2013
ADSP-21362/ADSP-21363/ADSP-21364/ADSP-21365/ADSP-21366
Data Register File
Each processing element contains a general-purpose data regis-
ter file. The register files transfer data between the computation
units and the data buses, and store intermediate results. These
10-port, 32-register (16 primary, 16 secondary) files, combined
with the ADSP-2136x enhanced Harvard architecture, allow
unconstrained data flow between computation units and inter-
nal memory. The registers in PEX are referred to as R0–R15 and
in PEY as S0–S15.
processing, and are commonly used in digital filters and Fourier
transforms. The two DAGs contain sufficient registers to allow
the creation of up to 32 circular buffers (16 primary register sets,
16 secondary). The DAGs automatically handle address pointer
wraparound, reduce overhead, increase performance, and sim-
plify implementation. Circular buffers can start and end at any
memory location.
Flexible Instruction Set
The 48-bit instruction word accommodates a variety of parallel
operations for concise programming. For example, the
processor can conditionally execute a multiply, an add, and a
subtract in both processing elements while branching and fetch-
ing up to four 32-bit values from memory—all in a single
instruction.
Context Switch
Many of the processor’s registers have secondary registers that
can be activated during interrupt servicing for a fast context
switch. The data registers in the register file, the DAG registers,
and the multiplier result register all have secondary registers.
The primary registers are active at reset, while the secondary
registers are activated by control bits in a mode control register.
On-Chip Memory
The processor contains 3M bits of internal SRAM and 4M bits
of internal ROM. Each block can be configured for different
combinations of code and data storage (see
Table 3).
Each
memory block supports single-cycle, independent accesses by
the core processor and I/O processor. The processor’s memory
architecture, in combination with its separate on-chip buses,
allows two data transfers from the core and one from the I/O
processor, in a single cycle.
The SRAM can be configured as a maximum of 96K words of
32-bit data, 192K words of 16-bit data, 64K words of 48-bit
instructions (or 40-bit data), or combinations of different word
sizes up to 3M bits. All of the memory can be accessed as 16-bit,
32-bit, 48-bit, or 64-bit words. A 16-bit floating-point storage
format is supported that effectively doubles the amount of data
that can be stored on-chip. Conversion between the 32-bit
floating-point and 16-bit floating-point formats is performed in
a single instruction. While each memory block can store combi-
nations of code and data, accesses are most efficient when one
block stores data using the DM bus for transfers, and the other
block stores instructions and data using the PM bus for
transfers.
Using the DM bus and PM buses, with one bus dedicated to
each memory block, assures single-cycle execution with two
data transfers. In this case, the instruction must be available in
the cache.
Universal Registers
The universal registers are general purpose registers. The
USTAT (4) registers allow easy bit manipulations (Set, Clear,
Toggle, Test, XOR) for all system registers (control/status) of
the core.
The data bus exchange register (PX) permits data to be passed
between the 64-bit PM data bus and the 64-bit DM data bus, or
between the 40-bit register file and the PM/DM data bus. These
registers contain hardware to handle the data width difference.
Timer
A core timer that can generate periodic software interrupts. The
core timer can be configured to use FLAG3 as a timer expired
signal.
Single-Cycle Fetch of Instruction and Four Operands
The processor features an enhanced Harvard architecture in
which the data memory (DM) bus transfers data and the pro-
gram memory (PM) bus transfers both instructions and data
(see
Figure 2).
With the its separate program and data memory
buses and on-chip instruction cache, the processor can simulta-
neously fetch four operands (two over each data bus) and one
instruction (from the cache), all in a single cycle.
Instruction Cache
The processor includes an on-chip instruction cache that
enables three-bus operation for fetching an instruction and four
data values. The cache is selective—only the instructions whose
fetches conflict with PM bus data accesses are cached. This
cache allows full-speed execution of core looped operations
such as digital filter multiply-accumulates, and FFT butterfly
processing.
On-Chip Memory Bandwidth
The internal memory architecture allows three accesses at the
same time to any of the four blocks, assuming no block con-
flicts. The total bandwidth is gained with DMD and PMD buses
(2 × 64-bits, core CLK) and the IOD bus (32-bit, PCLK).
ROM-Based Security
The processor has a ROM security feature that provides hard-
ware support for securing user software code by preventing
unauthorized reading from the internal code. When using this
feature, the processor does not boot-load any external code, exe-
cuting exclusively from internal ROM. Additionally, the
processor is not freely accessible via the JTAG port. Instead, a
unique 64-bit key, which must be scanned in through the JTAG
Data Address Generators with Zero-Overhead Hardware
Circular Buffer Support
The processor’s two data address generators (DAGs) are used
for indirect addressing and implementing circular data buffers
in hardware. Circular buffers allow efficient programming of
delay lines and other data structures required in digital signal
Rev. J |
Page 5 of 60 |
July 2013