RSC-300/364
Recognition
•
Synthesis
•
Control
Speech Recognition Microcontroller
GENERAL DESCRIPTION
The RSC-300/364, from the Interactive Speech™ family
of products, is an 8-bit microcontroller designed
specifically for speech applications in consumer
electronic products. The RSC-300/364 is a single chip
solution that combines the flexibility of a microcontroller
with advanced speech technology, including high-quality
speech recognition, speech and music synthesis, speaker
verification, and voice record and playback. Products
can use one or all of the RSC-300/364 features in a single
application.
The RSC-300/364 supports Sensory Speech™ 5.0, the
latest speech recognition technology from Sensory, which
includes a number of new techniques that significantly
improve recognition performance over previous versions.
Using a sophisticated neural network technology, on-chip
speech recognition algorithms reach an accuracy of
greater than 97% for speaker-independent recognition
and greater than 99% for speaker-dependent recognition.
In addition to the improved recognition performance, the
RSC-300/364 provides further on-chip integration of
features, including a preamplifier, multiplier, watchdog
timer, and 2.5 Kbytes of RAM. A complete system may
be built with few additional parts other than a battery,
speaker, microphone, and a few resistors and capacitors.
The RSC-300 is designed for ROM-less for applications
that need more ROM space and consequently use off-chip
memory.
FEATURES
Full Range of Sensory Speech™ 5.0 Capabilities
•
Speaker-independent speech recognition
•
Speaker-dependent speech recognition
•
High quality speech synthesis and sound effects
•
Speaker verification
•
Four-voice music synthesis
•
Voice record & playback
Integrated Single-Chip Solution
•
4 MIPS 8-bit microcontroller
•
On-chip A/D and D/A converters, and pre-amplifier
•
32kHz clock for time keeping
•
Internal 64 Kbytes ROM; 2.5 Kbytes RAM
•
Internal 32 kHz watchdog timer
•
External memory bus: 16-bit Address, 8-bit Data
•
24x24 Multiplier for rapid recognition processing
Low Power Requirements
•
2.4 – 5.25V operation for 2 or 3 battery applications
•
~10mA operating current at 3V
•
Power down mode; <5
µA
standby current
RSC-300/364 Block Diagram
Oscillator
Preamp
and Gain
Control
Microphone
Multiplexer
ADC
Digital Logic
AGC
Microcontroller
DAC
RAM
AMP
Speaker
ROM
(RSC-364 only)
Multiplier
RSC-364
Watchdog Timer
External
General
Purpose I/O Memory
From the
Interactive Speech™
Line of Products
RSC-300/364
DATA SHEET
RSC-300/364 OVERVIEW
The RSC-300/364 is a member of the Interactive
Speech™ line of products from Sensory. It features a
high-performance 8-bit microcontroller with on-chip
A/D, D/A, preamplifier, RAM and ROM (RSC-364
only). The RSC-300/364 is designed to bring a high
degree of integration and versatility into low-cost, power-
sensitive toy applications.
Various functional units have been integrated onto the
CPU core in order to reduce total system cost and
increase system reliability without degrading system
performance. The RSC-300/364 delivers 4 MIPS of
integer performance at 14.32 MHz providing maximum
performance at minimum cost.
The CPU core embedded in the RSC-300/364 is an 8-bit,
variable-length-instruction,
microcontroller.
The
TM
instruction set is somewhat similar to the Zilog 78, and
has a variety of addressing mode
mov
instructions. The
RSC-300/364 processor avoids the limitations of
dedicated A, B, and DPTR registers by having
completely symmetrical source and destinations for all
instructions. Of the 2.5 Kbytes of internal RAM, 2
Kbytes are organized as a Data Space, with 0.5K used for
Register Space.
RECORD AND PLAYBACK
The RSC-300/364 can perform audio record and
playback at various compression levels depending on the
quantity and quality of playback desired. Data rates of
under 14,000 bits per second are achievable while
maintaining very high quality reproduction. The RSC-
300/364 also performs silence removal to improve sound
quality and reduce memory requirements.
SPEAKER VERIFICATION
The RSC-300/364 can also perform text-dependent
speaker verification. After a speaker trains the chip on a
specific word, the chip is able to identify whether that
word is spoken by the original speaker, thus providing
biometric security.
POWER
The typical operating current is 10 mA operating at
14.32 MHz and 3V. Lowering clock frequency reduces
power consumption, although speech recognition
requires a 14.32 MHz clock. Standby current is <5µA in
power down mode.
SPEECH RECOGNITION
The RSC-300/364 uses a neural network to perform
speaker-independent or speaker-dependent speech
recognition. Speaker-dependent recognition requires
external memory to store speech recognition information
(e.g., SRAM, optional Serial EEPROM, Flash Memory).
Speaker-independent recognition requires on-chip or off-
chip ROM to store the words to be recognized. The
RSC-300/364 has several additional speech recognition
features as described below.
Continuous listening
allows the chip to continuously
listen for a specific word. With this feature a product can
be used in a normal environment and only “activates”
when a specific word, preceded by quiet, is spoken.
RSC-300/364 Architecture Diagram
AiFE1
AiFE2
AiNØ
AiN1
AOFE1
A[15:0]
D[7:0]
EXTERNAL
MEMORY
INTERFACE
SPEECH
PROCESSING
UNIT
-RDC
-WRC
-RDD
-WRD
PRE-AMP
AOFE2
AOFE3
ADC
DACOUT
2K TECHNOLOGY
SRAM
DAC
ANALOG
CONTROL
PULSE
WIDTH
MODULATOR
INTERRUPT LOGIC
OSC1
REGISTER SPACE
448 bytes
BUFOUT/
PWM
STACK SPACE
8 levels
XI1, XO1
CPU
INTERNAL ROM (RSC-364)
32K x 8
HIGH
TIMER1
TIMER2
XI2, XO2
-XMH
-XML
LOW
32K x 8
SPEECH AND MUSIC SYNTHESIS
The RSC-300/364 provides high-quality speech synthesis
by using a hybrid of a time-domain compression scheme
that improves on conventional ADPCM and a customized
reuse of sounds. Speech synthesis requires on-chip or
off-chip ROM to store audio sounds for synthesis.
The RSC-300/364 provides high-quality, low-cost four-
voice music synthesis which allows multiple,
simultaneous instruments for harmonizing. The RSC-
300/364 uses a MIDI-like system to generate music.
2
P0.0-P0.7
OSC2
PORT
0
TIMING AND
CONTROL
-RESET
-TE1/
PWM
BREAK POINT
REGISTER
P1.0-P1.7
From the
Interactive Speech™
Line of Products
PORT
1
DATA SHEET
RSC-300/364
RSC-300/364 ARCHITECTURE
The RSC-300/364 is a highly integrated device that
combines:
•
•
8-bit microcontroller
On-chip ROM (64 Kbytes, RSC-364 only) and RAM
(2.5 Kbytes), and the ability to address off-chip RAM
or ROM
A/D converter and D/A converter
Input amplifier and pulse width modulator
external devices. There are two programmable 8-bit
counters / timers, one derived from each oscillator.
An external microphone passes an audio signal to the
preamplifier and ADC (Analog-to-Digital Converter) to
convert the incoming speech signal into digital data. The
output audio signal of the RSC-300/364 is derived from a
DAC (Digital-to-Analog Converter) or PWM (Pulse
Width Modulator).
•
•
USING THE RSC-300/364
Creating applications using the RSC-300/364 requires
the development of electronic circuitry, software code,
and speech/music data files. Software code for the RSC-
300/364 can be developed by Sensory or by external
programmers using the RSC-300/364 Development Kit.
For more information about development tools and
services, please contact Sensory. A typical product will
require about $0.30 - $1.00 (in high volume) of
additional components, in addition to the RSC-300/364.
The following sample circuit provides an example of how
the RSC-300/364 might be used in a consumer electronic
product.
The RSC-300/364 has an external memory interface,
with 16-bit addresses and 8-bit data buses, for accessing
external memory. It also has an internal ROM (RSC-364
only) that can be enabled or disabled (partially or fully)
by pin inputs (signals , -XMH, -XML).
Two bi-directional ports provide 16 general purpose I/O
pins to communicate with external devices. The RSC-
300/364 has a high frequency (14.32 MHz) oscillator as
well as a low frequency (32,768 Hz) oscillator suitable
for timekeeping applications. The processor clock can be
selected from either source, with a selectable divider
value. The device performs speech recognition when
running at 14.32 MHz. The RSC-300/364 also supports
programmable wait states to allow the use of slower
Sample Application Circuit (Die)
R1
2.7K
U1
10
9
8
7
6
5
4
3
25
24
21
23
2
26
27
1
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
D0
D1
D2
D3
D4
D5
D6
D7
11
12
13
15
16
17
18
19
VDD
R7
47
C14
0.01uF
C5
0.1uF
C7
0.022uF
INPUT-MIC
/XML
/XMH
PDN
/WRD
/RDD
/WRC
/RDC
P0.0
P0.1
P0.2
P0.3
P0.4
P0.5
J1
C6
100uF/16V
R2
100
C1
0.1uF
C3
4700pF
C2
220pF
C4
100uF
C14
0.01uF
R3
2.7K(TBD)
LS1
SPEAKER
PDN
20
/RDC
22
D0
D1
D2
D3
D4
D5
D6
D7
NC
NC
P1/TE
P0
GND
AOFE2
AIN0
AIN1
AOFE3
DAC
AIFE2
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
D0
D1
D2
D3
D4
D5
D6
D7
CE
OE
AT27LV512A(TSOP)
A15
A14
A13
A12
A11
A10
A9
A8
A7
A6
A5
A4
A3
A2
A1
A0
VDD
VDD
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
A15
A14
A13
A12
A11
A10
A9
A8
GND
Vdd
A7
A6
A5
A4
A3
A2
A1
RSC364
AOFE1
A IFE1
Vref
XML
XMH
PDN
WRD
RDD
WRC
RDC
GND
Vdd
P0.0
P0.1
P0.2
P0.3
P0.4
P0.5
72
71
70
69
68
67
66
65
64
63
62
61
60
59
58
57
56
55
R4
100K
A0
XO2
XI2
XO1
XI1
RST
NC
NC
P1.7
P1.6
P1.5
P1.4
P1.3
P1.2
P1.1
P1.0
P0.7
P0.6
VDD
C8
0.1uF
C9
68pF
P1.7
P1.6
P1.5
P1.4
P1.3
P1.2
P1.1
P1.0
P0.7
P0.6
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
C10
0.1uF
R6
400
VDD
Y1
C13
0.1uF
14.318MHz
C11
27pF
C12
27 pF
R5
100K
From the
Interactive Speech™
Line of Products
3
RSC-300/364
DATA SHEET
RSC-300/364 INSTRUCTION SET
The instruction set for the RSC-300/364 has 54
instructions comprising 10 move, 7 rotate, 11 branch, 11
register arithmetic, 9 immediate arithmetic, and 6
miscellaneous instructions. All instructions are 3 bytes or
fewer, and no instruction requires more than 10 clock
cycles to execute.
duration. Careful design may allow operation with
memories having access times as slow as 120 nsec.
TIMERS/COUNTERS
The two independent oscillators of the RSC-300/364
provide counts to two internal timers. Each of the two
timers consists of an 8-bit reload value register and an 8-
bit up-counter. The reload register is readable and
writeable by the processor.
GENERAL PURPOSE I/O
The RSC-300/364 has 16 general purpose I/O pins (P0.0-
P0.7, P1.0-P1.7). Each pin can be programmed as an
input with weak pull-up (~150kΩ equivalent device);
input with strong pull-up (~10kΩ equivalent device);
input without pull-up, or as an output.
INTERRUPTS
The RSC-300/364 allows for five interrupt sources, as
selected by software. Each has its own mask bit and
request bit in the IMR and IRQ registers respectively.
The following events can generate interrupts:
•
•
•
•
Positive edge on Port 0, bit 0
Overflow of Timer 1
Overflow of Timer 2
Sensory reserved functions
Completion of PWM sample period
EXTERNAL MEMORY
The RSC-300/364 includes an external memory interface
that allows connection with memory devices for speaker-
dependent speech recognition, audio record/playback,
and extended durations of speech and music synthesis.
Separate data and address buses allow use of standard
EPROMs, ROMs, SRAMs, and Flash memory with little
or no additional decoding. Support for separate read and
write signals for each external memory space further
simplifies interfacing. The RSC-300/364 includes 8 data
lines (D[7:0]) and 16 address lines (A[15:0]), and
associated control signals for memory interfacing.
•
PREAMPLIFIER
The on-chip preamplifier circuit consists of three stages
with a maximum overall gain of about 500. The
amplifier includes a Vref input that is used to set the
amplifier center voltages and must be driven by a low
impedance voltage supplied by an external source. The
signal inputs of all stages have an 80 KΩ input
impedance to the Vref pad. In a typical design, AOFE1
would be directly coupled to AIFE2, and AOFE2 would
be capacitively coupled to AIN0 through an RC lowpass
filter to remove DC offset and digital noise. AOFE3
would be bypassed to Vref with a small (220pF) capacitor
for additional noise suppression.
OSCILLATORS
Two independent oscillators in the RSC-300/364 provide
a high-frequency clock and a 32kHz time-keeping clock.
Both oscillators work with an external crystal, a ceramic
resonator or LC. The oscillator characteristics are:
Oscillator #1:
Oscillator #2
Pins XI1, XO1
14.32 MHz
Pins XI2 and XO2
32768 Hz
ANALOG OUTPUT
The RSC-300/364 offers two separate options for analog
output. The DAC (Digital to Analog Converter) output
provides a general purpose 10-bit analog output that may
be used for speech output (with the inclusion of an audio
amplifier), or other purposes requiring an analog
waveform. For speech applications that require driving a
small speaker, the PWM (Pulse-Width Modulator) output
can be used instead of the DAC output. The PWM
output can directly drive a 32 ohm speaker.
CLOCK
The RSC-300/364 uses a fully static core – the processor
can be stopped (by removing the clock source) and
restarted without causing a reset or losing contents of
internal registers. Static operation is guaranteed from DC
to 14.32 MHz.
Typically the processor clock runs from a 14.32 MHz
crystal with no divisor and one wait state. This creates
internal RAM cycles of 70 nsec duration and internal
ROM (RSC-364 only) or external cycles of 140 nsec
4
PACKAGING
The RSC-300/364 can be purchased as unpackaged die or
a 64 pin TQFP package.
From the
Interactive Speech™
Line of Products
DATA SHEET
RSC-300/364
DIE BOND PAD AND QFP PIN DESCRIPTIONS
19
20
1
72
64
1
49
48
RSC-364
top view of die
RSC-364
64-pin QFP
4
36
55
37
54
16
17
32
33
Name
A[15:0]
AIN0
AIN1
AOFE1
AOFE2
AOFE3
AIFE1
AIFE2
NC
PWM0
DACOUT
D[7:0]
Vss
PDN
P1[7:0], P0[7:0]
/RDC
/RDD
/RESET
/TE1 or PWM1
VREF
V
DD
/WRC
/WRD
/XMH
/XML
XO1
XI1
XO2
XI2
Die Pad
20-27, 30-37
5
4
72
6
3
71
1
10,11,43,44
8
2
12-19
7,28,62
67
43-52,53-60
63
65
42
9
70
29,61
64
66
68
69
40
41
38
39
QFP Pin
1-8, 11-18
52
51
49
53
51
48
49
Description
External Memory Address Bus
Analog In, low gain. (range AGND to AVDD/2.)
Analog In, hi gain (8X input amplitude of AIN0, same range)
Output of 1
st
stage of preamplifier
Output of 2
nd
stage of preamplifier
Output of 3 stage of preamplifier
Input of 1 stage of preamplifier
Input of 2
nd
stage of preamplifier
Not Connected
st
rd
I/O
O
I
I
O
O
O
I
I
-
O
O
I/O
-
O
I/O
O
O
I
I
or
O
-
-
O
O
I
I
O
I
O
I
55
50
57-64
9, 39,54
44
22-29, 30-37
40
42
21
56
47
10,38
41
43
45
46
19
20
NA
NA
Pulse Width Modulator Output0
Analog Output (unbuffered).
External Data Bus
Vss
Power Down. Active high when powered down.
General Purpose Port I/O. Pin P0.0 can act as an external interrupt
input. All I/O pins can act as “wake up” inputs.
External Code Read Strobe
External Data Read Strobe
Reset
Test Mode
or
Pulse Width Modulator Output1 (multiplexed)
Reference Voltage = Vdd/2 or Vdd/4. Depends on software
Supply Voltage
External Code Write Strobe
External Data Write Strobe
External Hi-memory enable (low active)
External Low-memory enable (low active)
Oscillator 1 output (high frequency)
Oscillator 1 input
Oscillator 2 output (32768 Hz)
Oscillator 2 input
From the
Interactive Speech™
Line of Products
5