ETC RSC-364

RSC-300/364
Recognition • Synthesis • Control
Speech Recognition Microcontroller
GENERAL DESCRIPTION
FEATURES
The RSC-300/364, from the Interactive Speech™ family
of products, is an 8-bit microcontroller designed
specifically for speech applications in consumer
electronic products. The RSC-300/364 is a single chip
solution that combines the flexibility of a microcontroller
with advanced speech technology, including high-quality
speech recognition, speech and music synthesis, speaker
verification, and voice record and playback. Products
can use one or all of the RSC-300/364 features in a single
application.
Full Range of Sensory Speech™ 5.0 Capabilities
The RSC-300/364 supports Sensory Speech™ 5.0, the
latest speech recognition technology from Sensory, which
includes a number of new techniques that significantly
improve recognition performance over previous versions.
Using a sophisticated neural network technology, on-chip
speech recognition algorithms reach an accuracy of
greater than 97% for speaker-independent recognition
and greater than 99% for speaker-dependent recognition.
• 4 MIPS 8-bit microcontroller
In addition to the improved recognition performance, the
RSC-300/364 provides further on-chip integration of
features, including a preamplifier, multiplier, watchdog
timer, and 2.5 Kbytes of RAM. A complete system may
be built with few additional parts other than a battery,
speaker, microphone, and a few resistors and capacitors.
The RSC-300 is designed for ROM-less for applications
that need more ROM space and consequently use off-chip
memory.
• Speaker-independent speech recognition
• Speaker-dependent speech recognition
• High quality speech synthesis and sound effects
• Speaker verification
• Four-voice music synthesis
• Voice record & playback
Integrated Single-Chip Solution
• On-chip A/D and D/A converters, and pre-amplifier
• 32kHz clock for time keeping
• Internal 64 Kbytes ROM; 2.5 Kbytes RAM
• Internal 32 kHz watchdog timer
• External memory bus: 16-bit Address, 8-bit Data
• 24x24 Multiplier for rapid recognition processing
Low Power Requirements
• 2.4 – 5.25V operation for 2 or 3 battery applications
• ~10mA operating current at 3V
• Power down mode; <5 µA standby current
RSC-300/364 Block Diagram
Oscillator
Preamp
and Gain
Control
Multiplexer
ADC
Microphone
DAC
Microcontroller
AMP
RAM
ROM
(RSC-364 only)
Digital Logic
AGC
Multiplier
RSC-364
Watchdog Timer
External
General
Purpose I/O Memory
From the Interactive Speech™ Line of Products
Speaker
RSC-300/364
DATA SHEET
RSC-300/364 OVERVIEW
RECORD AND PLAYBACK
The RSC-300/364 is a member of the Interactive
Speech™ line of products from Sensory. It features a
high-performance 8-bit microcontroller with on-chip
A/D, D/A, preamplifier, RAM and ROM (RSC-364
only). The RSC-300/364 is designed to bring a high
degree of integration and versatility into low-cost, powersensitive toy applications.
The RSC-300/364 can perform audio record and
playback at various compression levels depending on the
quantity and quality of playback desired. Data rates of
under 14,000 bits per second are achievable while
maintaining very high quality reproduction. The RSC300/364 also performs silence removal to improve sound
quality and reduce memory requirements.
Various functional units have been integrated onto the
CPU core in order to reduce total system cost and
increase system reliability without degrading system
performance. The RSC-300/364 delivers 4 MIPS of
integer performance at 14.32 MHz providing maximum
performance at minimum cost.
The CPU core embedded in the RSC-300/364 is an 8-bit,
variable-length-instruction,
microcontroller.
The
instruction set is somewhat similar to the ZilogTM 78, and
has a variety of addressing mode mov instructions. The
RSC-300/364 processor avoids the limitations of
dedicated A, B, and DPTR registers by having
completely symmetrical source and destinations for all
instructions. Of the 2.5 Kbytes of internal RAM, 2
Kbytes are organized as a Data Space, with 0.5K used for
Register Space.
SPEAKER VERIFICATION
The RSC-300/364 can also perform text-dependent
speaker verification. After a speaker trains the chip on a
specific word, the chip is able to identify whether that
word is spoken by the original speaker, thus providing
biometric security.
POWER
The typical operating current is 10 mA operating at
14.32 MHz and 3V. Lowering clock frequency reduces
power consumption, although speech recognition
requires a 14.32 MHz clock. Standby current is <5µA in
power down mode.
RSC-300/364 Architecture Diagram
The RSC-300/364 uses a neural network to perform
speaker-independent or speaker-dependent speech
recognition. Speaker-dependent recognition requires
external memory to store speech recognition information
(e.g., SRAM, optional Serial EEPROM, Flash Memory).
Speaker-independent recognition requires on-chip or offchip ROM to store the words to be recognized. The
RSC-300/364 has several additional speech recognition
features as described below.
Continuous listening allows the chip to continuously
listen for a specific word. With this feature a product can
be used in a normal environment and only “activates”
when a specific word, preceded by quiet, is spoken.
AiFE1
AOFE1
AiFE2
AiNØ
PRE-AMP
A[15:0]
AOFE2
D[7:0]
EXTERNAL
MEMORY
INTERFACE
AOFE3
AiN1
SPEECH
PROCESSING
UNIT
ADC
DACOUT
BUFOUT/
PWM
XI1, XO1
ANALOG
CONTROL
TIMER1
TIMER2
XI2, XO2
The RSC-300/364 provides high-quality, low-cost fourvoice music synthesis which allows multiple,
simultaneous instruments for harmonizing. The RSC300/364 uses a MIDI-like system to generate music.
2
P0.0-P0.7
PORT
0
OSC2
448 bytes
STACK SPACE
8 levels
CPU
INTERNAL ROM (RSC-364)
-XMH
32K x 8
HIGH
-XML
32K x 8
LOW
TIMING AND
CONTROL
P1.0-P1.7
From the Interactive Speech™ Line of Products
-RESET
-TE1/
PWM
BREAK POINT
REGISTER
PORT
1
The RSC-300/364 provides high-quality speech synthesis
by using a hybrid of a time-domain compression scheme
that improves on conventional ADPCM and a customized
reuse of sounds. Speech synthesis requires on-chip or
off-chip ROM to store audio sounds for synthesis.
2K TECHNOLOGY
SRAM
REGISTER SPACE
PULSE
WIDTH
MODULATOR
OSC1
SPEECH AND MUSIC SYNTHESIS
-RDC
-WRC
-RDD
-WRD
DAC
INTERRUPT LOGIC
SPEECH RECOGNITION
DATA SHEET
RSC-300/364
external devices. There are two programmable 8-bit
counters / timers, one derived from each oscillator.
RSC-300/364 ARCHITECTURE
The RSC-300/364 is a highly integrated device that
combines:
•
8-bit microcontroller
•
On-chip ROM (64 Kbytes, RSC-364 only) and RAM
(2.5 Kbytes), and the ability to address off-chip RAM
or ROM
•
A/D converter and D/A converter
•
Input amplifier and pulse width modulator
An external microphone passes an audio signal to the
preamplifier and ADC (Analog-to-Digital Converter) to
convert the incoming speech signal into digital data. The
output audio signal of the RSC-300/364 is derived from a
DAC (Digital-to-Analog Converter) or PWM (Pulse
Width Modulator).
USING THE RSC-300/364
Creating applications using the RSC-300/364 requires
the development of electronic circuitry, software code,
and speech/music data files. Software code for the RSC300/364 can be developed by Sensory or by external
programmers using the RSC-300/364 Development Kit.
For more information about development tools and
services, please contact Sensory. A typical product will
require about $0.30 - $1.00 (in high volume) of
additional components, in addition to the RSC-300/364.
The RSC-300/364 has an external memory interface,
with 16-bit addresses and 8-bit data buses, for accessing
external memory. It also has an internal ROM (RSC-364
only) that can be enabled or disabled (partially or fully)
by pin inputs (signals , -XMH, -XML).
Two bi-directional ports provide 16 general purpose I/O
pins to communicate with external devices. The RSC300/364 has a high frequency (14.32 MHz) oscillator as
well as a low frequency (32,768 Hz) oscillator suitable
for timekeeping applications. The processor clock can be
selected from either source, with a selectable divider
value. The device performs speech recognition when
running at 14.32 MHz. The RSC-300/364 also supports
programmable wait states to allow the use of slower
The following sample circuit provides an example of how
the RSC-300/364 might be used in a consumer electronic
product.
Sample Application Circuit (Die)
R1
2.7K
C1
0.1uF
C3
4700pF
C2
220pF
U1
D0
D1
D2
D3
D4
D5
D6
D7
C14
0.01uF
VDD
D0
D1
D2
D3
D4
D5
D6
D7
NC
NC
P1/TE
P0
GND
AOFE2
AIN0
AIN1
AOFE3
DAC
AIFE2
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
D0
D1
D2
D3
D4
D5
D6
D7
R7
47
CE
OE
AT27LV512A(TSOP)
A15
A14
A13
A12
A11
A10
A9
A8
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
A7
A6
A5
A4
A3
A2
A1
A0
VDD
A15
A14
A13
A12
A11
A10
A9
A8
GND
Vdd
A7
A6
A5
A4
A3
A2
A1
VDD
C9
68pF
P1.7
P1.6
P1.5
P1.4
P1.3
P1.2
P1.1
P1.0
P0.7
P0.6
C8
0.1uF
RSC364
R2
100
R3
2.7K(TBD)
C4
100uF
LS1
SPEAKER
C14
0.01uF
AOFE1
A IFE1
Vref
XML
XMH
PDN
WRD
RDD
WRC
RDC
GND
Vdd
P0.0
P0.1
P0.2
P0.3
P0.4
P0.5
A0
XO2
XI2
XO1
XI1
RST
NC
NC
P1.7
P1.6
P1.5
P1.4
P1.3
P1.2
P1.1
P1.0
P0.7
P0.6
PDN20
/RDC
22
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
11
12
13
15
16
17
18
19
C5
0.1uF
72
71
70
69
68
67
66
65
64
63
62
61
60
59
58
57
56
55
J1
C6
100uF/16V
C7
0.022uF
INPUT-MIC
/XML
/XMH
PDN
/WRD
/RDD
/WRC
/RDC
R4
100K
P0.0
P0.1
P0.2
P0.3
P0.4
P0.5
VDD
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
10
9
8
7
6
5
4
3
25
24
21
23
2
26
27
1
C10
0.1uF
R6
400
VDD
Y1
C13
0.1uF
R5
100K
14.318MHz
C11
27pF
C12
27 pF
From the Interactive Speech™ Line of Products
3
RSC-300/364
DATA SHEET
RSC-300/364 INSTRUCTION SET
The instruction set for the RSC-300/364 has 54
instructions comprising 10 move, 7 rotate, 11 branch, 11
register arithmetic, 9 immediate arithmetic, and 6
miscellaneous instructions. All instructions are 3 bytes or
fewer, and no instruction requires more than 10 clock
cycles to execute.
GENERAL PURPOSE I/O
The RSC-300/364 has 16 general purpose I/O pins (P0.0P0.7, P1.0-P1.7). Each pin can be programmed as an
input with weak pull-up (~150kΩ equivalent device);
input with strong pull-up (~10kΩ equivalent device);
input without pull-up, or as an output.
EXTERNAL MEMORY
The RSC-300/364 includes an external memory interface
that allows connection with memory devices for speakerdependent speech recognition, audio record/playback,
and extended durations of speech and music synthesis.
Separate data and address buses allow use of standard
EPROMs, ROMs, SRAMs, and Flash memory with little
or no additional decoding. Support for separate read and
write signals for each external memory space further
simplifies interfacing. The RSC-300/364 includes 8 data
lines (D[7:0]) and 16 address lines (A[15:0]), and
associated control signals for memory interfacing.
OSCILLATORS
Two independent oscillators in the RSC-300/364 provide
a high-frequency clock and a 32kHz time-keeping clock.
Both oscillators work with an external crystal, a ceramic
resonator or LC. The oscillator characteristics are:
Oscillator #1:
Oscillator #2
Pins XI1, XO1
14.32 MHz
Pins XI2 and XO2
32768 Hz
TIMERS/COUNTERS
The two independent oscillators of the RSC-300/364
provide counts to two internal timers. Each of the two
timers consists of an 8-bit reload value register and an 8bit up-counter. The reload register is readable and
writeable by the processor.
INTERRUPTS
The RSC-300/364 allows for five interrupt sources, as
selected by software. Each has its own mask bit and
request bit in the IMR and IRQ registers respectively.
The following events can generate interrupts:
•
Positive edge on Port 0, bit 0
•
Overflow of Timer 1
•
Overflow of Timer 2
•
Sensory reserved functions
•
Completion of PWM sample period
PREAMPLIFIER
The on-chip preamplifier circuit consists of three stages
with a maximum overall gain of about 500. The
amplifier includes a Vref input that is used to set the
amplifier center voltages and must be driven by a low
impedance voltage supplied by an external source. The
signal inputs of all stages have an 80 KΩ input
impedance to the Vref pad. In a typical design, AOFE1
would be directly coupled to AIFE2, and AOFE2 would
be capacitively coupled to AIN0 through an RC lowpass
filter to remove DC offset and digital noise. AOFE3
would be bypassed to Vref with a small (220pF) capacitor
for additional noise suppression.
ANALOG OUTPUT
CLOCK
The RSC-300/364 uses a fully static core – the processor
can be stopped (by removing the clock source) and
restarted without causing a reset or losing contents of
internal registers. Static operation is guaranteed from DC
to 14.32 MHz.
Typically the processor clock runs from a 14.32 MHz
crystal with no divisor and one wait state. This creates
internal RAM cycles of 70 nsec duration and internal
ROM (RSC-364 only) or external cycles of 140 nsec
4
duration. Careful design may allow operation with
memories having access times as slow as 120 nsec.
The RSC-300/364 offers two separate options for analog
output. The DAC (Digital to Analog Converter) output
provides a general purpose 10-bit analog output that may
be used for speech output (with the inclusion of an audio
amplifier), or other purposes requiring an analog
waveform. For speech applications that require driving a
small speaker, the PWM (Pulse-Width Modulator) output
can be used instead of the DAC output. The PWM
output can directly drive a 32 ohm speaker.
PACKAGING
The RSC-300/364 can be purchased as unpackaged die or
a 64 pin TQFP package.
From the Interactive Speech™ Line of Products
DATA SHEET
RSC-300/364
DIE BOND PAD AND QFP PIN DESCRIPTIONS
19
1
64
72
20
49
1
48
RSC-364
top view of die
RSC-364
64-pin QFP
4
36
55
37
16
54
33
17
Name
A[15:0]
Die Pad
QFP Pin
20-27, 30-37
1-8, 11-18
AIN0
5
AIN1
4
AOFE1
32
Description
I/O
External Memory Address Bus
O
52
Analog In, low gain. (range AGND to AVDD/2.)
I
51
Analog In, hi gain (8X input amplitude of AIN0, same range)
I
72
49
Output of 1st stage of preamplifier
O
AOFE2
6
53
Output of 2nd stage of preamplifier
O
AOFE3
3
51
Output of 3rd stage of preamplifier
O
AIFE1
71
48
Input of 1st stage of preamplifier
I
AIFE2
1
49
Input of 2nd stage of preamplifier
I
NC
Not Connected
-
PWM0
8
55
Pulse Width Modulator Output0
O
DACOUT
2
50
Analog Output (unbuffered).
O
D[7:0]
Vss
PDN
10,11,43,44
12-19
57-64
7,28,62
9, 39,54
External Data Bus
Vss
-
67
44
43-52,53-60
22-29, 30-37
/RDC
63
40
External Code Read Strobe
O
/RDD
65
42
External Data Read Strobe
O
/RESET
42
21
Reset
/TE1 or PWM1
9
56
Test Mode or Pulse Width Modulator Output1 (multiplexed)
VREF
70
47
Reference Voltage = Vdd/2 or Vdd/4. Depends on software
P1[7:0], P0[7:0]
VDD
Power Down. Active high when powered down.
I/O
General Purpose Port I/O. Pin P0.0 can act as an external interrupt
input. All I/O pins can act as “wake up” inputs.
O
I/O
I
I or O
-
29,61
10,38
Supply Voltage
-
/WRC
64
41
External Code Write Strobe
O
/WRD
66
43
External Data Write Strobe
O
/XMH
68
45
External Hi-memory enable (low active)
I
/XML
69
46
External Low-memory enable (low active)
I
XO1
40
19
Oscillator 1 output (high frequency)
O
XI1
41
20
Oscillator 1 input
I
XO2
38
NA
Oscillator 2 output (32768 Hz)
O
XI2
39
NA
Oscillator 2 input
I
From the Interactive Speech™ Line of Products
5
RSC-300/364
DATA SHEET
DC CHARACTERISTICS
(TO = 0°C to +70°C, VDD = 2.4V – 5.25V )
SYMBOL
VIL
PARAMETER
Input Low Voltage
MIN
-0.1
TYP
MAX
0.75
V
VIH(Vcc<3.6)
Input High Voltage
0.8*Vdd
Vdd+0.3
V
VIH(Vcc>3.6)
Input High Voltage
3.0
Vdd+0.3
V
VOL
Output Low Voltage
0.1*Vdd
V
IOL= 2 mA
VOH
Output High Voltage (I/O Pins)
V
IOL= -2 mA
IIL
Logical 0 Input Current
<1
10
uA
Vss<Vpin<Vdd
IDD1
Supply Current, Active
10
20
mA
Hi-Z Outputs
IDD3
Supply Current, Powerdown
1
10
uA
Hi-Z Outputs
Rpu
Pull-up resistance P0.0-P1.7
4.5,200,
Hi-Z
kΩ
Selected with software
200
kΩ
Fixed
0.3
0.8*Vdd
I/O Pins
5,80,
Hi-Z
/XML,/XMH
UNITS
0.9*Vdd
TEST CONDITIONS
A.C. CHARACTERISTICS (EXTERNAL MEMORY ACCESSES)
(TO = 0°C to +70°C, VDD = 5V; load capacitance for outputs = 80 pF; Osc=14.32 MHz)
SYMBOL
6
PARAMETER
CPU=osc/1, 1 WS
MIN MAX
CPU=osc/2, 0WS
MIN
MAX
UNITS
1/TCL1
Processor Clock frequency
14.32
7.16
MHz
TRLRH
-RDC (-RDD) Pulse Width
140
140
ns
TRLAV
-RDC (-RDD) Low to Address valid
5
5
ns
TALRAX
Address hold after -RDC (-RDD)
0
0
ns
TRAVDV
Address valid to Valid Data In
135
135
ns
TRHDX
Data Hold after -RDC (-RDD)
TWLWH
-WRC (-WRD) Pulse Width
TAVWL
Address Valid to -WRC (-WRD)
35
70
ns
TALWAX
Address Hold after -WRC (-WRD)
35
70
ns
TWDVAV
Write Data Valid to Address Valid
TWHQX
Data Hold after -WRC (-WRD)
0
0
140
140
5
35
From the Interactive Speech™ Line of Products
ns
5
70
ns
ns
ns
DATA SHEET
RSC-300/364
TIMING DIAGRAMS
Note that the -RDC signal does not necessarily pulse for every read from code space, but may stay low for multiple cycles.
-RDD (-RDC)
TRLRH
-WRC (-WRD)
TWLWH
ADDRESS
ADDRESS
TRLAV
TALRAX
TAVWL
TALWAX
DATA
DATA
TRAVDV
TRHDX
TWDVAV
TWHQX
External Read Timing
External Write Timing
ABSOLUTE MAXIMUM RATINGS
Any pin to GND
-0.1V to +6.5V
Operating temperature (TO)
0°C to +70°C
Soldering temperature
260°C for 10 sec
Power dissipation
1W
Operating Conditions
0°C to +70°C;
VDD=2.4 - 5.25V
WARNING: Stressing the RSC-300/364 beyond the
“Absolute Maximum Ratings” may cause permanent
damage. These are stress ratings only. Operation
beyond the “Operating Conditions” is not
recommended and extended exposure beyond the
“Operating Conditions” may affect device
reliability.
VSS=0V
ORDERING INFORMATION
Part
Marketing #
Description
RSC-364 DWF
C364XS1P
Tested die in wafer form
RSC-364 Die
C364XD1B
Tested, singulated die in waffle pack
RSC-364 QFP
C364XT1T
64 pin 10 x 10 x 1.4 mm TQFP
RSC-300 DWF
C300XS1P
Tested die in wafer form
RSC-300 Die
C300XD1B
Tested, singulated die in waffle pack
RSC-300 QFP
C300XT1T
64 pin 10 x 10 x 1.4 mm TQFP
From the Interactive Speech™ Line of Products
7
RSC-300/364
DATA SHEET
THE INTERACTIVE SPEECH™ PRODUCT LINE
The Interactive Speech line of ICs and software was developed to “bring life to products” through advanced speech
recognition and audio technology. The Interactive Speech Product Line was designed for consumer telephony products and
cost-sensitive consumer electronic applications such as home electronics, personal security, and personal communication.
The product line includes award-winning RSC-series general purpose microcontrollers plus a line of easy-to-implement
chips which can be pin-configured or controlled by an external host microcontroller. Sensory’s software technologies run
on a variety of microcontrollers and DSPs.
RSC Microcontrollers
The RSC family of microcontrollers (RSC-164, RSC-200/264T, RSC-300/364) are low-cost 8-bit
microcontrollers designed for use in consumer electronics. All members of the RSC family are fully
integrated and include a speech processor, A/D, D/A, ROM (except RSC-200/300), and RAM circuitry on
chip. The RSC-200/264T and RSC-300/364 also include on chip pre-amplification. The RSC family of
microcontrollers can perform a full range of speech/audio functions including speech recognition, speaker
verification, speech and music synthesis, and voice record/playback.
Voice Direct™ TSSP
The Voice Direct TSSP provides cost-sensitive products with speaker-dependent speech recognition and speech. This
easy-to-use, pin-configurable chip requires no custom programming and can recognize up to 60 trained words in slave
mode, and 15 words in stand-alone mode. The Voice Direct TSSP is ideal for speaker-dependent command and control of
household consumer products, and is part of a complete product line that includes the IC, module, Development Kit Voice
Direct™ Speech Recognition Kit. For product developers with limited time and unlimited imagination!
Voice Dialer™ ASSP
The Voice Dialer ASSP delivers speech recognition technology that allows users to dial phone numbers by saying the
name of the person they wish to call. Voice dialing and phone directory management through speech recognition can be
easily integrated into existing products. This IC is designed for use as a slave chip controlled by an external host processor.
Voice Activation™ Software
Sensory’s Voice Activation™ software provides advanced speech technology on a variety of microcontroller and DSP
platforms. A complete speech API and flexible design allows manufacturers to easily integrate speech functionality into
telephony products.
IMPORTANT NOTICES
Sensory reserves the right to make changes to or to discontinue any product or service identified in this publication at any time without notice in order to improve design and supply the
best possible product. Sensory does not assume responsibility for use of any circuitry other than circuitry entirely embodied in a Sensory product. Information contained herein is provided
gratuitously and without liability to any user. Reasonable efforts have been made to verify the accuracy of this information but no guarantee whatsoever is given as to the accuracy or as to
its applicability to particular uses.
Applications described in this data sheet are for illustrative purposes only, and Sensory makes no warranties or representations that the RSC series of products will be suitable for such
applications. In every instance, it must be the responsibility of the user to determine the suitability of the products for each application. Sensory products are not authorized for use as
critical components in life support devices or systems.
Sensory conveys no license or title, either expressed or implied, under any patent, copyright, or mask work right to the RSC series of products, and Sensory makes balance between
recognition and synthesis no warranties or representations that the RSC series of products are free from patent, copyright, or mask work right infringement, unless otherwise specified.
Nothing contained herein shall be construed as a recommendation to use any product in violation of existing patents or other rights of third parties. The sale of any Sensory product is
subject to all Sensory Terms and Conditions of Sales and Sales Policies.
5 2 1 East Weddell Drive
Sunnyvale, CA 94089
© 1999 SENSORY, INC.
ALL RIGHTS RESERVED
P/N 80-0111-6
Sensory is registered by the U.S. Patent and Trademark Office.
All other trademarks or registered trademarks are the property
of their respective owners.
TEL: (408) 744-9000
FAX: (408) 744-1299
From the Interactive Speech™ Line of Products