RSC-164 Recognition • Synthesis • Control General Purpose Microcontroller Featuring Speech Recognition, Speech & Music Synthesis, Speaker Verification and Audio Record/Playback GENERAL DESCRIPTION FEATURES The RSC-164, from the Interactive Speech™ family of products, is a low-cost microcontroller designed for use in consumer electronics. The RSC-164 combines an 8-bit microcontroller with high-quality speaker-independent and speaker-dependent speech recognition, speech synthesis, speaker verification, four-voice music synthesis, and voice record and playback. Products can use one or all of the RSC-164 features in a single application. Full Range of Speech Capabilities The RSC-164 employs a sophisticated neural network that learns to classify sound data. On-chip speech recognition algorithms reach an accuracy of greater than 96% for speaker-independent recognition and greater than 99% for speaker-dependent recognition. Sensory’s neural network approach (patent pending) eliminates the need for expensive signal processing or extensive RAM storage. Integrated Single-Chip Solution The highly-integrated nature of the chip reduces external parts count. A complete system may be built with few additional parts other than a battery, speaker, microphone, and audio input support circuitry. Low power requirements make the RSC-164 an ideal solution for battery-powered and hand-held devices. • On-chip output amplifier for direct speaker drive • Speaker-independent speech recognition • Speaker-dependent speech recognition • High quality speech synthesis and sound effects • Speaker verification • Four-voice music synthesis • Voice record & playback • 4 MIPS 8-bit microcontroller • On-chip A/D and D/A converters, digital filtering • 32kHz clock for time keeping • Internal 64kbytes ROM; 384 bytes RAM • 16 general purpose I/O lines • External memory bus: 16-bit Address, 8-bit Data Low Power Requirements • 3.5 - 5.0V supply • ~10mA operating RSC-164 Block Diagram From the Interactive Speech™ Line of Products RSC-164 DATA SHEET Oscillator Preamp and gain control Microphone Multiplexer AMP DAC ADC Microcontroller Digital Logic RAM Speaker ROM AGC RSC-164 General Purpose I/O RSC-164 OVERVIEW The RSC-164 is a member of the Interactive Speech™ line of products from Sensory. It features a highperformance 8-bit microcontroller with on-chip A/D, D/A, RAM and ROM. The RSC-164 is designed to bring a high degree of integration and versatility into low-cost, power-sensitive consumer applications. Various functional units have been integrated onto the CPU core in order to reduce total system cost and increase system reliability without degrading system performance. The RSC-164 delivers 4 MIPS of integer performance at 14.32 MHz providing maximum performance at minimum cost. The CPU core embedded in the RSC-164 is an 8-bit, variable-length-instruction, microcontroller. The instruction set is loosely based on Intel’s 8051, and has a variety of addressing mode mov instructions. The RSC164 processor avoids the limitations of dedicated A, B, and DPTR registers by having completely symmetrical source and destinations for all instructions. The 384 bytes of internal RAM are organized as a Register Space. SPEECH RECOGNITION The RSC-164 uses a neural network to perform speakerindependent or speaker-dependent speech recognition. Speaker-dependent recognition requires external memory to store speech recognition information (e.g., SRAM, Flash Memory). Speaker-independent recognition requires on-chip or off-chip ROM to store the words to be recognized. The RSC-164 has several additional speech recognition features as described below. Continuous listening allows the chip to continuously listen for a specific word. With this feature a product can be used in a normal environment and only “activates” when a specific word, preceded by quiet, is spoken. 2 External Memory Consecutive entry allows the chip to handle several voice inputs in succession as long as each input is surrounded by one-half second of quiet. SPEECH AND MUSIC SYNTHESIS The RSC-164 provides high-quality speech synthesis by using a hybrid of a time-domain compression scheme that improves on conventional ADPCM and a customized reuse of sounds. Speech synthesis requires on-chip or off-chip ROM to store audio sounds for synthesis. The RSC-164 provides high-quality, low-cost four-voice music synthesis which allows multiple, simultaneous instruments for harmonizing. Music synthesis has low ROM requirements - a 2-3 minute song requires under 5 kbytes of incremental memory. The RSC-164 uses a MIDI-like system to generate music. RECORD AND PLAYBACK The RSC-164 can perform audio record and playback at various compression levels depending on the quantity and quality of playback desired. Data rates of under 14,000 bits per second are achievable while maintaining very high quality reproduction. The RSC-164 also performs silence removal to improve sound quality and reduce memory requirements. SPEAKER VERIFICATION The RSC-164 can also perform text-dependent speaker verification. After a speaker trains the chip on a specific word, the chip is able to identify whether that word is spoken by the original speaker, thus providing biometric security. POWER From the Interactive Speech™ Line of Products DATA SHEET RSC-164 The typical operating current is 10 mA operating at 14.32 MHz. Lowering clock frequency reduces power consumption, although speech recognition requires a 14.32 MHz clock. RSC-164 Architecture Diagram A[15:0] AIN0 AIN1 ADC MUX D[7:0] EXTERNAL MEMORY INTERFACE SH -RDC -WRC -RDD -WRD ADC DACOUT DAC ANALOG CONTROL BUFOUT /PWM REGISTER SPACE 384 bytes PULSE WIDTH MODULATOR STACK SPACE 8 bytes XI1, XO1 TIMER1 TIMER2 XI2, XO2 INTERRUPT LOGIC OSC1 CPU INTERNAL ROM -XMH 32K x 8 HIGH 32K x 8 OSC2 -XML LOW P0.0-P0.7 PORT0 TIMING AND CONTROL -RESET -TE1/ PWM P1.0-P1.7 PORT1 BREAK POINT REGISTER From the Interactive Speech™ Line of Products 3 RSC-164 DATA SHEET RSC-164 ARCHITECTURE The RSC-164 is a highly integrated device that combines: • 8-bit microcontroller • On-chip ROM (64 kbytes) and RAM (384 bytes), and the ability to address off-chip RAM or ROM • A/D converter and D/A converter The RSC-164 has an external memory interface, with 16bit addresses and a 8-bit data buses, for accessing external memory. It also has an internal ROM that can be enabled or disabled (partially or fully) by pin inputs (signals -XMH, -XML). Two bi-directional ports provide 16 general purpose I/O pins to communicate with external devices. The RSC-164 has a high frequency (14.32 MHz) oscillator as well as a low frequency (32,768 Hz) oscillator suitable for timekeeping applications. The processor clock can be selected from either source, with a selectable divider value. The device performs speech recognition when running at 14.32 MHz. The RSC-164 also supports programmable wait states to allow the use of slower external devices. There are two programmable 8-bit counters / timers, one derived from each oscillator. 4 A microphone with an external preamp converts sound into an audio signal that is fed to the RSC-164. The gain of the external preamp may be controlled by the RSC-164 by using two of the I/O lines. The RSC-164 uses an ADC (Analog-to-Digital Converter) to convert incoming analog speech signal into digital data. The output audio signal of the RSC-164 is derived from a DAC (Digital-toAnalog Converter) or PWM (Pulse Width Modulator). USING THE RSC-164 Creating applications using the RSC-164 requires the development of electronic circuitry, software code, and speech/music data files. Software code for the RSC-164 can be developed by Sensory or by external programmers using the RSC Development Kit. For more information about development tools and services, please contact Sensory. A typical product will require about $0.80 $1.50 (in high volume) of additional components, in addition to the RSC-164. The following sample circuit provides an example of how the RSC-164 might be used. From the Interactive Speech™ Line of Products DATA SHEET RSC-164 Sample Application Circuit Preamp Vcc R1 Preamp Vcc 100 4 3 2 + C1 1 - V_BIAS C3 + IGAIN0 C8 R15 13 + C7 R13 14 6 + Electret MICROPHONE mic R12 .033uF 5% 4.7K R21 R20 R11 30K 1% U1C LM324 10 R14 + 9 .22 1.5K C5 C6 8 10uF .1 15K 11 - 10K 11 .22 C4 R8 56K R10 1K R9 2.7K U1B LM324 7 11 2 1 5 4 4 4 12 J2 2 1 U1D LM324 + R7 2.2K AGND Preamp Vcc Preamp Vcc IGAIN1 PREAMP GND and AGND to be connected at the power supply near the RSC-164 AGND(analog) input J1 R6 10K 1% 5.6K 10uF 47uF Preamp GND 10uF R5 + C2 11 R4 10K X1 R3 30K 1% U1A LM324 + R2 22K + C9 1uF R16 680K 220K R17 100K R19 10K 1% R18 1K R22 120K Input Preamp 4.7K C10 .001uF 5% VDD VDD TP-1 R23 100K AVDD C11 0.1uF SPEAKER R24 LS1 100(TBD) D[0..7] C12 VDDi D7 D6 D5 D4 D3 D2 D1 D0 C13 0.1uF VDD 20 22 O0 O1 O2 O3 O4 O5 O6 O7 11 12 13 15 16 17 18 19 D0 D1 D2 D3 D4 D5 D6 D7 U2 14 CE OE 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 A15 A14 A13 A12 A11 A10 A9 A8 GND(I/O)1 VDD(I/O)1 A7 A6 A5 A4 A3 A2 A1 U3 SENSORY RSC-164 470pF DAC /XML /XMH PDN /WRD /RDD /WRC /RDC GND(I/O)2 VDD(I/O)2 P0.0 P0.1 P0.2 P0.3 P0.4 P0.5 P0.6 To use ext EPROM(U4): use R26 To use int ROM in RSC-164: use R25 R25 47K 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 R26 47K VDDi RDC C15 0.1 DECOUPLING CAPS U4 8 7 6 5 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 Power Supply C17 0.1 VDD IGAIN0 27C512 VDD P0.7 P1.0 P1.1 P1.2 P1.3 P1.4 P1.5 P1.6 P1.7 VDD(core)2 GND(core)2 /RESET XI1 XO1 XO2 XI2 A0 RDC Transistor should have low (.1V) Vce(sat), @ a forced beta of 100 @ 100mA Ic. V(br)ebo should be above 5V. NEC's 9012H is an acceptable example. A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 SH A_IN_1 A_IN_0 GND(analog) BUF_OUT/PWM0 TE/PWM1 VCC(ANALOG) VDD(core)1 GND(core)1 D7 D6 D5 D4 D3 D2 D1 D0 Use EPROM if not ROM MASK 10 9 8 7 6 5 4 3 25 24 21 23 2 26 27 1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 61 62 63 64 65 66 67 68 1 2 3 4 5 6 7 8 9 28 IGAIN1 VCC A0 NC A1 SCL A2 SDA VSS 1 2 3 4 U2-28 U2-14 C14 U1-4 VDD 0.1uF U1-11 C16 0.1uF 24LC65 PNP 10 1.5V 2.2 2.2 3.9K PREAMP Vcc 1.5V AVDD VDD + + VDDi + 47uF Power Switch C18 0.1 C22 0.1uF + 47uF 47uF 47uF 1.5V PREAMP GND VDD R27 100K RSC AGND Reset Circuit C19 C20 27pF 27pF Y1 14.3MHz Ceramic Resonator or Crystal Oscillator DGND RSC-164 INSTRUCTION SET The instruction set for the RSC-164 has 52 instructions comprising 8 move, 7 rotate, 11 branch, 11 register arithmetic, 9 immediate arithmetic, and 6 miscellaneous instructions. All instructions are 3 bytes or fewer, and no instruction requires more than 8 clock cycles to execute. GENERAL PURPOSE I/O The RSC-164 has 16 general purpose I/O pins (P0.0P0.7, P1.0-P1.7). Each pin can be programmed as an input with weak pull-up (~200kΩ equivalent device); input with strong pull-up (~10kΩ equivalent device); input without pull-up, or as an output. This is accomplished by having 32 bits of configuration registers for the I/O pins (Port Control Register A and Port Control Register B for ports 0 and 1). extended durations of speech and music synthesis, and enhanced product functionality. Separate data and address buses allow use of standard EPROMs, ROMs, SRAMs, and flash memory with little or no additional decoding. Provision of separate read and write signals for each external memory space further simplifies interfacing. The RSC-164 includes 8 data lines (D[7:0]) and 16 address lines (A[15:0]), along with associated control signals for interfacing to external memory. Using flash memory and EEPROM will require custom code development. The RSC-164 can connect serially through two I/O lines to a serial EEPROM for applications with low data storage requirements. OSCILLATORS Two independent oscillators in the RSC-164 provide a high-frequency clock and a 32kHz time-keeping clock. The oscillator characteristics are as follows: EXTERNAL MEMORY The RSC-164 includes an external memory interface that allows connection with memory devices for speakerdependent speech recognition, audio record/playback, Oscillator #1: Pins XI1, XO1 14.32 MHz (3.5V-5.0V) From the Interactive Speech™ Line of Products 5 RSC-164 Oscillator #2 DATA SHEET Pins XI2 and XO2 32768 Hz (3.5V-5.0V) Oscillator #1 works with an external crystal, a ceramic resonator or LC. Use of Oscillator #2 requires a crystal for precision timing. PACKAGING The RSC-164 can be purchased as bare unpackaged die or packaged in 68-pin PLCC, 64-pin QFP packages, or 68- pin COB. CLOCK The RSC-164 uses a fully static core – the processor can be stopped (by removing the clock source) and restarted without causing a reset or losing contents of internal registers. Static operation is guaranteed from DC to 14.32 MHz. Typically the processor clock runs from a 14.32 MHz crystal with no divisor and one wait state. This creates internal RAM cycles of 70 nsec duration and internal ROM or external cycles of 140 nsec duration. Careful design of external decoding logic and close analysis of gate delays may allow operation with memories having access times as slow as 120 nsec. TIMERS/COUNTERS The two independent oscillators of the RSC-164 provide counts to two internal timers. Each of the two timers consists of an 8-bit reload value register and an 8-bit upcounter. The reload register is readable and writeable by the processor. INTERRUPTS The RSC-164 allows for five interrupt sources, as selected by software. Each has its own mask bit and request bit in the IMR and IRQ registers respectively. The following events can generate interrupts: • Positive edge on Port 0, bit 0 • Overflow of Timer 1 • Overflow of Timer 2 • Sensory reserved functions • Completion of PWM sample period ANALOG OUTPUT The RSC-164 offers two separate options for analog output. The DAC (Digital to Analog Converter) output provides a general purpose 10-bit analog output that may be used for speech output (with the inclusion of an audio amplifier), or other purposes requiring an analog waveform. For speech applications that require driving a small speaker, the PWM (Pulse-Width Modulator) output can be used instead of the DAC output. The PWM output can directly drive a 32 ohm speaker. 6 From the Interactive Speech™ Line of Products DATA SHEET RSC-164 DIE BOND PAD, PLCC AND QFP PIN DRAWINGS 64 49 1 48 RSC-164 64-pin QFP 4 16 33 17 Name PLCC Pin/ Die Pad AGND 64 52 A[15:0] QFP Pin Description 32 I/O Analog Ground. For noise reasons, analog and digital grounds should connect together only at the RSC-164. - 10-17, 20-27 1-8, 11-18 External Memory Address Bus O AIN0 63 51 Analog In, low gain. (range AGND to AVDD/2.) I AIN1 62 50 Analog In, hi gain (8X input amplitude of AIN0, same range) I AVDD 67 55 Analog Voltage. For noise reasons, keep this supply independent of digital circuitry. - PWM0 65 53 Pulse Width Modulator Output0 O DACOUT 60 48 Analog Output (unbuffered). O D[7:0] 2-9 57-64 GND 1, 18, 33, 52 9, 22 41, 56 PDN External Data Bus Digital Ground, CPU core (pins 1 and 33) and I/O (pins 18and 52) - 57 NA 35-42, 43-50 24-31, 32-39 -RDC 53 42 External Code Read Strobe O -RDD 55 44 External Data Read Strobe O -RESET 32 21 Reset I SH 61 49 Sample and Hold. Connect a 470 pF capacitor from here to AGND. -TE1/PWM1 66 54 Test Mode or Pulse Width Modulator Output1 (multiplexed) Digital Supply Voltage (core) - Digital Supply Voltage (I/O line) - P1[7:0], P0[7:0] Power Down. Active high when powered down. I/O General Purpose Port I/O. Pin P0.0 can act as an external interrupt input. All I/O pins can act as “wake up” inputs. O I/O I I or O VDD 34, 68 23 VDDi 19, 51 10, 40 -WRC 54 43 External Code Write Strobe O -WRD 56 45 External Data Write Strobe O -XMH 58 46 External Hi-memory enable (low active) I -XML 59 47 External Low-memory enable (low active) I XO1 30 19 Oscillator 1 output (high frequency) O XI1 31 20 Oscillator 1 input I XO2 29 NA Oscillator 2 output (32768 Hz) O XI2 28 NA Oscillator 2 input I From the Interactive Speech™ Line of Products 7 RSC-164 DATA SHEET DC CHARACTERISTICS (TO = 0°C to +70°C, Vdd = 5V ) SYMBO L VIL PARAMETER Input Low Voltage VIH Input High Voltage VOL Output Low Voltage VOH Output High Voltage IIL Logical 0 Input Current ICC1 MIN TYP MAX UNITS -0.1 0.75 V 2.5 Vdd+0.5 V 0.5 V IOL= 4 mA 4.3 V IOL= -4 mA Digital Supply Current 10 mA Osc1 Freq=14.32 MHz, CPU clock divide by 1 ICC2 Analog Supply Current 0.15 mA Osc1 Freq=14.32 MHz, CPU clock divide by 1 ICC3 Digital Supply Current, Standby Power-down mode ICC4 Analog Supply Current, Standby Power-down mode Rpu Pull-up resistance P0.0-P1.7 0.3 4.0 10 400 Hi-Z kΩ TEST CONDITIONS selected with software A.C. CHARACTERISTICS (EXTERNAL MEMORY ACCESSES) (TO = 0°C to +70°C, Vdd = 5V; load capacitance for outputs = 80 pF; Osc=14.32 MHz) SYMBOL 8 PARAMETER CPU=osc/1, 1 WS MIN MAX CPU=osc/2, 0WS MIN MAX UNITS 1/TCL1 Processor Clock frequency 14.32 7.16 MHz TRLRH -RDC (-RDD) Pulse Width 140 140 ns TRLAV -RDC (-RDD) Low to Address valid 5 5 ns TALRAX Address hold after -RDC (-RDD) 0 0 ns TRAVDV Address valid to Valid Data In 135 135 ns TRHDX Data Hold after -RDC (-RDD) TWLWH -WRC (-WRD) Pulse Width TAVWL Address Valid to -WRC (-WRD) 35 70 ns TALWAX Address Hold after -WRC (-WRD) 35 70 ns TWDVAV Write Data Valid to Address Valid TWHQX Data Hold after -WRC (-WRD) 0 0 140 140 5 35 From the Interactive Speech™ Line of Products ns 5 70 ns ns ns DATA SHEET RSC-164 ABSOLUTE MAXIMUM RATINGS Any pin to GND -0.1V to +7.5V Operating temperature (TO) 0°C to +70°C Soldering temperature 260°C for 10 sec Power dissipation TBD Operating Conditions 0°C to +70°C; VDD=3.5 - 5.0V; WARNING: Stressing the RSC-164 beyond the “Absolute Maximum Ratings” may cause permanent damage. These are stress ratings only. Operation beyond the “Operating Conditions” is not recommended and extended exposure beyond the “Operating Conditions” may affect device reliability. VSS=0V TIMING DIAGRAMS Note that the -RDC signal does not necessarily pulse for every read from code space, but may stay low for multiple cycles. -RDD (-RDC) TRLRH -WRC (-WRD) TWLWH ADDRESS ADDRESS TRLAV TALRAX TAVWL TALWAX DATA DATA TRAVDV TRHDX TWDVAV TWHQX External Read Timing External Write Timing ORDERING INFORMATION Part Suffix Description RSC-164 none Unpackaged RSC-164 in die form RSC-164 P RSC-164 in 68-pin PLCC package RSC-164 Q RSC-164 in 64-pin QFP package RSC-164 C RSC-164 in 68-pin COB package From the Interactive Speech™ Line of Products 9 RSC-164 DATA SHEET THE INTERACTIVE SPEECH™ PRODUCT LINE The Interactive Speech line of ICs was developed to “bring life to products” through advanced speech and audio technology. These chips allow products to think, talk, hear and play music. The Interactive Speech chips were designed for consumer telephony products and cost-sensitive consumer electronic applications such as home electronics, personal security, and personal communication. The product line includes general purpose microcontrollers (RSC-164, RSC-164i, RSC-132i) and application specific standard speech ASSPs (Voice Password™ and Voice Direct™). RSC-164i The RSC-164i is very similar in functionality to the RSC-164 and can perform speech recognition, speaker verification, speech and music synthesis, and general product control. This chip requires a custom mask of the on-chip ROM for each customer application and is ideal for high volume applications. This chip has limited I/O pins and limited access to external memory. RSC-132i The RSC-132i, the lowest cost member of the RSC series, is a speech enabled microcontroller designed specifically for the toy industry. The RSC-132i can support various combinations of speech technologies. The RSC-132i combines an 8-bit microcontroller with high quality speaker-independent and speaker-dependent speech recognition, speech synthesis, and speaker verification. This chips has 32 kBytes of ROM and limited I/O pins Voice Password™ ASSP The Voice Password ASSP provides consumer products with low cost biometric security. The chip lets products “lock out” unauthorized access by verifying key words and/or voices. Using text dependent speaker verification technology, Voice Password can secure products at a variety of security thresholds for many different applications. Voice Direct™ ASSP The Voice Direct ASSP provides cost-sensitive products with speaker-dependent speech recognition, speech synthesis and DTMF tone generation. This easy-to-use, pin-configurable chip requires no custom programming and can recognize up to 60 trained words. The Voice Direct ASSP is most ideal for consumer telephony products which feature voice dialing. IMPORTANT NOTICES Sensory reserves the right to make changes to or to discontinue any product or service identified in this publication at any time without notice in order to improve design and supply the best possible product. Sensory does not assume responsibility for use of any circuitry other than circuitry entirely embodied in a Sensory product. Information contained herein is provided gratuitously and without liability to any user. Reasonable efforts have been made to verify the accuracy of this information but no guarantee whatsoever is given as to the accuracy or as to its applicability to particular uses. Applications described in this data sheet are for illustrative purposes only, and Sensory makes no warranties or representations that the RSC series of products will be suitable for such applications. In every instance, it must be the responsibility of the user to determine the suitability of the products for each application. Sensory products are not authorized for use as critical components in life support devices or systems. Sensory conveys no license or title, either expressed or implied, under any patent, copyright, or mask work right to the RSC series of products, and Sensory makes balance between recognition and synthesis no warranties or representations that the RSC series of products are free from patent, copyright, or mask work right infringement, unless otherwise specified. Nothing contained herein shall be construed as a recommendation to use any product in violation of existing patents or other rights of third parties. The sale of any Sensory product is subject to all Sensory Terms and Conditions of Sales and Sales Policies. 5 2 1 East Weddell Drive Sunnyvale, CA 94089 © 1996 SENSORY, INC. ALL RIGHTS RESERVED P/N 80-0015-6 Sensory is registered by the U.S. Patent and Trademark Office. All other trademarks or registered trademarks are the property of their respective owners. TEL: (408) 744-9000 FAX: (408) 744-1299 From the Interactive Speech™ Line of Products