RSC-300/364 Recognition • Synthesis • Control Speech Recognition Microcontroller GENERAL DESCRIPTION FEATURES The RSC-300/364, from the Interactive Speech™ family of products, is an 8-bit microcontroller designed specifically for speech applications in consumer electronic products. The RSC-300/364 is a single chip solution that combines the flexibility of a microcontroller with advanced speech technology, including high-quality speech recognition, speech and music synthesis, speaker verification, and voice record and playback. Products can use one or all of the RSC-300/364 features in a single application. Full Range of Sensory Speech™ 5.0 Capabilities The RSC-300/364 supports Sensory Speech™ 5.0, the latest speech recognition technology from Sensory, which includes a number of new techniques that significantly improve recognition performance over previous versions. Using a sophisticated neural network technology, on-chip speech recognition algorithms reach an accuracy of greater than 97% for speaker-independent recognition and greater than 99% for speaker-dependent recognition. • 4 MIPS 8-bit microcontroller In addition to the improved recognition performance, the RSC-300/364 provides further on-chip integration of features, including a preamplifier, multiplier, watchdog timer, and 2.5 Kbytes of RAM. A complete system may be built with few additional parts other than a battery, speaker, microphone, and a few resistors and capacitors. The RSC-300 is designed for ROM-less for applications that need more ROM space and consequently use off-chip memory. • Speaker-independent speech recognition • Speaker-dependent speech recognition • High quality speech synthesis and sound effects • Speaker verification • Four-voice music synthesis • Voice record & playback Integrated Single-Chip Solution • On-chip A/D and D/A converters, and pre-amplifier • 32kHz clock for time keeping • Internal 64 Kbytes ROM; 2.5 Kbytes RAM • Internal 32 kHz watchdog timer • External memory bus: 16-bit Address, 8-bit Data • 24x24 Multiplier for rapid recognition processing Low Power Requirements • 2.4 – 5.25V operation for 2 or 3 battery applications • ~10mA operating current at 3V • Power down mode; <5 µA standby current RSC-300/364 Block Diagram Oscillator Preamp and Gain Control Multiplexer ADC Microphone DAC Microcontroller AMP RAM ROM (RSC-364 only) Digital Logic AGC Multiplier RSC-364 Watchdog Timer External General Purpose I/O Memory From the Interactive Speech™ Line of Products Speaker RSC-300/364 DATA SHEET RSC-300/364 OVERVIEW RECORD AND PLAYBACK The RSC-300/364 is a member of the Interactive Speech™ line of products from Sensory. It features a high-performance 8-bit microcontroller with on-chip A/D, D/A, preamplifier, RAM and ROM (RSC-364 only). The RSC-300/364 is designed to bring a high degree of integration and versatility into low-cost, powersensitive toy applications. The RSC-300/364 can perform audio record and playback at various compression levels depending on the quantity and quality of playback desired. Data rates of under 14,000 bits per second are achievable while maintaining very high quality reproduction. The RSC300/364 also performs silence removal to improve sound quality and reduce memory requirements. Various functional units have been integrated onto the CPU core in order to reduce total system cost and increase system reliability without degrading system performance. The RSC-300/364 delivers 4 MIPS of integer performance at 14.32 MHz providing maximum performance at minimum cost. The CPU core embedded in the RSC-300/364 is an 8-bit, variable-length-instruction, microcontroller. The instruction set is somewhat similar to the ZilogTM 78, and has a variety of addressing mode mov instructions. The RSC-300/364 processor avoids the limitations of dedicated A, B, and DPTR registers by having completely symmetrical source and destinations for all instructions. Of the 2.5 Kbytes of internal RAM, 2 Kbytes are organized as a Data Space, with 0.5K used for Register Space. SPEAKER VERIFICATION The RSC-300/364 can also perform text-dependent speaker verification. After a speaker trains the chip on a specific word, the chip is able to identify whether that word is spoken by the original speaker, thus providing biometric security. POWER The typical operating current is 10 mA operating at 14.32 MHz and 3V. Lowering clock frequency reduces power consumption, although speech recognition requires a 14.32 MHz clock. Standby current is <5µA in power down mode. RSC-300/364 Architecture Diagram The RSC-300/364 uses a neural network to perform speaker-independent or speaker-dependent speech recognition. Speaker-dependent recognition requires external memory to store speech recognition information (e.g., SRAM, optional Serial EEPROM, Flash Memory). Speaker-independent recognition requires on-chip or offchip ROM to store the words to be recognized. The RSC-300/364 has several additional speech recognition features as described below. Continuous listening allows the chip to continuously listen for a specific word. With this feature a product can be used in a normal environment and only “activates” when a specific word, preceded by quiet, is spoken. AiFE1 AOFE1 AiFE2 AiNØ PRE-AMP A[15:0] AOFE2 D[7:0] EXTERNAL MEMORY INTERFACE AOFE3 AiN1 SPEECH PROCESSING UNIT ADC DACOUT BUFOUT/ PWM XI1, XO1 ANALOG CONTROL TIMER1 TIMER2 XI2, XO2 The RSC-300/364 provides high-quality, low-cost fourvoice music synthesis which allows multiple, simultaneous instruments for harmonizing. The RSC300/364 uses a MIDI-like system to generate music. 2 P0.0-P0.7 PORT 0 OSC2 448 bytes STACK SPACE 8 levels CPU INTERNAL ROM (RSC-364) -XMH 32K x 8 HIGH -XML 32K x 8 LOW TIMING AND CONTROL P1.0-P1.7 From the Interactive Speech™ Line of Products -RESET -TE1/ PWM BREAK POINT REGISTER PORT 1 The RSC-300/364 provides high-quality speech synthesis by using a hybrid of a time-domain compression scheme that improves on conventional ADPCM and a customized reuse of sounds. Speech synthesis requires on-chip or off-chip ROM to store audio sounds for synthesis. 2K TECHNOLOGY SRAM REGISTER SPACE PULSE WIDTH MODULATOR OSC1 SPEECH AND MUSIC SYNTHESIS -RDC -WRC -RDD -WRD DAC INTERRUPT LOGIC SPEECH RECOGNITION DATA SHEET RSC-300/364 external devices. There are two programmable 8-bit counters / timers, one derived from each oscillator. RSC-300/364 ARCHITECTURE The RSC-300/364 is a highly integrated device that combines: • 8-bit microcontroller • On-chip ROM (64 Kbytes, RSC-364 only) and RAM (2.5 Kbytes), and the ability to address off-chip RAM or ROM • A/D converter and D/A converter • Input amplifier and pulse width modulator An external microphone passes an audio signal to the preamplifier and ADC (Analog-to-Digital Converter) to convert the incoming speech signal into digital data. The output audio signal of the RSC-300/364 is derived from a DAC (Digital-to-Analog Converter) or PWM (Pulse Width Modulator). USING THE RSC-300/364 Creating applications using the RSC-300/364 requires the development of electronic circuitry, software code, and speech/music data files. Software code for the RSC300/364 can be developed by Sensory or by external programmers using the RSC-300/364 Development Kit. For more information about development tools and services, please contact Sensory. A typical product will require about $0.30 - $1.00 (in high volume) of additional components, in addition to the RSC-300/364. The RSC-300/364 has an external memory interface, with 16-bit addresses and 8-bit data buses, for accessing external memory. It also has an internal ROM (RSC-364 only) that can be enabled or disabled (partially or fully) by pin inputs (signals , -XMH, -XML). Two bi-directional ports provide 16 general purpose I/O pins to communicate with external devices. The RSC300/364 has a high frequency (14.32 MHz) oscillator as well as a low frequency (32,768 Hz) oscillator suitable for timekeeping applications. The processor clock can be selected from either source, with a selectable divider value. The device performs speech recognition when running at 14.32 MHz. The RSC-300/364 also supports programmable wait states to allow the use of slower The following sample circuit provides an example of how the RSC-300/364 might be used in a consumer electronic product. Sample Application Circuit (Die) R1 2.7K C1 0.1uF C3 4700pF C2 220pF U1 D0 D1 D2 D3 D4 D5 D6 D7 C14 0.01uF VDD D0 D1 D2 D3 D4 D5 D6 D7 NC NC P1/TE P0 GND AOFE2 AIN0 AIN1 AOFE3 DAC AIFE2 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 D0 D1 D2 D3 D4 D5 D6 D7 R7 47 CE OE AT27LV512A(TSOP) A15 A14 A13 A12 A11 A10 A9 A8 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 A7 A6 A5 A4 A3 A2 A1 A0 VDD A15 A14 A13 A12 A11 A10 A9 A8 GND Vdd A7 A6 A5 A4 A3 A2 A1 VDD C9 68pF P1.7 P1.6 P1.5 P1.4 P1.3 P1.2 P1.1 P1.0 P0.7 P0.6 C8 0.1uF RSC364 R2 100 R3 2.7K(TBD) C4 100uF LS1 SPEAKER C14 0.01uF AOFE1 A IFE1 Vref XML XMH PDN WRD RDD WRC RDC GND Vdd P0.0 P0.1 P0.2 P0.3 P0.4 P0.5 A0 XO2 XI2 XO1 XI1 RST NC NC P1.7 P1.6 P1.5 P1.4 P1.3 P1.2 P1.1 P1.0 P0.7 P0.6 PDN20 /RDC 22 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 11 12 13 15 16 17 18 19 C5 0.1uF 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 J1 C6 100uF/16V C7 0.022uF INPUT-MIC /XML /XMH PDN /WRD /RDD /WRC /RDC R4 100K P0.0 P0.1 P0.2 P0.3 P0.4 P0.5 VDD 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 10 9 8 7 6 5 4 3 25 24 21 23 2 26 27 1 C10 0.1uF R6 400 VDD Y1 C13 0.1uF R5 100K 14.318MHz C11 27pF C12 27 pF From the Interactive Speech™ Line of Products 3 RSC-300/364 DATA SHEET RSC-300/364 INSTRUCTION SET The instruction set for the RSC-300/364 has 54 instructions comprising 10 move, 7 rotate, 11 branch, 11 register arithmetic, 9 immediate arithmetic, and 6 miscellaneous instructions. All instructions are 3 bytes or fewer, and no instruction requires more than 10 clock cycles to execute. GENERAL PURPOSE I/O The RSC-300/364 has 16 general purpose I/O pins (P0.0P0.7, P1.0-P1.7). Each pin can be programmed as an input with weak pull-up (~150kΩ equivalent device); input with strong pull-up (~10kΩ equivalent device); input without pull-up, or as an output. EXTERNAL MEMORY The RSC-300/364 includes an external memory interface that allows connection with memory devices for speakerdependent speech recognition, audio record/playback, and extended durations of speech and music synthesis. Separate data and address buses allow use of standard EPROMs, ROMs, SRAMs, and Flash memory with little or no additional decoding. Support for separate read and write signals for each external memory space further simplifies interfacing. The RSC-300/364 includes 8 data lines (D[7:0]) and 16 address lines (A[15:0]), and associated control signals for memory interfacing. OSCILLATORS Two independent oscillators in the RSC-300/364 provide a high-frequency clock and a 32kHz time-keeping clock. Both oscillators work with an external crystal, a ceramic resonator or LC. The oscillator characteristics are: Oscillator #1: Oscillator #2 Pins XI1, XO1 14.32 MHz Pins XI2 and XO2 32768 Hz TIMERS/COUNTERS The two independent oscillators of the RSC-300/364 provide counts to two internal timers. Each of the two timers consists of an 8-bit reload value register and an 8bit up-counter. The reload register is readable and writeable by the processor. INTERRUPTS The RSC-300/364 allows for five interrupt sources, as selected by software. Each has its own mask bit and request bit in the IMR and IRQ registers respectively. The following events can generate interrupts: • Positive edge on Port 0, bit 0 • Overflow of Timer 1 • Overflow of Timer 2 • Sensory reserved functions • Completion of PWM sample period PREAMPLIFIER The on-chip preamplifier circuit consists of three stages with a maximum overall gain of about 500. The amplifier includes a Vref input that is used to set the amplifier center voltages and must be driven by a low impedance voltage supplied by an external source. The signal inputs of all stages have an 80 KΩ input impedance to the Vref pad. In a typical design, AOFE1 would be directly coupled to AIFE2, and AOFE2 would be capacitively coupled to AIN0 through an RC lowpass filter to remove DC offset and digital noise. AOFE3 would be bypassed to Vref with a small (220pF) capacitor for additional noise suppression. ANALOG OUTPUT CLOCK The RSC-300/364 uses a fully static core – the processor can be stopped (by removing the clock source) and restarted without causing a reset or losing contents of internal registers. Static operation is guaranteed from DC to 14.32 MHz. Typically the processor clock runs from a 14.32 MHz crystal with no divisor and one wait state. This creates internal RAM cycles of 70 nsec duration and internal ROM (RSC-364 only) or external cycles of 140 nsec 4 duration. Careful design may allow operation with memories having access times as slow as 120 nsec. The RSC-300/364 offers two separate options for analog output. The DAC (Digital to Analog Converter) output provides a general purpose 10-bit analog output that may be used for speech output (with the inclusion of an audio amplifier), or other purposes requiring an analog waveform. For speech applications that require driving a small speaker, the PWM (Pulse-Width Modulator) output can be used instead of the DAC output. The PWM output can directly drive a 32 ohm speaker. PACKAGING The RSC-300/364 can be purchased as unpackaged die or a 64 pin TQFP package. From the Interactive Speech™ Line of Products DATA SHEET RSC-300/364 DIE BOND PAD AND QFP PIN DESCRIPTIONS 19 1 64 72 20 49 1 48 RSC-364 top view of die RSC-364 64-pin QFP 4 36 55 37 16 54 33 17 Name A[15:0] Die Pad QFP Pin 20-27, 30-37 1-8, 11-18 AIN0 5 AIN1 4 AOFE1 32 Description I/O External Memory Address Bus O 52 Analog In, low gain. (range AGND to AVDD/2.) I 51 Analog In, hi gain (8X input amplitude of AIN0, same range) I 72 49 Output of 1st stage of preamplifier O AOFE2 6 53 Output of 2nd stage of preamplifier O AOFE3 3 51 Output of 3rd stage of preamplifier O AIFE1 71 48 Input of 1st stage of preamplifier I AIFE2 1 49 Input of 2nd stage of preamplifier I NC Not Connected - PWM0 8 55 Pulse Width Modulator Output0 O DACOUT 2 50 Analog Output (unbuffered). O D[7:0] Vss PDN 10,11,43,44 12-19 57-64 7,28,62 9, 39,54 External Data Bus Vss - 67 44 43-52,53-60 22-29, 30-37 /RDC 63 40 External Code Read Strobe O /RDD 65 42 External Data Read Strobe O /RESET 42 21 Reset /TE1 or PWM1 9 56 Test Mode or Pulse Width Modulator Output1 (multiplexed) VREF 70 47 Reference Voltage = Vdd/2 or Vdd/4. Depends on software P1[7:0], P0[7:0] VDD Power Down. Active high when powered down. I/O General Purpose Port I/O. Pin P0.0 can act as an external interrupt input. All I/O pins can act as “wake up” inputs. O I/O I I or O - 29,61 10,38 Supply Voltage - /WRC 64 41 External Code Write Strobe O /WRD 66 43 External Data Write Strobe O /XMH 68 45 External Hi-memory enable (low active) I /XML 69 46 External Low-memory enable (low active) I XO1 40 19 Oscillator 1 output (high frequency) O XI1 41 20 Oscillator 1 input I XO2 38 NA Oscillator 2 output (32768 Hz) O XI2 39 NA Oscillator 2 input I From the Interactive Speech™ Line of Products 5 RSC-300/364 DATA SHEET DC CHARACTERISTICS (TO = 0°C to +70°C, VDD = 2.4V – 5.25V ) SYMBOL VIL PARAMETER Input Low Voltage MIN -0.1 TYP MAX 0.75 V VIH(Vcc<3.6) Input High Voltage 0.8*Vdd Vdd+0.3 V VIH(Vcc>3.6) Input High Voltage 3.0 Vdd+0.3 V VOL Output Low Voltage 0.1*Vdd V IOL= 2 mA VOH Output High Voltage (I/O Pins) V IOL= -2 mA IIL Logical 0 Input Current <1 10 uA Vss<Vpin<Vdd IDD1 Supply Current, Active 10 20 mA Hi-Z Outputs IDD3 Supply Current, Powerdown 1 10 uA Hi-Z Outputs Rpu Pull-up resistance P0.0-P1.7 4.5,200, Hi-Z kΩ Selected with software 200 kΩ Fixed 0.3 0.8*Vdd I/O Pins 5,80, Hi-Z /XML,/XMH UNITS 0.9*Vdd TEST CONDITIONS A.C. CHARACTERISTICS (EXTERNAL MEMORY ACCESSES) (TO = 0°C to +70°C, VDD = 5V; load capacitance for outputs = 80 pF; Osc=14.32 MHz) SYMBOL 6 PARAMETER CPU=osc/1, 1 WS MIN MAX CPU=osc/2, 0WS MIN MAX UNITS 1/TCL1 Processor Clock frequency 14.32 7.16 MHz TRLRH -RDC (-RDD) Pulse Width 140 140 ns TRLAV -RDC (-RDD) Low to Address valid 5 5 ns TALRAX Address hold after -RDC (-RDD) 0 0 ns TRAVDV Address valid to Valid Data In 135 135 ns TRHDX Data Hold after -RDC (-RDD) TWLWH -WRC (-WRD) Pulse Width TAVWL Address Valid to -WRC (-WRD) 35 70 ns TALWAX Address Hold after -WRC (-WRD) 35 70 ns TWDVAV Write Data Valid to Address Valid TWHQX Data Hold after -WRC (-WRD) 0 0 140 140 5 35 From the Interactive Speech™ Line of Products ns 5 70 ns ns ns DATA SHEET RSC-300/364 TIMING DIAGRAMS Note that the -RDC signal does not necessarily pulse for every read from code space, but may stay low for multiple cycles. -RDD (-RDC) TRLRH -WRC (-WRD) TWLWH ADDRESS ADDRESS TRLAV TALRAX TAVWL TALWAX DATA DATA TRAVDV TRHDX TWDVAV TWHQX External Read Timing External Write Timing ABSOLUTE MAXIMUM RATINGS Any pin to GND -0.1V to +6.5V Operating temperature (TO) 0°C to +70°C Soldering temperature 260°C for 10 sec Power dissipation 1W Operating Conditions 0°C to +70°C; VDD=2.4 - 5.25V WARNING: Stressing the RSC-300/364 beyond the “Absolute Maximum Ratings” may cause permanent damage. These are stress ratings only. Operation beyond the “Operating Conditions” is not recommended and extended exposure beyond the “Operating Conditions” may affect device reliability. VSS=0V ORDERING INFORMATION Part Marketing # Description RSC-364 DWF C364XS1P Tested die in wafer form RSC-364 Die C364XD1B Tested, singulated die in waffle pack RSC-364 QFP C364XT1T 64 pin 10 x 10 x 1.4 mm TQFP RSC-300 DWF C300XS1P Tested die in wafer form RSC-300 Die C300XD1B Tested, singulated die in waffle pack RSC-300 QFP C300XT1T 64 pin 10 x 10 x 1.4 mm TQFP From the Interactive Speech™ Line of Products 7 RSC-300/364 DATA SHEET THE INTERACTIVE SPEECH™ PRODUCT LINE The Interactive Speech line of ICs and software was developed to “bring life to products” through advanced speech recognition and audio technology. The Interactive Speech Product Line was designed for consumer telephony products and cost-sensitive consumer electronic applications such as home electronics, personal security, and personal communication. The product line includes award-winning RSC-series general purpose microcontrollers plus a line of easy-to-implement chips which can be pin-configured or controlled by an external host microcontroller. Sensory’s software technologies run on a variety of microcontrollers and DSPs. RSC Microcontrollers The RSC family of microcontrollers (RSC-164, RSC-200/264T, RSC-300/364) are low-cost 8-bit microcontrollers designed for use in consumer electronics. All members of the RSC family are fully integrated and include a speech processor, A/D, D/A, ROM (except RSC-200/300), and RAM circuitry on chip. The RSC-200/264T and RSC-300/364 also include on chip pre-amplification. The RSC family of microcontrollers can perform a full range of speech/audio functions including speech recognition, speaker verification, speech and music synthesis, and voice record/playback. Voice Direct™ TSSP The Voice Direct TSSP provides cost-sensitive products with speaker-dependent speech recognition and speech. This easy-to-use, pin-configurable chip requires no custom programming and can recognize up to 60 trained words in slave mode, and 15 words in stand-alone mode. The Voice Direct TSSP is ideal for speaker-dependent command and control of household consumer products, and is part of a complete product line that includes the IC, module, Development Kit Voice Direct™ Speech Recognition Kit. For product developers with limited time and unlimited imagination! Voice Dialer™ ASSP The Voice Dialer ASSP delivers speech recognition technology that allows users to dial phone numbers by saying the name of the person they wish to call. Voice dialing and phone directory management through speech recognition can be easily integrated into existing products. This IC is designed for use as a slave chip controlled by an external host processor. Voice Activation™ Software Sensory’s Voice Activation™ software provides advanced speech technology on a variety of microcontroller and DSP platforms. A complete speech API and flexible design allows manufacturers to easily integrate speech functionality into telephony products. IMPORTANT NOTICES Sensory reserves the right to make changes to or to discontinue any product or service identified in this publication at any time without notice in order to improve design and supply the best possible product. Sensory does not assume responsibility for use of any circuitry other than circuitry entirely embodied in a Sensory product. Information contained herein is provided gratuitously and without liability to any user. Reasonable efforts have been made to verify the accuracy of this information but no guarantee whatsoever is given as to the accuracy or as to its applicability to particular uses. Applications described in this data sheet are for illustrative purposes only, and Sensory makes no warranties or representations that the RSC series of products will be suitable for such applications. In every instance, it must be the responsibility of the user to determine the suitability of the products for each application. Sensory products are not authorized for use as critical components in life support devices or systems. Sensory conveys no license or title, either expressed or implied, under any patent, copyright, or mask work right to the RSC series of products, and Sensory makes balance between recognition and synthesis no warranties or representations that the RSC series of products are free from patent, copyright, or mask work right infringement, unless otherwise specified. Nothing contained herein shall be construed as a recommendation to use any product in violation of existing patents or other rights of third parties. The sale of any Sensory product is subject to all Sensory Terms and Conditions of Sales and Sales Policies. 5 2 1 East Weddell Drive Sunnyvale, CA 94089 © 1999 SENSORY, INC. ALL RIGHTS RESERVED P/N 80-0111-6 Sensory is registered by the U.S. Patent and Trademark Office. All other trademarks or registered trademarks are the property of their respective owners. TEL: (408) 744-9000 FAX: (408) 744-1299 From the Interactive Speech™ Line of Products