RSC-164i Recognition • Synthesis • Control General Purpose Microcontroller Featuring Speech Recognition, Speaker Verification, and Speech Synthesis. GENERAL DESCRIPTION FEATURES The RSC-164i, from the Interactive Speech™ family of products, is a low-cost microcontroller designed for use in consumer electronics. The RSC-164i combines an 8-bit microcontroller with high-quality speaker-independent and speaker-dependent speech recognition, speaker verification, and speech synthesis. Products can use one or all of the RSC-164i features in a single application. Full Range of Speech Capabilities The RSC-164i employs a sophisticated neural network that learns to classify sound data. On-chip speech recognition algorithms reach an accuracy of greater than 96% for speaker-independent recognition and greater than 99% for speaker-dependent recognition. Sensory’s neural network approach (patent pending) eliminates the need for expensive signal processing or extensive RAM storage. • 4 MIPS 8-bit microcontroller The highly-integrated nature of the chip reduces external parts count. A complete system may be built with few additional parts other than a battery, speaker, microphone, and audio input support circuitry. The RSC-164i is similar to the RSC-164, but does not support a parallel memory interface. All program, synthesis, recognition, and verification code must fit in the on-chip ROM. • Speaker-independent speech recognition • Speaker-dependent speech recognition • High quality speech synthesis and sound effects • Speaker verification Integrated Single-Chip Solution • On-chip A/D and D/A converters, digital filtering • 32kHz clock for time keeping • Internal 64kbytes ROM; 384 bytes RAM • 12 general purpose I/O lines • On-chip output PWM for direct speaker drive Low Power Requirements • 3.5-5.0V supply • ~10mA operating RSC-164i Block Diagram Oscillator External Preamp Microphone Analog Input ADC DAC Microcontroller Digital Logic PWM RAM Speaker ROM AGC RSC-164i General Purpose I/O From the Interactive Speech™ Line of Products RSC-164i DATA SHEET RSC-164i OVERVIEW POWER The RSC-164i features a high-performance 8-bit microcontroller with on-chip A/D, D/A, RAM and ROM. The RSC-164i is designed to bring a high degree of integration and versatility into low-cost, power-sensitive consumer applications. The typical operating current is 10 mA operating at 14.32 MHz. The CPU core embedded in the RSC-164i is an 8-bit, variable-length-instruction, microcontroller. The instruction set is loosely based on Intel’s 8051, having a variety of addressing mode mov instructions. But the RSC-164i processor avoids the limitations of dedicated A, B, and DPTR registers by having completely symmetrical source and destinations for all instructions. The 384 bytes of internal RAM are organized as a Register Space. AIN0 ADC MUX AIN1 SH ADC DACOUT DAC ANALOG CONTROL BUFOUT /PWM REGISTER SPACE 384 bytes PULSE WIDTH MODULATOR STACK SPACE 8 bytes XI1, XO1 OSC1 TIMER1 TIMER2 XI2, XO2 INTERRUPT LOGIC Various functional units have been integrated onto the CPU core in order to reduce total system cost and increase system reliability without degrading system performance. The RSC-164i delivers 4 MIPS of integer performance at 14.32 MHz providing maximum performance at minimum cost. RSC-164i Architecture Diagram CPU INTERNAL ROM 32K x 8 HIGH 32K x 8 OSC2 LOW SPEECH RECOGNITION PORT0 TIMING AND CONTROL P0.0-P0.7 -RESET -TE1/ PWM The RSC-164i uses a neural network to perform speakerindependent or speaker-dependent speech recognition. Speaker-dependent recognition requires external serial memory to store speech recognition templates. Speakerindependent recognition requires on-chip ROM to store the words to be recognized. The RSC-164i has several additional speech recognition features as described below. Continuous listening allows the chip to continuously listen for a specific word. With this feature a product can be used in a normal environment and only “activates” when a specific word, preceded by quiet, is spoken. SPEECH SYNTHESIS The RSC-164i provides high-quality speech synthesis by using a hybrid of a time-domain compression scheme that improves on conventional ADPCM and a customized reuse of sounds. Speech synthesis uses on-chip ROM to store audio sounds for synthesis. The RSC-164i supports approximately 25 seconds of speech. SPEAKER VERIFICATION P1.0-P1.3 PORT1 BREAK POINT REGISTER RSC-164i ARCHITECTURE The RSC-164i is a highly integrated device that combines: • 8-bit microcontroller • On-chip ROM (64 kbytes) and RAM (384 bytes) • A/D converter and D/A converter Two bi-directional ports provide 12 general purpose I/O pins to communicate with external devices. The RSC164i has a high frequency (14.32 MHz) oscillator as well as a low frequency (32,768 Hz) oscillator suitable for timekeeping applications. The processor clock can be selected from either source, with a selectable divider value. The device performs speech recognition when running at 14.32 MHz. There are two programmable 8bit counters / timers, one derived from each oscillator. The RSC-164i can perform text-dependent speaker verification. After a speaker trains the chip on a specific word, the chip will be able to identify whether that word is spoken by the original speaker. From the Interactive Speech™ Line of Products DATA SHEET RSC-164i A microphone with an external preamp converts sound into an audio signal that is fed to the RSC-164i. The gain of the external preamp may be controlled by the RSC164i by using two of the I/O lines. The RSC-164i uses an ADC (Analog-to-Digital Converter) to convert incoming analog speech signals into digital data. The output audio signal of the RSC-164i is derived from a DAC (Digitalto-Analog Converter) or PWM (Pulse Width Modulator). USING THE RSC-164i Creating applications using the RSC-164i requires the development of electronic circuitry, software code, and speech/music data files. Software code for the RSC-164i can be developed by Sensory or by external programmers using the RSC Development Kit. For more information about development tools and services, please contact Sensory. A typical product will require about $0.80 $1.50 (in high volume) of additional components, in addition to the RSC-164i. Oscillator #2 14.32 MHz (3.5V-5.0V) Pins XI2 and XO2 32768 Hz (3.5V-5.0V) Oscillator #1 works with an external crystal, a ceramic resonator or LC. Use of Oscillator #2 requires a crystal for precision timing. CLOCK The RSC-164i uses a fully static core – the processor can be stopped (by removing the clock source) and restarted without causing a reset or losing contents of internal registers. Static operation is functional from DC to 14.32 MHz. Typically the processor clock runs from a 14.32 MHz crystal with no divisor and one wait state. This creates internal RAM cycles of 70 nsec duration and internal ROM of 140 nsec duration. TIMERS/COUNTERS RSC-164i INSTRUCTION SET The instruction set for the RSC-164i has 52 instructions comprising 8 move, 7 rotate, 11 branch, 11 register arithmetic, 9 immediate arithmetic, and 6 miscellaneous instructions. All instructions are 3 bytes or fewer, and no instruction requires more than 8 clock cycles to execute. The two independent oscillators of the RSC-164i provide counts to two internal timers. Each of the two timers consists of an 8-bit reload value register and an 8-bit upcounter. The reload register is readable and writeable by the processor. INTERRUPTS GENERAL PURPOSE I/O The RSC-164i has 12 general purpose I/O pins (P0.0P0.7, P1.0-P1.3). Each pin can be programmed as an input with weak pull-up (~400kΩ equivalent device); input with strong pull-up (~10kΩ equivalent device); input without pull-up, or as an output. This is accomplished by having 24 bits of configuration registers for the I/O pins (Port Control Register A and Port Control Register B for ports 0 and 1). The RSC-164i allows for five interrupt sources, as selected by software. Each has its own mask bit and request bit in the IMR and IRQ registers respectively. The following events can generate interrupts: • • • • • Positive edge on Port 0, bit 0 Overflow of Timer 1 Overflow of Timer 2 Sensory reserved functions Completion of PWM sample period EXTERNAL MEMORY ANALOG OUTPUT The RSC-164i can connect serially through two I/O lines to a serial EEPROM for applications with low data storage and low access speed requirements. Speakerdependent and speaker verification can therefore be performed on the RSC-164i with such external memory. The RSC-164i offers two separate options for analog output. The DAC (Digital to Analog Converter) output provides a general purpose 10-bit analog output that may be used for speech output (with the inclusion of an audio amplifier), or other purposes requiring an analog waveform. For speech applications that require driving a small speaker, the PWM (Pulse-Width Modulator) output can be used instead of the DAC output. The PWM output can directly drive a 32 ohm speaker. OSCILLATORS Two independent oscillators in the RSC-164i provide a high-frequency clock and a 32kHz time-keeping clock. The oscillator characteristics are as follows: Oscillator #1: Pins XI1, XO1 From the Interactive Speech™ Line of Products RSC-164i DATA SHEET ABSOLUTE MAXIMUM RATINGS Any pin to GND Operating temperature (TO) Soldering temperature Power dissipation Operating Conditions -0.1V to +7.5V 0°C to +70°C 260°C for 10 sec TBD 0°C to +70°C; VDD=3.5-5.0V; VSS=0V WARNING: Stressing the RSC-164i beyond the “Absolute Maximum Ratings” may cause permanent damage. These are stress ratings only. Operation beyond the “Operating Conditions” is not recommended and extended exposure beyond the “Operating Conditions” may affect device reliability. THE INTERACTIVE SPEECH™ PRODUCT LINE The Interactive Speech line of ICs was developed to “bring life to products” through advanced speech and audio technology. These chips allow products to think, talk, hear and play music. The Interactive Speech chips were designed for consumer telephony products and cost-sensitive consumer electronic applications such as home electronics, personal security, and personal communication. The product line includes general purpose microcontrollers (RSC-164, RSC-164i, RSC-132i) and application specific standard speech ASSPs (Voice Password™ and Voice Direct™). RSC-164 The RSC-164 is a low-cost 8-bit microcontroller designed for use in consumer electronics. It is a fully integrated microcontroller and includes A/D, D/A, ROM, and RAM circuitry on chip. The RSC-164 can perform a full range of speech/audio functions including speech recognition, speaker verification, speech and music synthesis, and voice record/playback. RSC-132i The RSC-132i, the lowest cost member of the RSC series, is a speech enabled microcontroller designed specifically for the toy industry. The RSC-132i can support various combinations of speech technologies. The RSC-132i combines an 8-bit microcontroller with high quality speaker-independent and speaker-dependent speech recognition, speech synthesis, and speaker verification. This chips has 32 kBytes of ROM and limited I/O pins Voice Password™ ASSP The Voice Password ASSP provides consumer products with low cost biometric security. The chip lets products “lock out” unauthorized access by verifying key words and/or voices. Using text dependent speaker verification technology, Voice Password can secure products at a variety of security thresholds for many different applications. Voice Direct™ ASSP The Voice Direct ASSP provides cost-sensitive products with speaker-dependent speech recognition, speech synthesis and DTMF tone generation. This easy-to-use, pin-configurable chip requires no custom programming and can recognize up to 60 trained words. The Voice Direct ASSP is most ideal for consumer telephony products which feature voice dialing. IMPORTANT NOTICES Sensory reserves the right to make changes to or to discontinue any product or service identified in this publication at any time without notice in order to improve design and supply the best possible product. Sensory does not assume responsibility for use of any circuitry other than circuitry entirely embodied in a Sensory product. Information contained herein is provided gratuitously and without liability to any user. Reasonable efforts have been made to verify the accuracy of this information but no guarantee whatsoever is given as to the accuracy or as to its applicability to particular uses. Applications described in this data sheet are for illustrative purposes only, and Sensory makes no warranties or representations that the RSC series of products will be suitable for such applications. In every instance, it must be the responsibility of the user to determine the suitability of the products for each application. Sensory products are not authorized for use as critical components in life support devices or systems. Sensory conveys no license or title, either expressed or implied, under any patent, copyright, or mask work right to the RSC series of products, and Sensory makes balance between recognition and synthesis no warranties or representations that the RSC series of products are free from patent, copyright, or mask work right infringement, unless otherwise specified. 5 2 1 East Weddell Drive Sunnyvale, CA 94089 TEL: (408) 744-9000 FAX: (408) 744-1299 © 1996 SENSORY, INC. ALL RIGHTS RESERVED P/N 80-0024-3 Sensory is registered by the U.S. Patent and Trademark Office. All other trademarks or registered trademarks are the property of their respective owners. From the Interactive Speech™ Line of Products DATA SHEET RSC-164i Nothing contained herein shall be construed as a recommendation to use any product in violation of existing patents or other rights of third parties. The sale of any Sensory product is subject to all Sensory Terms and Conditions of Sales and Sales Policies. From the Interactive Speech™ Line of Products