A Programming Tool for Cirrus Logic 32-Bit Audio DSPs Cirrus Logic 32-bit DSP Assembly Programmer’s Guide Preliminary Product Information This document contains information for a new product. Cirrus Logic reserves the right to modify this product without notice. Copyright 2013 Cirrus Logic, Inc. http://www.cirrus.com SEP 2013 DS795UM11 32-bit DSP Assembly Programmer’s Guide Contacting Cirrus Logic Support For all product questions and inquiries contact a Cirrus Logic Sales Representative. To find one nearest you go to www.cirrus.com IMPORTANT NOTICE “Preliminary” product information describes products that are in production, but for which full characterization data is not yet available. Cirrus Logic, Inc. and its subsidiaries (“Cirrus”) believe that the information contained in this document is accurate and reliable. However, the information is subject to change without notice and is provided “AS IS” without warranty of any kind (express or implied). Customers are advised to obtain the latest version of relevant information to verify, before placing orders, that information being relied on is current and complete. All products are sold subject to the terms and conditions of sale supplied at the time of order acknowledgment, including those pertaining to warranty, indemnification, and limitation of liability. No responsibility is assumed by Cirrus for the use of this information, including use of this information as the basis for manufacture or sale of any items, or for infringement of patents or other rights of third parties. This document is the property of Cirrus and by furnishing this information, Cirrus grants no license, express or implied under any patents, mask work rights, copyrights, trademarks, trade secrets or other intellectual property rights. Cirrus owns the copyrights associated with the information contained herein and gives consent for copies to be made of the information only for use within your organization with respect to Cirrus integrated circuits or other products of Cirrus. This consent does not extend to other copying such as copying for general distribution, advertising or promotional purposes, or for creating any work for resale. CERTAIN APPLICATIONS USING SEMICONDUCTOR PRODUCTS MAY INVOLVE POTENTIAL RISKS OF DEATH, PERSONAL INJURY, OR SEVERE PROPERTY OR ENVIRONMENTAL DAMAGE (“CRITICAL APPLICATIONS”). CIRRUS PRODUCTS ARE NOT DESIGNED, AUTHORIZED OR WARRANTED FOR USE IN AIRCRAFT SYSTEMS, MILITARY APPLICATIONS, PRODUCTS SURGICALLY IMPLANTED INTO THE BODY, LIFE SUPPORT PRODUCTS OR OTHER CRITICAL APPLICATIONS (INCLUDING MEDICAL DEVICES, AIRCRAFT SYSTEMS OR COMPONENTS AND PERSONAL OR AUTOMOTIVE SAFETY OR SECURITY DEVICES). INCLUSION OF CIRRUS PRODUCTS IN SUCH APPLICATIONS IS UNDERSTOOD TO BE FULLY AT THE CUSTOMER'S RISK AND CIRRUS DISCLAIMS AND MAKES NO WARRANTY, EXPRESS, STATUTORY OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR PARTICULAR PURPOSE, WITH REGARD TO ANY CIRRUS PRODUCT THAT IS USED IN SUCH A MANNER. IF THE CUSTOMER OR CUSTOMER'S CUSTOMER USES OR PERMITS THE USE OF CIRRUS PRODUCTS IN CRITICAL APPLICATIONS, CUSTOMER AGREES, BY SUCH USE, TO FULLY INDEMNIFY CIRRUS, ITS OFFICERS, DIRECTORS, EMPLOYEES, DISTRIBUTORS AND OTHER AGENTS FROM ANY AND ALL LIABILITY, INCLUDING ATTORNEYS' FEES AND COSTS, THAT MAY RESULT FROM OR ARISE IN CONNECTION WITH THESE USES. Cirrus Logic, Cirrus, and the Cirrus Logic logo designs are trademarks of Cirrus Logic, Inc. All other brand and product names in this document may be trademarks or service marks of their respective owners. Microsoft and Windows are registered trademarks of Microsoft Corporation. Microwire is a trademark of National Semiconductor Corp. National Semiconductor is a registered trademark of National Semiconductor Corp. Texas Instruments is a registered trademark of Texas Instruments, Inc. Motorola is a registered trademark of Motorola, Inc. LINUX is a registered trademark of Linus Torvalds. ii Copyright 2013 Cirrus Logic, Inc. DS795UM11 32-bit DSP Assembly Programmer’s Guide Contents Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-iii Chapter 1. Cirrus Logic Assembly Program (CASM) ......................................... 1-1 1.1 Welcome to CASM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-1 1.2 Accessing CASM Through the CLIDE GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2 1.3 Accessing CASM Through the Assembler Command Line . . . . . . . . . . . . . . . . . . . . . . . . . .1-2 1.3.1 Command Line Format...................................................................................................1-2 1.3.2 Command Line Options..................................................................................................1-3 1.3.3 Command Line Examples ..............................................................................................1-4 1.4 Assembly Language Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-5 1.4.1 Code Line Format...........................................................................................................1-5 1.4.2 Comment Character .......................................................................................................1-5 1.4.3 Case Sensitivity ..............................................................................................................1-5 1.4.4 Symbol Definition............................................................................................................1-5 1.4.5 Local Symbol Definition and Use....................................................................................1-6 1.4.6 Expressions ....................................................................................................................1-6 1.4.6.1 Floating-point Expressions ............................................................................1-7 1.4.6.2 Address Expressions .....................................................................................1-7 1.4.7 Constants .......................................................................................................................1-7 1.4.7.1 Floating Point Literals ....................................................................................1-7 1.4.7.2 Integer Literals ...............................................................................................1-8 1.4.7.3 String Literals.................................................................................................1-8 1.4.8 Unary Operators .............................................................................................................1-9 1.4.9 Binary Operators ............................................................................................................1-9 1.4.9.1 Precedence of Operators.............................................................................1-10 1.4.10 Expression Examples .................................................................................................1-10 1.4.11 Built-in Functions ........................................................................................................1-10 1.4.12 Mathematical Functions..............................................................................................1-11 1.4.13 Conversion Functions.................................................................................................1-13 1.4.14 String Functions..........................................................................................................1-14 1.4.15 Assembler Directives..................................................................................................1-14 1.4.15.1 Code Modularity.........................................................................................1-15 1.4.15.2 Memory Segments.....................................................................................1-15 1.4.15.3 Symbol Assignment ...................................................................................1-16 1.4.15.4 Data Memory Assignment .........................................................................1-16 1.4.15.5 Conditional Assembly ................................................................................1-18 1.4.15.6 Token Substitution .....................................................................................1-19 1.4.15.7 Listing and Message Control .....................................................................1-20 1.4.15.8 Assembler Warning/Error Control ..............................................................1-20 1.4.15.9 Define .struct Type.....................................................................................1-21 1.4.15.10 Sizeof Function ........................................................................................1-24 1.4.15.11 Assert Directive........................................................................................1-24 1.4.16 Macro Definition and Calling.......................................................................................1-25 1.4.17 Macro Replication.......................................................................................................1-27 1.4.18 Assembly Language Example ....................................................................................1-28 Chapter 2. 32-Bit DSP Internal Architecture and Programming Model ...................................................................................... 2-1 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-1 2.2 Data Path and Accumulators Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2 DS795UM11 Copyright 2013 Cirrus Logic, Inc. iii 32-bit DSP Assembly Programmer’s Guide 2.2.1 Data Representation.......................................................................................................2-4 2.2.2 Accumulator Data Transfers...........................................................................................2-6 2.2.2.1 Move to Accumulator .....................................................................................2-7 2.2.2.2 Moving from Accumulator ..............................................................................2-8 2.2.2.3 Saturation Examples......................................................................................2-9 2.2.2.4 Rounding Examples.......................................................................................2-9 2.2.2.5 Shifting Examples ........................................................................................2-10 2.3 Parallel Address Generation Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10 2.3.1 Addressing Modes........................................................................................................2-12 2.3.1.1 Modulo Addressing ......................................................................................2-12 2.3.1.2 Reverse Binary Addressing .........................................................................2-13 2.3.1.3 Immediate Addressing .................................................................................2-14 2.3.1.4 Indexed Addressing .....................................................................................2-14 2.4 Program Control Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-17 2.4.1 Program Counter ..........................................................................................................2-17 2.4.2 Subroutine Stack ..........................................................................................................2-17 2.4.3 Loop Stack....................................................................................................................2-17 2.4.4 Subroutine Stack and Loop Stack Common Implementations .....................................2-18 2.4.5 jsr_mode Register .......................................................................................................2-19 2.4.6 lst_mode Register.........................................................................................................2-20 2.4.7 stq_base Register ........................................................................................................2-21 2.4.8 mr_jsr_ptr Register .......................................................................................................2-21 2.4.9 jsr_data Register ..........................................................................................................2-21 2.4.10 mr_lst_ptr Register .....................................................................................................2-21 2.4.11 lp_data1 Register .......................................................................................................2-22 2.4.12 lp_data2 Register .......................................................................................................2-22 2.4.13 lst_data1 Register.......................................................................................................2-23 2.4.14 lst_data2 Register.......................................................................................................2-23 2.4.15 jsr_ovf Register...........................................................................................................2-23 2.4.16 jsr_unf Register ..........................................................................................................2-24 2.4.17 lst_ovf Register...........................................................................................................2-24 2.4.18 lst_unf Register...........................................................................................................2-24 2.4.19 Mode Register ............................................................................................................2-24 2.4.20 Condition Code Register ............................................................................................2-25 2.4.21 Loop Stack Example...................................................................................................2-26 2.5 Master State Registers (MSREGS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-33 2.5.1 Search Registers ..........................................................................................................2-34 2.5.2 Random Number Generator .........................................................................................2-34 2.6 Interrupt Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-35 2.6.1 Fast Interrupts ..............................................................................................................2-35 2.6.2 Long Interrupts .............................................................................................................2-35 2.6.3 Masking ........................................................................................................................2-35 2.6.3.1 IMask ...........................................................................................................2-36 2.6.3.2 IRMask.........................................................................................................2-36 2.7 Instruction Restrictions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36 2.7.1 Code Example, Broken Code .......................................................................................2-37 2.7.2 Code Example, Fixed Code..........................................................................................2-37 2.8 LogExp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37 Chapter 3. Full Word Instructions........................................................................ 3-1 3.1 Assembly Language Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-1 3.2 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2 iv Copyright 2013 Cirrus Logic, Inc. DS795UM11 32-bit DSP Assembly Programmer’s Guide 3.3 Execution Control Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2 3.3.1 do - Start Hardware Loop ...............................................................................................3-2 3.3.2 enddo - End Current Do-Loop ........................................................................................3-3 3.3.3 do_patch - Jump to Patch...............................................................................................3-4 3.3.4 jmp - Jump......................................................................................................................3-5 3.3.5 if - Jump Conditionally ....................................................................................................3-6 3.3.6 call - Jump To Subroutine...............................................................................................3-7 3.3.7 callint - Answer Interrupt.................................................................................................3-8 3.3.8 callint_stq - Answer Stack Interrupt ................................................................................3-8 3.3.9 ret - Return From Subroutine..........................................................................................3-8 3.3.10 retint - Return From Interrupt........................................................................................3-9 3.3.11 retint_stq - Return From Stack Interrupt .......................................................................3-9 3.3.12 inten - Enable Interrupts ...............................................................................................3-9 3.3.13 intdis - Disable Interrupts............................................................................................3-10 3.3.14 halt - Stop Further Execution......................................................................................3-10 3.3.15 nop - No Operation .....................................................................................................3-10 3.3.16 _breakpt - Breakpoint Instruction................................................................................3-11 3.4 64-bit Peripheral Moves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-12 3.4.1 XY Register Pair = ext(16-bit Address) ........................................................................3-12 3.4.2 Accum = ext(16-bit Address) ........................................................................................3-12 3.4.3 ext(16-bit Address) = XY Register Pair ........................................................................3-13 3.4.4 ext(16-bit Address) = Accum ........................................................................................3-13 3.4.5 logexp = XY Register Pair ............................................................................................3-14 3.4.6 XY Register Pair = logexp ............................................................................................3-16 3.5 Memory Moves - Direct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16 3.5.1 Any Reg = xmem[16-bit Address].................................................................................3-16 3.5.2 xmem[16-bit Address] = Any Reg.................................................................................3-17 3.5.3 Any Reg = ymem[16-bit Address].................................................................................3-18 3.5.4 ymem[16-bit Address] = Any Reg.................................................................................3-19 3.5.5 Any Reg = pmem[16-bit Address] ................................................................................3-20 3.5.6 pmem[16-bit Address] = Any Reg ................................................................................3-21 3.5.7 Any Reg = inp[16-bit Address]......................................................................................3-22 3.5.8 outp[16-bit Address] = Any Reg ...................................................................................3-23 3.5.9 Any Reg = xmem[Index Register].................................................................................3-24 3.5.10 xmem[Index Register] = Any Reg...............................................................................3-25 3.5.11 Any Reg = ymem[Index Register]...............................................................................3-26 3.5.12 ymem[Index Register] = Any Reg...............................................................................3-27 3.5.13 Any Reg = pmem[Index Register] ..............................................................................3-28 3.5.14 pmem[Index Register] = Any Reg ..............................................................................3-29 3.5.15 outp[Index Register] = Any Reg .................................................................................3-30 3.5.16 Any Reg = inp[Index Register]....................................................................................3-31 3.6 Immediate Register Moves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-33 3.6.1 fixed16(Destination) = (16-bit Data) .............................................................................3-34 3.6.2 ufixed16(Destination) = (16-bit Data) ...........................................................................3-34 3.6.3 uhalfword(Destination) = (16-bit Data) .........................................................................3-35 3.6.4 Index Register = (16-bit Data) ......................................................................................3-36 3.6.5 NM Register = (16-bit Data) .........................................................................................3-36 3.6.6 Guard Register = (8-bit Data) .......................................................................................3-36 3.6.7 halfword(Destination) = (16-bit Data) ...........................................................................3-37 3.6.8 lo16(Destination) = (16-bit Data) ..................................................................................3-38 3.6.9 MS Reg = (16-bit Data) ................................................................................................3-38 DS795UM11 Copyright 2013 Cirrus Logic, Inc. v 32-bit DSP Assembly Programmer’s Guide 3.6.10 AnyReg(Any Reg, Any Reg).......................................................................................3-39 3.6.11 Any Reg = MS Reg.....................................................................................................3-40 3.6.12 MS Reg = Any Reg.....................................................................................................3-41 3.6.13 AnyReg (Any Reg, Any Reg), (Any Reg, Any Reg)....................................................3-42 3.6.14 Accum = long(Accum) ................................................................................................3-43 3.6.15 In = Im/(0) ± (16-bit Data) ...........................................................................................3-44 3.7 Bit Manipulation Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-45 3.7.1 Bit Test .........................................................................................................................3-45 3.7.2 Bit Set ...........................................................................................................................3-46 3.7.3 Bit Clear........................................................................................................................3-47 3.7.4 Bit Change....................................................................................................................3-48 Chapter 4. Multifunction Moves ........................................................................... 4-1 4.1 Single Multifunction Moves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-1 4.1.1 DP Reg = xmem[Index Register] DP Reg = xmem[6-bit Address] ................................................................................................4-1 4.1.2 xmem[Index Register] = DP Reg xmem[6-bit address] = DP Reg ................................................................................................4-2 4.1.3 DP Reg = ymem[Index Register] DP Reg = ymem[6-bit address] ................................................................................................4-3 4.1.4 ymem[Index Register] = DP Reg ymem[6-bit address] = DP Reg ................................................................................................4-4 4.1.5 Data Path Register to or from Any Register ...................................................................4-5 4.1.5.1 DP Reg = Any Reg ........................................................................................4-5 4.1.5.2 Any Reg = DP Reg ........................................................................................4-6 4.2 Parallel Multifunction Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10 4.2.1 Xn = xmem[Index Register] ..........................................................................................4-10 4.2.2 xmem[Index Register] = An ..........................................................................................4-11 4.2.3 Ym = ymem[Index Register] .........................................................................................4-12 4.2.4 ymem[Index Register] = Bm .........................................................................................4-12 4.3 Data Path Register to Data Path Register Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13 4.3.1 DP Reg = DP Reg ........................................................................................................4-14 4.4 Parallel Register to/from Register Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14 4.4.1 Data Path Register to Data Path Register and Data Path Register to/from X or Y Memory Restrictions ........................................................4-15 4.5 64-bit Multifunction Moves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16 4.5.1 Data Path Register Pair to or from XY Memory............................................................4-16 4.5.1.1 Data Path Register Pair = xymem[Index Register] Data Path Register Pair = xymem[6-bit Address] .....................................................4-16 4.5.1.2 xymem[Index Register] = Data Path Register Pair xymem[6-bit Address] = Data Path Register Pair .....................................................4-17 4.5.2 Accumulator to or from XY Memory .............................................................................4-18 4.5.2.1 Accum = xymem[Index Register] Accum = xymem[6-bit Address] ................................................................................4-18 4.5.2.2 xymem[Index Register] = Accum xymem[6-bit Address] = Accum ................................................................................4-18 4.6 Index Register Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19 4.6.1 In = Im ± (6-bit Data).....................................................................................................4-19 4.6.2 In ±= 1/2/N ....................................................................................................................4-20 Chapter 5. Multifunction Operations ................................................................... 5-1 5.1 Multifunction Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-1 vi Copyright 2013 Cirrus Logic, Inc. DS795UM11 32-bit DSP Assembly Programmer’s Guide 5.1.1 Parallel Multiply/Multiply-Accumulate I ...........................................................................5-1 5.1.2 Parallel Multiply/Multiply-Accumulate II ..........................................................................5-2 5.1.3 Real Multiply/Multiply-Accumulate..................................................................................5-3 5.1.4 Parallel Squares .............................................................................................................5-4 5.1.5 Parallel Multiply with Add................................................................................................5-5 5.1.6 Multiply by One with Optional Accumulate .....................................................................5-5 5.1.7 Parallel Multiply by One with Optional Accumulate ........................................................5-6 5.2 Multifunction Accumulator Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-7 5.2.1 Parallel Add with Shift.....................................................................................................5-7 5.2.2 Add with Shift..................................................................................................................5-7 5.2.3 Conditional Operation - Maximum ..................................................................................5-8 5.2.4 Conditional Operation - Minimum ...................................................................................5-9 5.2.5 Conditional Operation - Absolute Value Maximum.........................................................5-9 5.2.6 Conditional Operation - Absolute Value Minimum........................................................5-10 5.2.7 Bitwise Accumulator Move ...........................................................................................5-10 5.2.8 Parallel Bitwise Accumulator Move ..............................................................................5-11 5.2.9 Bitwise Complement.....................................................................................................5-12 5.2.10 Parallel Bitwise Complement......................................................................................5-12 5.2.11 AccumNegative Accumulator Move............................................................................5-13 5.2.12 Parallel Negative Accumulator Move..........................................................................5-13 5.2.13 Absolute Value Accumulator Move.............................................................................5-14 5.2.14 Parallel Absolute Value Accumulator Move................................................................5-14 5.2.15 Bitwise OR..................................................................................................................5-15 5.2.16 Parallel Bitwise OR.....................................................................................................5-15 5.2.17 Bitwise Exclusive OR..................................................................................................5-16 5.2.18 Parallel Bitwise Exclusive OR.....................................................................................5-16 5.2.19 Bitwise AND................................................................................................................5-17 5.2.20 Parallel Bitwise AND...................................................................................................5-17 5.2.21 Bitwise Zero................................................................................................................5-18 5.2.22 Parallel Bitwise Zero...................................................................................................5-18 5.2.23 Bitwise Shift Left by One ............................................................................................5-19 5.2.24 Parallel Bitwise Shift Left by One ...............................................................................5-19 5.2.25 Bitwise Shift Left by Four............................................................................................5-19 5.2.26 Parallel Bitwise Shift Left by Four...............................................................................5-20 5.2.27 Bitwise Shift Left by Eight ...........................................................................................5-20 5.2.28 Parallel Bitwise Shift Left by Eight ..............................................................................5-21 5.2.29 Bitwise Shift Right by One ..........................................................................................5-21 5.2.30 Parallel Bitwise Shift Right by One .............................................................................5-22 5.2.31 Bitwise Test ................................................................................................................5-22 5.2.32 Parallel Bitwise Test ...................................................................................................5-23 5.2.33 Bitwise Compare ........................................................................................................5-23 5.2.34 Parallel Bitwise Compare ...........................................................................................5-24 5.2.35 Bitwise Absolute Value Compare ...............................................................................5-24 5.2.36 Parallel Bitwise Absolute Value Compare ..................................................................5-25 Chapter A. Glossary ............................................................................................. A-1 Chapter B. List of Instructions by Category and Flag Reference .................... B-1 Table B-1. Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-4 DS795UM11 Copyright 2013 Cirrus Logic, Inc. vii 32-bit DSP Assembly Programmer’s Guide Figures Figure 2-1. Cirrus Logic 32-Bit Architecture ..................................................................................................2-1 Figure 2-2. Data Flow within Data Path and Accumulators Unit ...................................................................2-2 Figure 2-3. Data Path Registers ....................................................................................................................2-3 Figure 2-4. 32-bit Fractional Representation .................................................................................................2-5 Figure 2-5. 64-bit Fractional Representation .................................................................................................2-5 Figure 2-6. 72-bit Fractional Representation .................................................................................................2-5 Figure 2-7. Integer vs. Fractional Multiplication.............................................................................................2-6 Figure 2-8. Positive 32-bit Value ...................................................................................................................2-7 Figure 2-9. Negative 32-bit Value..................................................................................................................2-7 Figure 2-10. Positive Saturation: x0=a0 ........................................................................................................2-9 Figure 2-11. Rounding Example: Negative Saturation: x0=a0 ......................................................................2-9 Figure 2-12. No Saturation: x0=a0 ................................................................................................................2-9 Figure 2-13. Data Flow for the Parallel Address Generation Unit ...............................................................2-11 Figure 2-14. Execute Phase vs. Decode Phase Assignments ....................................................................2-17 Figure 2-15. Loop Stack Overflow Example ................................................................................................2-28 Figure 2-16. Loop Stack Underflow Example ..............................................................................................2-29 Figure 3-1. Assembler Example: 32-bit Instruction Word ..............................................................................3-1 Tables Table 1-1 Command Line Options ................................................................................................................1-3 Table 1-2 Unary Operators............................................................................................................................1-9 Table 1-3 Binary Operators ...........................................................................................................................1-9 Table 1-4 Precedence of Operators ............................................................................................................1-10 Table 1-5 Expression Examples..................................................................................................................1-10 Table 1-6 Built-in Functions.........................................................................................................................1-10 Table 1-7 Mathematical Functions ..............................................................................................................1-11 Table 1-8 Conversion Functions .................................................................................................................1-13 Table 1-9 String Functions ..........................................................................................................................1-14 Table 1-10 Macros ......................................................................................................................................1-15 Table 1-11 Symbol Assignment ..................................................................................................................1-16 Table 1-12 Data Memory Assignment.........................................................................................................1-16 Table 1-13 Conditional Assembly Directives...............................................................................................1-18 Table 1-14 Listing Control Switches............................................................................................................1-20 Table 1-15 Special Characters Used in Macros..........................................................................................1-26 Table 2-1. Result of x0=a0 for a Given Rounding Mode (Shifting Off) ..........................................................2-9 Table 2-2. Result of x0=a0 for a Given Shifting Mode with Rounding Set to Truncate (off)........................2-10 viii Copyright 2013 Cirrus Logic, Inc. DS795UM11 32-bit DSP Assembly Programmer’s Guide Table 2-3. Result of x0=a0 for a Given Shifting Mode with Rounding Set to Add ½ then Truncate............2-10 Table 2-4. Result of x0=a0 for a Given Shifting Mode with Rounding Set to Round to Zero ......................2-10 Table 2-5. Index Registers ..........................................................................................................................2-12 Table 2-6. Increment-Modulo Registers ......................................................................................................2-12 Table 2-7. Addressing Modes, Defined by the NM Registers .....................................................................2-13 Table 2-8. jsr_mode Bit Definitions .............................................................................................................2-19 Table 2-9. lst_mode Bit Definitions..............................................................................................................2-20 Table 2-10. stq_base Bit Definitions............................................................................................................2-21 Table 2-11. mr_jsr_ptr Bit Definitions ..........................................................................................................2-21 Table 2-12. jsr_data Bit Definitions .............................................................................................................2-21 Table 2-14. lp_data1 Bit Definitions ............................................................................................................2-22 Table 2-15. lp_data2 Bit Definitions ............................................................................................................2-22 Table 2-13. mr_lst_ptr Bit Definitions ..........................................................................................................2-22 Table 2-16. lst_data1 Bit Definitions............................................................................................................2-23 Table 2-17. lst_data2 Bit Definitions............................................................................................................2-23 Table 2-18. jsr_ovf Bit Definitions................................................................................................................2-23 Table 2-19. jsr_unf Bit Definitions ...............................................................................................................2-24 Table 2-20. lst_ovf Bit Definitions................................................................................................................2-24 Table 2-21. lst_unf Bit Definitions................................................................................................................2-24 Table 2-23. Condition Code Register Bit Definitions ...................................................................................2-25 Table 2-22. Mode Register Bit Definitions...................................................................................................2-25 Table 2-24. T1, T0 with Various Accum + Shift Values ...............................................................................2-26 Table 2-25. Master State Registers.............................................................................................................2-33 Table 2-26. Writing to the LogExp Peripheral .............................................................................................2-38 Table 2-27. Command Operations ..............................................................................................................2-38 Table 2-28. X Input Mux ..............................................................................................................................2-38 Table 2-29. Y Input Mux ..............................................................................................................................2-38 Table 3-1. Syntax Terms Used in this Manual ..............................................................................................3-2 Table 3-2. 72-bit Accumulators ...................................................................................................................3-33 Table 3-3. 32-bit Data Registers .................................................................................................................3-33 Table A-1. Glossary Terms .......................................................................................................................... A-1 Table B-1. Instruction / Flag Reference Table.............................................................................................. B-1 DS795UM11 Copyright 2013 Cirrus Logic, Inc. ix Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Chapter 1 1Cirrus Logic Assembly Program (CASM) 1.1 Welcome to CASM The Cirrus Logic Cross-assembler (CASM) application was originally used by software developers at Cirrus Logic for over 10 years to implement custom DSP audio algorithms on the Cirrus Logic 32-bit DSP core-based platforms such as the CS4953xx, CS4970x4, CS485xx, CS470xx, and the CS498xx multicore DSPs. Cirrus Logic offers CASM as part of the Cirrus Logic Integrated Development Environment (CLIDE) that is available to Cirrus Logic customers to develop their own custom audio algorithms to run on Cirrus Logic DSPs. Note: The Cirrus Device Manager (CDM) must be running to use the CLIDE tool set. After the SDK for the Cirrus DSP used in the customer’s design is launched, CDM should launch automatically. CDM provides communication between CLIDE and the board and simulator. The CLIDE tool set includes: • CLIDE’s graphical user interface (GUI) is described in the CLIDE User’s Manual and allows the user to access the following applications from CLIDE: • CASM—described in this manual. • Cirrus Logic C-Compiler—described in the Cirrus Logic C-Compiler User’s Manual. • CLIDE debugger—described in the CLIDE User’s Manual; debugs both Assembly and C language source files, and replaces the Hydra debugger. • Cirrus Linker (CLINK)—described in the CLINK User’s Manual; takes one or more object files as input and creates binary file(s) suitable for loading. • Primitive Elements Wizard—described in the CLIDE User’s Manual and is an XML file wizard used to create custom primitives that are debugged within CLIDE. • Simulator—described in the CLIDE User’s Manual. • Source editor—described in the CLIDE User’s Manual. DS795UM11 1-1 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.2 Accessing CASM Through the CLIDE GUI Most users access CASM through the CLIDE GUI, which is described fully in the CLIDE User’s Manual, available from the main CLIDE window in HelpHelp Contents. To access CLIDE, follow these steps: 1 1. Install the SDK for the Cirrus DSP used in your system design. 2. From the Windows Start menu, select Cirrus Logic DSPProgramming ToolsCLIDE. 3. After CLIDE has opened, select File New Project. 4. Select one of the wizard-based templates and develop your DSP software project by following the instructions contained in the CLIDE User’s Manual. Users can also use CLIDE’s on-line Help system for assistance. CLIDE has an Assembler tab in Project properties that can be used to set some of the options when assembling within CLIDE. 1.3 Accessing CASM Through the Assembler Command Line Users can also access the assembler software by launching the casm.exe application. The casm.exe application can be run from a console window opened from within the Cirrus Device Manager (CDM) application, a batch file, or from a “make” utility. Open the Cirrus Device Manager by clicking the CDM icon in the system tray, and then open a console window by selecting the menu option FileOpen build console. The command line specifies the source file and control directives for processing the file. 1.3.1 Command Line Format The format of the assembler command line is as follows: casm <source file> <options> Note: These can be in any order. <source file> is a single valid file path representing the file containing assembler code. The <source file> parameter must be specified. Otherwise, CASM exits with an error message. If <source file> has no extension, CASM will append the default extension ‘.A’ to <source file> prior to searching for the file. If the assembler does not find <source file>, it exits with an error message. If the assembler finds <source file> and the environment variable CASMSPEC is included in a command, the environment string of the CAMSPEC variable is used as the default assembler option. The format of the options defined in CASMSPEC must follow the format described in Section 1.3.2. 1-2 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.3.2 Command Line Options <options> contains zero or more control directives to the assembler. Each control directive starts with the character ‘-‘(dash or minus) and ends with un-quoted white space. A single dash '-' is used for single character options and a double dash '--' is used for multiple character options. The text after the initial dash indicates the specific control directive for the assembler to employ. The formats of the various options are as follows. Command lines are case-insensitive. See Section 1.4.3 for details. All alphabetic characters are transformed to upper case prior to passing the options to the assembler. Table 1-1 Command Line Options Option -a<file_name> -c Description Macro preprocessed source code output. Enable case sensitivity; CASM is case insensitive by default. CASMSPEC is an environment variable that the user can set to a default value for CASM to use when given the –-casmspec option. For example to set up CASMSPEC, the user could select the following options: ? The user should begin with the CASM requirement that the –i<macro include file> be included in every invocation. --casmspec ? The user might want CASM to be case sensitive with the -c directive. ? The user might want a listing file to always be generated with the -l directive. To accomplish the user’s wishes, the CASMSPEC environmental variable in this manner: CASMSPEC=-iC:\CirrusDSP\bin\athena.h –c -l After CASMSPEC is configured using the --casmspec option ensures preset options are used whenever CASM is called. Note: Environment strings do not allow the ‘=’ character, so ‘:’ must be used for definitions of the environment variable CASMSPEC. See Section 1.4.4 for more information on defining and referencing symbols in the Cirrus Logic 32-bit DSP assembler. --cdl Emit dependency file .adf --help Display all valid command line options and switches. -d<symbol>[:<value>] The -D or -d directive instructs the assembler to define a symbol with label <symbol> prior to assembling the source code. A symbol defined in this manner can be referenced in the assembler source as if it were defined in the source. A replacement string <value> can optionally be associated with the symbol by using either the ‘=’ or ‘:’ characters. --debug The --debug directive instructs the assembler to add symbol debugging information used by CLIDE to the object file and instructs the assembler to create line-level debug information for the CLIDE debugger. -e The -e directive instructs the assembler to produce error output with alternative formatting. Example: With the -e directive: “<sourcefile> (<line #>) <macro line #>:Error” Without the -e directive: “Error in <sourcefile>:<line #.<macro Line #>” -f<file name> Used for the compatibility with the old CLINK. -I<include folder> Set include folder. If you include a header file in your source file such as .include “i.h”, then adding -Ic:\example\inc\ tells CASM to search for i.h in the c:\example\inc folder. -i<macro include file> Platform dependent macros. -l<lst file> The -l directive instructs the assembler to create a listing file. If -l<lst file> is given in the option, the listing file will be given this path, otherwise the source code root path will be used with the default extension .LST. Similarly, if -l<lst file> has no extension, the default extension will be appended to the path. If a file at the listing file path exists prior to the assembler run, the old file contents will be lost. DS795UM11 1-3 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-1 Command Line Options (continued) Option 1 Description -o<object file> The -o directive specifies the location of the object file to the assembler. If this option is not specified in the command line, the assembler will make an object file using the root of the source file path and the default extension .O. Similarly, if <object file> has no extension, the default extension will be appended to the path. If a file at the object file path exists prior to the assembler run, the old file contents will be lost. The object file is used exclusively by the CLINK linker. -s Add local symbols in the object file. 1.3.3 Command Line Examples Example 1: casm maketab.a -l --debug -dTABSIZE=128 i%tools%\CS498xx\common\inc\base.h This example does the following: • Assembles the file maketab.a in the working directory. • Employs the definition file base.h that is appropriate for the CS498xx DSP. • Define the symbol TABSIZE in the assembler and set the symbol to 128. • Make listing file maketab.lst. • Include debug information in the output file. • Implicit make object file maketab.o. Example 2: casm c:\mycode\hiworld –ibase.h –dBYTEALIGN –ogreet.obj This example does the following: • Assembles file c:\mycode\hiworld.a. • Employs definition file base.h. • Defines the symbol BYTEALIGN in the assembler with no associated value. • Makes the object file greet.obj in the working directory. 1-4 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4 Assembly Language Format An assembly language file is a text file parsed as a series of lines. Each line is terminated with a new line character. Each line contains at most one instruction or assembler directive. A line may contain all white space characters or consist entirely of comments. 1.4.1 Code Line Format A line containing an instruction or assembler directive must adhere to the following format: <symbol or white space> <instruction or directive> The first character of a code line is significant. If the initial character of <symbol or white space> is not white space, a symbol is defined and its value is set to either the current value of the address counter or the value of the assignment assembler directive. A symbol definition ends with either a white space character or the special character ‘:’. See Section 1.4.4 for more on symbol definitions. If the initial character of <symbol or white space> is white space, no symbol is defined for the current address. <instruction or directive> follows the valid syntax of an operation or assembler directive. Executable operations are covered in Chapter 2, Chapter 3, Chapter 4, and Chapter 5 of this manual. Assembler directives are covered in detail in Section 1.4.16 of this chapter. 1.4.2 Comment Character The comment character in this assembler is ‘#’. The assembler ignores the text from the comment character to the end of the line. The comment string appears in the listing file. 1.4.3 Case Sensitivity The assembler is not case sensitive unless the -c directive is used to enable case sensitivity. The single exception to this rule is text between string delimiters. Case sensitivity should be used whenever C-language source files are involved in a project. Even if there is a single Clanguage file in a project, it is a good idea to enable case sensitivity for every assembler file, too. See Section 1.4.7 on page 1-7 for more information on string definition. 1.4.4 Symbol Definition A valid assembler symbol must start with either an alphabetic character or the special character ‘_’. Each symbol character after the first is either alphabetic ([‘a’...’z’] or [‘A’...’Z’]), numeric ([‘0’...’9’]), or ‘_’. Placing the symbol string at the beginning of a line registers the string in the assembler symbol table and associates the symbol with either the address counter value or the value of the assignment assembler directive on the code line. Placing the symbol string anywhere else in a line causes the assembler to search for the symbol in its symbol table. If the assembler finds a match in the table, the value of the symbol is evaluated in the context of the code line. If the assembler does not find a match, an assembler error is generated. There are a few assembler keywords that cannot be used as symbols. All register names and memory reference labels are reserved (such as a0 and xmem). DS795UM11 1-5 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.5 Local Symbol Definition and Use Symbols beginning with ‘%’, ‘>’, or ‘<’ characters are called local symbols. These are used in situations where the value of such a symbol is valid only for a brief period. Local symbols are not recorded in the assembler symbol table. These symbols are not visible to the linker or debugger. 1 To reference a defined local symbol, the local symbol name must be prefixed with either ‘<’ or ‘>’, depending on where the local symbol is defined. A local symbol reference starting with the special character ‘>’ instructs the assembler that the value of the local symbol will be defined later on in the code (a forward reference). A local symbol reference starting with the special character ‘<’ instructs the assembler that the value of the local symbol has already been defined (a backward reference). The following code example employs local variables to implement a C-like While loop without having to define any permanent symbols. # while (clause) { #body # } %whiletop: # calculate value of clause (zero or non-zero) if (a==0) jmp>whileend #loop body jmp<whiletop %whileend: If there are multiple (non-nested) instances of this while structure in the same assembler file, the local variables can be reused without causing a symbol redefinition error. References to local symbols are different inside of macro bodies. See Section 1.4.17 for more information on local variables in macro definitions. 1.4.6 Expressions Quantities to be evaluated at assembly time, such as addresses, conditionals, and numerical values, are cast as expressions. Expressions in the assembler are composed using infix notation, with binary operators between their operands. Therefore, a hierarchy of operator precedence is necessary to establish an order of operator evaluation. Parenthetical characters ‘(‘ and ’)’ provide an escape from evaluation difficulties. See Table 1-5 for examples. Externally defined symbols (.extern) can be used in expressions. extern FOO .xdata BAR .dw FOO * 5 1-6 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.6.1 Floating-point Expressions In an expression, any operation involving a floating point operand is promoted to float type. Use of a floating point value in an address expression or a context requiring an integer expression is disallowed. 1 Assembly time intermediate floating point computations are performed in double precision float format or higher. Float values are implicitly emitted as 32-bit fixed point. An error is posted if the final value of a floating point expression is not in the [–1.0, 1.0] range. 1.4.6.2 Address Expressions Any expression involving a relocatable symbol, such as a label, is termed an address expression. The value of an address expression must be less than 2^16. Address expressions are limited in a fashion similar to C pointer arithmetic. For example: • Legal address expressions: <address> + <integer> <address> - <integer> <address> - <address> • Illegal address expressions: <address> + <address> <address> <any-op> <float> 1.4.7 Constants There are three types of literal values in the assembler: floating point, integer, and string. 1.4.7.1 Floating Point Literals A floating point literal is expressed as: <mantissa>[’E’|’e’<exponent>] where <mantissa> ::= <digits>[‘.’<digits>] <exponent> ::= [[sign]<digits>] Examples: 1.0 0.2e-10 99e20 DS795UM11 1-7 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.7.2 Integer Literals Integer literals may be specified in any of the four commonly-used radices: hexadecimal, decimal, octal, and binary. Hexadecimal radix integer literals are composed of the digits 0-9 and the characters A-F or a-f. Decimal literals may use the digits 0-9. Octal literals may use the digits 0-7. Binary literals may use only the digits 0 and 1. 1 There are both prefix and postfix methods of radix specification for integer literals, absence of a radix specification defaults to decimal. 1.4.7.2.1 Prefix Radix Specification hexadecimal: 0x or 0X followed by <hex-digits> binary: 0b or 0B followed by <binary-digits> Examples: 0xffff 0b010101010 12345 1.4.7.2.2 Postfix Radix Specification Hexadecimal, decimal, octal, and binary numbers can be specified with a postfix radix specification character as follows: hexadecimal: <hex-digits>(‘H’|’h’|’X’|’x’) decimal: <digits>(‘D’|’d’|nil) octal: <octal-digits>(‘Q’|’q’|’O’|’o’) binary: <binary-digits>(‘B’|’b’) Examples: 0ffffH ffffX 777Q 101010B 1.4.7.3 String Literals Strings are consecutive text characters between the string delimiter character, which can be either a single quote or double quote. The ending delimiter character must match the starting delimiter character. Two quotes in a row, within a string, are treated as a single quote character to be added to the contents of the string. 1-8 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.8 Unary Operators Unary operators apply to only one operand. The target operand is the value or expression to the immediate right of the operator. The unary operator characters for this assembler are as follows. Table 1-2 Unary Operators Operator Description + Unary plus, operand unchanged - Unary minus, operand arithmetically negated ~ Complement, operand logically negated ! Not, operand boolean-wise (zero for false, non-zero for true) negated 1.4.9 Binary Operators Binary operators apply to two operands, the values or expressions to the immediate left and right of the operator. The binary operators for this assembler are as follows. Table 1-3 Binary Operators Operator Type Operator Arithmetic + Add - Subtract * Multiply / Divide & And Logical Comparison (evaluates to boolean (zero for false, -1 for true) Description | Or ^ Exclusive or = Is equal to != Is not equal to <> Is not equal to > Is greater than >= Is greater than or equal to < Is less than <= Is less than or equal to NOTE: String operands that do not evaluate to numbers can only have the comparison operators applied to them. DS795UM11 1-9 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.9.1 Precedence of Operators An expression is evaluated by first evaluating any parenthetical sub-expressions encountered. Then all operators are evaluated in the order of precedence, the highest precedence operators performed first, and the lowest precedence operators performed last. The assembler precedence of operators is summarized in Table 1-4. 1 Table 1-4 Precedence of Operators Precedence Description 6 (highest) unary +, unary -, ~, ! 5 *, / 4 binary +, binary - 3 =, !=, <>, >, >=, <, <= 2 & 1 (lowest) |, ^ 1.4.10 Expression Examples Table 1-5 shows examples for expressions, precedence of operators, and what the expressions evaluate to. Table 1-5 Expression Examples Expression Evaluation 32+3*(20-4) 32+(3*16) ’ 32+48 ’ 80 77/6*6+mod(77,6) (77/6)*6+mod(77,6) ’ (12*6)+mod(77,6) ’ 72+5 ’ 77 strcat(“moo”,‘cow’)=“moocow” “moocow”=”moocow” ’-1 (true) 14>=0&14<10 (14>=0)&14<10 ’ -1&(14<10) ’ -1&0 ’ 0 (false) 240&0x3f | 240&0x7f00 (240&0x3f) | 240&0x7f00 ’ 48 | (240&0x7f00) ’ 48 | 0 ’ 48 !0&9-12<>0^-64 (!0)&9-12<>0^(-64) ’ -1&(9-12)<>0^-64 ’ -1&(-3<>0)^-64 ’ (-1&-1)^-64 ’ -1^-64 ’ 63 1.4.11 Built-in Functions There are a number of built-in functions in the assembler that assist the programmer in the configurability of code. These functions are presented in Table 1-6. Table 1-6 Built-in Functions Function .defined(<expression>) Description This function returns zero (false) if the argument expression contains an undefined symbol, non-zero (true) otherwise. .isabsolute(<expression>) This function returns non-zero (true) if the expression evaluates to a numeric value or an absolute address, zero (false) otherwise. .isfloat(<expression >) Returns non-zero (true) if expression is a floating point quantity, zero (false) otherwise. .isint(<expression>) Returns non-zero (true) if expression is an integer quantity, zero (false) otherwise. .isstring(<expression>) This function returns non-zero (true) if the argument expression evaluates to a character string, zero (false) otherwise. 1-10 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-6 Built-in Functions (continued) Function Description .classname(<address expression>) This function returns a string representing the memory type in which the input address expression resides. The possible outputs are “X” for X memory, “Y” for Y memory, “L” for XY memory, and “CODE” for code memory. .typename(<expression>) This function returns the string representing the type of the expression. Where ‘typename’ is one of “FLOAT”, “NUMBER”, “STRING”, “ADDRESS”, “UNDEFINED”, “ERROR”, “EXTERNAL” or an enumerated type name. .segname(<address expression>) This function returns the name of the segment in which the input address expression resides. Segment names are defined when segments are declared. See Section 1.4.16.2, “Memory Segments” on page 16. .segaddr(<address expression>) This function returns the base address of the memory segment in which the input address expression resides. Note: .segaddr(<address expression>) + .offset(<address expression>) = <address expression>). .offset(<address expression>) This function returns the offset from the segment base address in which the input address expression resides, zero if undefined. .filename() This function returns a character string representing the full path of the file being assembled. .linenumber() This function returns the line number in the assembler file in which this function call resides. .timestamp() This function returns a character string representing the time this assembler run began, in format “MM-DD-YY HH:MM:SS”. .linecount() Returns the total number of source lines read. 1.4.12 Mathematical Functions There are a number of mathematical functions in the assembler. These functions are presented in Table 1-7. Table 1-7 Mathematical Functions Function Description .mod(<expression1>, <expression2>) This function returns <expression1> modulo <expression2>, or the remainder after division of <expression1> by <expression2>. .shl(<expression1>, <expression2>) This function returns <expression1> left-shifted by <expression2>, or <expression1> multiplied by 2-to-the-power-of-<expression2>. <expression2> must be in the range [0...31]. .shr(<expression1>, <expression2>) This function returns <expression1> arithmetically right-shifted by <expression2>, or <expression1> multiplied by 2-to-the-power-of-minus<expression2>. <expression2> must be in the range [0...31]. An arithmetic right shift implies that the arithmetic sign of <expression1> is preserved. .abs(<expression>) Returns the absolute value of expression. Return datatype is same as expression datatype. .acos(<expression>) Returns the arc cosine of expression in radians as a floating point value in the range zero to pi. <expression> must be in the range [–1…1]. .asin(<expression>) Returns the arc sine of <expression> in radians as a floating point value in the range [–pi/2...pi/2]. <expression> must be in the range [–1…1]. .atan(<expression>) Returns the arc tangent of <expression> in radians as a floating point value in the range [–pi/2...pi/2]. .cos(<expression>) Returns the cosine of <expression> (given in radians) as a floating point value. DS795UM11 1-11 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-7 Mathematical Functions (continued) Function Description Returns the natural exponential (base e raised to the power of <expression>) as a floating point value. Example: 1 .exp(<expression>) EXP1 .EQU .EXP(1.0) # EXP1 = 2.718282 .log(<expression>) Returns <expr1> raised to the power <expr2> as a floating point value. Example: BUF .EQU .CVI(.POW(2.0,3.0)) # BUF = 8 .log(<expression>) Returns the natural logarithm of <expression> as a floating point value. <expression> must evaluate to an integer or floating point value greater than zero. Example: LOG .set .LOG(100.0) # LOG = 4.605170 .log10(<expression>) .log10(<expression>) Returns the base 10 logarithm of <expression> as a floating point value. <expression> must evaluate to an integer or floating point value greater than zero. Example: LOG .EQU .LOG10(100.0) # LOG = 2.0 .log2(<expression>) .log2(<expression>) Returns the base 2 logarithm of <expression> as a floating point value. <expression> must evaluate to an integer or floating point value greater than zero. Example: LOG .EQU .LOG2(8.0) # LOG = 3 .max(<expr1>,<expr2>[,...,<exprN>]) .max(<expression>) Returns the greatest of <expr1>,...,<exprN>. Expressions must be numerical. If all expressions are integral type, the return value is an integer. Otherwise a floating point value is returned. Example: MAX .set .MAX(1.0,5,-3) # MAX = 5.0 (floating point) .min(<expr1>,<expr2>[,...,<exprN>]) .min(<expression>) Returns the least of <expr1>,...,<exprN>. Expressions must be numerical. If all expressions are integral type, the return value is an integer. Otherwise a floating point value is returned. Example: MIN .set .MIN(1.0,5,-3.25) # MIN = -3.25 .pow(<expr1>,<expr2>) .sign(<expression>) Returns the sign of <expression> as an integer: -1 if <expression> is negative, 0 if zero, 1 if positive. Example: .if .SIGN(INPUT)>=0 # is INPUT nonnegative? .sin(<expression>) Returns the sine of <expression> (given in radians) as a floating point value. .sqrt(<expression>) Returns the square root of <expression> as a floating point value. <expression> must be positive. Example: SQRT .EQU .SQRT(3.5) # SQRT = 1.870829 1-12 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-7 Mathematical Functions (continued) Function Description Returns the tangent of <expression> (given in radians) as a floating point value. Example: .tan(<expression>) TANGENT .set .TAN(1.0) # TANGENT = 1.557408 1.4.13 Conversion Functions There are a number of conversion functions in the assembler. These functions are presented in Table 1-8. Table 1-8 Conversion Functions Function .b2f(<expression>) Description Converts <expression> to a floating point value. <expression> should represent a binary fraction. Example: FRC .EQU .B2F(0x40000000) # FRC = 0.5 Obtain the fractional representation of the floating point <expression> as an 32-bit integer. Example: FRAC1 .set .F2B(0.5) # FRAC1 = 0x40000000 FRAC2 .set .F2B(1.5) # error! .f2b(<expression>) Example: THREE_PT_98_IN_9_23 .equ .F2B(3.98*.pow(2,–8.0)) #THREE_PT_98_IN_9_23 = 0x01fd70a4 .f2i(<expression>) Converts <expression> to an integer value. Any fractional portion of <expression> is removed (truncated). Example: .i2f(<expression>) Converts <expression> to a floating point value. Example: INT .set .F2I(-1.05) # INT = -1 FLOAT .set .I2F(5) # FLOAT = 5.0 .ceil(<expression>) .floor(<expression>) Returns an integer value that represents the smallest integer greater than or equal to <expression>. Returns an integer value which represents the largest integer less than or equal to <expression>. Example: FLOOR .SET .FLOOR(2.5) # FLOOR = 2 Adds 0.5 to <expression> then converts to an integer as per .F2I or .FLOOR. Example: .round(<expression>) round1 .set .ROUND(1.5) # round1 = 2 round2 .set .ROUND(1.48) # round2 = 1 DS795UM11 1-13 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.14 Loading Immediate Values to Registers The Cirrus Logic Assembly Program (CASM) allows the developer to load 16 bits at a time into a register. When the developer needs to load more than 16 bits into a register, the developer must use multiple lines of code. Below are some examples of loading more than 16 bits into a register. The first example uses two processing cycles and no data memory, and the second example uses one processing cycle and one word of data memory. It is often more desirable to conserve processing cycles than it is to conserve memory. Therefore, the second example is generally, but not always, recommended over the first example. 1 Example 1: Load 3.98 as a Q.N(9.23) fixed-point number into the into accumulator, a1. 1. Define the symbol “THREE_PT_98_IN_9_23” as shown below: THREE_PT_98_IN_9_23 .equ .f2b(3.98*.pow(2,–8.0)) 2. Load 16 bits at a time into the high part of the accumulator: a1 = (THREE_PT_98_IN_9_23>>16) # 3.98 in Q9.23 #a1 = 00 01fd0000 00000000 lo16(a1) = (THREE_PT_98_IN_9_23 & 0x0000FFFF) #a1 = 00 01fd70a4 00000000 Note: Example 1 uses two processing cycles and no data memory. The following macro could be used to implement the first example. .macro FLOAT_2_FIXEDQNM_REG_LOAD %intval, %fracval, %n, %m, %reg # inputs: # intval = integer part of float value # fracval = fractional part of float value # n = number of bits for (signed) integer part # m = number of bits for fractional part # reg = destination register (32 bit data register or accumulator) %reg = (.shr(.f2b(%intval.%fracval*.pow(2,–(%n.0–1.0))),16)) lo16(%reg) = (.f2b(%intval.%fracval*.pow(2,–(%n.0–1.0)))&0xffff) .endm .code_ovly FLOAT_2_FIXEDQNM_REG_LOAD 3, 98, 9, 23, a1 #### The above macro expands to # a1 = (.SHR(.F2B(3.98*.POW(2,–(9.0–1.0))),16)) # LO16(a1) = (.F2B(3.98*.POW(2,–(9.0–1.0)))&0XFFFF) Example 2: Load 3.98 as a Q.N (9.23) fixed-point number into the accumulator, a0. .xdata_ovly THREE_PT_98_IN_9_23 .equ .f2b(3.98*.pow(2,-8.0)) X_THREE_PT_98_IN_9_23 .dw (THREE_PT_98_IN_9_23) .code_ovly a0 = xmem[X_THREE_PT_98_IN_9_23] #a0 = 00 01fd70a4 00000000 Note: Example 2 uses one processing cycle and one word of data memory. 1-14 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.15 String Functions There are a number of conversion functions in the assembler. These functions are presented in Table 1-9. Table 1-9 String Functions Function Description Returns the 0 based position (0 = first character of the <str1>) of string <str2> in <str1> as an integer, starting at position <start>. If <start> is not given the search begins at the first character of <str1>. If the <start> argument is specified it must be a positive integer. If <str2> is not found in <str1>, the length of <str1> is returned. Example: .strpos(<str1>,<str2>[,<start>]) ID .EQU .STRPOS('CS18101','18') # ID = 2 Lexicographical string comparison (case sensitive). Returns an integer 0 if the two strings are the same, 1 if <str1> is greater than <str2>, -1 if <str1> is less than <str2>. Example: .strcmp(<str1>,<str2>) .IF .STRCMP(STR,'MAIN') # does STR differ from “MAIN”? .strcat(<string1>, <string2>[, … <stringN>]) This function returns the string arguments concatenated into a single string. .strlen(<string>) This function returns the length of the argument character string (<string>). .streval(<string>) This function takes its character string argument and evaluates <string> as an expression. The function returns the value of the expression. .substr<string>, <startexpression>[,<lengthexpression>]) Return the substring starting at <start-expression> until the end of the string, or <length-expression>+<start-expression>. <start-expression> is a 1-based index into the string. If <length-expression> is omitted, the substring from <startexpression> to the end of the string is returned. This function evaluates <expression> and returns its numerical value as a decimal character string. .numtostr(<expression>) or .str(<expression>) .numtostrx(<expression>) or .strx(<expression>) This function evaluates <expression> and returns its numerical value as a hexadecimal character string. 1.4.16 Assembler Directives Assembler directives have a format similar to instructions, but they are instructions to the assembler, not intended for the target DSP. Instructions or data memory locations may be generated as a side effect, but not necessarily. The assembler directives assist the programmer in controlling and configuring code. There are directives that allow separate assembler files to share objects. Other directives allow for allocation and initialization of data memory. There are directives for conditional assembly allowing for optional features without writing multiple versions of the same code. Finally, there are macros, which help the programmer encapsulate often-used or iterated blocks of code. DS795UM11 1-15 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.16.1 Code Modularity Macros used to enable the programmers to build modularity into their code are presented in Table 1-10: 1 Table 1-10 Macros Macro Description .include <file string> This directive opens the file with path <file string>, inserts, and assembles the file contents where the directive resides. There is no default extension to an include file. If there is no extension to <file string>, the assembler will attempt to open the file path with no extension. The include file can itself contain .include directives with no nesting limit. An include file can be shared among several assembler files to define common constants, expressions, and macros. .public <symbols> This directive allows the specified list of symbols to be referenced by other assembler files. <symbols> contains one or more symbols separated by commas, that are defined in the course of assembly of this file. .extern <symbols> This directive tells the assembler that the specified list of symbols is defined with .public directives in other assembler files. <symbols> contains one or more symbols separated by commas, defined in other assembler files. If an .extern symbol is defined in the course of assembly of this file and declared to be .public, the .public declaration takes precedent. If an .extern symbol is not defined in any object file, the linker will produce an unresolved external error at link time. An .extern symbol cannot be used in an expression. .end This directive tells the assembler that there are no more lines to assemble in this file. Further assembly of this file is halted and the file is closed. Use of this directive is optional, as this directive is equivalent to a physical end of file. The.end directive may be used to hide text below the directive from the assembler. .export <symbols> This directive allows the specified list of symbols to be exported to a *.xo file at link time and to be referenced by other assembler files. <symbols> contains one or more symbols separated by commas that are defined in the same assembly file. Casm does not report an error if both .public and .export macros are used for the same symbol. External symbols cannot be exported, so an error is reported in this case. 1.4.16.2 Memory Segments This segment directive instructs the assembler to put subsequent instructions or data declarations in the memory specified by <segment-class>. [<name>] segment <class_name> [at <address> | align <modulo>] OR [<name>] .<segment-class> [at <address> | align <modulo>] Use .xdata_ovly, .ydata_ovly, .data_ovly, and .code_ovly. For example: MAIN_XDATA_ALIGN16 segment "X_OVLY" align 16 MAIN_XDATA_ALIGN32 .xdata_ovly align 32 SAMPLE_MCV .ydata_ovly MAINCODE .code_ovly MY_64_BIT_XY_DATA_SEGMENT .data_ovly The segment definition ends with either the next segment directive or the end of assembly. All consecutive declarations within a segment will appear contiguously in the final memory map produced by the linker. <name> is an optional parameter. <name> may be the same as used on other segment declarations in which case the new declaration will be concatenated with the previous one(s) with like <name> parameters. The ordering of these concatenations is indeterminate. If the at form of the directive is used, the first memory location following the directive will be placed at the absolute <address> specified. Only one declaration with at form is allowed for a given segment <name>. If such a declaration exists, it determines the starting point for the 1-16 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide segment. All segments using the same <name> specification will be concatenated and follow the segment generated with the at specification. Use of the align form of the segment declaration will instruct the linker to align the first memory location following the directive to the given address <modulo>. Only one declaration with align form is allowed for a given segment <name>. All segments using the same <name> specification will be concatenated and follow the segment generated with align specification. The at and align forms of the segment declaration are mutually exclusive and may not be used together. 1.4.16.3 Symbol Assignment The following directives are used to assign values to symbols. Table 1-11 Symbol Assignment Directive <symbol> .equ <expression> Description The .equ directive can only be applied to a unique symbol one time in an assembly, meaning that the symbol must not be previously defined or redefined in the assembly. There may be no forward references to symbols in <expression>. Example: uyequ .equ (0x1) <symbol> .set <expression> This directive assigns the value of <expression> to the symbol defined in this code line. The symbol can be redefined later in the assembly with other instances of the .set directive. <expression> in this directive array contains forward references. 1.4.16.4 Data Memory Assignment Data memory directives are described Table 1-12: Table 1-12 Data Memory Assignment Directive .bss <expression> Description No allocation performed. This directive instructs the linker that <expression> words are reserved. Example: .xdata .bss (0xff) .bsc <count>,<value>? This directive instructs the assembler to allocate <count> words in the current memory segment. All words allocated are initialized to the value of <value>. If <value> is not used, then there is no allocation. Memory location is reserved. Example: .data .bsc (0xf),0x8 DS795UM11 1-17 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-12 Data Memory Assignment (continued) Directive Description Define constant byte(s). <Arg>(s) may be strings or integers. Strings are not implicitly null terminated. Data is stored in big endian order and zero padded to the next 32-bit word boundary. Example: 1 .dcb 0x30,0x31,0x32,0x33,0x34 .dcb 0x35,0x36,0x37,0x38,0x40 .dcb ‘R’ .dcb ‘CirrusLogic’,0 # explicit null termination .dcb ‘four’ # no null termination #Generates the following memory image: X 0000 30313233 34000000 35363738 40000000 X 0004 52000000 43697272 75734C6F 67696300 X 0008 666F7572 .dcb <arg>[, <arg>, …] .dh <arg>[,<arg, …] Define constant 16-bit halfword(s). <Arg>(s) must be in the range 0…65535 to fit in 16-bit storage. Data is stored in big endian order and zero padded to the next 32-bit word boundary. This construct is useful for packing a 16-bit address expression and a 16-bit integer expression in the same 32-bit word. Example .dh 0x2d,0x123,0x2d,0x1111 #Generates the following memory image: 0x002d0123 0x002d1111 .dw <expression> .dw <expression1>,<expression2> These directives instruct the assembler to allocate one word in the current memory segment. If the memory is X or Y, the first form must be used, and the word allocated is initialized to the value of <expression>. If the memory is X-Y, the second form must be used. The X word allocated is initialized to the value of <expression1>, and the Y word allocated is initialized to the value of <expression2>. Example 1: .xdata .dw (0x12345678) Example 2: .data .dw (0x1),(0x2) .dd <expression> This directive instructs the assembler to allocate one word in the current memory segment, which must be in XY. The X word allocated is initialized to the most significant bits of <expression>, and the Y word allocated is initialized to the least significant bits of <expression>. Example: .data .dd (0x12345678ABCDEF00) 1-18 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.16.5 Conditional Assembly Conditional assembly directives are described Table 1-13. Table 1-13 Conditional Assembly Directives Directive 1 Description .if <expression> This directive evaluates <expression> (<expression> must not contain a forward reference). If the value is true (non-zero), the lines just below this directive are assembled to the next instance of the .elseif, .else, or .endif directive. If the directive is .elseif or .else, the assembler will skip to the following .endif directive. Normal assembly continues after the .endif directive. If the value is false (zero), the assembler will skip to the corresponding .elseif, .else, or .endif directive. The directive encountered is then executed. An .if directive begins an .if block. A corresponding .endif directive must follow the .if directive. An .if block does not require either ann .elseif directive or an .else directive. .elseif <expression> This directive functions in the same manner as the .if directive above. However, this directive must be contained within an .if block, that is, after an .if directive and before its corresponding .endif directive. In addition, the .elseif directives in an .if block must all occur before a corresponding .else directive. .else This directive, if encountered when the .if directive and all .elseif directives in the .if block have evaluated to false (zero), instructs the assembler to assemble lines just below this directive until an .endif directive is encountered. .endif This directive indicates the termination of an .if block. Normal assembly continues after this line. DS795UM11 1-19 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-13 Conditional Assembly Directives (continued) Directive Description There are four basic configurations of an .if block: 1 ? if Configuration I .if <exp> <code> .endif ? if Configuration II .if <exp> <code0> .else <code1> .endif ? if Configuration III .if <exp0> <code0> .elseif <exp1> <code1> # optional additional .elseif clauses .endif .if ? if Configuration IV .if <exp0> <code0> .elseif <exp1> <code1> .dw (0x1),(0x2)optional additional .elseif clauses .else <codeN> .endif The conditional expressions are evaluated until one is found to be true. The code segment corresponding to this conditional is assembled. If no conditional expression evaluates to true, and an .else directive exists in this block, the code segment corresponding to the .else directive is assembled. It is possible to nest .if blocks. The entire nested .if block, from the .if directive to the .endif directive, must reside in one <code> section, meaning that the .if directive cannot reside in <codeN> and its corresponding .endif directive reside in <codeN+1>. 1.4.16.6 Token Substitution .def <token> <substitution> Defines a token substitution. All occurrences of <token> are replaced with <substitution> for the remainder of the source file or until a matching .undef statement is encountered. <token> may be any text that is not already known as a symbol to the assembler. <substitution> may be any arbitrary text delimited by end of line or start of comment. Example: .def STACK_PTR i7 x0 = xmem [ STACK_PTR ] 1-20 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide .undef <token> Undefines previously defined token (defined with .def). The token is undefined until end of source code or until it is redefined with .def. Example: .def STACK_PTR i7 if .defined(STACK_PTR) x0 = xmem[STACK_PTR] .endif .undef STACK_PTR x0 = xmem[STACK_PTR] # <- CASM report error! 1.4.16.7 Listing and Message Control The following directives have no effect if the –l switch does not appear in the command line. Omitting the –l switch specifies that no listing file will be created. • .list <switches> This directive controls the format of the listing file. <switches> contains one or more listing control switches, separated by spaces. The formats of the switches are as follows: Table 1-14 Listing Control Switches Directive Description off This switch inhibits listing output. on This switch activates listing output, or reverses the action of the off switch. cond +cond –cond These switches control the listing of code blocks skipped over during .if block processing. ‘+’ or no prefix will list the skipped blocks (the default behavior), and the ‘-‘ prefix will not list the skipped blocks. mac +mac –mac These switches control the listing of macro expansion. ‘+’ or no prefix will list the expansion lines (the default behavior), and the ‘-‘ prefix will not list the expansion lines. inc +inc –inc These switches control the listing of include files. ‘+’ or no prefix will list the contents of include files (the default behavior), and the ‘-‘ prefix will not list the include files. sym +sym –sym These switches control the listing of the symbol table. ‘+’ or no prefix will list the symbol table (the default behavior), and the ‘-‘ prefix will not list the symbol table. gensym +gensym –gensym These switches control the listing of internally generated symbols in the symbol table. ‘+’ or no prefix will list internal symbols in the symbol table, and the ‘-‘ prefix will not list internal symbols in the symbol table (the default behavior). allsym +allsym –allsym These switches control the listing of internally reserved symbols in the symbol table. ‘+’ or no prefix will list reserved symbols in the symbol table, and the ‘-‘ prefix will not list reserved symbols in the symbol table (the default behavior). .page .page <expression> This directive controls pagination of the listing file and the page size. The first form, without an argument, causes a page advance in the listing. The second form establishes <expression> as the number of lines per page. The default number of lines per page is 60. .title <string> This directive instructs the assembler to print <string> at the top of every listing page. This directive must be employed only per assembly. .subtitle <string> This directive instructs the assembler to print <string> under the title line of subsequent listing pages. <string> will be printed on the subtitle line of the current listing page if the directive is encountered within the first four lines of the page. This directive may be employed more than once or not at all. DS795UM11 1-21 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-14 Listing Control Switches Directive 1 .error <strings> .message <strings> Description This directive instructs the assembler to print an error message on both the console output and the listing file. <strings> consists of one or more string expressions, separated by commas, that are concatenated together to create the message. This directive instructs the assembler to print an message on the console output but not the listing file. <strings> consists of one or more string expressions, separated by commas, that are concatenated together to create the message. 1.4.16.8 Assembler Warning/Error Control The assembler emits warnings and errors for certain combinations of target DSP instructions. The .pragma directive allows some control over what conditions are considered errors and warnings. The syntax of the .pragma directive is: .pragma enable:<condition> [,<condition>...] .pragma disable:<condition> [,<condition>...] The possible condition values are: • OSp If enabled, the use of registers i8-i11, nm8-nm11, iic_mask, and iic_addr is allowed. Otherwise, such use produces an error because these registers are reserved for use by the DSP operating system. The errors or warnings produced are: • LOAD_DELAY_AS_WARNINGp If enabled, the use of an index register immediately after it is loaded with a constant will be treated as a warning condition. Otherwise, such an instruction sequence produces an error. • GLOBAL_MEMp If enabled, the standard memory location directives (.data, .code, .ydata, .xdata) are allowed. Otherwise use of those directives is an error, and code should use the application-specific memory segment directives (such as .xdata_ppm) • CODE_IN_DATAp If enabled, executable instructions found in data segments are allowed. Otherwise, such instructions cause an error. • DATA_IN_CODEp If enabled, data definition directives (such as .dw) found in code segments are allowed. Otherwise, such instructions cause an error. 1-22 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.16.9 Data Structure Types CASM provides a way of grouping variables into structures, similar to structures in the C programming language. Elements of structures are referred to as “members”. Structure members can be initialized at structure type definition time or at structure instantiation time. Values specified for structure members at structure type definition time propagate to each instance of the type unless overridden by the instance definition. Not all structure members need to be initialized, but a member initialized at instantiation time may not be preceded by an uninitialized member. Any of the data memory directives described in Section 1.4.16.4 may be used to define structure type members. Structure type definition syntax: <struct type name> <element1> <element2> <element3> ... <elementn> .struct <data memory directive> <data memory directive> <data memory directive> <data memory directive> .endstruct Structure instantiation syntax: <label> <struct type name> (<initial_value1>,...,<initial valuen>) Accessing the structure members from code: • Direct structure member access: a0 = xmem[<struct symbol>.<element>] • Indirect structure member access may be performed by adding the offset of the member inside the structure to the structure's address: i0 i1 a0 if = (<struct symbol 1>) = (<struct symbol 2>) - a1 (a > 0) jmp > anyreg (i0, i1) % i0 = i0 + (<struct type>.<element>) x0 = xmem[i0] • Accessing members of an externally-defined structure: <struct type> .struct ... ... .endstruct .extern <symbol> (<struct type>) a0 = xmem[<symbol>.element] Example 1: # define a structure type labeled Y_OS_GLOBAL_VARS_T Y_OS_GLOBAL_VARS_T .struct _X_BY_IOBUFFER_PTRS .bsc NUM_IOBUFFER_CHANNELS,0 DS795UM11 1-23 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide _X_VY_HOST_SAMPLERATE .dw 0 _X_VY_Host_Output_Mode_Control .dw 0 _X_VY_IO_Free .dw 0 _X_VY_Autodet_config .dw 0 _X_VY_IO_Processed .dw 0 _X_VY_ODT_PTR .bsc NUM_OVLY_SLOTS,0 .endstruct 1 .ydata_dec # define an instance of the Y_OS_GLOBAL_VARS_T structure type: Y_OS_GLOBAL Y_OS_GLOBAL_VARS_T .code_dec # read the value of the _X_VY_HOST_SAMPLERATE member of the Y_OS_GLOBAL structure: y0=ymem[Y_OS_GLOBAL._X_VY_HOST_SAMPLERATE] Example 2: # define a structure type labeled EXAMPLE_STRUCT_T: EXAMPLE_STRUCT_T .struct INITIALIZED_ARRAY_1 .bsc 2,0 UNINITIALIZED_WORD_2 .bss 1 INITIALIZED_ARRAY_3 .bsc 3, 1 INITIALIZED_WORD_4 .dw 0x7FFFFFFF UNINITIALIZED_ARRAY_5 .bss 5 .endstruct .ydata_dec # define an instance of the EXAMPLE_STRUCT_T structure type: EXAMPLE_STRUCT EXAMPLE_STRUCT_T .code_dec # read the value of the UNINITIALIZED_WORD_2 member of the EXAMPLE_STRUCT structure: y0 = ymem[EXAMPLE_STRUCT.UNINITIALIZED_WORD_2] Arbitrary structure nesting is supported. All the rules regarding initialization that apply to structures without nested structures apply here as well. Structure nesting syntax: <struct type name> <element1> <element2> <element3> ... <elementn> .struct <data memory directive> or <struct type name> <data memory directive> or <struct type name> <data memory directive> or <struct type name> <data memory directive> or <struct type name> .endstruct 1-24 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Nested structure initialization syntax: <label> <super struct type name> (<initial_value1>,...,<initial valuen>) Accessing the nested structure members from code: 1 • Direct nested structure member access: a0=xmem[<superstruct symbol>.<substruct symbol>.<element>] • Indirect nested structure member access may be performed by adding the offset of the member inside the superstructure to the structure's address: i0 i1 a0 if = (<superstruct symbol 1>) = (<superstruct symbol 2>) - a1 (a > 0) jmp > anyreg (i0, i1) % i0 = i0 + (<superstruct type>.<substruct type>.<element>) x0 = xmem[i0] Example 1: Y_OS_GLOBAL_VARS_T .struct _X_BY_IOBUFFER_PTRS .bsc NUM_IOBUFFER_CHANNELS,0 _X_VY_HOST_SAMPLERATE .dw 0 _X_VY_Host_Output_Mode_Control .dw 0 _X_VY_IO_Free .dw 0 _X_VY_Autodet_config .dw 0 _X_VY_IO_Processed .dw 0 _X_VY_ODT_PTR .bsc NUM_OVLY_SLOTS,0 .endstruct DEMO_STRUCT_TYPE_T .struct _ELEMENT_0 .dw 0 _ELEMENT_1 .dw 0 _ELEMENT_2 Y_OS_GLOBAL_VARS_T _ELEMENT_3 .dw 0 .endstruct .ydata_dec Y_OS_GLOBAL Y_OS_GLOBAL_VARS_T Y_DEMO_STRUCT DEMO_STRUCT_TYPE_T (0x1,0x2,0x3,0x4,0x5,0x6,0x7, 0x8,0x9,0xA,0xB,0xC) CASM supports struct initialization with brackets. For example, Y_DEMO_STRUCT can be initialized the following way: Y_DEMO_STRUCT DEMO_STRUCT_TYPE_T (0x1,0x2,(0x3,0x4,0x5,0x6,0x7,0x8,0x9,0xA,0xB),0xC) or Y_DEMO_STRUCT DEMO_STRUCT_TYPE_T (0x1,0x2,(0x3,0x4,0x5,0x6,0x7,0x8),0xC) In the second example, _X_VY_IO_Processed and _X_VY_ODT_PTR will have pre-defined values (0 and NUM_OVLY_SLOTS,0). DS795UM11 1-25 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Example 2: # define various nested structures: STRUCT_1_T .struct member_1 .dw 0x101 member_2 .dw 0x102 member_3 .dw 0x103 .endstruct 1 STRUCT_2_T member_1 member_2 member_3 STRUCT_3_T member_1 member_2 member_3 .struct .dw 0x201 .dw 0x202 .dw 0x203 .endstruct .struct .dw 0x301 .dw 0x302 .dw 0x303 .endstruct STRUCT_23_T STRUCT_2 STRUCT_3 .struct STRUCT_2_T STRUCT_3_T .endstruct STRUCT_4_T STRUCT_1 STRUCT_23 .struct STRUCT_1_T STRUCT_23_T .endstruct .ydata_dec # define an instance of the STRUCT_4_T structure type: STRUCT_4 STRUCT_4_T (0x401, 0x402, 0x403) .code_dec # read the value of the member_1 member of the STRUCT_2 nested structure: y0 = ymem[STRUCT_4.STRUCT_23.STRUCT_2.member_1] 1.4.16.10 Sizeof Function This function returns the number of words allocated for a structure or a symbol. sizeof(<struct type name>) sizeof(<struct symbol>) sizeof(<symbol>) 1-26 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Examples: Y_OS_GLOBAL_VARS_T .struct _X_BY_IOBUFFER_PTRS .bsc NUM_IOBUFFER_CHANNELS,0 _X_VY_HOST_SAMPLERATE .dw 0 _X_VY_Host_Output_Mode_Control .dw 0 _X_VY_IO_Free .dw 0 _X_VY_Autodet_config .dw 0 _X_VY_IO_Processed .dw 0 _X_VY_ODT_PTR .bsc NUM_OVLY_SLOTS,0 .endstruct SYMBOL 1 .equ 10 + sizeof(Y_OS_GLOBAL_VARS_T) .ydata_dec Y_OS_GLOBAL Y SYMBOL_0 SYMBOL_1 .code_dec Y_OS_GLOBAL_VARS_T .dw 12 .bss 10 i0=(sizeof(Y_OS_GLOBAL_VARS_T)) i0=(sizeof(Y_OS_GLOBAL)) i0=(sizeof(SYMBOL_0)) i0=(sizeof(SYMBOL_1)) 1.4.16.11 Assert Directive This macro definition is used for debug purposes. Expression in the assert macro is evaluated as true or false. If the value of the assert expression is fale, CASM reports the error. Usage: .assert <expression> Example: a_t channel_address channel_stride channel_buffer .struct .dw 0 .dw 0 .bss 16 .endstruct .xdata_ovly my_data a_t .code_ovly ... i4 = (0)+(my_data) # use a single index register to access the members of the structure # make sure the type declaration doesn’t violate the coding assumptions i0 = xmem[i4]; i4+ = 1 # i0 = channel_ptr .assert (a_t.channel_stride = (a_t.channel_address+1)) DS795UM11 1-27 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1 nm4 = xmem[i4]; i4+ = 1 # nm4 = channel stride .assert (a_t.channel_buffer = (a_t.chanel_stride+1)) #i4 = buffer ... 1.4.17 Macro Definition and Calling This directive instructs the assembler to begin defining a macro: • macro .macro nolist This directive instructs the assembler to begin macro definition. Unlike other assemblers, the macro defined is not associated with a symbol defined at the beginning of this line, so there should be no symbol defined on this line. The second form of this directive inhibits the lines of the macro definition from appearing in the listing file. The line immediately following the .macro directive contains the calling prototype. The format of the prototype line is as follows. [<%symbolarg>] <name> <%args> <symbolarg>, if defined, must begin in the first column of the line. This local symbol allows for the macro call to pass an argument as a symbol definition. The symbol in the macro call is not defined or re-defined, but passed as an argument to the macro. <name> is the macro name. <name> cannot begin at the first column of the line. <%args> is an optional list of local symbols that serve as arguments to the macro. These arguments should be separated by commas. There are two ways to use commas: • ROMCMD {"ABC",7}, ABCROUTINE • ROMCMD "ABC"%,7, ABCROUTINE In the first example, CASM understands that the curly braces enclose an entire parameter. In the second example, the percent sign “escapes” the comma and causes CASM to accept it as text rather than a parameter delimiter. A percent sign can also be used to “escape” a curly brace or another percent sign to get these characters accepted as text rather than special parameter syntax. The lines after the calling prototype, up to but not including the closing .endm directive, constitute the macro body. It is possible to define symbols within the macro body, but like the arguments, all symbols in a macro must be local symbols. All references to both arguments and body symbols must start with the ‘%’ character, employing the same format as their definition. No ‘>’ or ‘<’ references are allowed in a macro body. A local symbol outside of the macro body cannot be referenced. The ‘%’ character has other special functions within a macro body. The special functions are described in Table 1-15. 1-28 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Table 1-15 Special Characters Used in Macros Characters Description %%: Replaced with a single ‘%’ character in the macro expansion. %#: Delineates a comment that will not appear in the macro expansion. %&: Replaced with no characters, used as a concatenation operator in conjunction with a symbol or argument reference. %(<string>): Replaced with the contents of <string>. The following directives are used in defining macros: • .endm This directive marks the end of a .rept block or .macro body. Normal assembly resumes after this line. • .exitm This directive, if encountered during macro expansion, will terminate macro expansion at this point, essentially skipping to the corresponding .endm directive. An example of a macro is given in the code example in Section 1.4.19. 1.4.18 Macro Replication The following expression is used to replicate macros: .rept <expression> .rept %<variable>=<start>,<stop>[,<step>] .endm Create duplicates of enclosed block of source lines. <expression> must be a non-negative integer. <variable> is assigned a value of <start> for the first iteration and is incremented by <step> (or by 1 if <step> is not supplied) on each successive iteration. <variable> does not need to be previously defined. On the last iteration, <variable> is equal to or less than <stop>. Example 1: count .set 3 .rept count .dw 0x20 .endm Generates: count .set 3 .dw 0x20 .dw 0x20 .dw 0x20 DS795UM11 1-29 Copyright 2013 Cirrus Logic 1 Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide Example 2: .rept %v=1,5,2 .dw %v*5+3 .endm 1 Generates: .dw 8 .dw 18 .dw 28 Example 3: .macro MapDef %index,%offset DEF_AUDIOMAP%&%(.numtostr(%index)) .set %(.numtostr(%offset)) .endm AUDIO_INDEX .set 1 .rept %m=1,4 .rept %%n=1,3 MapDef AUDIO_INDEX,%m*4+%%n AUDIO_INDEX .set AUDIO_INDEX+1 % .endm .endm Generates: DEF_AUDIOMAP1 .set 5 DEF_AUDIOMAP2 .set 6 DEF_AUDIOMAP3 .set 7 DEF_AUDIOMAP4 .set 9 ... DEF_AUDIOMAP12 .set 19 1-30 DS795UM11 Copyright 2013 Cirrus Logic Cirrus Logic Assembly Program (CASM) 32-bit DSP Assembly Programmer’s Guide 1.4.19 Assembly Language Example The following code example employs many of the assembler directives in this section, concentrating on macros and conditional assembly, to create a generic FIR filter macro. 1 .list-cond .macro % labelfir%tapptr, %tapbuf, %tapcir, %wtbuf, %order # tapptr: pointer into tap buffer, string reference # tapbuf: tap buffer # tapcir: tap buffer circular length # wtbuf: tap weight buffer # order: FIR order # make sure buffer memory is configured properly .ifclassname(%tapbuf) = "XYMEM" | classname(%wtbuf) = "XYMEM" .error "fir: tap and/or weight buffer cannot be in x-y memory" .exitm .elseifclassname(%tapbuf) = "XMEM" & classname(%wtbuf) = "XMEM" .error "fir: tap and weight buffer cannot both be in x memory" .exitm .elseifclassname(%tapbuf) = "YMEM" & classname(%wtbuf) = "YMEM" .error "fir: tap and weight buffer cannot both be in y memory" .exitm .endif # load address registers and perform convolution depending on where circular taps and weights are .ifclassname(%tapbuf) = "YMEM" i0 = (%wtbuf) i4 = %(%tapptr) # recall the %(<string>) operator nm4 = (%tapcir) a0 = 0;x0 = xmem[i0]; i0 += 1;y0 = ymem[i4]; i4 += 1 do (%order), %loop %loop:a0 += x0*y0; x0 = xmem[i0]; i0 += 1; y0 = ymem[i4]; i4 += 1 nm4 = (0) .else i4 = (%wtbuf) i0 = %(%tapptr) nm0 = (%tapcir) a0 = 0;x0 = xmem[i0]; i0 += 1;y0 = ymem[i4]; i4 += 1 do (%order), %loop %loop:a0 += x0*y0; x0 = xmem[i0]; i0 += 1; y0 = ymem[i4]; i4 += 1 nm0 = (0) .endif # new value of index register is not put back into tapptr .endm The above macro does a simple FIR one time and checks the memory locations of the taps and tap weights to assure a fast FIR can be performed. This macro can be called as follows. Mylabelfir“xmem[inptr]”, inbuffer, 128, lpfilter, 65 DS795UM11 1-31 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Chapter 2 232-Bit DSP Internal Architecture and Programming Model 2.1 Overview The Cirrus Logic 32-bit DSP core is a fixed point, fully programmable digital signal processor which achieves high performance through an efficient instruction set and highly parallel architecture. This device uses two's complement fractional number representation. The block diagram of the internal architecture is shown in Figure 2-1. The device has busses for two data memory spaces and one program memory space. Interrupts Program Control Unit Dual Address Generation Units (AGU) Call Stack X Address Generation Unit Loop Stack Instruction Decoder Interrupt Control 8 Index/Modulo Register Pairs Address Generation Y Address Generation Unit 4 Index/Modulo Register Pairs Program Data Bus Program Address Bus X Address Bus Y Address Bus X Data Bus Y Data Bus Eight 32-bit Data Registers Peripheral Data Peripheral Address Peripheral MUX Eight 72-bit Accumulators MAC/ALU A MAC/ALU B Dual MAC/ALU Data Paths Figure 2-1. Cirrus Logic 32-Bit Architecture The Cirrus Logic 32-bit DSP core consists of the following modules: DS795UM11 2-1 Copyright 2013 Cirrus Logic 2 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide • Program control unit • Parallel Data Paths (A and B). 2 • Parallel Address Generation Units (AGUs) The AGUs contain: • Eight 16-bit registers for address generation • Eight 16-bit registers that work in conjunction with the index registers to provide different addressing modes. 2.2 Data Path and Accumulators Unit Figure 2-2 shows the data flow within the Data Path and Accumulator Unit. Each data path has four 32-bit general-purpose registers and four 72-bit accumulators (eight each, total). Each 72-bit accumulator is the concatenation of three registers: Guard, High, and Low. X Data Bus Y Data Bus 32-bit 32-bit 32-bit 32-bit (Identical Bus Widths for A and B Data paths) 32-bit 32-bit 72-bit MAC A 72-bit 72-bit 72-bit ALU A 72-bit 32-bit Data Registers X0 X1 X2 X3 Y0 Y1 Y2 Y3 MAC B 72-bit Accumulators A0 A1 A2 A3 B0 B1 B2 B3 ALU B 72-bit SRS A SRS B 32-bit 32-bit Figure 2-2. Data Flow within Data Path and Accumulators Unit 2-2 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide The Guard registers are 8 bits, High and Low registers are 32 bits, and all parts can be addressed independently. See Figure 2-3. Each data path also has one Multiply-Accumulate unit (MAC), Shifter/Rounder/Saturator (SRS) and Arithmetic Logic Unit (ALU). The ALU is responsible for all the logical operations performed on the accumulators. The way the SRS handles data in the accumulator and transfers it to the data bus is explained later in this chapter. X Data Registers 31 0 x0 x1 x2 x3 Y Data Registers 31 0 y0 y1 y2 y3 A Accumulators 71 64 63 32 31 0 a0g a0h a0l a1g a1h a1l a2g a2h a2l a3g a3h a3l B Accumulators 71 64 63 32 31 0 b0g b0h b0l b1g b1h b1l b2g b2h b2l b3g b3h b3l Figure 2-3. Data Path Registers DS795UM11 2-3 Copyright 2013 Cirrus Logic 2 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide For successive additions to a particular accumulator, it is possible for the sum being greater than maximum value that can be represented by 32 fixed-point bits. The guard bits allow for temporary accommodation of this overflow. This is useful when you are adding a bunch of numbers that will sum to be less than the maximum, but can overflow all the additions have completed. 2 Example 2-1 For example Consider this sum: 0.5 + 0.75 - 0.3 - 0.8 The intermediate sums are: - 0.6 - 0.4 - 0.25 + 0.5 = -0.6 0.5 + 0.75 = 1.25 (over-flow occurs. Can’t be represented by 32 fixed-point bits – one more bit is needed) 0.5 + 0.75 - 0.3 = 0.95 0.5 + 0.75 - 0.3 - 0.8 = 0.15 0.5 + 0.75 - 0.3 - 0.8 - 0.6 = -0.45 0.5 + 0.75 - 0.3 - 0.8 - 0.6 - 0.4 = -0.85 0.5 + 0.75 - 0.3 - 0.8 - 0.6 - 0.4 - 0.25 = -1.1 (Overflow occurs. Can’t be represented by 32 fixed-point bits – one more bit is needed) 0.5 + 0.75 - 0.3 - 0.8 - 0.6 - 0.4 - 0.25 + 0.5 = -0.6 With these 8 guard bits the numeric range of the accumulator is extended by 256 extra levels of precision. 2.2.1 Data Representation The data representation used in this processor is the two's complement fractional notation. The 32-bit, 64-bit, and 72-bit fractional representations are shown in Figure 2-4, Figure 2-5, and Figure 2-6. The S bit is the sign bit. The X and Y data registers contain 32-bit operands, and the accumulators contain 72-bit operands which may be read out through the SRS as 32bit or 64-bit operands. All internal ALU operations in the data path are 72 bits. The 32-bit operand represents Twos complement form with the left most bit is the sign bit, followed by the radix point and the 31-bit fractional part. The largest positive number that can be represented is 0x7fffffff (1-2-31 in decimal), and the largest negative number is 0x80000000 (-1.0 in decimal). 2-4 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide a0h 31 S 30 . 2-1 0 -2 2 2-3 ... 2 Fractional Part Radix Point Figure 2-4. 32-bit Fractional Representation The 64-bit accumulator has the sign bit as the left most bit followed by the radix point and the 63-bit fractional part. a01 a0h 63 S . 2-1 32 31 2-2 2-3 ... 0 Fractional Part Radix Point Figure 2-5. 64-bit Fractional Representation A 72-bit accumulator has the sign bit as the left most bit followed by 8 integer bits, the radix point, and the 63-bit fractional part. The largest positive number that can be represented in an accumulator is 0x7f ffff ffff ffff ffff (256-2-63 in decimal) and the largest negative number is 0x80 0000 0000 0000 0000 (-256.0 in decimal). a0h a0g 71 64 63 Sign 27 26 25 24 23 22 21 a01 32 31 0 Fractional Part Signed Integer Part Radix Point Figure 2-6. 72-bit Fractional Representation A comparison can be made between integer and fractional number representation. The range for integer representation is +/-2N-1, and for fractional representation is +/-1. To convert from an integer to a fraction the integer is multiplied by a scaling factor. The representation of a result from an addition or subtraction for both integer and fractional numbers is the same. This is not true when the arithmetic operation is a multiplication or a division. The difference is that the extra bit obtained in integer multiplication acts as a duplicate sign bit in fractional multiplication. See Figure 2-7. DS795UM11 2-5 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Integer Multiplication: Multiplying 2 n-bit numbers results in a 2n-bit product. 2 2 8-bit result: 0 0 0 . 2 0 1 23 22 21 -11 -20 -30 0 1 2 0 2 0 20 0 1 ...1 0 0 (8) (3) = 24 no shifting needed Fractional Multiplication: If we have a P.Q. number, we have (P-1) integer bits and 1 sign bit. Q is the number of fractional bits, as shown in Figure 2-4. 1.3 numbers: 1. 0 0 0 (-1) 0. 0 1 1 (3/8) When doing fractional multiplication, extend the sign bits to the length of the product register. 8-bit result: 1 1 1 1 1. 0 0 0 (-1) 0 0 0 0 0. 0 1 1 (3/8) 1 1. 1 0 1 0 0 0 which is a 2.6 number To format the answer back into a 1.7 number, shift it left , since we have an extra sign bit in the integer portion of the answer. 21 20 2-1 2-2 20 2-1 1 1. 1 0 1 0 0 2-6 0 = 1. 1 0 1 0 2-6 1. 1 0 1 0 0 0 0 = -1 + 2-1 + 2-3 0 0 = -3/8 0 so, the correct answer In general, when multiplying a P.Q. x W.Z number, the result is: (P+W) . (Q+Z) For 1.31 numbers, 1.31 + 1.31 = 2.62 and the resultant left shift formats the number back to 1.63. Figure 2-7. Integer vs. Fractional Multiplication 2.2.2 Accumulator Data Transfers A 32-bit value may be transferred from the X data bus or the Y data bus to an accumulator. The 32-bit value will be loaded into the high portion of the accumulator and sign extended into the guard. The low portion of the accumulator will be zeroed. A 64-bit value may be transferred from the X data bus and Y data bus to an accumulator. The 32-bit value from the X data bus will be loaded into the high portion of the accumulator and sign extended into the guard. The 32-bit value from the Y data bus will be loaded into the low portion of the accumulator. 2-6 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.2.2.1 Move to Accumulator Move from data register into 72-bit accumulator (a0=x0). See Figure 2-8 and Figure 2-9. Before Execution 2 x0 31 1 a0g 0 2 3 a0h 71 64 63 x x x x x x 4 5 6 7 8 a0l x x x 32 31 x x 0 x x x x x x x After Execution x0 31 1 a0g 0 2 3 a0h 71 64 63 0 0 1 2 3 4 4 5 6 7 8 a0l 5 6 7 32 31 8 0 0 0 0 0 0 0 0 0 Figure 2-8. Positive 32-bit Value Before Execution x0 31 8 a0g 0 2 3 a0h 71 64 63 x x x x x x x 4 5 6 7 8 a0l x x 32 31 x x 0 x x x x x x x After Execution x0 31 8 a0g 0 2 3 a0h 71 64 63 f f 8 2 3 4 5 4 5 6 7 8 a0l 6 7 32 31 8 0 0 0 0 0 0 0 0 0 Figure 2-9. Negative 32-bit Value DS795UM11 2-7 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.2.2.2 Moving from Accumulator Each data path (A and B) has its own independent SRS unit as shown in Figure 2-2. The SRSs are the only interface to move a value from an accumulator in a data path to the internal X and Y data busses. 2 The SRS units are named in the same order that they process data – Shift first, Round second, Saturate last. When an accumulator is transferred to a data bus, data saturation will occur and the limit bit in the Condition Code register is set. Examples of saturation are shown in Section 2.2.2.3. The data shifters can shift the data coming from an accumulator one or two bits to the right, one bit to the left, or pass the data without shifting. The data in the accumulator remains unchanged. The shifts are controlled by the shift bits in the Mode register. Shifting facilitates the scaling of fixed point data which is useful in implementing the block floating point algorithms. Examples of shifting are shown in Section 2.2.2.5. The Rounder has four modes affecting how the data in the low register of the accumulator (i.e. a0l) is handled when an accumulator is moved onto the X or Y data bus: • Truncate - The data in the low register is ignored. • Add ½ then truncate - One-half of the least significant bit of the high register of the accumulator (0x00.00000000.80000000) is added to the data in the low register before truncation. • Round to zero - Positive accumulators are simply truncated, but if the value of the accumulator is negative the high register is incremented by 1 before truncation. This exists for removing limit-cycle operations in IIR filters. • Add dither then truncate - If the top 4 bits of the low register is larger (unsigned comparison) than a 4-bit random number, the high register is incremented by 1 before truncation. The 4-bit random number is actually bits 15, 13, 12, and 10 of a 16-bit random number that is seeded at reset. The A and B SRS units have individual 16-bit random numbers that are seeded differently at reset. The 16-bit random numbers are post-updated after each use by the individual SRS. In other words, moving data out of an A accumulator onto the X or Y data bus with rounding in this mode updates only the A SRS 16-bit random number after it has been used for that comparison. Examples of rounding are shown in Section 2.2.2.5. Any move from a full 72-bit accumulator to a 32-bit destination (X or Y data register, X or Y memory, peripheral space, etc.) is appropriately shifted, rounded, and saturated. Moves from any portion of an accumulator (for example, a0h, a3l, b2g, etc.) are not affected by the SRS unit. Additionally, the Bitwise Accumulator Move instruction (a0=+b3) does not utilize the SRS. 2-8 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.2.2.3 Saturation Examples Figure 2-10, Figure 2-11, and Figure 2-12 are examples of saturation: a0g a0h 71 64 63 0 0 c 0 0 0 2 a0l 0 0 0 f f f 32 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x0 31 7 0 f f f f Figure 2-10. Positive Saturation: x0=a0 Note: 0x00c000000000000000 (1.5) is limited to 0x7fffffff (.99999999953). a0g a0h 71 64 63 8 0 c 0 0 0 a0l 32 31 0 0 0 0 0 0 0 0 0 0 0 0 0 x0 31 8 0 0 0 0 Figure 2-11. Rounding Example: Negative Saturation: x0=a0 Note: 0x80c000000000000000 (-1.5) is limited to 0x80000000 (-1). a0g a0h 71 64 63 f f f f f f a0l f f f f f f 32 31 f 0 0 0 0 0 0 x0 31 f 0 f f f f Figure 2-12. No Saturation: x0=a0 Note: 0xffffffffff00000000 (-.99999999953) remains unchanged as 0xffffffff (-.99999999953). 2.2.2.4 Rounding Examples Table 2-1 is an example of rounding. Table 2-1. Result of x0=a0 for a Given Rounding Mode (Shifting Off) a0 Contents x0 Result Given Rounding Mode a0g a0h a0l Truncate add .5 round to 0 dither 00 00000001 80000000 00000001 00000002 00000001 00000001 or 00000002 00 00000001 00000001 00000001 00000001 00000001 00000001 or 00000002 ff 80000000 00000001 80000000 80000000 80000001 80000000 or 80000001 ff ffffffff 80000000 ffffffff 00000000 00000000 ffffffff or 00000000 DS795UM11 2-9 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.2.2.5 Shifting Examples Table 2-2, Table 2-3, and Table 2-4 are examples of shifting. 2 Table 2-2. Result of x0=a0 for a Given Shifting Mode with Rounding Set to Truncate (off) a0 Contents a0g a0h x0 Result Given Rounding Mode a0l No shift Right shift 1 Right shift 2 Left shift 1 00 7fffffff 00000000 7fffffff 3fffffff 1fffffff 7fffffff 01 80000001 80000000 7fffffff 7fffffff 60000000 7fffffff ff 00000000 00000000 80000000 80000000 c0000000 80000000 40 00000000 40000000 7fffffff 7fffffff 7fffffff 80000000 Table 2-3. Result of x0=a0 for a Given Shifting Mode with Rounding Set to Add ½ then Truncate a0 Contents a0g a0h x0 Result Given Rounding Mode a0l No shift Right shift 1 Right shift 2 Left shift 1 7fffffff 00 7fffffff 00000000 7fffffff 40000000 20000000 01 80000001 80000000 7fffffff 7fffffff 60000000 7fffffff ff 00000000 00000000 80000000 80000000 c0000000 80000000 40 00000000 40000000 7fffffff 7fffffff 7fffffff 80000000 Table 2-4. Result of x0=a0 for a Given Shifting Mode with Rounding Set to Round to Zero a0 Contents a0g a0h x0 Result Given Rounding Mode a0l No shift Right shift 1 Right shift 2 Left shift 1 00 7fffffff 00000000 7fffffff 3fffffff 1fffffff 7fffffff 01 80000001 80000000 7fffffff 7fffffff 60000000 7fffffff ff 00000000 00000000 80000000 80000001 c0000001 80000000 40 00000000 40000000 7fffffff 7fffffff 7fffffff 80000000 2.3 Parallel Address Generation Unit This unit consists of two sets of 12 registers - the 16-bit Index (I) registers i0-i11 and the 16bit modulo-offset registers nm0-nm11. The data flow for the Address Generation Unit (AGU) is shown in Figure 2-13. A modulo-offset register consists of a modulo portion, bits [15:12], and an offset portion, bits [11:0]. See Table 2-5 and Table 2-6. The offset portion is used to update the index register and the modulo portion to specify the type of addressing: • Linear • Reverse binary • Modulo The offset portion is treated as a signed 12-bit number, and as such can update the address in the corresponding index register with any value from -2048 to 2047 (0x800-0x7ff). 2-10 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide nm0 nm1 nm2 nm3 nm8 nm9 nm10 nm11 2 Address ALU Post Increment Post Decrement Y Data Bus i0 i1 i2 i3 i8 i9 i10 i11 i4 i5 i6 i7 Mux X Data Bus X Address / 16 Mux Mux X Data Bus Mux Mux 16-Bit Immediate from Opcode Mux Y Data Bus Y Address / 16 Address ALU Post Increment Post Decrement nm4 nm5 nm6 nm7 16-Bit Immediate from Opcode Figure 2-13. Data Flow for the Parallel Address Generation Unit DS795UM11 2-11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Table 2-5. Index Registers 2 Register Names Bits i0 15 0 i1 15 0 i2 15 0 i3 15 0 i4 15 0 i5 15 0 i6 15 0 i7 15 0 i8 15 0 i9 15 0 i10 15 0 i11 15 0 Table 2-6. Increment-Modulo Registers Modulo Register Name Field Name Increment Field Name Bits Bits nm0 m0 15 12 n0 11 0 nm1 m1 15 12 n1 11 0 nm2 m2 15 12 n2 11 0 nm3 m3 15 12 n3 11 0 nm4 m4 15 12 n4 11 0 nm5 m5 15 12 n5 11 0 nm6 m6 15 12 n6 11 0 nm7 m7 15 12 n7 11 0 nm8 m8 15 12 n8 11 0 nm9 m9 15 12 n9 11 0 nm10 m10 15 12 n10 11 0 nm11 m11 15 12 n11 11 0 2.3.1 Addressing Modes 2.3.1.1 Modulo Addressing Modulo addressing can be used to implement circular buffers whose size is a power of 2, ranging from 4 to 32768. When incrementing an index register with the corresponding NM register set for modulo addressing the index register wraps around to the beginning of the buffer when the end of the buffer is reached. The most significant 4 bits of the NM register control whether and how modulo addressing is used. If set to a value between 0x1 and 0xe, modulo addressing is used with an address boundary of 2^(m+1). If set to 0x0, then linear addressing is used. If set to 0xf, reverse binary addressing is used. See Table 2-7. 2-12 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Table 2-7. Addressing Modes, Defined by the NM Registers MS 4 bits of NM Addressing Mode 0x0 Linear Addressing 0x1 Modulo 4 0x2 Modulo 8 0x3 Modulo 16 0x4 Modulo 32 0x5 Modulo 64 0x6 Modulo 128 0x7 Modulo 256 0x8 Modulo 512 0x9 Modulo 1024 0xa Modulo 2048 0xb Modulo 4096 0xc Modulo 8192 0xd Modulo 16384 0xe Modulo 32768 0xf Reverse Binary Addressing 2 To use modulo addressing, circular buffers must be placed in memory such that their base address is a multiple of their size. For example, to use modulo addressing on a 1024-sample (0x400) circular buffer the base address of the buffer must be 0x0000, 0x0400, 0x0800, 0x0c00, 0x1000, etc. In modulo addressing mode, all index register updates (+/-1, +/-2, +/n) will result in an address that is within the boundaries of the buffer, except for +/-n when n is greater than or equal to the buffer size, in which case the index register will jump out of the circular buffer. 2.3.1.2 Reverse Binary Addressing Reverse binary addressing is useful for implementing Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT) algorithms to switch the signals from time to frequency and frequency to time domain. In writing the code for an FFT it is necessary either to get the input data in a reverse binary (bit reverse order) or to extract the correct output data in a reverse binary order. The number of data points or a block of data that can be reverse binary addressed will always be a power of 2. The reverse binary addressing is implemented by setting the value in the M register to 0xf. Suppose the data block is 2k locations. The N register should be initialized to a value 2k-1. The index register i is initialized to any address between the lower and upper boundary. The lower boundary is k*2t where t is any integer. The upper boundary is (k*2t) + (2t-1). The mode of addressing must be i1+=n. DS795UM11 2-13 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.3.1.3 Immediate Addressing This addressing mode has both long word instruction (32 bits) and short word instruction (16 bits) versions. In the long word instruction, the address field is 16 bits, which allows access to the X and Y memory of up to 64k locations. This addressing is used to transfer data from memory to registers. 2 For example: a0 = xmem[0x6540] In the short word instruction, the address field is 6 bits, which allows access to the first 64 locations in X, Y, and XY memory. XY memory is the concatenation of X and Y memory with the same address as indicated by 6-bit address field. When XY memory is used as the source or destination of a data transfer, the destination/source should be either a pair of data path registers or an accumulator. For example: x0,y0 = xymem[12] or a0 = xymem[12] The short word instruction can be used in conjunction with an arithmetic or logic instruction. 2.3.1.4 Indexed Addressing This addressing mode uses long (32 bits), short (16 bits), and 8-bit instructions. Two instructions using 8-bits can be used simultaneously along with an arithmetic or logic instruction, but one move must use the X memory field and the other the Y memory field. One instruction using 16 bits of the program word can be used along with an arithmetic or logical instruction. In the long word instruction (see Section 4.5), X, Y, and P memory can be addressed using indexed addressing. Index registers i0-i11 are used and they can be post- incremented or post-decremented. The updates available are +/-1, +/-2 and +/-n. The value of n is stored in the corresponding NM Register. In the short word instruction (see Section 4.1), X, Y, and XY memory can be addressed. XY memory is the concatenation of the X and Y memory. XY memory is used for complex and double moves. When XY memory is used as the source or destination of a data transfer, the destination/source should be either a pair of data path registers or an accumulator. The index registers used here are i0-i11, and the updates available are +/-1, +/-2 and +/-n. When performing parallel moves (see Section 4.2), use X memory with X data registers, A accumulators and i0 or i1 index registers. Use Y memory with Y data registers, B accumulators and i4 or i5 index registers. Index register updates are limited to +/-1 and +n. Example of a valid instruction is: a0=a0+b0; x1=xmem[i0]; i0+=n; ymem[i4]=b1; i4-=1 2-14 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.3.1.4.1 Index Register Updates All the standard index register updates can be used without an associated move. For example: 2 i0+=n These are parallel instructions that can be paired with a MAC/ALU instruction: a0 = x0*y0;i4-=2 Additional index register update instructions exist that are not available for use with moves. These instructions add an immediate value to an index register and place the result in a second index register: i0 = i3 + (0x1234) The target index register can be the same as the first argument: i7 = i7 + (0x1234) There are two forms of these instructions. One is a full word instruction that cannot be used with any parallel instruction; this form uses a full 16-bit operand for the immediate value. The second is a 16-bit instruction that can be paired with a MAC/ALU instruction, and is limited to a 6-bit immediate operand and a source index register of i8-i11: a0 = x0*y0; i0 = i9 + (0x3f) These instructions are affected by the NM register associated with the source index register: i2 = (0x3f)# last address of a modulo 16 buffer nm2 = (0x3000)# set to modulo 16 nop# see “Index Register Loading Restrictions” i3 = i2 + (0x1)# after execution, i3==0x30 # (nm3 is ignored) Example 2-2 Syntax for Index Register Updates <index register> += 1 or 2 or n <index register> -= 1 or 2 or n 2.3.1.4.2 Parallel Index Register Updates Index updates for parallel instructions can be executed in parallel with other parallel instructions. References in syntax statements to “update” almost always have the same update options: +1 -1 +2 -2 +n -n When an instruction has fewer options than noted here, the available update options are noted under the Restrictions subsection for each instruction. Index updates are always optional. DS795UM11 2-15 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.3.1.4.3 Index Register Loading Index registers can be loaded by using register-to-register moves: 2 i0 = x0 or immediate data loads: i5 = (0x1234) Additionally, a specialized version of the full-word immediate value index update instruction (see previous section) can be used to load data into an index register: i0 = (0) + (0x1234) As stated, this is a full-word instruction that takes a full 16-bit immediate value, and as such it cannot be paired with any parallel instructions. 2.3.1.4.4 Index Register Loading Restrictions Due to the pipelined nature of the AGU, instructions that utilize the AGU update index registers during the decode phase of the pipeline, which is the second of the three phases (Fetch - Decode - Execute). This implies that any modification to an index register that occurs during the execute phase will be undefined for any AGU operations in the subsequent instruction. The main impact on programming is that an index register that is modified through a register-to-register move or an immediate load is unavailable for use or update by the AGU in the next instruction. In this example: i0 = (0x40) nop# this is necessary x0 = xmem[i0] A one-instruction buffer is required between loading and using i0. A nop was used here, but any instruction that does not require i0 would have sufficed (and is usually preferable to avoid wasting cycles.) If an index register is used before it is ready, the assembler will warn the user. Instructions that do not use the AGU are unaffected by this pipeline effect: i0 = x0 x2 = i0# no problem here... Note that the immediate value index update instructions use the AGU to load/add the immediate value into the index register, so the result can be used immediately: i0 = (0) + (0x40) x0 = xmem[i0]# no waiting necessary Operations performed by an instruction during the Decode phase of the instruction pipeline can be lost if another instruction performs the same operation but in the Execute phase of the same cycle. In the example below, the second i0 assignment is not performed because the previous instruction performs an i0 assignment during its Execute phase. See Figure 2-14. BitSet (i0), (0xEEEE) i0 = (0) + (0xDDDD) 2-16 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2. i0 + 1 i0 Assignment (performed successfully) 1 Instruction 1 F D E F D 2 E Instruction 2 2 i0 Assignment (lost) Figure 2-14. Execute Phase vs. Decode Phase Assignments 2.4 Program Control Unit The program control unit consists of a Program Counter (PC), two system stacks and two control registers: the Mode register and the Condition Code register. The Mode Register is the MR, the Condition Codes Register is the CCR. 2.4.1 Program Counter The PC is a 16-bit pointer used to indicate the location of the next instruction to fetch from program memory. 2.4.2 Subroutine Stack The subroutine stack is used to store the return PC for subroutine calls. It is 16-bits wide and implemented as a 16-entry, circular buffer with overflow and underflow interrupts. Each time there is a call instruction the current PC is stored on the top of the stack and the call stack pointer is auto-incremented or auto-decremented depending on the configuration of the jsr_mode register. Conversely for a return instruction the entry at the top of the stack is popped and is used as the next PC value. 2.4.3 Loop Stack The loop stack is used to store the current do-loop state (last address, first address, and count) or do_patch state (patch length, last address, return address) when a new do-loop or do_patch is encountered prior to completing any preceding do-loop or do_patch. It is 49-bits wide and is implemented as an 8-entry circular buffer. When a do-loop or do_patch is encountered, the state required to manage software flow control is kept in a 49-bit register that appears to software as two registers lp_data1 (Table 2-14) and lp_data2 (Table 2-15). If software is executing a do-loop or do_patch and encounters another do-loop or do_patch, the state of the current do-loop or do_patch is pushed from lp_data to the loop stack, and the loop stack pointer is auto-incremented or auto-decremented depending on the configuration of the lst_mode register. And, the state of the new do-loop or do_patch is placed in lp_data. Conversely, when a do-loop or do_patch DS795UM11 2-17 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide has completed, the entry at the top of the stack is popped to lp_data which restores the previous do-loop or do_patch state. 2 2.4.4 Subroutine Stack and Loop Stack Common Implementations The two stacks operate independently of each other and consist of circular buffers. Each stack has programmable thresholds for both overflow and underflow which can be set up to cause interrupts. Each stack can grow either up or down. Most configuration is via the registers jsr_mode, jsr_ovf, and jsr_unf for the subroutine stack or lst_mode, lst_ovf, and lst_unf for the loop stack. Software may directly read or write subroutine stack data via jsr_data and loop stack data via lst_data1 and lst_data2. Note that if auto-update bits in jsr_mode or lst_mode are not cleared, then software reads and writes of the stacks will modify the respective stack pointers mr_jsr_ptr and mr_lst_ptr. There are a total of five interrupt masks that are important to the proper operation of the stack interrupts. Each stack has two maskable interrupts, one each for overflow and underflow. All stack interrupts can be globally disabled by a global interrupt enable bit. Each individual stack interrupt has a mask which prevents the interrupt from getting queued or recorded. On the other hand, clearing the global stack interrupt enable bit prevents the core from taking an interrupt request.if the individual interrupt mask bits are clear, then the interrupts are still queued up and will be serviced the global stack interrupt enable bit is set. Each of the stacks can be modified by software. This is done through reads and writes to jsr_mode (Table 2-8), lst_mode (Table 2-9), mr_jsr_ptr (Table 2-11), jsr_data (Table 2-12), mr_lst_ptr (Table 2-13), lst_data1 (Table 2-16), and lst_data2 (Table 2-17) registers. Note that the stack pointer auto-increment and auto-decrement bits should be disabled (in jsr_mode or lst_mode) before attempting to modify the contents of the stack, unless that behavior is desired. Though the pointers for each stack, mr_jsr_ptr and mr_lst_ptr, are fields in the Mode register, never modify the Mode register directly. The following Mode register fields should be accessed through separate registers: mr_jsr_ptr, mr_lst_ptr, mr_r, mr_s, or mr_sr. Several additional registers are useful for handling overflow and underflow of the stacks: jsr_ovf (Table 2-18), jsr_unf (Table 2-15), lst_ovf (Table 2-16), lst_unf (Table 2-17). When an underflow or an overflow condition occurs and the appropriate interrupt mask is set, interrupts are queued but held for execution. As long as the stack interrupt enable mask is set the core will fetch an ISI (interrupt service instruction) from the stack ISR (interrupt service routine) table and execute it. By default the stack ISR table is located immediately after the PIC (peripheral interrupt controller) ISR table. Typically the 32 ISIs for the PIC ISR table will be located at 0x0000-0x001f and the 4 ISIs for the stack ISR table will be located at 0x00200x0023. If the stack ISR needs to be relocated, simply modify bits [15:2] of the stq_base (Table 2-10) with the desired address. Bits [1:0] of the stq_base register always read as 0. 2-18 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.4.5 jsr_mode Register 15 14 13 12 11 10 9 8 Reserved x x x x x x x x 7 6 5 4 3 2 1 * * * * * * * 0 * 1 0 0 0 0 0 1 0 2 Table 2-8. jsr_mode Bit Definitions Bits 15:8 7 6 5 4 3 2 1 0 Field/Flag Name Reserved jsr_wr_inc_dec jsr_wr_ptr_en jsr_rd_inc_dec jsr_rd_ptr_en jsr_ovf_imask jsr_unf_imask jsr_int_en jsr_auto_stq Description Reserved. Auto increment(1) / auto-decrement(0) on write. Enable pointer auto-update on write. Auto incremnt(1) / auto decrement(0) on read. Enable pointer auto-update on read. Overflow interrupt mask. (Disable overflow interrupt.) Underflow interrupt mask. (Disable underflow interrupt.) Call-stack interrupt enable. Auto-stack mode enable. (Reserved) jsr_wr_inc_dec This feature is only available when the “auto-update on write” bit, jsr_wr_ptr_en is set. If so, auto-increment or auto-decrement the respective stack pointer when the top of the stack is written. jsr_wr_pt_en When set, a write to the respective stack will update (autoincrement or auto-decrement) the stack pointer. jsr_rd_inc_dec This feature is only available when the “auto-update on read” bit, jsr_rd_ptr_en is set. If so, auto-increment or auto-decrement the respective stack pointer when the top of the stack is read. jsr_rd_ptr_en When set, a read of the respective stack will update (autoincrement or auto-decrement) the stack pointer. jsr_ovf_imask When zero, this bit disables interrupts that would otherwise be generated by an overflow condition. jsr_unf_imask When zero, this bit disables interrupts that would otherwise be generated by an underflow condition. jsr-int-en This bit is equivalent in function to the MR[7] bit, except it is used for stack interrupts. Clearing this bit prevents the DSP from taking requests for either underflow or overflow interrupts. However, interrupts are still queued, assuming the corresponding mask bits are set. auto-stq This bit enables an un-supported stack mode. It should always be kept clear (0). DS795UM11 2-19 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.4.6 lst_mode Register 2 15 14 13 12 11 10 9 8 Reserved x x x x x x x x 7 6 5 4 3 2 1 * * * * * * * 0 * 1 0 0 0 0 0 1 0 Table 2-9. lst_mode Bit Definitions Bits Field/Flag Name 15:8 7 6 5 4 3 2 1 0 Reserved lst_wr_inc_dec lst_wr_ptr_en lst_rd_inc_dec lst_rd_ptr_en lst_ovf_imask lst_unf_imask lst_int_en lst_auto_stq Description Reserved. Auto increment(1) / auto-decrement(0) on write. Enable pointer auto-update on write. Auto incremnt(1) / auto decrement(0) on read. Enable pointer auto-update on read. Overflow interrupt mask. (Disable overflow interrupt.) Underflow interrupt mask. (Disable underflow interrupt.) Call-stack interrupt enable. Auto-stack mode enable. (Reserved) lst_wr_inc_dec This feature is only available when the “auto-update on write” bit, lsr_wr_ptr_en is set. If so, auto-increment or auto-decrement the respective stack pointer when the top of the stack is written. lst_wr_pt_en When set, a write to the respective stack will update (autoincrement or auto-decrement) the stack pointer. lst_rd_inc_dec This feature is only available when the “auto-update on read” bit, lsr_rd_ptr_en is set. If so, auto-increment or auto-decrement the respective stack pointer when the top of the stack is read. lst_rd_ptr_en This bit controls whether a read of the respective stack will update (auto-increment or auto-decrement) the stack pointer. lst_ovf_imask When zero, this bit disables interrupts that would otherwise be generated by an overflow condition. lst_unf_imask When zero, this bit disables interrupts that would otherwise be generated by an underflow condition. lst-int-en This bit is equivalent in function to the MR[7] bit, except it is used for stack interrupts. Clearing this bit prevents the DSP from taking requests for either underflow or overflow interrupts. However, interrupts are still queued, assuming the corresponding mask bits are zero. auto-stq This bit enables an un-supported stack mode. It should always be kept clear (0). 2-20 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.4.7 stq_base Register 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 7 6 5 4 3 2 1 0 0 0 1 0 0 0 x Reserved 15 14 13 12 11 10 0 0 0 0 0 0 9 8 stq_isr_base_addr 0 2 Rsvd. 0 x Table 2-10. stq_base Bit Definitions Bits Field/Flag Name Description 31:16 15:2 1:0 Reserved stq_isr_base_addr Reserved Reserved. ISR base address. Reserved. 2.4.8 mr_jsr_ptr Register This is the index of the next entry to which data will be pushed and/or the index of the last entry popped. It appears as a field in Mode Register but should be modified here. 15 14 13 12 11 10 9 8 7 6 5 4 3 Reserved 0 0 0 0 0 0 2 1 0 mr_jsr_ptr 0 0 0 0 0 0 0 0 0 0 Table 2-11. mr_jsr_ptr Bit Definitions Bits Field/Flag Name Description 3:0 mr_jsr_ptr Call stack pointer 2.4.9 jsr_data Register The top of the call stack is (mr_jsr_ptr - 1) mod 16. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 jsr_data 0 0 0 0 0 0 0 0 Table 2-12. jsr_data Bit Definitions Bits Field/Flag Name Description 15:0 jsr_data PC value at top of call stack. DS795UM11 2-21 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.4.10 mr_lst_ptr Register This is the index of the next entry to which data will be pushed and/or the index of the last entry popped. It appears as a field in Mode Register but should be modified here. 2 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Reserved 0 0 0 0 0 0 1 0 mr_lst_ptr 0 0 0 0 0 0 0 0 0 0 Table 2-13. mr_lst_ptr Bit Definitions Bits Field/Flag Name Description 2:0 mr_lst_ptr Loop stack pointer 2.4.11 lp_data1 Register 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lp_lad lp_fad 0 Table 2-14. lp_data1 Bit Definitions Description Bits Field/Flag Name 31:16 15:0 lp_lad lp_fad do loop do_patch Top of loop stack last address Top of loop stack first address Top of loop stack last address Return address: just after do_patch instruction 2.4.12 lp_data2 Register 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 Reserved 15 14 13 12 11 10 9 8 16 type 7 6 5 4 3 2 1 0 0 0 1 0 0 0 0 0 lp_cnt 0 0 0 0 0 0 0 0 Table 2-15. lp_data2 Bit Definitions Bits Field/Flag Name 16 15:0 type lp_cnt Description 0 for do loop Top of loop stack count 2-22 1 for do_patch Length of patch DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.4.13 lst_data1 Register The top of the loop stack is (mr_lst_ptr - 1) mod 8. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 lp_lad lp_fad 0 Table 2-16. lst_data1 Bit Definitions Description Bits Field/Flag Name 31:16 15:0 lp_lad lp_fad do loop do_patch Top of loop stack last address Top of loop stack first address Top of loop stack last address Return address: just after do_patch instruction 2.4.14 lst_data2 Register The top of the loop stack is (mr_lst_ptr - 1) mod 8. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 Reserved 15 14 13 12 11 10 9 16 type 8 7 6 5 4 3 2 1 0 0 0 1 0 0 0 x x lp_cnt 0 0 0 0 0 0 0 0 Table 2-17. lst_data2 Bit Definitions Bits Field/Flag Name 16 15:0 type lp_cnt Description 0 for do loop Top of loop stack count 1 for do_patch Length of patch 2.4.15 jsr_ovf Register An exception occurs when mr_jsr_ptr is incremented past jsr_ovf and the exception is enabled in the jsr_mode register. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Reserved 0 0 0 0 0 0 1 0 1 1 jsr_ovf 0 0 0 0 0 0 1 1 Table 2-18. jsr_ovf Bit Definitions Bits Field/Flag Name Description 3:0 jsr_ovf Subroutine stack overflow threshold DS795UM11 2-23 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.4.16 jsr_unf Register 2 An exception occurs when mr_jsr_ptr is decremented to jsr_unf and the exception is enabled in the jsr_mode register. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Reserved 0 0 0 0 0 0 1 0 0 0 jsr_unf 0 0 0 0 0 0 0 0 Table 2-19. jsr_unf Bit Definitions Bits Field/Flag Name Description 3:0 jsr_ovf Subroutine stack underflow threshold 2.4.17 lst_ovf Register An exception occurs when mr_lst_ptr is incremented past lst_ovf and the exception is enabled in the lst_mode register. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Reserved 0 0 0 0 0 0 1 0 jsr_ovf 0 0 0 0 0 0 0 1 1 1 Table 2-20. lst_ovf Bit Definitions Bits Field/Flag Name Description 2:0 lst_ovf Loop stack overflow threshold 2.4.18 lst_unf Register An exception occurs when mr_lst_ptr is decremented to lst_unf and the exception is enabled in the lst_mode register. 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Reserved 0 0 0 0 0 0 1 0 lst_unf 0 0 0 0 0 0 0 0 0 0 Table 2-21. lst_unf Bit Definitions Bits Field/Flag Name Description 2:0 lst_unf Loop stack underflow threshold 2.4.19 Mode Register The Mode register is a 16-bit register defined as follows. Specific bits of the Mode register can be accessed for reading and writing through designated bitfield registers, shown in Table 2-22. 15 14 13 mr_jsr_ptr 12 11 mr_int_p 10 9 mr_lst_ptr 8 7 mr_int 6 5 reserved 2-24 4 3 2 1 0 Ls R1 R0 S1 S0 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Table 2-22. Mode Register Bit Definitions Bits Field/Flag Name 19 mr_jsr_ovf 18 mr_jsr_unf 17 mr_lst_ovf 16 mr_lst_unf 15:12 mr_jsr_ptr 11 mr_int_p 10:8 7 6:5 mr_lst_ptr mr_int Reserved 4 Ls 3:2 mr_r R1, R0 1:0 mr_s S1, S0 Description Set whenever a call stack overflow occurs. Note: Applicable only for CS48L20. Set whenever a call stack underflow occurs. Note: Applicable only for CS48L20. Set whenever a loop stack overflow occurs. Note: Applicable only for CS48L20. Set whenever a loop stack underflow occurs. Note: Applicable only for CS48L20. Call stack pointer. Set to previous value of mr_int whenever an interrupt of any kind occurs; can be written by firmware but need not be (CS48L20 only). Loop stack pointer. Interrupt enable/disable bit. Reserved. Least significant bit - If set to one, data moved from the low part of an accumulator (such as a0l) will be logically shifted right one bit. Round mode bits. Defined as: R1 R0 Round Mode 00 No round 01 Add 0.5 then truncate 10 Round to 0 11 Add dither then truncate Note: When setting these bits using the mr_r register, bits [3:2] of the mr_r register must be set to affect R1, R0. Shift mode bits. Defined as: S1 S0 Shift Mode 00 No shift 01 Shift right 10 Shift right twice 11 Shift left Note: Register mr_sr can be used to set bits [3:0] of the mode register with a single constant. [19:16] mr_stq_queue_sticky is used to know whether two stack exceptions occurred at the same time so that after one is serviced, the other may also be. Otherwise, under certain conditions, a stack interrupt may be lost. Note: These are all sticky and must be cleared by firmware. [11] mr_int_p is used to know whether to enable interrupts or not when returning from a stack ISR. DS795UM11 2-25 Copyright 2013 Cirrus Logic 2 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.4.20 Condition Code Register The Condition Code register contains flags that are affected by various instructions in the DSP. The bits of the Condition Code register are defined in Table 2-23. 2 15 14 13 12 11 10 9 8 Reserved 7 6 5 4 3 2 1 0 Z L BS AS B0 A0 T1 T0 Table 2-23. Condition Code Register Bit Definitions Bits Field/Flag Name 15:8 7 6 5 4 3 Reserved Z L BS AS B0 2 A0 1:0 T1, T0 Description Reserved. Zero bit - Set by the bit manipulation instructions. Limit bit - Set when saturation occurs: after it is set, it must be cleared by software. B sign bit - Set when the B accumulator result is negative. A sign bit - Set when the A accumulator result is negative. B zero bit - Set when the B accumulator result is zero. A zero bit - Set when the A accumulator result is zero. Shift mode status bits. T1 and T0 are set depending on the [63:59] bits of the accumulator and the value of s1 and s0 in the MR. See example in Table 2-24. Example of how T1 and T0 are affected by various accum + shift values: Table 2-24. T1, T0 with Various Accum + Shift Values Accum values [63:59] 00000 00001 00010 00011 00100 00101 0011x 01xxx or 11111 11110 11101 11100 11011 11010 1100x 10xxx T1 T0 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 0 1 Shift No shift No shift No shift Shift Shift Shift Shift twice Shift twice Not used 2.4.21 Loop Stack Example The operation of the loop stack is best illustrated by working through an example. Though the loop stack contains only eight entries, it is possible to extend it using software to initialize and manage a much larger software stack. In this way, the loop stack appears larger to the software that uses it. In this example, the software consists of some initialization code and a stack overflow and stack underflow exception handler. The code that follows represents one possible way to implement a software loop stack. Better and more complex approaches may be used. In particular, more error handling might be added or some hysteresis put in to the underflow/overflow interaction to minimize stack exceptions in certain cases. Note that the subroutine stack operates similarly to the loop stack, except that it has 16 entries and its entries are each 16 bits. 2-26 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide For the loop stack, several factors do not vary: • mr_lst_ptr always points to the next stack entry to be pushed and/or the last entry popped. 2 • lp_state is the next data to be pushed and/or the last entry popped. • lst_data is always the data at entry (mr_lst_ptr - 1) mod 8. • An overflow exception occurs when data is pushed onto the stack such that mr_lst_ptr is incremented past the overflow threshold and the exception is enabled, that is, all of the following are true: • lst_mode.lst_ovf_imask = 1 • lst_mode.lst_int_en = 1 • (mr_lst_ptr - 1) mod 8 = lsf_ovf • previous clock cycle mr_lst_ptr = lst_ovf • An underflow exception occurs when data is popped from the stack such that mr_lst_ptr is decremented to the underflow threshold, and the exception is enabled, that is, all of the following are true: • lst_mode.lst_unf_imask = 1 • lst_mode.lst_int_en = 1 • mr_lst_ptr = lsf_unf • previous clock cycle (mr_lst_ptr - 1) mod 8 = lst_unf • At hardware reset, lp_state is 0x0000000. This means that the first data pushed onto the loop stack is always meaningless and should not be used. Consider Figure 2-15 and Figure 2-16, which show the hardware related to the loop stack as it changes over time. Each figure shows: • The loop stack, an eight entry circular buffer for holding 49-bit data. The entries are numbered 0 through 7. • The lp_state register for holding 4- bit data. • the lst_data register, which is just one of the entries in the loop stack. • the lst_unf, lst_ovf, and mr_lst_ptr values, represented by arrows. mr_lst_ptr is the unlabeled one. The 49-bit data is flow control management information for either a do loop or do_patch instruction. The particular values are not important here. Instead, this data is represented by unique capital letters in the figures. To make use of the loop stack, it is assumed in this example that, during normal operation, lst_unf = (lst_ovf + 1) mod 8. This setting allows overflow and underflow to be detected properly. The left half of each diagram represents states before an overflow or underflow ISR, while the right half shows states after an overflow or underflow ISR. DS795UM11 2-27 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2 7 B B B B B 6 A A A A A lst_unf H lst_ovf Loop stack 5 G G F F F E E E E 4 3 2 I H H lst_unf lst_ovf 1 D D D D D 0 C C C C C lp_state E F G H I I J lst_data D E F G H H I push lst_ovf exception push push push push lst_ovf ISR Time Figure 2-15. Loop Stack Overflow Example Figure 2-15 illustrates how the hardware loop stack could overflow, causing seven entries to be pushed to the software stack. The loop stack initially contains four valid entries, A through D. Over time, four more entries, E through H, are pushed onto the stack. At this point, an overflow exception occurs. The overflow exception handler pushes seven entries in the hardware stack, A through G, onto the software stack. It also updates the overflow and underflow thresholds. Upon return from the ISR, the loop stack has seven free spaces instead of none, and lp_state and lst_data are unchanged. Then, over time, another entry, I, is pushed onto the stack. 2-28 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Loop stack 7 J J J 6 I I I I I 5 H H H H H H B B A A lst_unf 4 lst_ovf lst_ovf G 3 F F 2 E E 1 D D C C H G G F K 0 lp_state J K L K J I I J M J I H H lst_unf pop lst_unf exception lst_unf ISR pop pop pop pop push push lst_data Time Figure 2-16. Loop Stack Underflow Example Figure 2-16 shows how the hardware loop stack could then underflow, causing the same seven entries to be removed from the software stack and restored to the hardware stack. The loop stack initially contains two valid entries, H and I. Over time, two more entries, J and K, are pushed onto the stack. Then, four entries, H through K, are popped. At this point, an underflow exception occurs. The underflow exception handler pops seven entries from the software stack, A through G, and places them back in the hardware loop stack. It also updates the overflow and underflow thresholds. Upon return from the ISR, the loop stack has one free space instead of eight, lp_state is unchanged, and lst_data contains the next entry to be popped. Over time, another entry, G, is popped from the stack. The examples below provide example code to set up and control the hardware and software loop stack. First consider the set-up code. The code sets the stack ISR base, and initializes the list stack hardware configuration such that: • The mr_lst_ptr auto increments on read and write. • Overflow interrupt is enabled. • Underflow interrupt is not enabled for now since the stack is assumed to be empty. • The overflow threshold is set such that on the eighth push, an exception occurs. • The underflow threshold is set such that if a pop empties the stack, an exception would occur if enabled. DS795UM11 2-29 Copyright 2013 Cirrus Logic 2 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide If the underflow interrupt is enabled when the stack is empty, every time the stack becomes empty again, an underflow interrupt would occur, though underflow has actually not occurred. The regs_stack is a small X-memory area for storing at least DSP register i11 in all stack ISRs. The isr_stack is a small X-memory area for storing more DSP registers within all stack ISRs. The soft_lst_ptr is an X-memory address whose value points to the top of the software list stack, initially at address 0x100 in both X and Y memory. The software stack will grow upward from address 0x100 in both X and Y memory as the hardware stack overflows, and shrink downward as the hardware stack underflows. 2 Example 2-3 # Initialization regs_stack .bss (8) # Stores i11 throughout stack ISRs isr_stack .bss (8) # Stores other registers during all stack ISRs soft_lst_ptr .bsc 1, (0x100) # Stores list stack entries stq_base = (0x0024) # Location os ISR table lst_mode = (0xfa) # Enable auto inc on rd/wr, ovf exception lst_unf = (0x00) # lst_ovf = (0x07) # The code starting at address 0x0024 should be something like the following: Example 2-4 # ISR table 0x0024 through 0x0025 callint_stq ISR_lst_unf callint_stq ISR_lst_ovf This set-up insures that the list stack underflow and overflow ISRs will be called properly. The overflow ISR takes the following actions: • Saves some registers so they can be reused locally. • Updates the overflow and underflow thresholds by decrementing each by one. • Saves seven stack entries by copying them from the hardware stack to the software stack. • Updates the software list stack pointer in memory by incrementing by seven. • Enables underflow interrupts. • Restores saved registers. • Determine how to finish the ISR. After the ISR, the lp_state, mr_lst_ptr, and lst_data registers are unchanged, but seven entries are now available for pushing onto the hardware list stack. Looking at Figure 2-15, the value of N as used in the code example is 5. 2-30 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Example 2-5 # loop stack overflow ISR # ISR_lst_ovf: xmem[regs_stack] = i11 i11 = xmem[isr_stack] xmem[i11] = ccr; i11+=1 xmem[i11] = x0; i11+=1 ISR_lst_ovf_jmp: xmem[i11] = i0; i11+=1 i0 = xmem[soft_lst_ptr] 2 # Push i11 # i11 is ISR stack pointer # Push registers to isr_stack # Entry point from call stack ISRs # Push register to isr_stack # i0 is software list stack pointer # Save seven entries from hardware stack to software stack. # Update lst_unf and lst_ovf by decrementing each by 1. # Assume lsf_ovf = N, lst_unf = mr_lst_ptr = (N+1) mod 8 x0 = lst_data2 # increment stack pointer to (N+2) mod 8 xmem[i0] = lst_data1 # save stack entry (N+2) mod 8 ymem[i0] = lst_data2;i0+=1 xmem[i0] = lst_data1 # save stack entry (N+3) mod 8 ymem[i0] = lst_data2;i0+=1 xmem[i0] = lst_data1 # save stack entry (N+4) mod 8 ymem[i0] = lst_data2;i0+=1 xmem[i0] = lst_data1 # save stack entry (N+5) mod 8 ymem[i0] = lst_data2;i0+=1 xmem[i0] = lst_data1 # save stack entry (N+6) mod 8 ymem[i0] = lst_data2;i0+=1 x0 = mr_lst_ptr # x0 is (N+7) mod 8 lst_ovf = x0 # decrement overflow threshold to (N+7) mod 8 xmem[i0] = lst_data1 # save stack entry (N+7) mod 8 ymem[i0] = lst_data2;i0+=1 x0 = mr_lst_ptr # x0 is N lst_unf = x0 # decrement underflow threshold to N xmem[i0] = lst_data1 # save stack entry (N+7) mod 8 ymem[i0] = lst_data2;i0+=1 # stack pointer becomes (N+1) mod 8 xmem[soft_lst_ptr] = i0 # update software list stack pointer by # incrementing by 7 i11-=1 i0 = xmem[i11] # Restore register lst_mode = (0x00fc) # Enable list stack underflow interrupt x0 = mr bitclr hi(x0), (0x0002) mr = x0 jmp ISR_lst_finish # Clear list stack overflow request # Finishing is same for both list ISRs DS795UM11 2-31 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide The underflow ISR takes these actions: • Saves some registers so they can be reused locally. 2 • Updates the software list stack pointer in memory by decrementing by seven. • Updates the overflow and underflow thresholds by incrementing each by one. • Restores seven stack entries by copying them from the software stack to the hardware stack. • Checks if the software stack is empty; if so, disable underflow interrupts. • Restores saved registers. • Determine how to finish the ISR. After the ISR, the lp_state register and mr_lst_ptr register are unchanged, and the value apparent in lst_data is what is expected, as seven entries have just been restored to the hardware list stack. Looking at Figure 2-16, the value of N as used in the code example is 5. 2-32 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Example 2-6 # loop stack underflow ISR # ISR_lst_unf: xmem[regs_stack] = i11 i11 = xmem[isr_stack] xmem[i11] = ccr; i11+=1 xmem[i11] = x0; i11+=1 ISR_lst_unf_jmp: xmem[i11] xmem[i11] xmem[i11] xmem[i11] = = = = x1; i0; b0; b1; i11+=1 i11+=1 i11+=1 i11+=1 i0 = xmem[soft_lst_ptr] nop i0 = i0-(7) xmem[soft_lst_ptr] = i0 2 # Push i11 # i11 is ISR stack pointer # Push registers to isr_stack # Entry point from call stack ISRs # Push registers to isr_stack # i0 is software list stack pointer # Restore seven entries from software stack to hardware stack. # Update lst_unf and lst_ovf by incrementing each by 1. # Asume lst_ovf = (N-1) mod 8, lst_unf = mr_lst_ptr = N. x0 = mr_lst_ptr # x0 is entry just popped lst_ovf = x0 # increment overflow threshold to N x1 = lst_data2 # increment stack pointer to (N+1) mod 8 x1 = mr_lst_ptr # x1 is (N+1) mod 8 lst_unf = x1 # increment underflow threshold to (N+1) mod 8 lst_data1 = xmem[i0] # restore stack entry (N+1) mod 8 lst_data2 = ymem[i0];i0+=1 lst_data1 = xmem[i0] # restore stack entry (N+2) mod 8 lst_data2 = ymem[i0];i0+=1 lst_data1 = xmem[i0] # restore stack entry (N+3) mod 8 lst_data2 = ymem[i0];i0+=1 lst_data1 = xmem[i0] # restore stack entry (N+4) mod 8 lst_data2 = ymem[i0];i0+=1 lst_data1 = xmem[i0] # restore stack entry (N+5) mod 8 lst_data2 = ymem[i0];i0+=1 lst_data1 = xmem[i0] # restore stack entry (N+6) mod 8 lst_data2 = ymem[i0];i0+=1 lst_data1 = xmem[i0] # restore stack entry (N+7) mod 8 lst_data2 = ymem[i0];i0+=1 # stack pointer becomes N b0 = 0 # Disable underflow interrupts lo16(b0) = (soft_lst1 + 7) # if software loop stack is empty b1 = i0 b0 - b1 if (b != 0) jmp ISR_lst_unf_finish lst_mode = (0x00f8) DS795UM11 2-33 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide ISR_lst_unf_finish: i11-=1 b1 = xmem[i11]; i11-=1 b0 = xmem[i11]; i11-=1 i0 = xmem[i11]; i11-=1 x1 = xmem[i11] 2 # Restore some saved registers x0 = mr bitclr hi(x0), (0x0001) mr = x0 # Clear list stack underflow request jmp ISR_lst_finish # Finishing is same for both list ISRs Each list stack ISR ends the same way–a decision must be made between several options: • jmp directly to a call stack interrupt handler: • overflow • underflow • Return to interrupted code with stack interrupts enabled and: • Interrupts enabled • Interrupts disabled To determine if it is necessary to jump directly to a call stack interrupt, it is only necessary to check the state of the mr_jsr_ovf and mr_jsr_unf bits. Otherwise, to determine whether to re-enable interrupts when returning to interrupted code, it is only necessary to check the mr_int_p bit. Example 2-7 illustrates these decisions as well as restoring registers as needed. 2-34 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide Example 2-7 ISR_lst_finish: bitchg hi(x0), (0x000f) # If call overflow/underflow, handle ... nop bittst hi(x0), (0x000c) if (z==0) jmp ISR_lst_pending_stq x0 = mr bittst lo(x0), (0x0800) if (z==0) jmp ISR_lst_ret 2 # No other stack interrupt. # If called with interrupts enabled, handle ... ISR_lst_retint_stq: i11-=1 x0 = xmem[i11]; i11-=1 ccr = xmem[i11] xmem[isr_stack] = i11 i11 = xmem[regs_stack] retint_stq # Return with interrupts enabled. # Restore more registers ISR_lst_ret: x0 = lst_mode bitset lo(x0), 0x0002 lst_mode = x0 x0 = jsr_mode bitset lo(x0), 0x0002 jsr_mode = x0 i11-=1 x0 = xmem[i11]; i11-=1 ccr = xmem[i11] xmem[isr_stack] = i11 i11 = xmem[regs_stack] ret # Return with interrupts disabled # Enable list stack interrupts. # Remember isr_stack pointer # Restore i11 # Enable call stack interrupts # Restore more registers # Remember isr_stack pointer # Restore i11 ISR_lst_pending_stq: # Call stack interrupt pending. bittst hi(x0), (0x0008) # Which one? if (z==0) jmp ISR_jsr_ovf_jmp # If call overflow, do it. jmp ISR_jsr_unf_jmp # else do call underflow. Note that both list stack ISRs share the same startup code. They save i11, ccr, and x0, and update i11 to point to the isr_stack. Similarly, on return from either handler, these registers are restored. The call stack ISRs should be very similar to the list stack ISRs, differing in the size of the hardware stack and the location of relevant status and control bits. Hence, it should be assumed in this example that i11, ccr, and x0 are still saved when jumping directly from a call stack ISR to a jump stack ISR or vice versa. In the interrupted code, only one list stack interrupt and one call stack interrupt should ever occur at the same time. The stack ISRs themselves do not utilize the list stack or call stack directly, that is, there are no calls, do_patch, or do loops within these ISRs. Hence, stack ISRs do not ever trigger more stack ISRs. DS795UM11 2-35 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.5 Master State Registers (MSREGS) The Master State Registers are registers that exist within the core, but are separate from the AGU and ALU/MAC registers. These registers control internal configuration, provide visibility into the current state. Specific full-word instructions exist for reading and writing the Master State Registers to and from memory, peripheral space, and other registers. Immediate data loads and the Bit Manipulation instructions also work with the Master State Registers. In all instructions, the Master State Registers are referred to in the syntax by their name as specified in Table 2-25: 2 x0 = page_p bittst (mr), (0x0002) search_latch = xmem[0x1234] ] Table 2-25. Master State Registers Register Shift bits of Mode Register (S1 S0) Round bits and right shift bits (Ls R1 R0) Round bits and shift bits (Ls R1 R0 S1 S0) Condition Code Register Stack base address Call stack mode register Loop stack mode register Search Count Call stack pointer from the Mode Register Call stack overflow value Call stack underflow value Search Latch register Loop stack pointer from the Mode Register Loop stack overflow value Loop stack underflow value Reserved P Page for external memory X Page for external memory Y Page for external memory Random Number Register See Section 2.5.2 on page 37. Dither register A Dither register B Top of loop stack; 31:16 last address, 15:0 first address Top of loop stack; 15:0 cnt Current loop value; 31:16 last address, 15:0 first address Current loop value; 15:0 cnt Program Counter Program Counter for Breakpoints PC value at top of call stack 2-36 Syntactical Name mr_s mr_r mr_sr ccr stq_base jsr_mode lst_mode search_cnt mr_jsr_ptr jsr_ovf jsr_unf search_latch mr_lst_ptr lst_ovf lst_unf N/A page_p page_x page_y rand rand_reset rand_a rand_b lst_data1 lst_data2 lp_data1 lp_data2 pc pc_bp jsr_data DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.5.1 Search Registers When finding the maximum or minimum value of a buffer in memory, it is often desirable not only to know what that value is, but where it was located in the buffer. The Search Count Register, search_cnt, and Search Latch Register, search_latch, can be used to accomplish this task. Whenever a “Conditional Operation” is performed, the Search Count Register is incremented, and whenever a MAX or MIN instruction results in the accumulator move the search count register is copied into the search latch register. Consider the following code fragment: Example 2-8 i0 = (X_BX_data1) # i0 set to the beginning of a buffer search_cnt = i0 # Search Count Register and search search_latch = i0 # Latch Register set to i0 a0 = xmem[i0]; i0+=1 # find minimum of buffer, leave in b0 b0 = a0 do (64),> %: if (a0<b0) b0=a0; a0 = xmem[i0]; i0+=1 At the end of the loop, the search latch register contains the address of the minimum value, such that if you then execute: Example 2-9 i1 = search_latch nop x0 = xmem[i1] # x0 now equals b0. 2.5.2 Random Number Generator The DSP core has three hardware-based random number generators. The first one, called the PSR, generates 32-bit random data from a 16-bit seed which is updated each time a random number is generated. The PSR register is the only one of the three that is readable (from a programmer's perspective). The other two 4-bit random number generators are called Dither A and Dither B and each are generated independently from their own 16-bit seeds. They are only used when the Mode Register bits MR[3:2], also known as MR[R1: R0], are both set and data is moved through the respective SRS A/B (Shift Round Saturate). The purpose of setting MR[3:2] is to select the “add dither and truncate” mode so dither can be added to the lower-order bits of the accumulator as it passes through the SRS. DS795UM11 2-37 Copyright 2013 Cirrus Logic 2 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide New random numbers from the PSR are generated by reading the MSREG rand. By default, the seed for the PSR will be 0x0000 unless the MSREG rand_reset is written. The MSREG rand_reset is limited to 16-bits. Any higher-order bits that are written to it will be ignored. The current PSR seed can be obtained by reading the MSREG rand_reset. Each time the PSR is read and a new random number is generated a new PSR seed will be written to the MSREG rand_reset. 2 The 16-bit Dither A and Dither B registers are updated with new values when data is moved through the respective SRS in the “add dither and truncate” mode. For Dither A, the move must use an An accumulator. For Dither B, the move must be use a Bn accumulator. It is not possible to use SRS A or Dither A on a Bn accumulator. Likewise, it is not possible to use SRS B or Dither B in conjunction with an accumulator. By default the seed for Dither A and Dither B will be 0x0010 and 0x0030 respectively. Each dither seed is limited to 16-bits. Any higher-order bits that are written will be ignored. The current Dither A and B registers can be read from MSREG rand_a and MSREG rand_b respectively. The values will not be updated when read by the programmer. 2.6 Interrupt Controller The Interrupt Controller prioritizes 32 peripheral interrupt requests. The interrupt priority is fixed; 0 is highest and 32 is lowest. For a given interrupt service routine (ISR), a single instruction corresponding to that interrupt is inserted directly into the instruction pipeline. This is called an Interrupt Service Instruction (ISI). All ISIs reside in the first 32 locations of program memory (addresses 0x0000-0x001F). The core has three categories of interrupts. Interrupts can be generated from the DBC (debug controller), the subroutine and do-loop stacks, and the PIC (peripheral interrupt controller). The DBC has the highest priority. Second is priority are the stack interrupts. Last in priority are the standard interrupts from the PIC. 2.6.1 Fast Interrupts Interrupts that consist solely of a single instruction are referred to as fast interrupts. 2.6.2 Long Interrupts If an interrupt needs to execute more than one instruction, the callint instruction is used for the ISI. This is referred to as a long interrupt. The callint instruction disables interrupts, pushes the program counter (PC) onto the subroutine stack, and starts executing the specified ISR. The final instruction of the ISR should be ret_int, which pops the PC and enable interrupts. The call or jmp instructions can also be used as ISIs, but they will not disable interrupts, allowing the possibility of code re-entrance. 2.6.3 Masking There are two 32-bit registers that govern interrupt operation: IMask and IRMask. They are accessible from the peripheral space as imask and irmask. 2-38 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.6.3.1 IMask IMask is the interrupt enable/disable mask. Every bit corresponds to 1 of 32 possible interrupts numbered 31-0 and corresponding to mask bits [31:0]. If a bit is 1, then that interrupt is enabled. The default value after reset is 0x0000. 2 2.6.3.2 IRMask IRMask is the interrupt “run” mask, which affects how the DSP handles interrupts while in the Halt state. If an IRMask bit (31-0) corresponding to a particular interrupt is 1, then execution of that ISI will bring the processor out of Halt. Otherwise, the instruction is executed without the processor being brought out of Halt with no further instructions occurring, even if the interrupt instruction is a callint, call, or jmp. For this reason, long interrupts that might be triggered while the processor is in Halt should have the corresponding bit in the IRMask set. The default value after reset is 0x0000. 2.7 Instruction Restrictions There are some cases where certain combinations of instructions which affect MSREGs can produce an undesired result. These cases are limited to the modification of any MSREG by two different, but overlapping operations. In order to guarantee this problem will not occur, MSREG modifications should be avoided one cycle after or before any bits in the same register could be affected by an operation. Simply add a NOP before or after any MSREG access to avoid this problem. For example, a conditional jump could be taken incorrectly if first, the Condition Code register bits are set by a bitwise compare (that is, An - Am) and second, the Condition Code register is modified by a MSREG write (that is,. bitset (ccr), (1<<6)). After the first and second instructions have completed the CCR may not contain the intended sign and zero flag values since the bitset instruction literally performs a read-modify-write operation on the CCR. The read occurs before the result of the bitwise compare is stored in the CCR. After the CCR is modified by the ALU operation, the final modify-write operation completes from the bitset instruction and thus corrupts the state of the condition flags which are necessary for the following conditional jump. Example 2-10 illustrates one scenario with both bad and good coding styles. DS795UM11 2-39 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.7.1 Code Example, Broken Code Example 2-10 2 # sample broken code A0 - A1 bitset (ccr), (1<<6) # # # # if (condition) jmp success # # # # # # Supposedly we only want to modify the limit bit but the whole register must be read then written back this may or may not work depending on the previous state of the CCR and the result from the bitwise compare. Either way the real result in the CCR was overwritten by the bitset. 2.7.2 Code Example, Fixed Code Example 2-11 # sample fixed code A0 - A1 nop # Added one NOP before a direct MSREG # modification bitset (ccr), (1<<6) if (condition) jmp success 2.7.3 Successive but Orthagonal Operations that Affect the CCR It is possible to have successive but orthagonal operations that affect the condition code register. For example, performing an addition or subtraction operation with the "A" accumulators affects the AS and A0 bits but not the BS or B0 bits and vice versa. The following code illustrates this behavior: ##Starting state: The CCR = 0 uhalfword(a0)=(0) uhalfword(a1)=(1) a2=a0-a1 #only the AS and A0 bits are affected uhalfword(b0)=(0) uhalfword(b1)=(1) b2=b0-b1 #only the BS and B0 bits are affected if (b<0) jmp>do_something 2-40 DS795UM11 Copyright 2013 Cirrus Logic 32-Bit DSP Internal Architecture and Programming Model 32-bit DSP Assembly Programmer’s Guide 2.7.4 If Statements and the CCR If statements, such as "if (a<0)" and "if (b==0)" do not alter the contents of the CCR. Therefore it is possible to have consecutive if statements, as shown in the following example: uhalfword(a0)=(0) uhalfword(a1)=(1) a2=a0-a1 if (a < 0) jmp > process_channel_0 ### if statements do not alter the CCR. if (a == 0) jmp > process_channel_1 ### so a second if statement can be placed here #process channel 2 ### and the "else" operations can begin here .. .. ret %process_channel_0: .. .. ret %process_channel_1: .. .. DS795UM11 2-41 Copyright 2013 Cirrus Logic 2 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Chapter 3 3Full Word Instructions 3.1 Assembly Language Syntax The length of the DSP program word is 32-bits, and the assembler allows one 32-bit program word per line. Some instructions use 32 bits of the program word (long word instructions) while others use 16 bits of the program word (short word instructions). When there are two parallel data moves in an instruction, each parallel move uses 8 bits of the 16-bit short word instruction. Any parallel move(s) (16 bits) can be combined with any arithmetic or logic instruction (16 bits) to form a complete 32-bit instruction word. See Figure 3-1. Only labels can occupy the first column of the line; the instruction may be located anywhere else within the line. Optional Label — Full Word Instruction (32-bits) Optional Comment — label: if (a!=0) jmp > # comment Optional Label — Arithmetic, Accumulator or Logic Instruction (16-bits) label: a0+=x0*x2; b0+=x0*y2; (16-bits) Optional Comment — x0,y0=xymem[i0];i0-=n # comment Optional Parallel Move Optional Label — Arithmetic, Accumulator or Logic Instruction (16-bits) (8 bits of parallel move) (8 bits of parallel move) Optional Comment — label: b3=0; x3=xmem[i0]; ymem[i4]=b0;i4+=1 # comment X Memory Data Move Y Memory Data Move Figure 3-1. Assembler Example: 32-bit Instruction Word Some arithmetic instructions allow dual accumulator destinations. For example, the instruction: a3=a1+=x2*x2 translates to, “Square x2, add result of multiplication to a1, store final result in a1 and a3.” In this case the previous value of a3 is irrelevant. The valid accumulator destination value pairs are: • • 1 and 0 3 and 1 DS795UM11 • • 0 and 2 2 and 3 Copyright 2013 Cirrus Logic, Inc 3-1 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.2 Conventions The following conventions for the use of certain syntax terms used in this manual are explained in Table 3-1. Table 3-1. Syntax Terms Used in this Manual Terms Definitions Accum Any Accumulator (a0-a3 or b0-b3) Any Reg Any Register (x0-x3, y0-y3, a0-a3, a0h-a3h, a0l-a3l, b0-b3, b0h-b3h, b0l-b3l, i0-i11, or nm0nm11) DP Reg Any Data Path Register (x0-x3, y0-y3, a0-a3, or b0-b3) MS Reg ccr, dbc_cmd, dbc_d1, dbc_d2, dbc_io, dbc_status, iic_addr, iic_mask, jsr_data, jsr_mode, jsr_ovf, jsr_unf, lp_data1, lp_data2, lst_data1, lst_data2, lst_mode, lst_ovf, lst_unf, mr, mr_jsr_ptr, mr_lst_ptr, mr_r, mr_s, mr_sr, page_p, page_x, page_y, pc, pc_bp, rand, rand_a, rand_b, rand_reset, rx_in, search_cnt, search_latch, stq_base Those parts of instructions that appear in the format: a0 = x0*y0 are optional, which means that the instruction can take any of the following forms: a0 = x0*y0 a0 += x0*y0 a0 -= x0*y0 3.3 Execution Control Instructions 3.3.1 do - Start Hardware Loop Repeat a set of instructions count times, from the instruction following DO (first address) through instruction at label (last address). Count is either a 10-bit immediate value or the 16-bit value in an index register. A count of zero is not allowed. Valid values for the 10-bit immediate number are 1-1024, where 1024 is encoded in the instruction word as zero. Upon finishing the last instruction of the last iteration of the loop, the PC is set to the first instruction following the last address of the loop as specified in the DO instruction. This means that nested do-loops cannot share a last address. Assembler Syntax 1: do (Index Register), label# i = 0 to 11 do (Index Register), > where: Index Register = i0, i1,...,i7 Examples: #example using a label do (i0), label label: nop #example using a local symbol do (i2), > %: nop 3-2 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Flags Affected: None Assembler Syntax 2: do (10-bit count), label do (10-bit count), > Examples: #example using a label do 1024, label label: nop #example using a local symbol do 1024, > %: nop Flags Affected: None 3.3.2 enddo - End Current Do-Loop Pops the do-loop stack pointer. Assembler Syntax: enddo Example: enddo Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-3 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.3.3 do_patch - Jump to Patch Jump to a set of instructions, then start at a first address and execute through a last address or for a specific number of cycles. Upon finishing the last instruction, the PC is set to the first instruction following the do_patch instruction. Nested do-loops or do-patches cannot share a last address. The do_patch instruction allows a programmer to point to and run a piece of patch code in another location in the code or in ROM. There are two forms of the do_patch instruction. • Form one uses a 10-bit immediate value, count, that specifies the number of instructions in the patch. That is, the last address of the patch is calculated as: (first address + count - 1) modulo 0x10000. Valid values for count are 1-1024. • Form two uses a 16-bit value in an index register to specify the last address of the patch. Note that this is not an instruction count but an absolute address. The do_patch instruction utilizes the same loop stack as the do instruction. CAUTION: Do not execute an enddo instruction with a do_patch without also being within a do-loop in the patch, as the resulting behavior is unpredictable. Assembler Syntax 1: do_patch label, (10-bit count) Example 1: start: end: nop nop do_patch Start, (2) Flags Affected: None Note: Valid values are 1-1024. 1024 is encoded as 0. Assembler Syntax 2: do_patch label, (Index Register) where: Index Register = i0, i1,...,i7 Example 2: start: nop end: nop i0 = end nop do_patch Start, (i0) 3-4 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Flags Affected: None 3.3.4 jmp - Jump Jump to 16 bit immediate address or to the address specified in the index register. The index register can be updated. Assembler Syntax 1: jmp label Example 1: jmp label jmp < Flags Affected: None Assembler Syntax 2: jmp [;In <register update.] Example 2: jmp (i0) jmp (i2); i2+=2 Flags Affected: None Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Note: Where n is the offset value in the Modulo-Offset register corresponding to the specified I register (that is, i0 implies nm0). DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-5 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.3.5 if - Jump Conditionally Jump conditionally to “label.” The PC will be updated with the address in “label” when the condition is true. See Section 5.2. It may be helpful to review the instructions contained in this section as they are often used to set the condition required for jumping. The instructions that do not modify the contents of the accumulators being compared can be particularly useful. Assembler Syntax: if (condition) jmp label Example: if (a==0) jmp label if (!limit) jmp > label: nop %: nop Flags Affected: None Restrictions: a a a a a a z z b b b b b b == != < >= <= > != == == != < >= <= > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 limit (limit bit set in Modulo-Offset register) !limit (limit bit not set in Modulo-Offset register) 3-6 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.3.6 call - Jump To Subroutine Jump unconditionally to subroutine in 16 bit address label or the address in the index register. The index register can be updated. Only the PC is saved on the stack. Assembler Syntax 1: call label Example 1: call label call > Flags Affected: None Assembler Syntax 2: call (Index Register); index register update where: Index Register = i0, i1,...,i7 Example 2: call (i7); i7-=2 Flags Affected: None Restrictions: Register Update: no update +1 -1 +n +2 -2 -n DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-7 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.3.7 callint - Answer Interrupt Identical to the CALL instruction, except that interrupts are disabled. It can only be used with a 16-bit address – there is no index register mode. Uses the subroutine stack. Assembler Syntax: callint label Example: callint label Flags Affected: None 3.3.8 callint_stq - Answer Stack Interrupt Identical to the CALL instruction, except that standard interrupts and stack interrupts are disabled. It can only be used with a 16-bit address – there is no index register mode. Uses the subroutine stack. Assembler Syntax: callint_stq label Example: callint_stq label Flags Affected: None 3.3.9 ret - Return From Subroutine Return from subroutine. Pops the return address from subroutine stack and assigns it to the PC. Assembler Syntax: ret Example: retFlags Affected: None 3-8 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.3.10 retint - Return From Interrupt Return from interrupt. Pops the return address from subroutine stack, assigns it to the PC, and enables interrupts. Assembler Syntax: retint Example: retint Flags Affected: None 3.3.11 retint_stq - Return From Stack Interrupt Return from stack interrupt. Pops the return address from subroutine stack, assigns it to the PC, enables interrupts, and enables stack interrupts. Assembler Syntax: retint_stq Example: retint_stq Flags Affected: None 3.3.12 inten - Enable Interrupts Enables interrupts. Assembler Syntax: inten Example: inten Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-9 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.3.13 intdis - Disable Interrupts Disables interrupts. Assembler Syntax: intdis Example: intdis Flags Affected: None 3.3.14 halt - Stop Further Execution Stop execution and enter low-power wait state. Assembler Syntax: halt Example: halt Flags Affected: None 3.3.15 nop - No Operation Perform no operation. Assembler Syntax: nop Example: nop Flags Affected: None 3-10 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.3.16 _breakpt - Breakpoint Instruction Stop execution and enter low-power wait state. Assembler Syntax: _breakpt Example: _breakpt Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.4 64-bit Peripheral Moves 3.4.1 XY Register Pair = ext(16-bit Address) 64-bit data transfer from peripheral space to an XY register pair. Assembler Syntax: XnYn Register Pair = ext(16-bit Address) where: x=0,1,...,3, y=0,1,...,3 Example: x0,y0 = ext(0x0010) Flags Affected: None 3.4.2 Accum = ext(16-bit Address) 64-bit, sign-extended data transfer from peripheral space to an accumulator. Assembler Syntax: Accum = ext(16-bit Address) where: Accum = a0,a1,...a3, b0,b1,...ba3 Example: a0 = ext(0x1234) Flags Affected: None Restrictions: Register: x0,y0- x3,y3 a0 - a3 b0 - b3 3-12 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.4.3 ext(16-bit Address) = XY Register Pair 64-bit data transfer from an XY register pair to peripheral space. Assembler Syntax: ext(16-bit Address) = XY Register Pair Example: ext(0x0010) = x1,y1 Flags Affected: None. 3.4.4 ext(16-bit Address) = Accum 64-bit data transfer from an accumulator to peripheral space. Data from an accumulator does pass through the SRS unit and is affected accordingly. Assembler Syntax: ext(16-bit Address) = Accum Example: ext(0x0010) = x1,y1 ext(0x1234) = b3 Flags Affected: L limit T1, T0 Shift bits Note: After the L, T1, and T0 flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b. Restrictions: Register: x0,y0- x3,y3 a0 - a3 b0 - b3 DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-13 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.4.5 logexp = XY Register Pair Perform primitive operations that may be used to approximate divide, power, and square root functions. Data can be sourced from the XY registers or from the X or Y input mux. Operations are pipelined in two clock cycles so data can not be read until after one cycle. Since the operation is pipelined, a second operation can be started before the first one completes. Assembler Syntax: logexp logexp logexp logexp logexp logexp logexp logexp logexp logexp logexp logexp logexp logexp logexp logexp X=Cmd_X(Mux_X(Xn)) Y=Cmd_Y(Mux_Y(Yn)) X=Cmd_X(Mux_X(Xn)) Y=Cmd_Y(X-Y) X=Cmd_X(Mux_X(Xn)) Y=Cmd_Y(X>>1) X=Cmd_X(Mux_X(Xn)) Y=Cmd_Y(Xn) X=Cmd_X(X-Y) Y=Cmd_Y(Mux_Y(Yn)) X=Cmd_X(X-Y) Y=Cmd_Y(X-Y) X=Cmd_X(X-Y) Y=Cmd_Y(X>>1) X=Cmd_X(X-Y) Y=Cmd_Y(Xn) X=Cmd_X(X>>1) Y=Cmd_Y(Mux_Y(Yn)) X=Cmd_X(X>>1) Y=Cmd_Y(X-Y) X=Cmd_X(X>>1) Y=Cmd_Y(X>>1) X=Cmd_X(X>>1) Y=Cmd_Y(Xn) X=Cmd_X(Xn) Y=Cmd_Y(Mux_Y(Yn)) X=Cmd_X(Xn) Y=Cmd_Y(X-Y) X=Cmd_X(Xn) Y=Cmd_Y(X>>1) X=Cmd_X(Xn) Y=Cmd_X(Xn) Example: logexp X=log(norm64(x0)) logexp X=exp(X-Y) nop x0,y0 = logexp Y=log(norm32(y0)) Y=sm(X-Y) # pipeline delay before reading Flags Affected: None Normalization: X is normalized to 16 bit float point data in the following format: 15 14 13 12 7 6 0 2 sign bit + 1bit 0 + 6 bit exponent + 7 bit significand Bit[15] Bit[14] Bit[13] Bit[12] = S = S & norm64 = 0 is 1 if a 64-bit normalization is performed. Otherwise, it is 0. The data represented by above format is: 0.[1,significant]*2^exponent Example 1: uhalfword(x0) = (0x1000) # x0 = 0x00001000 3-14 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide uhalfword(y0) = (0x100) # y0 = 0x00000100 logexpX = nop(norm32(x0)) Y = nop(norm32(y0)) nop x1, y1 = logexp x1 will have a value of 0x06800000 and y1 will have a value of 0x04800000. The least significant 16-bits can be ignored. Expanding the most significant 16-bits in binary: Bit Register X1 Register Y1 15 (Sign) 14 (Sign & norm64) 7–12 0–6 0 0 001101 (13) 0000000 0 0 001001 (9) 0000000 Note: As is typical in floating point representation, after normalization, the first bit of a non-zero input is always 1, and hence, is not stored. The significant after normalization in the above example would be 0x8000–since the most significant 1 is implicit, it is not stored and hence bits 0–6 in the output are 0. Example 2: uhalfword(x0) = (0x1000) # x0 = 0x00001000 uhalfword(y0) = (0xfe0) # y0 = 0x00000FE0 logexp X = nop(norm64(x0)) Y = nop(norm32(y0)) nop x1, y1 = logexp Norm64 operates on a 64-bit number where the most significant 32-bits come from the input register, and the least significant 32-bits are implicitly assumed as 0. Log Operation It takes normalized float point format data N as input. It calculates log2(2*N). The result is a 9.23 number with the least significant 16-bits always being 0 (the meaningful accuracy is 9.7). Example 1: uhalfword(x0) = (0x1000) # x0 = 0x00001000 = 4096 in decimal uhalfword(y0) = (0x1234) # y0 = 0x00001234 = 4660 in decimal logexp X = log(norm32(x0)) Y = log(norm32(y0)) nop x1, y1 = logexp x1 will have a value of 0x06800000 and y1 will have a value of 0x06970000. Since the output is in 9.23 format, the decimal value of x1 and y1 are 13 and 13.1796875, which matches the expected output of log2(2*4096) and log2(2*4660). Exp Operation It takes a 9.7 log number L as input (where the 9.7 number is in the most significant 16bits and the least significant 16 bits are ignored), and uses L's fractional part to compute (2^(0.fractional))/2. The range of the fractional part is between 0 and 127/128 (since the DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-15 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide fractional part has 7 bits), and the output range will be between [(2^0)/2 = 0.5, (2^(127/128))/2 = 0.9921]. The output will be in 1.31 format with the least significant 23 bits always 0 (the meaningful accuracy is 1.8). The result is: 15 14 13 ... 7 0. 2^(L[6,0])/2 6 ... 0 7-bit 0 Bit 14 will always be 1 since the lowest output will be (2^0)/2 = 0.5. If the 9.7 log number > 31.0, it’s over flow condition, and the result will be 0x7fff. Note: Negative inputs will be interpreted as positive values and produce overflow. Example 1: fixed16(x0) = (0x06c0) # x0 = 0x06c00000 = 13.5 in 9.23 format logexp X = exp(x0) Y = nop(x0) nop x1, y1 = logexp x1 will have a value of 0x5a800000 which corresponds to 0.707 in 1.31 format. This matches (2^(0.5))/2 = 1.414/2 = 0.707. Shift-multiply Operation It takes a 9.7 log number L as input (where the 9.7 number is in the most significant 16bits and the least significant 16 bits are ignored) and uses L's integer part to compute (2^(integer_part)). The final output is a 16-bit number with the output present in the most significant 16-bits and the least significant 16-bits being set to zero. The shift-factor would be saturated if the integer part is greater than 32 and is set to 0 if the integer part is less than 16; that is, the output will be in the range [-2^16 to -2^32]. The result of SM is: if(L[12]) // overflow, data > 2^32 SM = 0x8000; else if(L[11]==0) // underflow, data < 2^16 SM = 0x0000; else SM = -1<<L[[10:7]; if(L[15]) // data is a negative number, negate result SM = ~SM; Example 1: fixed16(x0) = (0x0ec0) # x0 = 0x0ec00000 = 29.5 in 9.23 format logexp X = exp(x0) Y = sm(x0) nop x1, y1 = logexp y1 will have a value of 0xe0000000 which corresponds to –(2^29). Data Output When outputting the 16 bit result from logexp block, it is concatenated with 16 bit 0 at the 3-16 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide end to form a 32 bit number: {16 bit data, 16-bit 0} Sample Usage Here are some basic functions: # compute log2(x0) using HW assist y0 = (0x0000) logexp X=log(norm32(x0)) Y=nop(norm64(y0)) # X = log2(x0); Y = # nop(norm64)0x0000)) == # 0x10000000 == 32 in 9.23 logexp X=nop(X-Y) Y=nop(x0) # subtract/compensate for bias # of 32 (Y ignored hereafter) nop x0,y0 = logexp # x0 = log2(x0) in 9.23 format #(y0 == MS 16 bits of x0) # convert from 20*log10 to linear # a0 = 2^(x0*log2(10)/20) # x0 (input) in 8.24 format .ydata I_VY_log10_to_log2 .dw .f2b(.log2(10)/(20*2)) .code I_S_20log10_to_Linear # convert from 20*log10 to log2 # and move from 8.24 to 9.23 (extra shift in log10_to_log2 constant) a0 = (0x1000) # bias == 32 y0 = ymem[[I_VY_log10_to_log2] a0 += x0*y0 x0 = a0 # convert back to linear (2^x) logexp X=exp(x0) Y=sm(x0) nop x0,y0 = logexp a0 = -x0*y0 # b0 = sqrt(x0) logexp X=log(norm64(x0)) Y=nop(x0) logexp X=exp(X>>1) Y=sm(X>>1) nop x1,y1 = logexp b0 = -x1*y1 # Note, it appears that output is Q5.26 format # cheap divide (a0 = x0/y0) logexp X=log(norm64(x0)) Y=log(norm32(y0)) logexp X=exp(X-Y) Y=sm(X-Y) nop x0,y0 = logexp a0 = -x0*y0 # normalize x2 DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-17 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide x1 = x2 # compute log2 using HW assist logexp X=log(norm32(x1)) Y=nop(x1) uhalfword(y0) = (0x0100) x1,y1 = logexp bitclr hi(x1), (0x807f) b1 = x1*y0 uhalfword(b0) = (0x001f) b0 = b0-b1; b1 = x2 AnyReg(i7, b0h) if (b==0) jmp >noshift do (i7), > %: b1 = b1 << 1 %noshift # X = log2(x1) #Y = x1 = b # used to shift 9.23 down to 32.0 # clear sign bit and truncate any fractions # shift down # remove bias # b1 = b # i7 = shifts # b1 = b' Restrictions: Register: x0, x1, x2, x3, y0 y1 y2 y3 Cmd_X[1:0]: nop sm log exp Cmd_Y[1:0]: nop sm log exp Mux_X[2:0]: norm32 (x reg) norm64 (x reg) x-y x>>1 (x reg) Mux_Y[2:0]: norm32 (y reg) norm64 (y reg) x-y x>>1 (x reg) 3-18 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.4.6 XY Register Pair = logexp Transfer 64-bit value into the XY register pair from the LogExp peripheral. Assembler Syntax: DP Pair = logexp Flags Affected: None Restrictions: Register: x0, x1, x2, x3, y0 y1 y2 y3 3.5 Memory Moves - Direct Note: During Direct Memory Moves, if the size of the destination is less than 32 bits, the excess upper bits of the source are ignored. If the size of the source is less than 32 bits, the excess upper destination bits are zero-filled, except for when reading guard registers (for example, a0g, b3g) that are sign extended. 3.5.1 Any Reg = xmem[16-bit Address] Data transfer from X memory to any register. Direct addressing (16-bit) is used. Assembler Syntax: Any Reg = xmem[16-bit Address] Example: a0 = xmem[0x9980] Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-19 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.2 xmem[16-bit Address] = Any Reg Data transfer from any register to X memory. Direct addressing (16-bit) is used. Assembler Syntax: xmem[16-bit Address] = Any Reg Example: xmem[0x0870] = b3 Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. 3-20 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.3 Any Reg = ymem[16-bit Address] Data transfer from Y memory to any register. Direct addressing (16-bit) is used. Assembler Syntax: Any Reg = ymem[16-bit Address] Example: a0 = ymem[0x9980] Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-21 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.4 ymem[16-bit Address] = Any Reg Data transfer from any register to Y memory. Direct addressing (16-bit) is used. Assembler Syntax: ymem[16-bit Address] = Any Reg Example: ymem[0x0870] = b3 Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b.: 3-22 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.5 Any Reg = pmem[16-bit Address] Data transfer from program memory to any register. Direct addressing (16-bit) is used. Assembler Syntax: Any Reg = pmem[16-bit Address] Example: a0 = pmem[0x9980] Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-23 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.6 pmem[16-bit Address] = Any Reg Data transfer from any register to program memory. Direct addressing (16-bit) is used. Assembler Syntax: pmem[16-bit Address] = Any Reg Example: pmem[0x0870] = b3 Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b 3-24 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.7 Any Reg = inp[16-bit Address] Data transfer from peripheral space to any register. Direct addressing (16-bit) is used. Assembler Syntax: Any Reg = inp[16-bit Address] Example: a0 = inp[0x9980] Flags Affected: None. DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-25 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.8 outp[16-bit Address] = Any Reg Data transfer from any register to peripheral space. Direct addressing (16-bit) is used. Assembler Syntax: outp[16-bit Address] = Any Reg Example: outp[0x0870] = b3 Flags Affected: L limit T1, T0 Shift bits If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b. 3-26 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.9 Any Reg = xmem[Index Register] Data transfer from X memory to any register. Indexed addressing is used. Assembler Syntax: Any Reg = xmem[Index Register] Example: a0 = xmem[i7] Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-27 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.10 xmem[Index Register] = Any Reg Data transfer from any register to X memory. Indexed addressing is used. Assembler Syntax: xmem[Index Register] = Any Reg Example: xmem[i9] = b3 Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. 3-28 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.11 Any Reg = ymem[Index Register] Data transfer from Y memory to any register. Indexed addressing is used. Assembler Syntax: Any Reg = ymem[Index Register] Example: x0 = ymem[i0] Flags Affected: None. DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-29 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.12 ymem[Index Register] = Any Reg Data transfer from any register to Y memory. Indexed addressing is used. Assembler Syntax: ymem[Index Register] = Any Reg Example: ymem[i7] = i3 Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. 3-30 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.13 Any Reg = pmem[Index Register] Data transfer from program memory to any register. Indexed addressing is used. Assembler Syntax: Any Reg = pmem[Index Register] Example: x0 = pmem[i0] Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-31 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.14 pmem[Index Register] = Any Reg Data transfer from any register to program memory. Indexed addressing is used. Assembler Syntax: pmem[Index Register] = Any Reg Example: pmem[i7] = i3 Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. 3-32 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.15 outp[Index Register] = Any Reg Data transfer from peripheral space to any register. Indexed addressing is used. Assembler Syntax: outp[Index Register] = Any Reg Example: outp[i7] = i3 Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-33 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: no update +1 -1 +n +2 -2 -n Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.5.16 Any Reg = inp[Index Register] Data transfer from peripheral space to any register. Indexed addressing is used. Assembler Syntax: Any Reg = inp[Index Register] Example: x0 = inp[i0] Flags Affected: None. 3-34 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update no update +1 -1 +n +2 -2 -n Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-35 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.6 Immediate Register Moves There are five types of immediate register loads designed to cover the useful cases of moving 16-bit immediate data into an 8-bit (guard,) 16-bit (index or nm), 32-bit (data) or 72-bit (accumulator) register. The type of move is designated by a prefix on the destination register. Table 3-2 describes how the five modes work with 72-bit accumulators and Table 3-3 describes the 32-bit registers: Table 3-2. 72-bit Accumulators a0 a0g Instruction fixed16(a0) fixed16(a0h) fixed16(a0l) ufixed16(a0) ufixed16(a0h) ufixed16(a0l) halfword(a0) halfword(a0h) halfword(a0l) uhalfword(a0) uhalfword(a0h) uhalfword(a0l) lo16(a0) lo16(a0h) lo16(a0l) 71 a0h 64 sign extend no change no change zero no change no change sign extend no change no change zero no change no change no change no change no change 63 48 a0l 47 16-bit data 16-bit data no change 16-bit data 16-bit data no change sign extend sign extend no change zero zero no change no change no change no change 32 31 zero zero no change zero zero no change 16-bit data 16-bit data no change 16-bit data 16-bit data no change 16-bit data 16-bit data no change 16 zero no change 16-bit data zero no change 16-bit data zero no change sign extend zero no change zero no change no change no change 15 0 zero no change zero zero no change zero zero no change 16-bit data zero no change 16-bit data no change no change 16-bit data Table 3-2 Legend • zero - all bits zero • sign extend - sign extended from 16-bit immediate value • no change - no bits affected • 16-bit data - bits set to 16-bit immediate value. Table 3-3. 32-bit Data Registers x0 Instruction fixed16(x0) ufixed16(x0) halfword(x0) uhalfword(x0) lo16(x0) 31 16 16-bit data 16-bit data sign extend zero no change 15 0 zero zero 16-bit data 16-bit data 16-bit data The ‘fixed16’ move is the default prefix for accumulators and 32-bit registers. If no prefix is specified for these registers, ‘fixed16’ is used. For 8-bit guard registers, 16-bit index registers, and 16-bit nm registers, no prefix should be specified. If the destination register is a guard register, the least significant 8 bits of the 16-bit immediate data value are loaded. 3-36 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.6.1 fixed16(Destination) = (16-bit Data) Load 16-bit data into a register as a fixed point fractional value. The “fixed16” prefix is optional. If the destination is a 32-bit data register, data is loaded into the most significant 16 bits, and the least significant 16 bits are cleared. If the destination is an accumulator, the data is placed in the most significant 16 bits and the least significant 16 bits are cleared of the high segment (i.e. a0h), the low segment (i.e. a0l) is cleared, and the data is sign extended into the guard segment (i.e. a0g.) Assembler Syntax: fixed16(Destination) = (16-bit Data) Destination = (16-bit Data) Example: fixed16(x0) = (0x1234) x0 = (0x1234) Flags Affected: None Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g MS Reg - See Table 2-25. 3.6.2 ufixed16(Destination) = (16-bit Data) Load 16-bit data into a register as an unsigned fixed point fractional value. If the destination is a 32-bit data register, data is loaded into the most significant 16 bits, and the least significant 16 bits are cleared. If the destination is an accumulator, the data is placed in the most significant 16 bits and the least significant 16 bits are cleared of the high segment (that is, a0h), the low segment (that is, a0l) and the guard segment (that is, a0g.) are cleared. Assembler Syntax: ufixed16(Destination) = (16-bit Data) Example: ufixed16(x0) = (0x1234) DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-37 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Flags Affected: None Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g MS Reg - See Table 2-25. 3.6.3 uhalfword(Destination) = (16-bit Data) Load 16-bit data into a register as an unsigned integer. If the destination is a 32-bit data register, data is loaded into the least significant 16 bits, and the most significant 16 bits are cleared. If the destination is an Accumulator, the data is placed in the least significant 16 bits and the most significant 16 bits are cleared of the high segment (i.e. a0h), the low segment (i.e. a0l) is cleared, and the data is sign extended into the guard segment (i.e. a0g.) Assembler Syntax: uhalfword(Destination) = (16-bit Data) Example: uhalfword(a0) = (0x1234) Flags Affected: None. Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h MS Reg - See Table 2-25. 3-38 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.6.4 Index Register = (16-bit Data) Load 16-bit data into an index register as an unsigned integer. Data is loaded into the least significant 16 bits, and the most significant 16 bits are cleared. Assembler Syntax: Index Register = (16-bit Data) Example: i3 = (0xface) Flags Affected: None. Restrictions: Destination: i0 - i11 3.6.5 NM Register = (16-bit Data) Load 16-bit data into a register as an unsigned integer. Assembler Syntax: NM Register = (16-bit Data) Example: nm8 = (0xface) Flags Affected: None. Restrictions: Destination: nm0 - nm11 3.6.6 Guard Register = (8-bit Data) Load 16-bit data into a register as an unsigned integer. If the destination is a 32-bit data register, data is loaded into the least significant 16 bits, and the most significant 16 bits are cleared. If the destination is an Accumulator, the data is placed in the least significant 16 bits and the most significant 16 bits are cleared of the high segment (i.e. a0h), the low segment (i.e. a0l) is cleared, and the data is sign extended into the guard segment (i.e. a0g.) Since this is the only instruction for loading the Guard, Index, and NM registers directly, the prefix is optional for those destinations. DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-39 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Assembler Syntax: Guard Register = (8-bit Data) Example: b2g = (0xfe) Flags Affected: None .Restrictions: Destination: a0g - a3g b0g - b3g 3.6.7 halfword(Destination) = (16-bit Data) Load 16-bit data into a register as a signed integer. If the destination is a 32-bit data register, data is loaded into the least significant 16 bits, and the and the data is sign extended into the most significant 16 bits. If the destination is an accumulator, the data is placed in the least significant 16 bits and sign extended into the most significant 16 bits of the high segment (i.e. a0h), the low segment (i.e. a0l) is cleared, and the data is sign extended into the guard segment (i.e. a0g.) Assembler Syntax: halfword(Destination) = (16-bit Data) Example: halfword(a0) = (0x1234) halfword(dbc_d1) = (0xffff) Flags Affected: None. Restrictions: Destination: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g MS Reg - See Table 2-25. 3-40 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.6.8 lo16(Destination) = (16-bit Data) Load 16-bit data into the least significant 16 bits of a register. No other bits in the register are affected. If a full accumulator (i.e. a0) is specified, the operation is performed on just the high part of the accumulator (i.e. a0h). Assembler Syntax: lo16(Destination) = (16-bit Data) Example: lo16(x0) = (0x1234) lo16(dbc_d1) = (0xabcd) Flags Affected: None Restrictions: Destination x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g MS Reg - See Table 2-25. 3.6.9 MS Reg = (16-bit Data) Load 16-bit data into a MS register as a signed integer. If the destination is a 32-bit MS register, data is loaded into the least significant 16 bits, and the and the data is sign extended into the most significant 16 bits. Assembler Syntax: MS Reg = (16-bit Data) Example: dbc_d1 = (0x1234) jsr_data = (0xabcd) Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-41 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination MS Reg - See Table 2-25. 3.6.10 AnyReg(Any Reg, Any Reg) Transfer data from any register to any register. Restrictions: None. Assembler Syntax: AnyReg(Destination, Source) Example: AnyReg(nm4, b0h) Flags Affected: L limit T1, T0 Shift bits If the source is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b. 3-42 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination/Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.6.11 Any Reg = MS Reg Transfer data from a Master State Register to any register. Assembler Syntax: Any Reg = MS Reg Example: b0l = jsr_mode a0l = search_latch DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-43 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination/Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25 3.6.12 MS Reg = Any Reg Transfer data from any register to a Master State Register. Assembler Syntax: MS Reg = Any Reg Example: jsr_mode = b0l search_latch = a0l Flags Affected: L Only flags affected are: L limit T1, T0 Shift bits Note: If Any Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b 3-44 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination/Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.6.13 AnyReg (Any Reg, Any Reg), (Any Reg, Any Reg) Performs dual data transfers from any register to any register. There are some limitations in source and destination due to the current implementation of the Cirrus Logic DSP 32-bit architecture. Restrictions: The restrictions for AnyReg transfers include the following: • The first pair of registers is processed first, and the second pair of registers is processed second. • Both sources cannot be Index registers • Both sources cannot be NM registers • If both destinations are Index registers, destination 1 must be i0-i3 or i8-i11, and destination 2 must be i4-i7. • If both destinations are NM registers, destination 1 must be nm0-nm3 or nm8-nm11, and destination 2 must be nm4-nm7. • Dual accumulator destination indices must be equal. • “B” accumulator must be in second pair of arguments • “A” accumulator must be in first pair of arguments Assembler Syntax: AnyReg(Destination1, Source1),(Destination2, Source2) DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-45 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Example: AnyReg(i0,nm5),(nm5,b0) AnyReg(i0,nm4),(i7,x0) AnyReg(x0,nm4),(i2,i8) AnyReg(i0,i11),(i4,b0h) Flags Affected: L limit T1, T0 Shift bits Note: If Source 1 or Source 2 is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. Restrictions: Destination1/Source1, Destination2/Source2: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - a3h b0h - b3h a0g - a3g b0g - b3g i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 3.6.14 Accum = long(Accum) Transfers 64 bits of the source through the SRS unit to the destination. This instruction differs from the move instruction “Accum = Accum” in that it transfers 64 bits instead of 32, and it differs from the math instruction “Accum =+ Accum” in that it transfers 64 bits instead of 72, and it goes through the SRS unit, performing any necessary shifting or saturating. Assembler Syntax: Accum = long(Accum) Example: a0 = long(b2) 3-46 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Flags Affected: L limit T1, T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b. Note: The Source and Destination are both encoded twice in this instruction. Restrictions: Destination: Source a0 - a3 b0 - b3 3.6.15 In = Im/(0) ± (16-bit Data) Add 16-bit immediate data to an optional source index register and place result in destination index register. If an optional source Index register is used, addition is governed by the current state of the NM register associated with the source Index register (Im). Important Note: When the In = (0) ± (16-bit Data) form of this instruction is used, addition mode is governed by the current state of the NM0 register, regardless of the value of n. This instruction uses the AGU and hence does not require a subsequent dead-cycle before the index register is used. For example, the following code is valid: i0 = (0) + (0x1234) x0 = xmem[i0] Assembler Syntax: In In In In = = = = Im + (16-bit Data) Im - (16-bit Data) (0) + (16-bit Data) (0) - (16-bit Data) Example: i0 = i4 + (0x1234) i4 = i4 - (0x1234) Flags Affected: None DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-47 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: i0 i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 Source: i0 i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 Instruction: In In In In = = = = Im + (16-bit) Im - (16-bit) (0) + (16-bit) (0) - (16-bit) 3.7 Bit Manipulation Instructions 3.7.1 Bit Test Test the bits of the register specified by the immediate mask value. If all bits of the masked 16-bit result are ones, then the z bit in the CCR is set to one. Otherwise, the z bit is set to zero. The pseudo-code for this operation is: if ((reg AND mask) XOR mask) == 0x0000 z = 1 else z = 0 Either the least significant or most significant 16 bits of the register can be used, as selected by the “lo” or “hi” prefix. The register is unaffected after execution of this instruction. Not allowed on accumulators. 3-48 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Assembler Syntax: BitTst lo(32-bit Reg),(16-bit Mask) BitTst hi(32-bit Reg),(16-bit Mask) BitTst (16-bit Reg),(16-bit Mask) Example: BitTst lo(x0),0x0400 BitTst hi(imask),0x0002 BitTst (i4),0xaaaa Flags Affected: Z zero Restrictions: Destination: x0 - x3 y0 - y3 i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.7.2 Bit Set Perform a bitwise test as in BitTst, then perform a bitwise OR on 16 bits of the specified register with the immediate mask value and place the result back into the register. Either the least significant or most significant 16 bits of the register can be used, as selected by the “lo” or “hi” prefix. Not allowed on accumulators. Assembler Syntax: BitSet lo(32-bit Reg),(16-bit Mask) BitSet hi(32-bit Reg),(16-bit Mask) BitSet (16-bit Reg),(16-bit Mask) Example: BitSet lo(y2),0x0402 BitSet hi(x3),0x0001 BitSet (i1),0x0002 Flags Affected: Z DS795UM11 zero Copyright 2013 Cirrus Logic, Inc 3-49 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: x0 - x3 y0 - y3 i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3.7.3 Bit Clear Perform a bitwise test as in BitTst, then perform a bitwise AND on 16 bits of the specified register with the bitwise NOT of the immediate mask value and place the result back into the register. Either the least significant or most significant 16 bits of the register can be used, as selected by the “lo” or “hi” prefix. Not allowed on accumulators. Assembler Syntax: BitClr lo(32-bit Reg),(16-bit Mask) BitClr hi(32-bit Reg),(16-bit Mask) BitClr (16-bit Reg),(16-bit Mask) Example: BitClr lo(x2),0x0100 BitClr hi(y2),0x8002 BitClr (nm7),0x0400 Flags Affected: Z zero Restrictions: Destination x0 - x3 y0 - y3 i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. 3-50 Copyright 2013 Cirrus Logic, Inc. DS795UM11 Full Word Instructions 32-bit DSP Assembly Programmer’s Guide 3.7.4 Bit Change Perform a bitwise test as in BitTst, then perform a bitwise XOR on 16 bits of the specified register with the immediate mask value and place the result back into the register. Either the least significant or most significant 16 bits of the register can be used, as selected by the “lo” or “hi” prefix. Not allowed on accumulators. Assembler Syntax: BitChg lo(32-bit Reg),(16-bit Mask) BitChg hi(32-bit Reg),(16-bit Mask) BitChg (16-bit Reg),(16-bit Mask) Example: BitChg lo(x0),0xffff BitChg hi(y3),0x0180 BitChg (i0),0x5000 Flags Affected: Z zero Restrictions: Destination: x0 - x3 y0 - y3 i8 - i11 nm8 - nm11 i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 MS Reg - See Table 2-25. DS795UM11 Copyright 2013 Cirrus Logic, Inc 3-51 Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Chapter 4 4Multifunction Moves Multifunction instructions occupy the most significant 16 bits of the instruction word. In general, they affect the shifting/limiting bits of the CCR. 4.1 Single Multifunction Moves Transfers data between a data path register and X or Y memory. Either indexed or direct addressing (6 bit) can be used. Index registers can be updated. Data moves that can be done by themselves or included with a corresponding arithmetic instruction. Performing two data moves is considered a parallel move. There are restrictions for parallel moves, but any multifunction move can be done in conjunction with an arithmetic instruction. 4.1.1 DP Reg = xmem[Index Register] DP Reg = xmem[6-bit Address] Assembler Syntax: DP Reg = xmem[Index Register]; Index Register;= update DP Reg = xmem[6-bit Address] Example: x2 = xmem[i7] b0 = xmem[0x38] Flags Affected: None DS795UM11 4-1 Copyright 2013 Cirrus Logic 4 Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: 4 x0 x1 x2 x3 y0 y1 y2 y3 a0 a1 a2 a3 b0 b1 b2 b3 4.1.2 xmem[Index Register] = DP Reg xmem[6-bit address] = DP Reg Assembler Syntax: xmem[Index Register] = DP Reg;= update xmem[6-bit Address] = DP Reg Example: xmem[i7] = x2 xmem[0x2f] = a2 Flags Affected: L limit T1,T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b 4-2 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Source: 4 x0 x1 x2 x3 y0 y1 y2 y3 a0 a1 a2 a3 b0 b1 b2 b3 4.1.3 DP Reg = ymem[Index Register] DP Reg = ymem[6-bit address] Assembler Syntax: DP Reg = ymem[Index Register];= update DP Reg = ymem[6-bit Address] Example: b2 = ymem[i7] y3 = ymem[0x1f] Flags Affected: None. DS795UM11 4-3 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: 4 x0 x1 x2 x3 y0 y1 y2 y3 a0 a1 a2 a3 b0 b1 b2 b3 4.1.4 ymem[Index Register] = DP Reg ymem[6-bit address] = DP Reg Assembler Syntax: ymem[Index Register] = DP Reg;= update ymem[6-bit Address] = DP Reg Example: ymem[i7] = b3 ymem[0x30] = i3 Flags Affected: L limit T1,T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. 4-4 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Source: 4 x0 x1 x2 x3 y0 y1 y2 y3 a0 a1 a2 a3 b0 b1 b2 b3 4.1.5 Data Path Register to or from Any Register 4.1.5.1 DP Reg = Any Reg Data transfer between any register and a data path register. Assembler Syntax: DP Reg = Any Reg Example: y0 = b3g Flags Affected: None. DS795UM11 4-5 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: 4 x0 x1 x2 x3 y0 y1 y2 y3 a0 a1 a2 a3 b0 b1 b2 b3 Source: x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 4.1.5.2 Any Reg = DP Reg There are two different instructions that can be used to load the high portion of an accumulator with the contents of a data path register. The first takes the form: b3 = x2 The second takes the form: b3 = +x2 In terms of effective functionality, the results of executing either of these two instructions are identical—the contents of the destination accumulator is always the same, regardless of which instruction is executed. Neither is affected by the SRS block, so no shifting, rounding or 4-6 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide saturation will take place. Nor is either subject to the built-in one bit left shift that normally occurs with MAC multiply instructions (such as with b3 = x2 * y2). Both sign-extend into the accumulator's guard register during the transfer. 4 The only difference is in the type of instruction that they are. In the form b3 = x2, you are executing a 16-bit Multifunction Move operation. This is the instruction described in this section. In the form b3 = +x2, you are executing a 16-bit Multifunction Arithmetic MAC operation. Specifically, this is the “Multiply by One with Optional Accumulate” instruction as seen in Section 5.1.6 and Section 5.1.7. The reason it is important to note that these are different instructions is because, as with any pair of one Multifunction Move instruction and one Multifunction Arithmetic/Accumulator instruction, the two can be packed into a single 32-bit instruction. For example, if you would like to load four accumulators with four different values using a single instruction, you may do so with the following code: a0 = +x2; b0 = +y2; a3 = x3; b3 = y3 A parallel multiply by one instruction as described in Section 5.1.7 allows two destination registers per assignment, so this instruction could be extended to assign six registers in the form: a1 = a0 = +x2; b1 = b0 = +y2; a3 = x3; b3 = y3 where a1 and a0 are both assigned the value in x2 and b1 and b0 are both assigned the value in y2. In general, when choosing between which form of the move instruction you will use in your code, you should consider what else you would like to accomplish in addition to the move with that cycle of CPU execution. If you want to perform another move, as shown in the above example, or perhaps a move from memory into an accumulator, you must use the form of move that utilizes the MAC: b3 = +x2; a0 = xmem[i0] If you attempted the following code: b3 = x2; a0 = xmem[i0] you would receive the following assembler error: "Instructions cannot fit into one word." On the other hand, if you wanted to perform some other arithmetic operation with this instruction, such as a bitwise OR (see Section 5.2.15), then you must use the form of move that only operates on the data path: b3 = x2; a0 = a0 | b1 Again, if you attempted the following code: b3 = +x2; a0 = a0 | b1 you would receive the following assembler error: "Instructions cannot fit into one word." DS795UM11 4-7 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Finally, one should be careful not to confuse the "=+" notation with the "+=" operator. The instruction b3 =+ x2 and b3 += x2 accomplish two very different things.The latter is the accumulation operator that can be translated into: 4 b3 = b3 + x2 where the contents of x2 will be added to the high portion of b3 of stored back into b3. Assembler Syntax: Any Reg = DP Reg Example: b3 = x2 nm0 = a2 Flags Affected: L limit T1,T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. 4-8 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: 4 x0 - x3 y0 - y3 a0 - a3 b0 - b3 a0l - a3l b0l - b3l a0h - b3h b0h - b3h a0g - a3g b0g - b3g i0 - i3 nm0 - nm3 i4 - i7 nm4 - nm7 Source: x0 x1 x2 x3 y0 y1 y2 y3 a0 a1 a2 a3 b0 b1 b2 b3 DS795UM11 4-9 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide 4.2 Parallel Multifunction Move Instructions Parallel multifunction moves allow one X memory move and one Y memory move in a single instruction. Parallel multifunction moves are a subset of multifunction data moves. 4 Restrictions: Perform data transfer from memory to a data register, or from an accumulator to memory. • X memory can only be addressed using index registers i0 and i1 • Y memory can only be addressed using index registers i4 and i5. • X memory moves can only be used with X data registers and A accumulators. • Y memory moves can only be used with Y data registers and B accumulators. • Accumulators (a0-a3, b0-b3) can only be a source • Data registers (x0-x3, y0-y3) can only be a destination Note: “Parallel Pairing: Right” and “Parallel Pairing Left,” which appear below Assembler Syntax statements in this Section, refers to whether the defined register in a set of paired registers is to the left or right of a comma. A register that is “Parallel Pairing: Left” must be to the left of the comma, while “Parallel Pairing: Right” must be to the right of a comma when reading an instruction from left to right. 4.2.1 Xn = xmem[Index Register] Assembler Syntax: Xn = xmem[Index Register];= update Parallel Pairing: Left (see Restrictions in Section 4.2) Example: x0 = xmem[i0]; i0+=n; y0 = ymem[i4]; i4+=n; Flags Affected: None 4-10 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Register Update: 4 no update +1 -1 +n Destination x0 x1 x2 x3 4.2.2 xmem[Index Register] = An Assembler Syntax: xmem[Index Register] = An;= update Parallel Pairing: Left (see Restrictions in Section 4.2) Example: xmem[i1] = a3; i1+=n; y3 = ymem[i4] Flags Affected: L limit T1,T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected.After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b. Restrictions: Register Update: no update +=1 -=1 +=n Destination: a0 a1 a2 a3 DS795UM11 4-11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide 4.2.3 Ym = ymem[Index Register] Assembler Syntax: 4 Ym = ymem[Index Register];= update Parallel Pairing: Right (see Restrictions in Section 4.2) Example: xmem[i1] = a3; i1+=n; y3 = ymem[i4] Flags Affected: None Restrictions: Register Update: no update +=1 -=1 +=n Destination: y0 y1 y2 y3 4.2.4 ymem[Index Register] = Bm Assembler Syntax: ymem[Index Register] = Bm;= update Parallel Pairing: Right (see Restrictions in Section 4.2) Example: Examples of parallel multifunction moves: X memory moves are placed before Y memory moves in the syntax: x0 = xmem[i0]; i0+=n; ymem[i4]=b0 4-12 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Other Examples: ;ymem[i4] = b0 When used in conjunction with an arithmetic (least significant 16 bits) instruction: a0=x2*y0; x0=xmem[i0]; i0+=n; xmem[i1]=a0; 4 ymem[i4]=b0 ymem[i4]=b3; i4+=1 x0=xmem[i0]; i0+=1; y0=ymem[i5]; i5+=n x3=xmem[i0]; i0+=1; ymem[i5]=b2; i5-=1 Flags Affected: L limit T1,T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b. Restrictions: Register Update: no update +1 -1 +n Source b0 b1 b2 b3 4.3 Data Path Register to Data Path Register Instructions Perform data transfer from a data register to a data register. Restrictions: • Accumulators (a0-a3, b0-b3) can only be a source • Data registers (x0-x3, y0-y3) can only be a destination DS795UM11 4-13 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide 4.3.1 DP Reg = DP Reg Data path register to data path register data move. The only restriction is that the source and destination must have the same index (0-3). 4 Assembler Syntax: DP Reg = DP Reg Example: y0 = b3g Flags Affected: None Restrictions: Destination x y a b Source x y a b Note: See other Restrictions in Section 4.3. 4.4 Parallel Register to/from Register Instructions Perform data transfer from a data register to a data register, or from an accumulator to a data register. Restrictions: • Accumulators (a0-a3, b0-b3) can only be a source • Data registers (x0-x3, y0-y3) can only be a destination Examples: x0=y0; b3=a3 a2=b2; y0=a0 4-14 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide 4.4.1 Data Path Register to Data Path Register and Data Path Register to/from X or Y Memory Restrictions Perform a parallel multifunction move using a data register move and a memory move. One instruction from each of the previous two groups can be combined into one parallel move. • For this combination move, the sources cannot both be A or B accumulators. For example: ymem[i4]=b2; a0=b0 ymem[i4]=b2; b2=a2 #Bad:Sources are b2 and b0 (both B accumulators) #Good:Sources are b2 and a2 The exception to this is when the source accumulators are from the same accumulator group. For example, this is illegal: x0=a0; xmem[i0]=a1 The following instruction is also illegal: x0=a0; xmem[i0]=a0 But it is legal, when rewritten as: x0=xmem[i0]=a0 The restrictions for this case are: • An accumulator must be the source. • The memory space and accumulator must “match.” For example, this is illegal: x0=ymem[i4]=a1 As it uses Y memory and an A accumulator. Switching to a B accumulator makes the instruction legal, as follows: x0=ymem[i4]=b1 • There is no restriction on the data register (x0-x3 or y0-y3) used. • If both destinations are data registers (rather than one data register and one memory location), one must be an X data register and the other a Y data register. • If the accumulator source is an A accumulator, the accumulator index is encoded in the most significant 8 bits of the opcode, and the accumulator index in the least significant 8 bits is ignored. Conversely, if the accumulator source is a B accumulator, the accumulator index is encoded in the least significant 8 bits of the opcode, and the accumulator index in the most significant 8 bits is ignored. DS795UM11 4-15 Copyright 2013 Cirrus Logic 4 Multifunction Moves 32-bit DSP Assembly Programmer’s Guide 4.5 64-bit Multifunction Moves A 64-bit multifunction move will move X and Y memory into a pair of registers. 4 Restrictions: • The data register pair must share the same index. 4.5.1 Data Path Register Pair to or from XY Memory 4.5.1.1 Data Path Register Pair = xymem[Index Register] Data Path Register Pair = xymem[6-bit Address] Long data transfer between XY or AB registers and XY memory. Either indexed or direct addressing (6 bit) can be used. Index registers can be updated. Assembler Syntax: Xn,Yn Xn,Yn An,Bn An,Bn = = = = xymem[Index xymem[6-bit xymem[Index xymem[6-bit = = = = xymem[i0] xymem[0x23] xymem[i7] xymem[0x03] Register];= update Address] Register] Address] Example: x0,y0 x1,y1 a0,b0 a3,b3 Flags Affected: None Restrictions: Destination: x0,y0 x1,y1 x2,y2 x3,y3 a0,b0 a1,b1 a2,b2 a3,b3 4-16 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide 4.5.1.2 xymem[Index Register] = Data Path Register Pair xymem[6-bit Address] = Data Path Register Pair 4 Assembler Syntax: xymem[Index xymem[6-bit xymem[Index xymem[6-bit Register] = Xn,Yn;= update Address] = Xn,Yn Register] = An,Bn Address] = An,Bn Example: xymem[i0] = xymem[0x23] xymem[i7] = xymem[0x03] x0,y0 = x2,y2 a3,b3 = a2,b2 Flags Affected: L limit T1,T0 Shift bits Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 will only have values of 10b, 01b, or 00b. Restrictions: Source: x0,y0 x1,y1 x2,y2 x3,y3 a0,b0 a1,b1 a2,b2 a3,b3 DS795UM11 4-17 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide 4.5.2 Accumulator to or from XY Memory 4 4.5.2.1 Accum = xymem[Index Register] Accum = xymem[6-bit Address] Long data transfer between XY memory and an accumulator. Either indexed or direct addressing (6 bit) can be used. Index registers can be updated. Assembler Syntax: Accum = xymem[Index Register];= update Accum = xymem[6-bit Address] Example: b3 = xymem[i7] a0 = xymem[0x30] Flags Affected: None Restrictions: Destination a0 a1 a2 a3 b0 b1 b2 b3 4.5.2.2 xymem[Index Register] = Accum xymem[6-bit Address] = Accum Assembler Syntax: xymem[Index Register] = Accum;= update xymem[6-bit Address] = Accum Example: xymem[i7] = b0 xymem[0x24] = a3 4-18 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Flags Affected: L limit T1,T0 Shift bits 4 Note: If Reg is an accumulator, then the L, T1, and T0 are affected. After these flags are set, they must be cleared manually by the user. T0 and T1 only have values of 10b, 01b, or 00b. Restrictions: Source: a0 a1 a2 a3 b0 b1 b2 b3 4.6 Index Register Updates 4.6.1 In = Im ± (6-bit Data) Add 6-bit immediate data to source index register and place result in destination index register. Source register (Im) is limited to i8-i11. Addition is governed by the current state of the NM register associated with the source Index register (Im). This instruction uses the AGU and hence does not require a subsequent dead-cycle before the index register is used. For example, the following code is valid: i0 = i8 + (0x12) x0 = xmem[i0] Assembler Syntax: In = Im + (6-bit Data) In = Im - (6-bit Data) Examples: i0 = i9 + (0x12) i4 = i8 - (0x34) Flags Affected: None DS795UM11 4-19 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: 4 i0 i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 Source: i8 i9 i10 i11 4.6.2 In ±= 1/2/N Normal index register update without associated move. Operation occurs in the decode state. Assembler Syntax: In ±= 1/2/n Examples: i0 += 1 i4 -= n Flags Affected: None 4-20 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Moves 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: 4 i0 i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 DS795UM11 4-21 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Chapter 5 5Multifunction Operations 5.1 Multifunction Arithmetic Instructions Single or parallel arithmetic instructions can be done by themselves or with multifunction moves. 5.1.1 Parallel Multiply/Multiply-Accumulate I Parallel multiply or multiply accumulate, result in one or two accumulators. Assembler Syntax: Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar Aq=Ar ±= ±= ±= ±= ±= ±= ±= ±= ±= ±= ±= ±= ±= ±= ±= ±= Xn*Xm;Bq=Br ±= Yn*Xm Xn*Xm;Bq=Br ±= -Yn*Xm Xn*Ym;Bq=Br ±= Yn*Ym Xn*Ym;Bq=Br ±= -Yn*Ym Yn*Xm;Bq=Br ±= Xn*Xm Yn*Xm;Bq=Br ±= -Xn*Xm Yn*Ym;Bq=Br ±= Xn*Ym Yn*Ym;Bq=Br ±= -Xn*Ym -Xn*Xm;Bq=Br ±= Yn*Xm -Xn*Xm;Bq=Br ±= -Yn*Xm -Xn*Ym;Bq=Br ±= Yn*Ym -Xn*Ym;Bq=Br ±= -Yn*Ym -Yn*Xm;Bq=Br ±= Xn*Xm -Yn*Xm;Bq=Br ±= -Xn*Xm -Yn*Ym;Bq=Br ±= Xn*Ym -Yn*Ym;Bq=Br ±= -Xn*Ym Example: a0=x2*x3;b0=y2*x3 a0=x2*x3;b0=-y2*x3 a1=a0+=x0*y2;b1=b0+=y0*y2 a1=a0+=x0*y2;b1=b0-=y0*y2 a2=a3=-y1*x1;b2=b3=-x1*x1 a2=a3=-y1*x1;b2=b3=x1*x1 a1-=y3*y0; b1-=x3*y0 a1-=y3*y0; b1+=x3*y0 a0=-x2*x3;b0=y2*x3 a0=-x2*x3;b0=-y2*x3 a1=a0-=x0*y2;b1=b0+=y0*y2 DS795UM11 5-1 Copyright 2013 Cirrus Logic 5 Multifunction Operations 32-bit DSP Assembly Programmer’s Guide a1=a0-=x0*y2;b1=b0-=y0*y2 a2=a3=-y1*x1;b2=b3=-x1*x1 a2=a3=-y1*x1;b2=b3=x1*x1 a1+=y3*y0; b1-=x3*y0 a1+=y3*y0; b1+=x3*y0 5 Flags Affected: None. 5.1.2 Parallel Multiply/Multiply-Accumulate II Parallel Multiply/Multiply-Accumulate II (Double FIR) allows one register (X or Y) times two different registers. The arithmetic operators preceding the source register must be the same for both instructions, which together make up this parallel instruction. Example: Aq=Ap += Xn*Yn; Bq=Bp += Xn*Xm Aq=Ap -= Xn*Yn; Bq=Bp -= Xn*Xm Assembler Syntax: Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap Aq=Ap ±= Xn*Yn;Bq=Bp ±= ±= Xn*Yn;Bq=Bp ±= ±= Xn*Xm;Bq=Bp ±= ±= Xn*Ym;Bq=Bp ±= ±= Yn*Xn;Bq=Bp ±= ±= Yn*Xn;Bq=Bp ±= ±= Yn*Xm;Bq=Bp ±= ±= Yn*Ym;Bq=Bp ±= = -XY;Bq=Bp = -XX = -XY;Bq=Bp = -XY = -XX;Bq=Bp = -XY = -YX;Bq=Bp = -YX = -YX;Bq=Bp = -YY = -YY;Bq=Bp = -YX Xn*Xm Xn*Ym Xn*Yn Xn*Yn Yn*Xm Yn*Ym Yn*Xn Yn*Xn Example: a3-= x0*y0;b3-= x0*x2 a1=a0= x0*y0;b1=b0= x0*y3 a1= -x1*x1;b1= -x1*y1 a2=a3+= x1*y2;b2=b3+= x1*y1 a3=a1= -y3*x3;b3=b1= -y3*x2 a0=a2-= y2*x2;b0=b2-= y2*y0 a2= y0*x2;b2= y0*x0 a0+= y3*y0;b0+= y3*x3 5-2 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: None 5 Restrictions: Destination: a0,b0 a1,b1 a2,b2 a3,b3 a1a0,b1b0 a3a1,b3b1 a0a2,b0b2 a2a3,b2b3 5.1.3 Real Multiply/Multiply-Accumulate Multiply or multiply accumulate, result in an accumulator. Special mode allows one multiplicative operator to be treated as an unsigned value, range (0 to 1.99999) instead of (-1 to.99994). Unsigned by unsigned multiples are only valid for results (1.99999). Assembler Syntax: Accum Accum Accum Accum Accum Accum Accum ?= ?= ?= ?= ?= ?= ?= -Xn*Xm -Xn*(unsigned)Ym -Xm*Yn -Yn*Xm -Yn*Ym -(unsigned)Xn*(unsigned)Ym -Xn*(unsigned)Ym Example: a1 b3 b0 a1 b2 a0 = x0*x3 = x3*(unsigned)y3 += x1*y2 -= y2*x1 = y0*y0 = (unsigned)x0*(unsigned)y0 Flags Affected: None DS795UM11 5-3 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Restrictions: Destination: 5 a0 a1 a2 a3 b0 b1 b2 b3 5.1.4 Parallel Squares Square data registers, store or accumulate, result in one or two accumulators. Assembler Syntax: Aq = Ar ±= -Xn*Xn;Bq = Br ±= -Yn*Ym Example: a0 = x2*x2;b0 = y2*y1 a3=a1+=x2*x2;b3=b1+=y2*y1 Flags Affected: None. Restrictions: Destination: a0,b0 a1,b1 a2,b2 a3,b3 a1a0,b1b0 a3a1,b3b1 a0a2,b0b2 a2a3,b2b3 5-4 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide 5.1.5 Parallel Multiply with Add Multiply two data registers, add accumulator (A0 or B0 only), store result in one or two accumulators. 5 Assembler Syntax: Aq = Ar = A0±Xn*Xm;Bq = Br = B0±Yn*Xm Aq = Ar = A0±Xn*Ym;Bq = Br = B0±Yn*Ym Example: a1=a0-x2*x1; b1=b0-y2*x1 a2=a3=a0-x3*y0; b2=b3=b0-y3*y0 a1=a0=a0+x1*y1; b1=b0=b0+y1*y1 Flags Affected: None Restrictions: Destination: a0,b0 a1,b1 a2,b2 a3,b3 a1a0,b1b0 a3a1,b3b1 a0a2,b0b2 a2a3,b2b3 5.1.6 Multiply by One with Optional Accumulate Move or accumulate X or Y register into A or B accumulator. Note: The syntax ‘b3 = +x2’ is used to differentiate between this instruction and the move ‘b3 = x2’. See Section 4.1.5.2 for a discussion of the differences. Assembler Syntax: Accum Accum Accum Accum ±= Xn ±= Yn = ±Xn = ±Yn Example: b3 = +x2 a2 -= y0 DS795UM11 5-5 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: None. 5 Restrictions: Destination: a0 a1 a2 a3 b0 b1 b2 b3 5.1.7 Parallel Multiply by One with Optional Accumulate Move or accumulate X or Y registers into A or B accumulators. Note: The syntax ‘b3 = +x2’ is used to differentiate between this instruction and the move ‘b3 = x2’. See Section 4.1.5.2 for a discussion of the differences. Assembler Syntax: Aq = Ap ±= ±Xn;Bq = Bp ±= ±Yn Aq = Ap ±= ±Yn;Bq = Bp ±= ±Xn Example: a3 = +x0;b3 = +y0 a3 = a1 -= y2;b3 = b1 -= x2 Flags Affected: None. Restrictions: Destination: a0,b0 a1,b1 a2,b2 a3,b3 a1a0,b1b0 a3a1,b3b1 a0a2,b0b2 a2a3,b2b3 5-6 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide 5.2 Multifunction Accumulator Instructions Least significant 16 bits of instruction. Affects the zero and negative bits in the CCR. 5 5.2.1 Parallel Add with Shift Parallel add or subtract two accumulators, result in an accumulator. One of the operands can be shifted. Assembler Syntax: Ap=An Am;Bp=Bn Bm Ap=An Bm;Bp=Bn Am Ap=(An * 2) Am;Bp=(Bn * 2) Bm Ap=(An * 2) Bm;Bp=(Bn * 2) Am Example: a1 a3 a3 a1 = = = = a2+a3;b1 = b2+b3 a0-b0;b3 = b0-a0 (a2*2)+a3;b3 = (b2*2)-b3 (a1*2)-b1;b1 = (b1*2)+a1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.2 Add with Shift Add or subtract two accumulators, result in an accumulator. One of the operands can be shifted. Assembler Syntax: Ap=An Am Bp=Bn Bm Ap=An Bm Bp=Bn Am Ap=(An * 2) Bp=(Bn * 2) Ap=(An * 2) Bp=(Bn * 2) Am Bm Bm Am Example: a3 = a2-a1 DS795UM11 5-7 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: 5 A0 A zero AS A sign B0 B zero BS B sign Restrictions: Destination: a0 a1 a2 a3 b0 b1 b2 b3 5.2.3 Conditional Operation - Maximum Two accumulators are compared. If comparison is true, an accumulator to accumulator move is performed. This accumulator move is a full 72-bit move and does not pass through the SRS. See Section 2.5.1 for an example. Assembler Syntax: if if if if (Bn>Bm) (An>Am) (Bn>Am) (An>Bm) An=Am Bn=Bm An=Bm Bn=Am Example: if if if if (b0>b3) (a1>a2) (b1>a1) (a2>b2) a0=a3 b1=b2 a1=b1 b2=a2 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5-8 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide 5.2.4 Conditional Operation - Minimum Two accumulators are compared. If comparison is true, an accumulator to accumulator move is performed. This accumulator move is a full 72-bit move and does not pass through the SRS. Assembler Syntax: if if if if (Bn<Bm) (An<Am) (Bn<Am) (An<Bm) An=Am Bn=Bm An=Bm Bn=Am Example: if if if if (b0<b3) (a1<a2) (b1<a1) (a2<b2) a0=a3 b1=b2 a1=b1 b2=a2 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.5 Conditional Operation - Absolute Value Maximum The absolute values of two accumulators are compared. If comparison is true, an accumulator to accumulator move is performed. This accumulator move is a full 72-bit move and does not pass through the SRS. Assembler Syntax: if if if if (|Bn|>|Bm|) (|An|>|Am|) (|Bn|>|Am|) (|An|>|Bm|) An=Am Bn=Bm An=Bm Bn=Am Example: if if if if (|b0|>|b3|) (|a1|>|a2|) (|b1|>|a1|) (|a2|>|b2|) a0=a3 b1=b2 a1=b1 b2=a2 DS795UM11 5-9 Copyright 2013 Cirrus Logic 5 Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: 5 A0 A zero AS A sign B0 B zero BS B sign 5.2.6 Conditional Operation - Absolute Value Minimum The absolute values of two accumulators are compared. If comparison is true, an accumulator to accumulator move is performed. This accumulator move is a full 72-bit move and does not pass through the SRS. Assembler Syntax: if if if if (|Bn|<|Bm|) (|An|<|Am|) (|Bn|<|Am|) (|An|<|Bm|) An=Am Bn=Bm An=Bm Bn=Am Example: if if if if (|b0|<|b3|) (|a1|<|a2|) (|b1|<|a1|) (|a2|<|b2|) a0=a3 b1=b2 a1=b1 b2=a2 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.7 Bitwise Accumulator Move Bitwise accumulator move. This accumulator move is a full 72-bit move and does not pass through the SRS. Assembler Syntax: An Bn An Bn =+ =+ =+ =+ Am Bm Bm Am 5-10 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Example: a0 =+ b3 5 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.8 Parallel Bitwise Accumulator Move This is a dual bitwise accumulator move. All 72 bits of the accumulators are transferred. The move does not pass through the SRS. An accumulator can be both a destination and a source, so this instruction can successfully be used to swap the entire contents of two accumulators. For example, if two accumulators have the following values: a0 = 0x1234567890 b0 = 0x0987654321 and the following instruction is executed: a0 =+ b0; b0 =+ a0 after execution the accumulators have been swapped: a0 = 0x0987654321 b0 = 0x1234567890 Assembler Syntax: An =+ Am; Bn =+ Bm An =+ Bm; Bn =+ Am Example: a0 =+ b3;b0 =+ a3 a0 =+ a1;b0 =+ b1 a0 =+ b1;b0 =+ a1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign DS795UM11 5-11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide 5.2.9 Bitwise Complement The one’s complement of an accumulator is stored in an accumulator. 5 Assembler Syntax: AccumAccum =~ Accum; An =~ Am Bn =~ Bm An =~ Bm Bn =~ Am Example: a1 =~ b0 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign Note: Either A0,AS or B0,BS can be affected based on which accumulator is used. 5.2.10 Parallel Bitwise Complement The one’s complement of an accumulator is stored in an accumulator. Assembler Syntax: An =~ Am; Bn =~ Bm An =~ Bm; Bn =~ Am Example: a0 =~ a1;b0 =~ b1 a0 =~ b1;b0 =~ a1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5-12 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide 5.2.11 AccumNegative Accumulator Move Computes the two’s complement negative of the value in an accumulator and stores the result in an accumulator. 5 Assembler Syntax: Accum An =Bn =An =Bn =- =- Accum; Am Bm Bm Am Example: b2 =- b1 Flags Affected: Note: A0 A zero AS A sign B0 B zero BS B sign Either A0,AS or B0,BS can be affected based on which accumulator is used. 5.2.12 Parallel Negative Accumulator Move Computes the two’s complement negative of the value in an accumulator and stores the result in an accumulator. Assembler Syntax: An =- Am; Bn =- Bm An =- Bm; Bn =- Am Example: a0 =- a1;b0 =- b1 a2 =- b1;b2 =- a1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign DS795UM11 5-13 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide 5.2.13 Absolute Value Accumulator Move Absolute value of an accumulator is stored in an accumulator. 5 Assembler Syntax: Accum = |Accum| An = |Am| Bn = |Bm| An = |Bm| Bn = |Am| Example: a0 a3 b3 b0 = = = = |a1| |b2| |b3| |a3| Flags Affected: A0 A zero AS A sign B0 B zero BS B sign Either A0,AS or B0,BS can be affected based on which accumulator is used. Note: 5.2.14 Parallel Absolute Value Accumulator Move Absolute values of two accumulators are stored in two accumulators. Assembler Syntax: An = |Am|; Bn = |Bm| An = |Bm|; Bn = |Am| Example: a0 = |a1|; b0 = |b1| a3 = |b2|; b3 = |a2| 5-14 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5 5.2.15 Bitwise OR The bitwise OR of two accumulators is stored in an accumulator. Assembler Syntax: An Bn An Bn = = = = An Bn An Bn | | | | Am Bm Bm Am a0 b3 a1 b2 | | | | a3 b3 b2 a1 Example: a0 b3 a1 b2 = = = = Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.16 Parallel Bitwise OR The bitwise OR of two accumulators is stored in an accumulator. Assembler Syntax: An = An | Am;Bn = Bn | Bm An = An | Bm;Bn = Bn | Am Example: a0 = a0 | a3;b0 = b0 | b3 a0 = a0 | b3;b0 = b0 | a3 DS795UM11 5-15 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: 5 A0 A zero AS A sign B0 B zero BS B sign 5.2.17 Bitwise Exclusive OR The Bitwise Exclusive OR of two accumulators is stored in an accumulator. Assembler Syntax: An Bn An Bn = = = = An Bn An Bn ^ ^ ^ ^ Am Bm Bm Am a0 b3 a1 b2 ^ ^ ^ ^ a3 b3 b2 a1 Example: a0 b3 a1 b2 = = = = Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.18 Parallel Bitwise Exclusive OR The Bitwise Exclusive OR of two accumulators is stored in an accumulator. Assembler Syntax: An = An ^ Am;Bn = Bn ^ Bm An = An ^ Bm;Bn = Bn ^ Am Example: a0 = a0 ^ a3;b0 = b0 ^ b3 a1 = a1 ^ b2;b1 = b1 ^ a2 5-16 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5 5.2.19 Bitwise AND The Bitwise AND of two accumulators is stored in an accumulator. Assembler Syntax: An Bn An Bn = = = = An Bn An Bn & & & & Am Bm Bm Am a0 b3 a1 b2 & & & & a3 b3 b2 a1 Example: a0 b3 a1 b2 = = = = Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.20 Parallel Bitwise AND Parallel bitwise accumulator ANDs. Assembler Syntax: An = An & Am;Bn = Bn & Bm An = An & Bm;Bn = Bn & Am Example: a1 = a1 & a3;b1 = b1 & b3 a2 = a2 & b2;b2 = b2 & a2 DS795UM11 5-17 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: 5 A0 A zero AS A sign B0 B zero BS B sign 5.2.21 Bitwise Zero Zero all 72 bits of the designated accumulator. Assembler Syntax: Accum = 0 Example: a2 = 0 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign Note: Either A0,AS or B0,BS can be affected based on which accumulator is used. 5.2.22 Parallel Bitwise Zero Zero all 72 bits of the designated accumulator. Assembler Syntax: An = 0; Bn = 0 Example: a2 = 0;b2 = 0 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5-18 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide 5.2.23 Bitwise Shift Left by One Accumulator is shifted left by one. A zero is placed in the least significant bit, the most significant bit is lost. 5 Assembler Syntax: An = An << 1 Bn = Bn << 1 Example: a0 = a0 << 1 b2 = b2 << 1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.24 Parallel Bitwise Shift Left by One Accumulator is shifted left by one. A zero is placed in the least significant bit, the most significant bit is lost. Assembler Syntax: An = An << 1; Bn = Bn << 1 Example: a0 = a0 << 1; b0 = b0 << 1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.25 Bitwise Shift Left by Four Accumulator is shifted left by four. Four zeros are placed in the least significant bits. The most significant 4 bits are lost. DS795UM11 5-19 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Assembler Syntax: An = An << 4 Bn = Bn << 4 5 Example: a0 = a0 << 4 b2 = b2 << 4 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.26 Parallel Bitwise Shift Left by Four Accumulator is shifted left by four. Four zeros are placed in the least significant bits. The most significant 4 bits are lost. Assembler Syntax: An = An << 4;Bn = Bn << 4 Example: a0 = a0 << 4;b0 = b0 << 4 a3 = a3 << 4;b3 = b3 << 4 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.27 Bitwise Shift Left by Eight Accumulator is shifted left by eight. Eight zeros are placed in the least significant bits. The most significant 8 bits are lost. Assembler Syntax: An = An << 8 Bn = Bn << 8 5-20 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Example: a0 = a0 << 8 b2 = b2 << 8 5 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.28 Parallel Bitwise Shift Left by Eight Accumulator is shifted left by eight. Eight zeros are placed in the least significant bits, the most significant 8 bits are lost. Assembler Syntax: An = An << 8;Bn = Bn << 8 Example: a0 = a0 << 8;b0 = b0 << 8 a3 = a3 << 8;b3 = b3 << 8 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.29 Bitwise Shift Right by One Accumulator is shifted right by one. The most significant bit is sign extended from the current value, the least significant bit is lost. Assembler Syntax: An = An >> 1 Bn = Bn >> 1 Example: a2 = a2 >> 1 b3 = b3 >> 1 DS795UM11 5-21 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: 5 A0 A zero AS A sign B0 B zero BS B sign 5.2.30 Parallel Bitwise Shift Right by One Accumulator is shifted right by one. The most significant bit is sign extended from the current value, the least significant bit is lost. Assembler Syntax: An = An >> 1;Bn = Bn >> 1 Example: a2 = a2 >> 1;b2 = b2 >> 1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.31 Bitwise Test The bitwise AND of the accumulators is performed, and the appropriate bits in the CCR are set according to the first accumulator. Neither accumulator is altered. Assembler Syntax: An Bn An Bn & & & & Am Bm Bm Am Example: a0 b2 a1 b2 & & & & a1 b2 b1 a3 5-22 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5 5.2.32 Parallel Bitwise Test The bitwise AND of the accumulators is performed, and the appropriate bits in the CCR are set according to the first accumulator. Neither accumulator is altered. Assembler Syntax: An & Am;Bn & Bm An & Bm;Bn & Am Example: a0 & a1;b0 & b1 a1 & b1;b1 & a1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.33 Bitwise Compare A bitwise comparison of the accumulators is performed, and the appropriate bits in the CCR are set according to the first accumulator. Neither accumulator is altered. Assembler Syntax: An Bn An Bn - Am Bm Bm Am Example: a0 b2 a1 b2 - a1 b2 b1 a3 DS795UM11 5-23 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Flags Affected: 5 A0 A zero AS A sign B0 B zero BS B sign 5.2.34 Parallel Bitwise Compare A bitwise comparison of the accumulators is performed, and the appropriate bits in the CCR are set according to the first accumulator. Neither accumulator is altered. Assembler Syntax: An - Am;Bn - Bm An - Bm;Bn - Am Example: a0 - a1;b0 - b1 a1 - b1;b1 - a1 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.35 Bitwise Absolute Value Compare A bitwise comparison of the absolute values of the accumulators is performed, and the appropriate bits in the CCR are set according to the first accumulator. Neither accumulator is altered. Assembler Syntax: |An| |Bn| |An| |Bn| - |Am| |Bm| |Bm| |Am| 5-24 DS795UM11 Copyright 2013 Cirrus Logic Multifunction Operations 32-bit DSP Assembly Programmer’s Guide Example: |a0|-|a1| |b2|-|b2| |a1|-|b1| |b2|-|a3| 5 Flags Affected: A0 A zero AS A sign B0 B zero BS B sign 5.2.36 Parallel Bitwise Absolute Value Compare A bitwise comparison of the absolute values of the accumulators is performed, and the appropriate bits in the CCR are set according to the first accumulator. Neither accumulator is altered. Assembler Syntax: |An| - |Am|;|Bn| - |Bm| |An| - |Bm|;|Bn| - |Am| Example: |a0|-|a1|; |b0|-|b1| |a1|-|b1|; |b1|-|a1| Flags Affected: A0 A zero AS A sign B0 B zero BS B sign DS795UM11 5-25 Copyright 2013 Cirrus Logic Glossary 32-bit DSP Assembly Programmer’s Guide Appendix A 1Glossary A Table A-1. Glossary Terms Term Definition AGU Address Generation Unit. ALU Arithmetic Logic Unit. API Application Programmers Interface. BIBO Bounded Input Bounded Output. CASMSPEC Environment variable that is the default assembler option. CCR Condition Code Register. Complex Memory Treating the same address across both X and Y Data memory as one complex value with the real part in X and the imaginary part in Y. DSPAB DSPA and DSPB. DSPC The third DSP in CS495xx. FFT Fast Fourier Transform. Fast Interrupts Short Interrupts One-Instruction Interrupts Interrupts that consist solely of a single instruction. IFFT Inverse Fast Fourier Transform. IMG file The image (RAM) information for an application. ISI Interrupt Service Instruction. ISR Interrupt Service Routine. Limiting Saturating These terms are used interchangeably. Long Interrupts Slow Interrupts Multi-Instruction Interrupts If an interrupt needs to execute more than one instruction, the callint instruction is used for the ISI. This is referred to as a Long Interrupt. The callint instruction disables interrupts, pushes the PC onto the Subroutine Stack, and starts executing the specified ISR. The final instruction of the ISR should be retcc, which pops the PC and enable interrupts. The call or jmp instructions can also be used as ISIs, but they will not disable interrupts, allowing the possibility of code reentrance. This is especially dangerous when using call due to the possibility of overflowing the Subroutine Stack, which leaves the processor in an unknown state. Long Memory Treating the same address across both X and Y Data memory as one double precision 64-bit value. MAC Multiply-Accumulator. MR Mode Register. DS795UM11 A-1 Copyright 2013 Cirrus Logic Glossary 32-bit DSP Assembly Programmer’s Guide Table A-1. Glossary Terms (Continued) Term A Definition Multifunction Instruction Combining a 16-bit encoded move in the most significant 16 bits of a 32-bit opcode with a 16-bit encoded MAC/ALU operation in the least significant 16 bits results in a 32-bit multifunction instruction. A “NOP” is a valid 16-bit instruction for either half. O file The assembled form of a portion of a module. Output from: casm.exe Input to: clib.exe (or clink.exe) Parallel or Dual Instructions Encoding two moves or two MAC/ALU operations in one 16-bit opcode results in a Parallel Instruction. Parallel Instructions can be used in Multifunction Instructions. SRS Shifter/Rounder/Saturator. ULD file The image information for an application, possibly encrypted and properly formatted for booting. A-2 DS795UM11 Copyright 2013 Cirrus Logic List of Instructions by Category and Flag Reference 32-bit DSP Assembly Programmer’s Guide Appendix B 2List of Instructions by Category and Flag Reference Table B-1. Instruction / Flag Reference Table Instructions with Links to Instructions / Flags (Red) Execution Control Instructions (Section 3.3 on page 3-2) do - Start Hardware Loop enddo - End Current Do-Loop do_patch - Jump to Patch jmp - Jump if - Jump Conditionally call - Jump To Subroutine callint - Answer Interrupt callint_stq - Answer Stack Interrupt ret - Return From Subroutine retint - Return From Interrupt retint_stq - Return From Stack Interrupt inten - Enable Interrupts intdis - Disable Interrupts halt - Stop Further Execution nop - No Operation 64-bit Peripheral Moves (Section 3.4 on page 3-12) XY Register Pair = ext(16-bit Address)] Accum = ext(16-bit Address) ext(16-bit Address) = XY Register Pair ext(16-bit Address) = Accum (L, T1, T0) logexp = XY Register Pair XY Register Pair = logexp Memory Moves - Direct (Section 3.5 on page 3-19) Any Reg = xmem[16-bit Address] xmem[16-bit Address] = Any Reg (L, T1, T0) Any Reg = ymem[16-bit Address] xmem[16-bit Address] = Any Reg (L, T1, T0) MS Reg - See Table 2-25. pmem[16-bit Address] = Any Reg (L, T1, T0) Any Reg = inp[16-bit Address] outp[16-bit Address] = Any Reg (L, T1, T0) Any Reg = xmem[Index Register] xmem[Index Register] = Any Reg (L, T1, T0) Any Reg = ymem[Index Register] ymem[Index Register] = Any Reg (L, T1, T0) Any Reg = pmem[Index Register] pmem[Index Register] = Any Reg (L, T1, T0) outp[Index Register] = Any Reg (L, T1, T0) Any Reg = inp[Index Register] DS795UM11 B-1 Copyright 2013 Cirrus Logic B List of Instructions by Category and Flag Reference 32-bit DSP Assembly Programmer’s Guide Table B-1. Instruction / Flag Reference Table (Continued) Instructions with Links to Instructions / Flags (Red) B Immediate Register Loads (Section 3.6 on page 3-36) fixed16(Destination) = (16-bit Data) ufixed16(Destination) = (16-bit Data) uhalfword(Destination) = (16-bit Data) Index Register = (16-bit Data) NM Register = (16-bit Data) Guard Register = (8-bit Data) halfword(Destination) = (16-bit Data) lo16(Destination) = (16-bit Data) MS Reg = (16-bit Data) AnyReg(Any Reg, Any Reg) (L, T1, T0) Any Reg = MS Reg (L, T1, T0) MS Reg = Any Reg (L, T1, T0) AnyReg (Any Reg, Any Reg), (Any Reg, Any Reg) (L, T1, T0) nm4 - nm7 (L, T1, T0) In = Im/(0) ± (16-bit Data) Bit Manipulation Instructions (Section 3.7.1 on page 3-48) Bit Test (Z or Zero) Bit Set (Z or Zero) Bit Clear (Z or Zero) MS Reg - See Table 2-25. (Z or Zero) Multifunction Moves (Section 4.1 on page 4-1) DP Reg = xmem[Index Register] DP Reg = xmem[6-bit Address] xmem[Index Register] = DP Reg xmem[6-bit address] = DP Reg (L, T1, T0) DP Reg = ymem[Index Register] DP Reg = ymem[6-bit address] ymem[Index Register] = DP Reg ymem[6-bit address] = DP Reg (L, T1, T0) Data Path Register to or from Any Register DP Reg = Any Reg Any Reg = DP Reg (L, T1, T0) Parallel Multifunction Moves (Section 4.2 on page 4-10) Data Path Register to/from X or Y Memory Xn = xmem[Index Register] xmem[Index Register] = An (L, T1, T0) Ym = ymem[Index Register] ymem[Index Register] = Bm (L, T1, T0) Data Path Register to Data Path Register (Section 4.3 on page 4-13) DP Reg = DP Reg 64-bit Multifunction Moves Data Path Register Pair to or from XY Memory (Section 4.5.1 on page 4-16) Data Path Register Pair = xymem[Index Register] Data Path Register Pair = xymem[6-bit Address] xymem[Index Register] = Data Path Register Pair xymem[6-bit Address] = Data Path Register Pair (L, T1, T0) Accumulator to or from XY Memory (Section 4.5.2 on page 4-18) Accum = xymem[Index Register] Accum = xymem[6-bit Address] xymem[Index Register] = Accum xymem[6-bit Address] = Accum (L, T1, T0) Index Register Updates (Section 4.6 on page 4-19) In = Im ± (6-bit Data) In ±= 1/2/N B-2 DS795UM11 Copyright 2013 Cirrus Logic List of Instructions by Category and Flag Reference 32-bit DSP Assembly Programmer’s Guide Table B-1. Instruction / Flag Reference Table (Continued) Instructions with Links to Instructions / Flags (Red) Multifunction Operations Multifunction Arithmetic Instructions (Section 5.1.1 on page 5-1) B Parallel Multiply/Multiply-Accumulate I Parallel Multiply/Multiply-Accumulate II Real Multiply/Multiply-Accumulate Parallel Squares Parallel Multiply with Add Multiply by One with Optional Accumulate Parallel Multiply by One with Optional Accumulate Multifunction Accumulator Instructions (Section 5.2.1 on page 5-7) Parallel Add with Shift (A0, AS, B0, BS) Add with Shift (A0, AS, B0, BS) Conditional Operation - Maximum (A0, AS, B0, BS) Conditional Operation - Minimum (A0, AS, B0, BS) Conditional Operation - Absolute Value Maximum (A0, AS, B0, BS) Conditional Operation - Absolute Value Minimum (A0, AS, B0, BS) Bitwise Accumulator Move (A0, AS, B0, BS) Parallel Bitwise Accumulator Move (A0, AS, B0, BS) Bitwise Complement (A0, AS, B0, BS) Parallel Bitwise Complement (A0, AS, B0, BS) AccumNegative Accumulator Move (A0, AS, B0, BS) Parallel Negative Accumulator Move (A0, AS, B0, BS) Absolute Value Accumulator Move (A0, AS, B0, BS) Parallel Absolute Value Accumulator Move (A0, AS, B0, BS) Bitwise OR (A0, AS, B0, BS) Parallel Bitwise OR (A0, AS, B0, BS) Bitwise Exclusive OR (A0, AS, B0, BS) Parallel Bitwise Exclusive OR (A0, AS, B0, BS) Bitwise AND (A0, AS, B0, BS) Parallel Bitwise AND (A0, AS, B0, BS) Bitwise Zero (A0, AS, B0, BS) Parallel Bitwise Zero (A0, AS, B0, BS) Bitwise Shift Left by One (A0, AS, B0, BS) Parallel Bitwise Shift Left by One (A0, AS, B0, BS) Bitwise Shift Left by Four (A0, AS, B0, BS) Parallel Bitwise Shift Left by Four (A0, AS, B0, BS) Bitwise Shift Left by Eight (A0, AS, B0, BS) Parallel Bitwise Shift Left by Eight (A0, AS, B0, BS) Bitwise Shift Right by One (A0, AS, B0, BS) Parallel Bitwise Shift Right by One (A0, AS, B0, BS) Bitwise Test (A0, AS, B0, BS) Parallel Bitwise Test (A0, AS, B0, BS) Bitwise Compare (A0, AS, B0, BS) A bitwise comparison of the accumulators is performed, and the appropriate bits in the CCR are set according to the first accumulator. Neither accumulator is altered. (A0, AS, B0, BS) Bitwise Absolute Value Compare (A0, AS, B0, BS) Parallel Bitwise Absolute Value Compare (A0, AS, B0, BS) DS795UM11 B-3 Copyright 2013 Cirrus Logic List of Instructions by Category and Flag Reference 32-bit DSP Assembly Programmer’s Guide Revision History Revision Date UM7 December, 2011 UM8 March, 2012 Added .undef token to Section 1.4.16.6. Updated Section 1.4.16.9 to explain struct inside struct. UM9 June, 2012 Added example to Table 1-8. Added Section 1.4.14, Section 2.7.3, and Section 2.7.4. Updated Section 3.4.5. UM10 April, 2013 Updated description of jsr_data Register in Section 2.4.9. Added new status bits to MR register in Section 2.4.19. Added examples to Section 2.4.21. UM11 September, 2013 B Changes Updated Section 1.4.16.3 to document .data_ovly, .xdata_ovly, .ydata_ovly, and .code_ovly segment macros. Added Section 3.3.16. Updated .strpos example in Table 1-9. Updated description of .extern <symbols> and added .export <symbols> to Macros Table 1-10. B-4 DS795UM11 Copyright 2013 Cirrus Logic