Code Generation Tools Tips & Tricks For All TI Code Generation Tools April 2011 1 Online Resources • Compiler Wiki – http://processors.wiki.ti.com/index.php?title=Category: Compiler (link) – Good chance your question is answered here – Constantly evolving – All the manuals! – Downloads – FAQ • Compiler Forum – http://e2e.ti.com/support/development_tools/compiler/f/ 343.aspx (link) – Questions and discussion – Search-able 2 Actually Read the Readme • • • • • This talk only covers the highlights Many more features covered in the readme files Critical detail as well Well worth the 1 hour it takes to read it Available with the compiler download – https://wwwa.ti.com/downloads/sds_support/TICodegenerationTool s/download.htm (link) – ARM compiler not there • Also in root directory of compiler install – CCSv4 typical path: C:\Program Files\Texas Instruments\ccsv4\tools\compiler\target – target: c2000, c5400, c5500, c6000, msp430, tms470 3 Agenda • • • • • • • • • • 4 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Recommended Development Flow Start No Edit Compile -g Debug Works? Yes No Done Yes Goals Met? • Separate use of –g and -o 5 Profile Compile -o Optimization vs. Debug • As one improves the other degrades • If you combine them together … • Effects on debugging – Operations reordered – Variables eliminated • Effects on performance – Instruction scheduling is impaired • Compare performance -o vs. –o -g – C6x MP3 decoder – -o runs 16% faster than –o –g 6 Optimization Levels Option Scope Default None -o0 Statement -o1 Block -o2 or -o Function -o3 File low high • Optimization is critical! • Not on by default • Option is –olevel where level is 0-3 7 If You Must Build with Debug … • • • • 8 Then also use –mn Restores some of the lost performance Degrades debug experience, but not too much Wiki article: http://processors.wiki.ti.com/index.php/Debug_v ersus_Optimization_Tradeoff (link) Quick Help on Compiler Options • For command line users • Run compiler shell with no options C:\dir>cl6x || more more C:\dir>cl6x TMS320C6x C/C++ C/C++ Compiler Compiler vX.Y.Z TMS320C6x vX.Y.Z …… -@=filename Read options options from from specified specified file file -@=filename Read -D=NAME[=value] Pre-define NAME NAME -D=NAME[=value] Pre-define -I=dir Add dir dir to to #include #include search search path path -I=dir Add …… • For more detail use –h –option – More detail not always present • To search options use –h text C:\dir>cl6x -h -h debug debug C:\dir>cl6x -mn -mn --symdebug:coff --symdebug:coff -g -g 9 Optimize fully fully in in the the presence presence of of …… Optimize Enable full full symbolic symbolic COFF COFF debugging debugging …… Enable object or or out out file file (DEPRECATED). (DEPRECATED). object Enable full full symbolic symbolic DWARF DWARF debugging debugging …… Enable Options: Long and Short Forms • All compiler options have a long form – Start with two dashes – Example: --symdebug:dwarf • Many have a shorter alias – Example: -g • Documentation and CCS build dialog emphasize long form • Table of long and short forms for options used in this presentation is next 10 Options: Long and Short Forms 11 -g --symdebug:dwarf -o --opt_level -mn --optimize_with_debug -w --warn_sections -pdr --issue_remarks -pden --display_error_number -pdsr --diag_remark -pdsw --diag_warning -pdse --diag_error -pds --diag_suppress Recommended Build Options • For C6000 – http://processors.wiki.ti.com/index.php/C6000_Compiler :_Recommended_Compiler_Options (link) • For C5500 – http://processors.wiki.ti.com/index.php/C5500_Compiler :_Recommended_Compiler_Options (link) • Many points made, however, apply equally well to all TI compilers 12 C6000 Optimization Hints • Much can be gained from informing the compiler about additional properties of your code – Pointers don’t access the same memory locations – How many times a loop iterates – Pointers are aligned • http://processors.wiki.ti.com/index.php/C6000_C GT_Optimization_Lab_-_1 (link) – Shows techniques for giving the compiler that information – A working example is modified, in steps, to run faster and faster 13 Agenda • • • • • • • • • • 14 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Linker Tips • Global variables are not initialized to 0! – Except under EABI • The linker does not know your system memory layout – Linker command file does that • MEMORY and SECTIONS directives • Use -w – Warns you when certain suspicious things occur – Creating a section without a specification • Never intentionally occurs in a production build – Stack not created – Heap not created 15 Agenda • • • • • • • • • • 16 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Introduction to EABI • ABI: Application Binary Interface – Conventions which allow separately compiled object files and libraries to link into a cohesive executable • Default for all TI compilers is COFF ABI • TMS470 v4.4.x and C6000 v7.2.x introduce EABI – Build Option: --abi=eabi • Impossible to mix COFF ABI and EABI • Before using EABI yourself, obtain EABI versions of all your libraries – Availability is very good, but not yet 100% • Details on the Wiki 17 – http://processors.wiki.ti.com/index.php/EABI_Support_i n_C6000_Compiler (link) – http://processors.wiki.ti.com/index.php/EABI_Support_i n_ARM_Compiler (link) Agenda • • • • • • • • • • 18 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Intrinsics vs. Assembly • TI supports limited asm() statements – Not like GCC asm() statements – Very little interaction with C environment – No interaction with registers or local variables • Intrinsics are preferred – Act like function calls – Implemented in one instruction • With a few exceptions • Compiler knows very little about instructions within asm() statements • Knows everything possible about intrinsics • Thus, optimization with intrinsics is far more effective 19 Agenda • • • • • • • • • • 20 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Type Sizes by CPU C28xTM C55xTM char 16 16 8 8 8 short 16 16 16 16 16 int 16 16 32 16 32 long 32 32 40 32 32 long long 64 40 64 NA 64 float 32 32 32 32 32 double 32 32 64 32 64 long double 64 32 64 32 64 Type C6000TM MSP430 • Shaded sizes Æ different from hosted systems • C6000 EABI long is 32-bits. More on that later. 21 470 Type Differences • Because char is 16-bits on C55xTM and C28xTM – sizeof(int) == sizeof(char) == 1 – Redefines the term byte – 8-bit wide external streams must be handled carefully – App note: http://www-s.ti.com/sc/techlit/spra757 (link) • On 470 plain char is unsigned – Use plain char only for ASCII chars – Otherwise, use signed char or unsigned char • Floating point is very slow on CPU’s without hardware support • MSP430: Will add 64-bit long long and 64-bit long double types 22 Standard Integer Typedefs • stdint.h defines standard names for exact width integer types #if #if defined(__TMS320C2000__) || || defined(_TMS320C5XX) defined(_TMS320C5XX) \\ defined(__TMS320C2000__) || defined(__TMS320C55X__) defined(__TMS320C55X__) || typedef int int16_t; typedef int int16_t; typedef unsigned unsigned int int uint16_t; uint16_t; typedef typedef long int32_t; int32_t; typedef long typedef unsigned unsigned long long uint32_t; uint32_t; typedef #elif defined(_TMS320C6X) defined(_TMS320C6X) || || defined(__TMS470__) defined(__TMS470__) #elif typedef signed signed char char int8_t; int8_t; typedef typedef unsigned unsigned char char uint8_t; uint8_t; typedef typedef short int16_t; int16_t; typedef short typedef unsigned unsigned short short uint16_t; uint16_t; typedef typedef int int32_t; typedef int int32_t; typedef unsigned unsigned int int uint32_t; uint32_t; typedef #elif defined(__MSP430__) defined(__MSP430__) #elif …… • Other typedefs: minimum width, fastest, etc. • No need to define your own 23 C6000 EABI and Size of long • COFF ABI: long is 40-bits • EABI: long is 32-bits – Eases porting of general purpose code to C6000 • Porting from COFF ABI to EABI? – In COFF change from long to int40_t • Unless you are 100% certain 32-bits is enough – Makes port easy 24 Alignment & Structures • Alignment is different from hosted systems • Misaligned access – x86: Works but imposes cycle penalty – TI CPUs: Fails silently • TI CPUs alignment == size of type (few exceptions) • Structures are laid out differently between CPU’s – Order of members is guaranteed – Any member, or the whole struct, may be aligned – Affects exchanging data with external sources 25 Packed Structures • Support starts in these compilers … CPU Family Version Notes ARM 4.8.0 Cortex only C6000 7.2.0 Not C62x, C67x, C67x+ • Syntax is same as GCC • Details here – http://processors.wiki.ti.com/index.php/GCC_Extension s_in_TI_Compilers (link) • Underlying HW must support unaligned access – Will be relaxed over time • MSP430 support coming soon 26 Agenda • • • • • • • • • • 27 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips 16 X 16 Î 32 Multiply • Do not write … long_var = short_var1 * short_var2; • Instead write … long_var = (long) short_var1 * (long) short_var2; • • • • 28 Very important when sizeof(int) != sizeof(long) Accurately represents operation Implemented efficiently App note: http://www-s.ti.com/sc/techlit/spra683 (link) Agenda • • • • • • • • • • 29 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Understanding Memory Models Large Small Full Partial Code Size Larger Smaller Speed Slower Faster See docs Default Memory Range To enable 30 Understanding Memory Models 31 CPU Extends Because C6000TM Code PC-relative branches C6000TM Data DP-offset global variables C55xTM Data 2 ways to access globals 2 sizes of address registers C28xTM Data 2 sizes of address registers MSP430 Code Wider ALU MSP430 Data Wider ALU Understanding Memory Models • Details on C6000 Data Memory Models – http://processors.wiki.ti.com/index.php/C6000_Memory _models (link) 32 Agenda • • • • • • • • • • 33 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Compiler Diagnostics Remark Warning Error Severity Low Medium High Build fails? No No Yes To enable -pdr Default Default • Is it okay to ignore remarks? 34 Automatic Error Detection • Remarks expose common bugs • This complete program compiles fine void main() main() {{ printf(“hello, printf(“hello, world\n”); world\n”); }} void • But doesn’t work • Compile with –pdr to see remarks “hello.c”, line line 1: 1: remark: remark: function function declared declared implicitly implicitly “hello.c”, • Always build with –pdr! • Take remarks seriously • Always include required RTS header files #include <stdio.h> <stdio.h> #include • Always prototype user functions 35 Elevate Remark to Error • Force remark to cause build to fail • Use 2 separate compilation steps • First, use –pden –pdr to see remark number "hello.c", line line 1: 1: remark remark #225-D: #225-D: function function declared declared implicitly implicitly "hello.c", • Second, use –pdsenum to elevate the remark • In this case –pdse225 gives "hello.c", line line 1: 1: error: error: function function declared declared implicitly implicitly "hello.c", error detected detected in in the the compilation compilation of of "hello.c". "hello.c". 11 error >> Compilation Compilation failure failure >> • Can use –pdse225, without –pdr, on all builds • Projects originated with CCSv4 have –pdsw225 by default – Warning, not an error 36 Control Diagnostic Levels • First identify diagnostic id with -pden Set level to: Remark Option -pdsrid #pragma #pragma diag_remark id Warning -pdswid #pragma diag_warning id Error -pdseid #pragma diag_error id Default none -pdsid #pragma diag_default id Suppress #pragma diag_suppress id • Diagnostics with “-D” appended to id can be suppressed or changed – All warnings and remarks – A few errors • #pragma is alternative to -pdsXXX • #pragma provides line by line control 37 Agenda • • • • • • • • • • 38 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Useful Utilities • ofdXX:Dumps out contents of object files and libraries – Use –x option for XML format • nmXX: Lists symbol table in .out or .obj file • stripXX: Strips symbol and debug information from .out file • demXX: Demangles symbols into their C++ form • disXX: Disassembler – Not on some ISA’s • Replace XX with the abbreviation for your ISA, e.g. ofd6x, nm55, etc. 39 Even Cooler Utilities • Turn the ofdXX XML or linker map file XML into useful information • Call graph, stack usage, section sizes by type, compare libraries, and much more • Called the cg_xml package • Target independent • Released separately from compiler tools • Command line executables – http://processors.wiki.ti.com/index.php/Code_Generatio n_Tools_XML_Processing_Scripts (link) • Invoke from within CCSv4 – http://processors.wiki.ti.com/index.php/Code_Generatio n_Tools_XML_Processing_Scripts_Plug-in_for_CCS (link) 40 Agenda • • • • • • • • • • 41 Recommended Development Flow Linker Tips EABI Intrinsics vs. Assembly Types and Alignment 16 x 16 Î 32 Multiply Memory Models Diagnostics Useful Utilities Remaining Tips Remaining Tips • C++ code is usually very efficient • #pragma is compiler specific and does not port – http://processors.wiki.ti.com/index.php/Pragmas_You_ Can_Understand (link) • C FAQ – http://www.eskimo.com/~scs/C-faq/faq.html (link) • C++ FAQ – http://www.parashift.com/c++-faq-lite (link) 42 Questions 43 Backup Slides 44 Backup Agenda • • • • • • 45 More Optimization Hints Standardize Handling of Types More on Diagnostics C Headers in Assembly Interrupt Intrinsics Predefined Symbols for Version More Optimization Hints • Optimization may uncover user errors like: – Uninitialized variables – Loose adherence to ANSI standard – Failure to use volatile • Use volatile on variables modified by: – Interrupts – Peripherals – Other processors • volatile controls order of access, not timing 46 Backup Agenda • • • • • • 47 More Optimization Hints Standardize Handling of Types More on Diagnostics C Headers in Assembly Interrupt Intrinsics Predefined Symbols for Version inttypes.h • Another header file for standardizing types • Includes stdint.h • Also defines printf format strings #include <inttypes.h> <inttypes.h> #include …… printf(“%” PRId32 PRId32 “\n”, “\n”, (int32_t) (int32_t) 1); 1); // // portable portable printf(“%” printf(“%d\n”, (int32_t) (int32_t) 1); 1); // not not portable portable printf(“%d\n”, // – Note careful use of commas in first printf 48 Backup Agenda • • • • • • 49 More Optimization Hints Standardize Handling of Types More on Diagnostics C Headers in Assembly Interrupt Intrinsics Predefined Symbols for Version Diagnostic Control Example int ex(int i) { switch (i) { case 10 : return val(); break; /* line 7 */ /* line 9 */ … } } C:\dir> cl55 –pden –pdr ex.c "ex.c", line 7: remark #225-D: function declared implicitly "ex.c", line 9: warning #112-D: statement is unreachable 50 Diagnostic Control Example #pragma diag_error 225 /* Require explicit function decls int ex(int i) { switch (i) { case 10 : return val(); /* line 7 #pragma diag_suppress 112 /* suppress msg on break break; /* line 9 #pragma diag_default 112 /* restore msg level … } } C:\dir> cl55 –pden –pdr ex.c "ex.c", line 7: error #225-D: function declared implicitly 1 error detected in the compilation of "ex.c". >> Compilation failure • #pragma is alternative to -pdsXXX • #pragma provides precise control 51 */ */ */ */ */ Verbose Diagnostics -pdv • For this source line extern struct struct example; example; extern • The diagnostic is “ex.c", line line 1: 1: warning: warning: aa storage storage class class may may not not be be specified specified here here “ex.c", • To avoid confusion add -pdv “ex.c", line line 1: 1: warning: warning: aa storage storage class class may may not not be be specified specified here here “ex.c", extern struct struct example; example; extern ^^ 52 Backup Agenda • • • • • • 53 More Optimization Hints Standardize Handling of Types More on Diagnostics C Headers in Assembly Interrupt Intrinsics Predefined Symbols for Version C Headers in Assembly .cdecls optional optional parameters parameters .cdecls %{ %{ /* C/C++ C/C++ code code here, here, usually usually #include’s #include’s */ */ /* %} %} • Converted constructs (usually found in .h files) – function/variable declarations (prototypes) – structs, unions, enumerations – Non-function-like macros • NOT converted – function/variable definitions – function-like macros • • • • 54 Each .cdecls region is separate context Conversion after pre-processing Option to warn on each construct NOT converted Includes generated assembly code in listing file How C/C++ is Transformed hd.h #define WANT_ID 1 #define NAME ”Jari\n” extern int var; extern float cvt(int src); struct duo { int ifld; float ffld; }; enum status { OK = 1, FAIL = 256 }; 55 .cdecls C, LIST, “hd.h” ----------------------------.define ”1”, WANT_ID .define ”””Jari\n”””, NAME .global _var .global _cvt duo .struct 0, ifld .field 16 .field 16 ffld .field 32 .endstruct status .enum OK .emember FAIL .emember .endenum 2 ; pretty and ; informative ; comments 1 256 Typical Usage hd.h #define WANT_ID 1 #define NAME ”Jari\n” extern int var; extern float cvt(int src); struct duo { int ifld; float ffld; }; enum status { OK = 1, FAIL = 256 }; 56 .cdecls C, LIST, ”hd.h” .data .if $defined(WANT_ID) id: .cstring NAME .endif .bss data, $sizeof(duo) data: .tag duo ... hope: AR1 = #data AC0 = *AR1(#(duo.ifld)) << 16 dbl(*AR1(#(duo.ffld))) = AC0 AR0 = #id T0 = #(status.OK) return Backup Agenda • • • • • • 57 More Optimization Hints Standardize Handling of Types More on Diagnostics C Headers in Assembly Interrupt Intrinsics Predefined Symbols for Version Interrupt Intrinsics • For enabling/disabling interrupts unsigned int int _disable_interrupts(); _disable_interrupts(); // 33 cycles cycles on on C6000 C6000 unsigned // unsigned int int _enable_interrupts(); _enable_interrupts(); // 33 cycles cycles on on C6000 C6000 unsigned // void _restore_interrupts(unsigned int); int); // // 11 cycle cycle on on C6000 C6000 void _restore_interrupts(unsigned • Disable and enable return interrupt state before change; use that value when restoring state • Barriers to optimization • Usage example … unsigned int int local; local; unsigned local == _disable_interrupts(); _disable_interrupts(); local if (sem) (sem) sem--; sem--; /* atomic atomic test/update test/update of of semaphore semaphore */ */ if /* _restore_interrupts(local); _restore_interrupts(local); • Replacement for HWI_disable(), HWI_restore() – Faster. HWI_disable on C64x takes 16 cycles. • Cycle counts given are “best case” numbers 58 – Ignores cache effects, memory latency, etc. Backup Agenda • • • • • • 59 More Optimization Hints Standardize Handling of Types More on Diagnostics C Headers in Assembly Interrupt Intrinsics Predefined Symbols for Version Predefined Symbols for Version • __TI_COMPILER_VERSION__ and __TI_ASSEMBLER_VERSION__ • Returns int corresponding to compiler version • Version number breaks down: – – – – Major number (1-2 digits) Minor version (3 digits) Patch version (3 digits) Example: v5.1.0 Î 5 001 000 Î 5001000 • Workaround compiler bugs #if defined(_TMS320C6X) defined(_TMS320C6X) && && __TI_COMPILER_VERSION__ __TI_COMPILER_VERSION__ == == 5001000 5001000 #if workaround C6x C6x compiler compiler bug bug workaround #endif #endif • Target independent test for TI compiler #if defined(__TI_COMPILER_VERSION__) defined(__TI_COMPILER_VERSION__) #if does not not work work with with older older compilers! compilers! does #endif #endif 60