This older slide presentation (PDF)

Code Generation Tools Tips & Tricks
For All TI Code Generation Tools
April 2011
1
Online Resources
• Compiler Wiki
– http://processors.wiki.ti.com/index.php?title=Category:
Compiler (link)
– Good chance your question is answered here
– Constantly evolving
– All the manuals!
– Downloads
– FAQ
• Compiler Forum
– http://e2e.ti.com/support/development_tools/compiler/f/
343.aspx (link)
– Questions and discussion
– Search-able
2
Actually Read the Readme
•
•
•
•
•
This talk only covers the highlights
Many more features covered in the readme files
Critical detail as well
Well worth the 1 hour it takes to read it
Available with the compiler download
– https://wwwa.ti.com/downloads/sds_support/TICodegenerationTool
s/download.htm (link)
– ARM compiler not there
• Also in root directory of compiler install
– CCSv4 typical path: C:\Program Files\Texas
Instruments\ccsv4\tools\compiler\target
– target: c2000, c5400, c5500, c6000, msp430, tms470
3
Agenda
•
•
•
•
•
•
•
•
•
•
4
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Recommended Development Flow
Start
No
Edit
Compile -g
Debug
Works?
Yes
No
Done
Yes
Goals Met?
• Separate use of –g and -o
5
Profile
Compile -o
Optimization vs. Debug
• As one improves the other degrades
• If you combine them together …
• Effects on debugging
– Operations reordered
– Variables eliminated
• Effects on performance
– Instruction scheduling is impaired
• Compare performance -o vs. –o -g
– C6x MP3 decoder
– -o runs 16% faster than –o –g
6
Optimization Levels
Option
Scope
Default
None
-o0
Statement
-o1
Block
-o2 or -o
Function
-o3
File
low
high
• Optimization is critical!
• Not on by default
• Option is –olevel where level is 0-3
7
If You Must Build with Debug …
•
•
•
•
8
Then also use –mn
Restores some of the lost performance
Degrades debug experience, but not too much
Wiki article:
http://processors.wiki.ti.com/index.php/Debug_v
ersus_Optimization_Tradeoff (link)
Quick Help on Compiler Options
• For command line users
• Run compiler shell with no options
C:\dir>cl6x || more
more
C:\dir>cl6x
TMS320C6x C/C++
C/C++ Compiler
Compiler
vX.Y.Z
TMS320C6x
vX.Y.Z
……
-@=filename
Read options
options from
from specified
specified file
file
-@=filename
Read
-D=NAME[=value]
Pre-define NAME
NAME
-D=NAME[=value]
Pre-define
-I=dir
Add dir
dir to
to #include
#include search
search path
path
-I=dir
Add
……
• For more detail use –h –option
– More detail not always present
• To search options use –h text
C:\dir>cl6x -h
-h debug
debug
C:\dir>cl6x
-mn
-mn
--symdebug:coff
--symdebug:coff
-g
-g
9
Optimize fully
fully in
in the
the presence
presence of
of ……
Optimize
Enable full
full symbolic
symbolic COFF
COFF debugging
debugging ……
Enable
object or
or out
out file
file (DEPRECATED).
(DEPRECATED).
object
Enable full
full symbolic
symbolic DWARF
DWARF debugging
debugging ……
Enable
Options: Long and Short Forms
• All compiler options have a long form
– Start with two dashes
– Example: --symdebug:dwarf
• Many have a shorter alias
– Example: -g
• Documentation and CCS build dialog emphasize
long form
• Table of long and short forms for options used in
this presentation is next
10
Options: Long and Short Forms
11
-g
--symdebug:dwarf
-o
--opt_level
-mn
--optimize_with_debug
-w
--warn_sections
-pdr
--issue_remarks
-pden
--display_error_number
-pdsr
--diag_remark
-pdsw
--diag_warning
-pdse
--diag_error
-pds
--diag_suppress
Recommended Build Options
• For C6000
– http://processors.wiki.ti.com/index.php/C6000_Compiler
:_Recommended_Compiler_Options (link)
• For C5500
– http://processors.wiki.ti.com/index.php/C5500_Compiler
:_Recommended_Compiler_Options (link)
• Many points made, however, apply equally well
to all TI compilers
12
C6000 Optimization Hints
• Much can be gained from informing the compiler
about additional properties of your code
– Pointers don’t access the same memory locations
– How many times a loop iterates
– Pointers are aligned
• http://processors.wiki.ti.com/index.php/C6000_C
GT_Optimization_Lab_-_1 (link)
– Shows techniques for giving the compiler that
information
– A working example is modified, in steps, to run faster
and faster
13
Agenda
•
•
•
•
•
•
•
•
•
•
14
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Linker Tips
• Global variables are not initialized to 0!
– Except under EABI
• The linker does not know your system memory
layout
– Linker command file does that
• MEMORY and SECTIONS directives
• Use -w
– Warns you when certain suspicious things occur
– Creating a section without a specification
• Never intentionally occurs in a production build
– Stack not created
– Heap not created
15
Agenda
•
•
•
•
•
•
•
•
•
•
16
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Introduction to EABI
• ABI: Application Binary Interface
– Conventions which allow separately compiled object
files and libraries to link into a cohesive executable
• Default for all TI compilers is COFF ABI
• TMS470 v4.4.x and C6000 v7.2.x introduce EABI
– Build Option: --abi=eabi
• Impossible to mix COFF ABI and EABI
• Before using EABI yourself, obtain EABI
versions of all your libraries
– Availability is very good, but not yet 100%
• Details on the Wiki
17
– http://processors.wiki.ti.com/index.php/EABI_Support_i
n_C6000_Compiler (link)
– http://processors.wiki.ti.com/index.php/EABI_Support_i
n_ARM_Compiler (link)
Agenda
•
•
•
•
•
•
•
•
•
•
18
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Intrinsics vs. Assembly
• TI supports limited asm() statements
– Not like GCC asm() statements
– Very little interaction with C environment
– No interaction with registers or local variables
• Intrinsics are preferred
– Act like function calls
– Implemented in one instruction
• With a few exceptions
• Compiler knows very little about instructions
within asm() statements
• Knows everything possible about intrinsics
• Thus, optimization with intrinsics is far more
effective
19
Agenda
•
•
•
•
•
•
•
•
•
•
20
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Type Sizes by CPU
C28xTM
C55xTM
char
16
16
8
8
8
short
16
16
16
16
16
int
16
16
32
16
32
long
32
32
40
32
32
long long
64
40
64
NA
64
float
32
32
32
32
32
double
32
32
64
32
64
long double
64
32
64
32
64
Type
C6000TM MSP430
• Shaded sizes Æ different from hosted systems
• C6000 EABI long is 32-bits. More on that later.
21
470
Type Differences
• Because char is 16-bits on C55xTM and C28xTM
– sizeof(int) == sizeof(char) == 1
– Redefines the term byte
– 8-bit wide external streams must be handled carefully
– App note: http://www-s.ti.com/sc/techlit/spra757 (link)
• On 470 plain char is unsigned
– Use plain char only for ASCII chars
– Otherwise, use signed char or unsigned char
• Floating point is very slow on CPU’s without
hardware support
• MSP430: Will add 64-bit long long and 64-bit long
double types
22
Standard Integer Typedefs
• stdint.h defines standard names for exact width
integer types
#if
#if
defined(__TMS320C2000__) ||
|| defined(_TMS320C5XX)
defined(_TMS320C5XX) \\
defined(__TMS320C2000__)
|| defined(__TMS320C55X__)
defined(__TMS320C55X__)
||
typedef
int
int16_t;
typedef
int
int16_t;
typedef unsigned
unsigned int
int uint16_t;
uint16_t;
typedef
typedef
long int32_t;
int32_t;
typedef
long
typedef unsigned
unsigned long
long uint32_t;
uint32_t;
typedef
#elif defined(_TMS320C6X)
defined(_TMS320C6X) ||
|| defined(__TMS470__)
defined(__TMS470__)
#elif
typedef signed
signed char
char int8_t;
int8_t;
typedef
typedef unsigned
unsigned char
char uint8_t;
uint8_t;
typedef
typedef
short int16_t;
int16_t;
typedef
short
typedef unsigned
unsigned short
short uint16_t;
uint16_t;
typedef
typedef
int
int32_t;
typedef
int
int32_t;
typedef unsigned
unsigned int
int uint32_t;
uint32_t;
typedef
#elif defined(__MSP430__)
defined(__MSP430__)
#elif
……
• Other typedefs: minimum width, fastest, etc.
• No need to define your own
23
C6000 EABI and Size of long
• COFF ABI: long is 40-bits
• EABI: long is 32-bits
– Eases porting of general purpose code to C6000
• Porting from COFF ABI to EABI?
– In COFF change from long to int40_t
• Unless you are 100% certain 32-bits is enough
– Makes port easy
24
Alignment & Structures
• Alignment is different from hosted systems
• Misaligned access
– x86: Works but imposes cycle penalty
– TI CPUs: Fails silently
• TI CPUs alignment == size of type (few
exceptions)
• Structures are laid out differently between CPU’s
– Order of members is guaranteed
– Any member, or the whole struct, may be aligned
– Affects exchanging data with external sources
25
Packed Structures
• Support starts in these compilers …
CPU Family
Version
Notes
ARM
4.8.0
Cortex only
C6000
7.2.0
Not C62x, C67x, C67x+
• Syntax is same as GCC
• Details here
– http://processors.wiki.ti.com/index.php/GCC_Extension
s_in_TI_Compilers (link)
• Underlying HW must support unaligned access
– Will be relaxed over time
• MSP430 support coming soon
26
Agenda
•
•
•
•
•
•
•
•
•
•
27
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
16 X 16 Î 32 Multiply
• Do not write …
long_var = short_var1 * short_var2;
• Instead write …
long_var = (long) short_var1 * (long) short_var2;
•
•
•
•
28
Very important when sizeof(int) != sizeof(long)
Accurately represents operation
Implemented efficiently
App note: http://www-s.ti.com/sc/techlit/spra683
(link)
Agenda
•
•
•
•
•
•
•
•
•
•
29
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Understanding Memory Models
Large
Small
Full
Partial
Code Size
Larger
Smaller
Speed
Slower
Faster
See docs
Default
Memory Range
To enable
30
Understanding Memory Models
31
CPU
Extends
Because
C6000TM
Code
PC-relative branches
C6000TM
Data
DP-offset global variables
C55xTM
Data
2 ways to access globals
2 sizes of address registers
C28xTM
Data
2 sizes of address registers
MSP430
Code
Wider ALU
MSP430
Data
Wider ALU
Understanding Memory Models
• Details on C6000 Data Memory Models
– http://processors.wiki.ti.com/index.php/C6000_Memory
_models (link)
32
Agenda
•
•
•
•
•
•
•
•
•
•
33
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Compiler Diagnostics
Remark
Warning
Error
Severity
Low
Medium
High
Build fails?
No
No
Yes
To enable
-pdr
Default
Default
• Is it okay to ignore remarks?
34
Automatic Error Detection
• Remarks expose common bugs
• This complete program compiles fine
void main()
main() {{ printf(“hello,
printf(“hello, world\n”);
world\n”); }}
void
• But doesn’t work
• Compile with –pdr to see remarks
“hello.c”, line
line 1:
1: remark:
remark: function
function declared
declared implicitly
implicitly
“hello.c”,
• Always build with –pdr!
• Take remarks seriously
• Always include required RTS header files
#include <stdio.h>
<stdio.h>
#include
• Always prototype user functions
35
Elevate Remark to Error
• Force remark to cause build to fail
• Use 2 separate compilation steps
• First, use –pden –pdr to see remark number
"hello.c", line
line 1:
1: remark
remark #225-D:
#225-D: function
function declared
declared implicitly
implicitly
"hello.c",
• Second, use –pdsenum to elevate the remark
• In this case –pdse225 gives
"hello.c", line
line 1:
1: error:
error: function
function declared
declared implicitly
implicitly
"hello.c",
error detected
detected in
in the
the compilation
compilation of
of "hello.c".
"hello.c".
11 error
>> Compilation
Compilation failure
failure
>>
• Can use –pdse225, without –pdr, on all builds
• Projects originated with CCSv4 have –pdsw225
by default
– Warning, not an error
36
Control Diagnostic Levels
• First identify diagnostic id with -pden
Set level to:
Remark
Option
-pdsrid
#pragma
#pragma diag_remark id
Warning
-pdswid
#pragma diag_warning id
Error
-pdseid
#pragma diag_error id
Default
none
-pdsid
#pragma diag_default id
Suppress
#pragma diag_suppress id
• Diagnostics with “-D” appended to id can be
suppressed or changed
– All warnings and remarks
– A few errors
• #pragma is alternative to -pdsXXX
• #pragma provides line by line control
37
Agenda
•
•
•
•
•
•
•
•
•
•
38
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Useful Utilities
• ofdXX:Dumps out contents of object files and
libraries
– Use –x option for XML format
• nmXX: Lists symbol table in .out or .obj file
• stripXX: Strips symbol and debug information
from .out file
• demXX: Demangles symbols into their C++ form
• disXX: Disassembler
– Not on some ISA’s
• Replace XX with the abbreviation for your ISA,
e.g. ofd6x, nm55, etc.
39
Even Cooler Utilities
• Turn the ofdXX XML or linker map file XML into
useful information
• Call graph, stack usage, section sizes by type,
compare libraries, and much more
• Called the cg_xml package
• Target independent
• Released separately from compiler tools
• Command line executables
– http://processors.wiki.ti.com/index.php/Code_Generatio
n_Tools_XML_Processing_Scripts (link)
• Invoke from within CCSv4
– http://processors.wiki.ti.com/index.php/Code_Generatio
n_Tools_XML_Processing_Scripts_Plug-in_for_CCS
(link)
40
Agenda
•
•
•
•
•
•
•
•
•
•
41
Recommended Development Flow
Linker Tips
EABI
Intrinsics vs. Assembly
Types and Alignment
16 x 16 Î 32 Multiply
Memory Models
Diagnostics
Useful Utilities
Remaining Tips
Remaining Tips
• C++ code is usually very efficient
• #pragma is compiler specific and does not port
– http://processors.wiki.ti.com/index.php/Pragmas_You_
Can_Understand (link)
• C FAQ
– http://www.eskimo.com/~scs/C-faq/faq.html (link)
• C++ FAQ
– http://www.parashift.com/c++-faq-lite (link)
42
Questions
43
Backup Slides
44
Backup Agenda
•
•
•
•
•
•
45
More Optimization Hints
Standardize Handling of Types
More on Diagnostics
C Headers in Assembly
Interrupt Intrinsics
Predefined Symbols for Version
More Optimization Hints
• Optimization may uncover user errors like:
– Uninitialized variables
– Loose adherence to ANSI standard
– Failure to use volatile
• Use volatile on variables modified by:
– Interrupts
– Peripherals
– Other processors
• volatile controls order of access, not timing
46
Backup Agenda
•
•
•
•
•
•
47
More Optimization Hints
Standardize Handling of Types
More on Diagnostics
C Headers in Assembly
Interrupt Intrinsics
Predefined Symbols for Version
inttypes.h
• Another header file for standardizing types
• Includes stdint.h
• Also defines printf format strings
#include <inttypes.h>
<inttypes.h>
#include
……
printf(“%” PRId32
PRId32 “\n”,
“\n”, (int32_t)
(int32_t) 1);
1); //
// portable
portable
printf(“%”
printf(“%d\n”, (int32_t)
(int32_t) 1);
1);
// not
not portable
portable
printf(“%d\n”,
//
– Note careful use of commas in first printf
48
Backup Agenda
•
•
•
•
•
•
49
More Optimization Hints
Standardize Handling of Types
More on Diagnostics
C Headers in Assembly
Interrupt Intrinsics
Predefined Symbols for Version
Diagnostic Control Example
int ex(int i)
{
switch (i)
{
case 10 :
return val();
break;
/* line 7
*/
/* line 9
*/
…
}
}
C:\dir> cl55 –pden –pdr ex.c
"ex.c", line 7: remark #225-D: function declared implicitly
"ex.c", line 9: warning #112-D: statement is unreachable
50
Diagnostic Control Example
#pragma diag_error 225
/* Require explicit function decls
int ex(int i)
{
switch (i)
{
case 10 :
return val();
/* line 7
#pragma diag_suppress 112
/* suppress msg on break
break;
/* line 9
#pragma diag_default 112
/* restore msg level
…
}
}
C:\dir> cl55 –pden –pdr ex.c
"ex.c", line 7: error #225-D: function declared implicitly
1 error detected in the compilation of "ex.c".
>> Compilation failure
• #pragma is alternative to -pdsXXX
• #pragma provides precise control
51
*/
*/
*/
*/
*/
Verbose Diagnostics -pdv
• For this source line
extern struct
struct example;
example;
extern
• The diagnostic is
“ex.c", line
line 1:
1: warning:
warning: aa storage
storage class
class may
may not
not be
be specified
specified here
here
“ex.c",
• To avoid confusion add -pdv
“ex.c", line
line 1:
1: warning:
warning: aa storage
storage class
class may
may not
not be
be specified
specified here
here
“ex.c",
extern struct
struct example;
example;
extern
^^
52
Backup Agenda
•
•
•
•
•
•
53
More Optimization Hints
Standardize Handling of Types
More on Diagnostics
C Headers in Assembly
Interrupt Intrinsics
Predefined Symbols for Version
C Headers in Assembly
.cdecls optional
optional parameters
parameters
.cdecls
%{
%{
/* C/C++
C/C++ code
code here,
here, usually
usually #include’s
#include’s */
*/
/*
%}
%}
•
Converted constructs (usually found in .h files)
– function/variable declarations (prototypes)
– structs, unions, enumerations
– Non-function-like macros
•
NOT converted
– function/variable definitions
– function-like macros
•
•
•
•
54
Each .cdecls region is separate context
Conversion after pre-processing
Option to warn on each construct NOT converted
Includes generated assembly code in listing file
How C/C++ is Transformed
hd.h
#define WANT_ID 1
#define NAME ”Jari\n”
extern int var;
extern float cvt(int src);
struct duo {
int ifld;
float ffld;
};
enum status {
OK = 1,
FAIL = 256
};
55
.cdecls C, LIST, “hd.h”
----------------------------.define ”1”, WANT_ID
.define ”””Jari\n”””, NAME
.global _var
.global _cvt
duo .struct 0,
ifld .field 16
.field 16
ffld .field 32
.endstruct
status .enum
OK
.emember
FAIL
.emember
.endenum
2
; pretty and
; informative
; comments
1
256
Typical Usage
hd.h
#define WANT_ID 1
#define NAME ”Jari\n”
extern int var;
extern float cvt(int src);
struct duo {
int ifld;
float ffld;
};
enum status {
OK = 1,
FAIL = 256
};
56
.cdecls C, LIST, ”hd.h”
.data
.if $defined(WANT_ID)
id:
.cstring NAME
.endif
.bss data, $sizeof(duo)
data: .tag duo
...
hope:
AR1 = #data
AC0 = *AR1(#(duo.ifld)) << 16
dbl(*AR1(#(duo.ffld))) = AC0
AR0 = #id
T0 = #(status.OK)
return
Backup Agenda
•
•
•
•
•
•
57
More Optimization Hints
Standardize Handling of Types
More on Diagnostics
C Headers in Assembly
Interrupt Intrinsics
Predefined Symbols for Version
Interrupt Intrinsics
• For enabling/disabling interrupts
unsigned int
int _disable_interrupts();
_disable_interrupts();
// 33 cycles
cycles on
on C6000
C6000
unsigned
//
unsigned int
int _enable_interrupts();
_enable_interrupts();
// 33 cycles
cycles on
on C6000
C6000
unsigned
//
void
_restore_interrupts(unsigned int);
int); //
// 11 cycle
cycle on
on C6000
C6000
void
_restore_interrupts(unsigned
• Disable and enable return interrupt state before
change; use that value when restoring state
• Barriers to optimization
• Usage example …
unsigned int
int local;
local;
unsigned
local == _disable_interrupts();
_disable_interrupts();
local
if (sem)
(sem) sem--;
sem--;
/* atomic
atomic test/update
test/update of
of semaphore
semaphore */
*/
if
/*
_restore_interrupts(local);
_restore_interrupts(local);
• Replacement for HWI_disable(), HWI_restore()
– Faster. HWI_disable on C64x takes 16 cycles.
• Cycle counts given are “best case” numbers
58
– Ignores cache effects, memory latency, etc.
Backup Agenda
•
•
•
•
•
•
59
More Optimization Hints
Standardize Handling of Types
More on Diagnostics
C Headers in Assembly
Interrupt Intrinsics
Predefined Symbols for Version
Predefined Symbols for Version
• __TI_COMPILER_VERSION__ and
__TI_ASSEMBLER_VERSION__
• Returns int corresponding to compiler version
• Version number breaks down:
–
–
–
–
Major number (1-2 digits)
Minor version (3 digits)
Patch version (3 digits)
Example: v5.1.0 Î 5 001 000 Î 5001000
• Workaround compiler bugs
#if defined(_TMS320C6X)
defined(_TMS320C6X) &&
&& __TI_COMPILER_VERSION__
__TI_COMPILER_VERSION__ ==
== 5001000
5001000
#if
workaround C6x
C6x compiler
compiler bug
bug
workaround
#endif
#endif
• Target independent test for TI compiler
#if defined(__TI_COMPILER_VERSION__)
defined(__TI_COMPILER_VERSION__)
#if
does not
not work
work with
with older
older compilers!
compilers!
does
#endif
#endif
60