ATMEL AVR32UC

Feature Summary
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Small area, high clock frequency.
32-bit load/store AVR32A RISC architecture.
15 general-purpose 32-bit registers.
32-bit Stack Pointer, Program Counter and Link Register reside in register file.
Fully orthogonal instruction set.
Pipelined architecture allows one instruction per clock cycle for most instructions.
Byte, half-word, word and double word memory access.
Fast interrupts and multiple interrupt priority levels.
Privileged and unprivileged modes enabling efficient and secure Operating Systems.
Optional MPU allows for operating systems with memory protection.
Innovative instruction set together with variable instruction length ensuring industry
leading code density.
DSP extention with saturating arithmetic, and a wide variety of multiply instructions.
Memory Read-Modify-Write instructions.
Optional advanced On-Chip Debug system.
FlashVault™ support through Secure state for executing trusted code alongside
nontrusted code on the same CPU.
Optional floating-point hardware.
AVR32UC
Technical
Reference
Manual
32002F–03/2010
AVR32
1. Introduction
AVR32 is a new high-performance 32-bit RISC microprocessor core, designed for cost-sensitive
embedded applications, with particular emphasis on low power consumption and high code density. In addition, the instruction set architecture has been tuned to allow for a variety of
microarchitectures, enabling the AVR32 to be implemented as low-, mid- or high-performance
processors. AVR32 extends the AVR family into the world of 32- and 64-bit applications.
1.1
The AVR family
The AVR family was launched by Atmel in 1996 and has had remarkable success in the 8-and
16-bit flash microcontroller market. AVR32 is complements the current AVR microcontrollers.
Through the AVR32 family, the AVR is extended into a new range of higher performance applications that is currently served by 32- and 64-bit processors
To truly exploit the power of a 32-bit architecture, the new AVR32 architecture is not binary compatible with earlier AVR architectures. In order to achieve high code density, the instruction
format is flexible providing both compact instructions with 16 bits length and extended 32-bit
instructions. While the instruction length is only 16 bits for most instructions, powerful 32-bit
instructions are implemented to further increase performance. Compact and extended instructions can be freely mixed in the instruction stream.
1.2
The AVR32 Microprocessor Architecture
The AVR32 is a new innovative microprocessor architecture. It is a fully synchronous synthesisable RTL design with industry standard interfaces, ensuring easy integration into SoC designs
with legacy intellectual property (IP). Through a quantitative approach, a large set of industry
recognized benchmarks has been compiled and analyzed to achieve the best code density in its
class of microprocessor architectures. In addition to lowering the memory requirements, a compact code size also contributes to the core’s low power characteristics. The processor supports
byte and half-word data types without penalty in code size and performance.
Memory load and store operations are provided for byte, half-word, word and double word data
with automatic sign- or zero extension of half-word and byte data. The C-compiler is closely
linked to the architecture and is able to exploit code optimization features, both for size and
speed.
In order to reduce code size to a minimum, some instructions have multiple addressing modes.
As an example, instructions with immediates often have a compact format with a smaller immediate, and an extended format with a larger immediate. In this way, the compiler is able to use
the format giving the smallest code size.
Another feature of the instruction set is that frequently used instructions, like add, have a compact format with two operands as well as an extended format with three operands. The larger
format increases performance, allowing an addition and a data move in the same instruction in a
single cycle.
Load and store instructions have several different formats in order to reduce code size and
speed up execution:
• Load/store to an address specified by a pointer register
• Load/store to an address specified by a pointer register with postincrement
• Load/store to an address specified by a pointer register with predecrement
• Load/store to an address specified by a pointer register with displacement
2
32002F–03/2010
AVR32
• Load/store to an address specified by a small immediate (direct addressing within a small
page)
• Load/store to an address specified by a pointer register and an index register.
The register file is organized as 16 32-bit registers and includes the Program Counter, the Link
Register, and the Stack Pointer. In addition, one register is designed to hold return values from
function calls and is used implicitly by some instructions.
The AVR32 core defines several micro architectures in order to capture the entire range of applications. The microarchitectures are named AVR32A, AVR32B and so on. Different
microarchitectures are suited to different end applications, allowing the designer to select a
microarchitecture with the optimum set of parameters for a specific application.
1.3
Exceptions and Interrupts
The AVR32 incorporates a powerful exception handling scheme. The different exception
sources, like Illegal Op-code and external interrupt requests, have different priority levels, ensuring a well-defined behavior when multiple exceptions are received simultaneously. Additionally,
pending exceptions of a higher priority class may preempt handling of ongoing exceptions of a
lower priority class. Each priority class has dedicated registers to keep the return address and
status register thereby removing the need to perform time-consuming memory operations to
save this information.
There are four levels of external interrupt requests, all executing in their own context. An interrupt controller does the priority handling of the external interrupts and provides the prioritized
interrupt vector to the processor core.
1.4
Java Support
Some AVR32 implementations provide Java hardware acceleration. To reduce gate count,
AVR32UC does not implement any such hardware.
1.5
FlashVault
Revision 3 of the AVR32 architecture introduced a new CPU state called Secure State. This
state is instrumental in the new security technology named FlashVault. This innovation allows
the on-chip flash and other memories to be partially programmed and locked, creating a safe onchip storage for secret code and valuable software intellectual property. Code stored in the
FlashVault will execute as normal, but reading, copying or debugging the code is not possible.
This allows a device with FlashVault code protection to carry a piece of valuable software such
as a math library or an encryption algorithm from a trusted location to a potentially untrustworthy
partner where the rest of the source code can be developed, debugged and programmed.
1.6
Microarchitectures
The AVR32 architecture defines different microarchitectures, AVR32A and AVR32B. This
enables implementations that are tailored to specific needs and applications. The microarchitectures provide different performance levels at the expense of area and power consumption.
The AVR32A microarchitecture is targeted at cost-sensitive, lower-end applications like smaller
microcontrollers. This microarchitecture does not provide dedicated hardware registers for shadowing of register file registers in interrupt contexts. Additionally, it does not provide hardware
registers for the return address registers and return status registers. Instead, all this information
is stored on the system stack. This saves chip area at the expense of slower interrupt handling.
3
32002F–03/2010
AVR32
Upon interrupt initiation, registers R8-R12 are automatically pushed to the system stack. These
registers are pushed regardless of the priority level of the pending interrupt. The return address
and status register are also automatically pushed to stack. The interrupt handler can therefore
use R8-R12 freely. Upon interrupt completion, the old R8-R12 registers and status register are
restored, and execution continues at the return address stored popped from stack.
The stack is also used to store the status register and return address for exceptions and scall.
Executing the rete or rets instruction at the completion of an exception or system call will pop
this status register and continue execution at the popped return address.
1.7
The AVR32UC architecture
The first implementation of the AVR32A architecture is called AVR32UC. This implementation
targets low- and medium-performance applications, and provides an optional, advanced OCD
system, no data or instruction caches, and an optional Memory Protection Unit (MPU). Java
acceleration is not implemented.
AVR32UC provides three memory interfaces, one High Speed Bus (HSB) master for instruction
fetch, one HSB bus master for data access, and one HSB slave interface allowing other bus
masters to access data RAMs internal to the CPU. Keeping data RAMs internal to the CPU
allows fast access to the RAMs, reduces latency and guarantees deterministic timing. Also,
power consumption is reduced by not needing a full HSB bus access for memory accesses. A
dedicated data RAM interface is provided for communicating with the internal data RAMs.
If an optional MPU is present, all memory accesses are checked for privilege violations. If an
access is attempted to an illegal memory address, the access is aborted and an exception is
taken.
The following figure displays the contents of AVR32UC:
4
32002F–03/2010
AVR32
OCD interface
Reset interface
Overview of AVR32UC.
Interrupt controller interface
Figure 1-1.
OCD
system
Power/
Reset
control
AVR32UC CPU pipeline
MPU
1.8
CPU Local
Bus
master
Data RAM interface
High
Speed
Bus slave
CPU Local Bus
High
Speed
Bus
master
High Speed Bus
High Speed Bus
High Speed Bus master
Data memory controller
High Speed Bus
Instruction memory controller
AVR32UC CPU revisions
Three revisions of the AVR32UC CPU currently exist:
• Revision 1 implementing revision 1 of the AVR32 architecture.
• Revision 2 implementing revision 2 of the AVR32 architecture, and with a faster divider.
• Revision 3 implementing revision 3 of the AVR32 architecture, and with optional floating-point
hardware.
Revision 2 of the AVR32UC CPU added the following instructions:
• movh Rd, imm
• {add, sub, and, or, eor}{cond4}, Rd, Rx, Ry
• ld.{sb, ub, sh, uh, w}{cond4} Rd, Rp[disp]
• st.{b, h, w}{cond4} Rp[disp], Rs
• rsub{cond4} Rd, imm
Revision 3 of the AVR32UC CPU added the following instructions:
5
32002F–03/2010
AVR32
• sscall
• retss
• Floating-point instructions as described in Section 4. on page 40.
Revision 3 of the AVR32UC CPU added the following system registers:
• SS_STATUS
• SS_ADRF, SS_ADRR, SS_ADR0, SS_ADR1
• SS_SP_SYS, SS_SP_APP
• SS_RAR, SS_RSR
Revision 3 of the AVR32UC CPU added the following bit in the status register:
• SS
AVR32UC CPU revision 2 is fully backward-compatible with revision 1, ie. code compiled for
revision 1 is binary-compatible with revision 2 CPUs.
AVR32UC CPU revision 3 is fully backward-compatible with revision 1 and 2, ie. code compiled
for revision 1 and 2 is binary-compatible with revision 3 CPUs.
The Architecture Revision field in the CONFIG0 system register identifies which architecture
revision is implemented in a specific device. The “Processor and Architecture”-chapter of the
device datasheet identifies the CPU revision used.
6
32002F–03/2010
AVR32
2. Programming Model
This chapter describes the programming model and the set of registers accessible to the user. It
also describes the implementation options in AVR32UC.
2.1
Architectural compatibility
AVR32UC is fully compatible with the Atmel AVR32A architecture. AVR32UC devices implementing both revision 2 and revision 3 of the AVR32 Architecture exist. Refer to the device
datasheet or the device’s CONFIG0 register to determine which architecture revision the device
implements.
Architecture revision 3 is fully backwards compatible with revision 2, and additionally
implements:
• Secure state with associated programming model
• The automatic clearing of COUNT on COMPARE match is now optional and disabled by
setting the NOCOMPRES bit in CPUCR.
2.2
Implementation options
2.2.1
Memory protection
AVR32UC optionally supports an MPU as specified by the AVR32 architecture.
2.2.2
Java support
AVR32UC does not implement Java hardware acceleration.
2.2.3
2.3
Floating-Point Hardware
AVR32UC optionally supports Floating-Point Hardware implemented as coprocessor
instructions.
Register file configuration
The AVR32A architecture dictates a specific register file implementation, reproduced below.
Secure state context and secure state system registers are only available in devices implementing revision 3 of the AVR32 architecture.
7
32002F–03/2010
AVR32
Figure 2-1.
Register File in AVR32A
A p p lic a tio n
S u p e r v is o r
IN T 0
B it 3 1
B it 3 1
B it 3 1
B it 0
B it 0
IN T 1
B it 0
IN T 2
B it 3 1
B it 0
IN T 3
B it 3 1
B it 0
B it 3 1
B it 0
E x c e p tio n
NMI
B it 3 1
B it 3 1
B it 0
S e c u re
B it 0
B it 3 1
B it 0
PC
LR
SP_APP
R 12
R 11
R 10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SYS
R 12
R 11
R 10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SYS
R 12
R 11
R 10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SYS
R12
R11
R10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SYS
R12
R11
R10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SYS
R12
R11
R10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SYS
R 12
R 11
R 10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SYS
R 12
R 11
R 10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
PC
LR
SP_SEC
R 12
R 11
R 10
R9
R8
IN TR07P C
IN TR16P C
F INRT5P C
SM
RP
4C
R3
R2
R1
R0
SR
SR
SR
SR
SR
SR
SR
SR
SR
SS_STATU S
SS_A DR F
SS_A DR R
SS_A DR 0
SS_A DR 1
SS_SP_SYS
SS_SP_APP
SS_RAR
SS_R SR
2.4
The Status Register
The Status Register (SR) consists of two halfwords, one upper and one lower, see Figure 2-2 on
page 8 and Figure 2-3 on page 9. The lower halfword contains the C, Z, N, V and Q flags, as
well as the L and T bits, while the upper halfword contains information about the mode and state
the processor executes in. The upper halfword can only be accessed from a privileged mode.
Figure 2-2.
The Status Register high halfword
Bit 31
Bit 16
SS
LC
1
-
-
DM
D
-
M2
M1
M0
EM
I3M
I2M
FE
I1M
I0M
GM
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
1
Bit name
Initial value
Global Interrupt Mask
Interrupt Level 0 Mask
Interrupt Level 1 Mask
Interrupt Level 2 Mask
Interrupt Level 3 Mask
Exception Mask
Mode Bit 0
Mode Bit 1
Mode Bit 2
Reserved
Debug State
Debug State Mask
Reserved
Secure State
8
32002F–03/2010
AVR32
Figure 2-3.
The Status Register low halfword
Bit 15
Bit 0
-
T
-
-
-
-
-
-
-
-
L
Q
V
N
Z
C
Bit name
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Initial value
Carry
Zero
Sign
Overflow
Saturation
Lock
Reserved
Scratch
Reserved
SS - Secure State
This bit is indicates if the processor is executing in the secure state. Only implemented in
devices implementing revision 3 of the AVR32 architecture, set to 0 in older revisions. The bit is
initialized in an IMPLEMENTATION DEFINED way at reset. Refer to Section 5. ”Secure State”
on page 59 for more information.
DM - Debug State Mask
If this bit is set, the Debug State is masked and cannot be entered. The bit is cleared at reset,
and can both be read and written by software.
D - Debug State
The processor is in debug state when this bit is set. The bit is cleared at reset and should only be
modified by debug hardware, the breakpoint instruction or the retd instruction. Undefined behaviour may result if the user tries to modify this bit using other mechanisms.
M2, M1, M0 - Execution Mode
These bits show the active execution mode. The settings for the different modes are shown in
Table 2-1 on page 10. M2 and M1 are cleared by reset while M0 is set so that the processor is in
supervisor mode after reset. These bits are modified by hardware when initiating interrupt or
exception processing. Execution of the scall, rets or rete instructions will also change these bits.
Undefined behaviour may result if the user tries to modify these bits using the mtsr, ssrf or csrf
instructions. If software needs to change these bits, scall, rets or rete should be used, possibly
with prior modifications of the stack, to achieve the desired changes in a safe way. Refer to the
AVR32 Architecture Manual for the behaviour of these instructions, note especially how the
stack is modified after their execution.
9
32002F–03/2010
AVR32
Table 2-1.
Mode bit settings
M2
M1
M0
Mode
1
1
1
Non Maskable Interrupt
1
1
0
Exception
1
0
1
Interrupt level 3
1
0
0
Interrupt level 2
0
1
1
Interrupt level 1
0
1
0
Interrupt level 0
0
0
1
Supervisor
0
0
0
Application
EM - Exception mask
When this bit is set, exceptions are masked. Exceptions are enabled otherwise. The bit is automatically set when exception processing is initiated or Debug Mode is entered. Software may
clear this bit after performing the necessary measures if nested exceptions should be supported.
This bit is set at reset.
I3M - Interrupt level 3 mask
When this bit is set, level 3 interrupts are masked. If I3M and GM are cleared, INT3 interrupts
are enabled. The bit is automatically set when INT3 processing is initiated. Software may clear
this bit after performing the necessary measures if nested INT3s should be supported. This bit is
cleared at reset.
I2M - Interrupt level 2 mask
When this bit is set, level 2 interrupts are masked. If I2M and GM are cleared, INT2 interrupts
are enabled. The bit is automatically set when INT3 or INT2 processing is initiated. Software
may clear this bit after performing the necessary measures if nested INT2s should be supported.
This bit is cleared at reset.
I1M - Interrupt level 1 mask
When this bit is set, level 1 interrupts are masked. If I1M and GM are cleared, INT1 interrupts
are enabled. The bit is automatically set when INT3, INT2 or INT1 processing is initiated. Software may clear this bit after performing the necessary measures if nested INT1s should be
supported. This bit is cleared at reset.
I0M - Interrupt level 0 mask
When this bit is set, level 0 interrupts are masked. If I0M and GM are cleared, INT0 interrupts
are enabled. The bit is automatically set when INT3, INT2, INT1 or INT0 processing is initiated.
Software may clear this bit after performing the necessary measures if nested INT0s should be
supported. This bit is cleared at reset.
GM - Global Interrupt Mask
When this bit is set, all interrupts are disabled. This bit overrides I0M, I1M, I2M and I3M. The bit
is automatically set when exception processing is initiated, Debug Mode is entered, or a Java
trap is taken. This bit is automatically cleared when returning from a Java trap. This bit is set
after reset.
10
32002F–03/2010
AVR32
T - Scratch bit
This bit is not set or cleared implicit by any instruction and the programmer can therefore use
this bit as a custom flag to for example signal events in the program. This bit is cleared at reset.
L - Lock flag
Used by the conditional store instruction. Used to support atomical memory access. Automatically cleared by rete. This bit is cleared after reset.
Q - Saturation flag
The saturation flag indicates that a saturating arithmetic operation overflowed. The flag is sticky
and once set it has to be manually cleared by a csrf instruction after the desired action has been
taken. See the Instruction set description for details.
V - Overflow flag
The overflow flag indicates that an arithmetic operation overflowed. See the Instruction set
description for details.
N - Negative flag
The negative flag is modified by arithmetical and logical operations. See the Instruction set
description for details.
Z - Zero flag
The zero flag indicates a zero result after an arithmetic or logic operation. See the Instruction set
description for details.
C - Carry flag
The carry flag indicates a carry after an arithmetic or logic operation. See the Instruction set
description for details.
2.5
System registers
The system registers are placed outside of the virtual memory space, and are only accessible
using the privileged mfsr and mtsr instructions. Some of the System Registers can be altered
automatically by hardware. The table below lists the system registers specified in AVR32UC.
The programmer is responsible for maintaining correct sequencing of any instructions following
a mtsr instruction.
Table 2-2.
System Registers
Reg #
Address
Name
Function
0
0
SR
Status Register
1
4
EVBA
Exception Vector Base Address
2
8
ACBA
Application Call Base Address
3
12
CPUCR
CPU Control Register
4
16
ECR
Exception Cause Register
5
20
RSR_SUP
Unused in AVR32UC
6
24
RSR_INT0
Unused in AVR32UC
7
28
RSR_INT1
Unused in AVR32UC
11
32002F–03/2010
AVR32
Table 2-2.
System Registers (Continued)
Reg #
Address
Name
Function
8
32
RSR_INT2
Unused in AVR32UC
9
36
RSR_INT3
Unused in AVR32UC
10
40
RSR_EX
Unused in AVR32UC
11
44
RSR_NMI
Unused in AVR32UC
12
48
RSR_DBG
Return Status Register for Debug Mode
13
52
RAR_SUP
Unused in AVR32UC
14
56
RAR_INT0
Unused in AVR32UC
15
60
RAR_INT1
Unused in AVR32UC
16
64
RAR_INT2
Unused in AVR32UC
17
68
RAR_INT3
Unused in AVR32UC
18
72
RAR_EX
Unused in AVR32UC
19
76
RAR_NMI
Unused in AVR32UC
20
80
RAR_DBG
Return Address Register for Debug Mode
21
84
JECR
Unused in AVR32UC
22
88
JOSP
Unused in AVR32UC
23
92
JAVA_LV0
Unused in AVR32UC
24
96
JAVA_LV1
Unused in AVR32UC
25
100
JAVA_LV2
Unused in AVR32UC
26
104
JAVA_LV3
Unused in AVR32UC
27
108
JAVA_LV4
Unused in AVR32UC
28
112
JAVA_LV5
Unused in AVR32UC
29
116
JAVA_LV6
Unused in AVR32UC
30
120
JAVA_LV7
Unused in AVR32UC
31
124
JTBA
Unused in AVR32UC
32
128
JBCR
Unused in AVR32UC
33-63
132-252
Reserved
Reserved for future use
64
256
CONFIG0
Configuration register 0
65
260
CONFIG1
Configuration register 1
66
264
COUNT
Cycle Counter register
67
268
COMPARE
Compare register
68
272
TLBEHI
Unused in AVR32UC
69
276
TLBELO
Unused in AVR32UC
70
280
PTBR
Unused in AVR32UC
71
284
TLBEAR
Unused in AVR32UC
72
288
MMUCR
Unused in AVR32UC
73
292
TLBARLO
Unused in AVR32UC
12
32002F–03/2010
AVR32
Table 2-2.
System Registers (Continued)
Reg #
Address
Name
Function
74
296
TLBARHI
Unused in AVR32UC
75
300
PCCNT
Unused in AVR32UC
76
304
PCNT0
Unused in AVR32UC
77
308
PCNT1
Unused in AVR32UC
78
312
PCCR
Unused in AVR32UC
79
316
BEAR
Bus Error Address Register
80
320
MPUAR0
MPU Address Register region 0
81
324
MPUAR1
MPU Address Register region 1
82
328
MPUAR2
MPU Address Register region 2
83
332
MPUAR3
MPU Address Register region 3
84
336
MPUAR4
MPU Address Register region 4
85
340
MPUAR5
MPU Address Register region 5
86
344
MPUAR6
MPU Address Register region 6
87
348
MPUAR7
MPU Address Register region 7
88
352
MPUPSR0
MPU Privilege Select Register region 0
89
356
MPUPSR1
MPU Privilege Select Register region 1
90
360
MPUPSR2
MPU Privilege Select Register region 2
91
364
MPUPSR3
MPU Privilege Select Register region 3
92
368
MPUPSR4
MPU Privilege Select Register region 4
93
372
MPUPSR5
MPU Privilege Select Register region 5
94
376
MPUPSR6
MPU Privilege Select Register region 6
95
380
MPUPSR7
MPU Privilege Select Register region 7
96
384
MPUCRA
MPU Cacheable Register A
97
388
MPUCRB
MPU Cacheable Register B
98
392
MPUBRA
MPU Bufferable Register A
99
396
MPUBRB
MPU Bufferable Register B
100
400
MPUAPRA
MPU Access Permission Register A
101
404
MPUAPRB
MPU Access Permission Register B
102
408
MPUCR
MPU Control Register
103
412
SS_STATUS
Secure State Status Register
104
416
SS_ADRF
Secure State Address Flash Register
105
420
SS_ADRR
Secure State Address RAM Register
106
424
SS_ADR0
Secure State Address 0 Register
107
428
SS_ADR1
Secure State Address 1 Register
108
432
SS_SP_SYS
Secure State Stack Pointer System Register
109
436
SS_SP_APP
Secure State Stack Pointer Application Register
13
32002F–03/2010
AVR32
Table 2-2.
System Registers (Continued)
Reg #
Address
Name
Function
110
440
SS_RAR
Secure State Return Address Register
111
444
SS_RSR
Secure State Return Status Register
112-191
448-764
Reserved
Reserved for future use
192-255
768-988
IMPL
IMPLEMENTATION DEFINED
248
992
MSU_ADDRHI
Memory Service Unit Address High Register
249
996
MSU_ADDRLO
Memory Service Unit Address Low Register
250
1000
MSU_LENGTH
Memory Service Unit Length Register
251
1004
MSU_CTRL
Memory Service Unit Control Register
252
1008
MSU_STATUS
Memory Service Unit Status Register
253
1012
MSU_DATA
Memory Service Unit Data Register
254
1016
MSU_TAIL
Memory Service Unit TailRegister
255
1020
Reserved
Reserved for future use
SR- Status Register
The Status Register is mapped into the system register space. This allows it to be loaded into
the register file to be modified, or to be stored to memory. The Status Register is described in
detail in Section 2.4 ”The Status Register” on page 8.
EVBA - Exception Vector Base Address
This register contains a pointer to the exception routines. All exception routines start at this
address, or at a defined offset relative to the address. Special alignment requirements may
apply for EVBA, depending on the implementation of the interrupt controller. Exceptions are
described in detail in the AVR32 Architecture Manual.
ACBA - Application Call Base Address
Pointer to the start of a table of function pointers. Subroutines can thereby be called by the compact acall instruction. This facilitates efficient reuse of code. Keeping this pointer as a register
facilitates multiple function pointer tables. ACBA is a full 32 bit register, but the lowest two bits
should be written to zero, making ACBA word aligned. Failing to do so may result in erroneous
behaviour.
CPUCR - CPU Control Register
Register controlling the configuration and behaviour of the CPU. The following fields are defined:
Table 2-3.
CPU control register
Name
Bit
Reset
Description
-
Other
-
Unused. Read as 0. Should be written as 0.
NOCOMP
RES
17
0
If set, COUNT is not set on COMPARE match. If cleared, COUNT is
cleared on COMPARE match.
LOCEN
16
0
Local Bus Enable. Must be written to 1 to enable the local bus. Any
access attempted to the LOCAL section when this bit is cleared will
result in a BUS ERROR.
14
32002F–03/2010
AVR32
Table 2-3.
Name
SPL
CPU control register
Bit
15:11
Reset
Description
16
Slave Pending Limit. The maximum number of clock cycles the slave
interface can have a request pending due to the CPU owning the RAMs.
After this period, the CPU will lose arbitrartion for the RAM, and the
slave access can proceed.
CPL
10:6
16
CPU Pending Limit. The maximum number of clock cycles the CPU can
have a request pending due to the slave interface owning the RAMs.
After this period, the slave interface will lose arbitrartion for the RAM,
and the CPU access can proceed.
COP
5:1
8
CPU Ownership Period. The number of cycles the CPU is guaranteed
to own the RAM after it has won the arbitration for the RAM. No
arbitration will be performed during this period.
SIE
0
1
Slave Interface Enable. If this bit is set, the slave interface is enabled.
Otherwise, the slave interface is disabled and any slave access will be
stalled.
ECR - Exception Cause Register
This register identifies the cause of the most recently executed exception. This information may
be used to handle exceptions more efficiently in certain operating systems. The register is
updated with a value equal to the EVBA offset of the exception, shifted 2 bit positions to the
right. Only the 9 lowest bits of the EVBA offset are considered. As an example, an ITLB miss
jumps to EVBA+0x50. The ECR will then be loaded with 0x50>>2 == 0x14. The ECR register is
not loaded when an scall, Breakpoint or OCD Stop CPU exception is taken. Note that for interrupts, the offset is given by the autovector provided by the interrupt controller. The resulting ECR
value may therefore overlap with an ECR value used by a regular exception. This can be
avoided by choosing the autovector offsets so that no such overlaps occur.
RSR_DBG - Return Status Register for Debug Mode
When Debug mode is entered, the status register contents of the original mode is automatically
saved in this register. When the debug routine is finished, the retd instruction copies the contents of RSR_DBG into SR.
RAR_DBG - Return Address Register for Debug Mode
When Debug mode is entered, the Program Counter contents of the original mode is automatically saved in this register. When the debug routine is finished, the retd instruction copies the
contents of RAR_DBG into PC.
CONFIG0 / 1 - Configuration Register 0 / 1
Used to describe the processor, its configuration and capabilities. The contents and functionality
of these registers is described in detail in Section 2.7 ”Configuration Registers” on page 17.
COUNT - Cycle Counter Register
Can be used as a general counter to time for example execution time. Can also be used
together with COMPARE to implement a periodic interrupt for example for an OS timer. The contents and functionality of this register is described in detail in Section 2.6 ”COMPARE and
COUNT registers” on page 17.
15
32002F–03/2010
AVR32
COMPARE - Cycle Counter Compare Register
Used together with COUNT to implement a periodic interrupt for example for an OS timer. The
contents and functionality of this register is described in detail in Section 2.6 ”COMPARE and
COUNT registers” on page 17.
BEAR - Bus Error Address Register
Physical address that caused a Data Bus Error. This register is Read Only. Writes are allowed,
but are ignored.
MPUARn - MPU Address Register n
Registers that define the base address and size of the protection regions. Refer to the AVR32
Architecture Manual for details.
MPUPSRn - MPU Privilege Select Register n
Registers that define which privilege register set to use for the different subregions in each protection region. Refer to the AVR32 Architecture Manual for details.
MPUCRA / MPUCRB - MPU Cacheable Register A / B
Registers that define if the different protection regions are cacheable. Refer to the AVR32 Architecture Manual for details.
MPUBRA / MPUBRB - MPU Bufferable Register A / B
Registers that define if the different protection regions are bufferable. Refer to the AVR32 Architecture Manual for details.
MPUAPRA / MPUAPRB - MPU Access Permission Register A / B
Registers that define the access permissions for the different protection regions. Refer to the
AVR32 Architecture Manual for details.
MPUCR - MPU Control Register
Register that control the operation of the MPU. Refer to the AVR32 Architecture Manual for
details.
SS_STATUS - Secure State Status Register
Register that can be used to pass status or other information from the secure state to the nonsecure state. Refer to Section 5. ”Secure State” on page 59 for details.
SS_ADRF, SS_ADRR, SS_ADR0, SS_ADR1 - Secure State Address Registers
Registers used to partition memories into a secure and a nonsecure section. The 10 LSBs must
always be written to zero. Refer to Section 5. ”Secure State” on page 59 for details.
SS_SP_SYS, SS_SP_APP - Secure State SP_SYS and SP_APP Registers
Read-only registers containing the SP_SYS and SP_APP values. Refer to Section 5. ”Secure
State” on page 59 for details.
SS_RAR, SS_RSR - Secure State Return Address and Return Status Registers
Contains the address and status register of the sscall instruction that called secure state. Also
used when returning to nonsecure state with the retss instruction. Refer to Section 5. ”Secure
State” on page 59 for details.
16
32002F–03/2010
AVR32
MSU_ADDRHI, MSU_ADDRLO, MSU_LENGTH, MSU_CTRL, MSU_STATUS, MSU_DATA, MSU_TAIL Memory Service Unit Registers
These registers are system register mappings of the Memory Service Unit Registers. Refer to
Section 9.8 ”Memory Service Unit” on page 138 for details.
2.6
COMPARE and COUNT registers
The COUNT register increments once every clock cycle, regardless of pipeline stalls and
flushes. The COUNT register can both be read and written. The COUNT register can be used
together with the COMPARE register to create a timer with periodic interrupt. The COUNT register is written to zero upon reset and compare match if the CPUCR[NOCOMPRES] bit is cleared,
otherwise COUNT is not reset on compare match. Incrementation of the COUNT register can
not be disabled. The COUNT register will increment even though a compare interrupt is pending.
The COMPARE register holds a value that the COUNT register is compared against. The COMPARE register can both be read and written. When the COMPARE and COUNT registers match,
a compare interrupt request is generated and COUNT is reset to 0. COUNT will thereafter continue incrementing in the following clock cycle. The interrupt request is routed out to the interrupt
controller, which may forward the request back to the processor as a normal interrupt request at
a priority level determined by the interrupt controller. Writing a value to the COMPARE register
clears any pending compare interrupt requests. The compare and exception generation feature
is disabled if the COMPARE register contains the value zero. The COMPARE register is written
to zero upon reset.
COUNT and COMPARE are clocked by a dedicated clock with the same frequency as the CPU
clock. This allows them to operate in some of the sleep modes. They can therefore be used as
timers even when the system use sleep modes. Consult the clock system documentation for
information on which sleep modes COUNT and COMPARE are operational.
2.7
Configuration Registers
Configuration registers are used to inform applications and operating systems about the setup
and configuration of the processor on which it is running, see Figure 2-4 on page 17.
AVR32UC implements the following read-only configuration registers.
Figure 2-4.
Configuration Registers
CONFIG0
31
24 23
Processor ID
20 19
-
16 15
Processor
Revision
13 12
AT
10 9
AR
7 6 5 4 3 2 1 0
MMUT
F J P O S D R
CONFIG1
31
26 25
IMMU SZ
20 19
DMMU SZ
16 15
ISET
13 12
ILSZ
10 9
IASS
6 5
DSET
3 2
DLSZ
0
DASS
17
32002F–03/2010
AVR32
Table 2-4 on page 18 shows the CONFIG0 fields.
Table 2-4.
CONFIG0 Fields
Name
Bit
Description
Processor ID
31:24
Specifies the type of processor. This allows the application to
distinguish between different processor implementations.
RESERVED
23:20
Reserved for future use.
Processor revision
19:16
Specifies the revision of the processor implementation.
Architecture type
AT
15:13
Value
Semantic
0
AVR32A
1
Unused in AVR32UC
Other
Reserved
Architecture Revision
AR
12:10
Value
Semantic
0
Unused in AVR32UC
1
Revision 1
2
Revision 2
3
Revision 3
Other
Reserved
MMU type
MMUT
9:7
Value
Semantic
0
None, using direct mapping and no segmentation
1
Unused in AVR32UC
2
Unused in AVR32UC
3
Memory Protection Unit
Other
Reserved
Floating-point unit implemented
F
Value
Semantic
0
No FPU implemented
1
Floating-Point Unit implemented
6
Java extension implemented
J
Value
Semantic
0
No Java extension implemented
1
Unused in AVR32UC
5
Performance counters implemented
P
Value
Semantic
0
No Performance Counters implemented
1
Unused in AVR32UC
4
18
32002F–03/2010
AVR32
Table 2-4.
Name
CONFIG0 Fields (Continued)
Bit
Description
On-Chip Debug implemented
O
Value
Semantic
0
No OCD implemented
1
OCD implemented
3
SIMD instructions implemented
S
Value
Semantic
0
No SIMD instructions
1
Unused in AVR32UC
2
DSP instructions implemented
D
Value
Semantic
0
Unused in AVR32UC
1
DSP instructions implemented
1
Memory Read-Modify-Write instructions implemented
R
Value
Semantic
0
Unused in AVR32UC
1
RMW instructions implemented
0
Table 2-5 on page 19 shows the CONFIG1 fields.
Table 2-5.
CONFIG1 Fields
Name
Bit
Description
IMMU SZ
31:26
Unused in AVR32UC
DMMU SZ
25:20
Specifies the number of MPU entries.
ISET
19:16
Unused in AVR32UC
ILSZ
15:13
Unused in AVR32UC
IASS
12:10
Unused in AVR32UC
DSET
9:6
Unused in AVR32UC
DLSZ
5:3
Unused in AVR32UC
DASS
2:0
Unused in AVR32UC
19
32002F–03/2010
AVR32
3. Pipeline
3.1
Overview
AVR32UC is a pipelined processor with three pipeline stages: IF, ID and EX. All instructions are
issued and complete in order. Some instructions may require several iterations through the EX
stage in order to complete.
The following figure shows an overview of the AVR32UC pipeline stages.
Figure 3-1.
The AVR32UC pipeline stages.
MUL
IF
ID
P re fe tc h u n it
D e c o d e u n it
R e g file
R ead
ALU
LS
M u ltip ly u n it
R e g file
w rite
A L U u n it
L o a d -s to re
u n it
The follwing abbreviations are used in the figure:
• IF - Instruction Fetch
• ID - Instruction Decode
• EX - Instruction Execute
• MUL - Multiplier
• ALU - Arithmetic-Logic Unit
• LS - Load/Store Unit
3.2
Prefetch unit
The prefetch unit comprises the IF pipestage, and is responsible for feeding instructions to the
decode unit. The prefetch unit fetches 32 bits at a time from the instruction memory interface
and places them in a FIFO prefetch buffer. At the same time, one instruction, either RISC
extended or compact, is fed to the decode stage.
3.3
Decode unit
The decode unit generates the necessary signals in order for the instruction to execute correctly.
The ID stage accepts one instruction each clock cycle from the prefetch unit. This instruction is
then decoded, and control signals and register file addresses are generated. If the instruction
cannot be decoded, an illegal instruction or unimplemented instruction exception is issued. The
ID stage also contains a state machine required for controlling multicycle instructions.
The ID stage performs the remapping of register file addresses from logical to physical
addresses. This is used for remapping the stack pointer register into the SP_APP or SP_SYS
registers.
20
32002F–03/2010
AVR32
3.4
EX pipeline stage
The Execute (EX) pipeline stage performs register file reads, operations on registers and memory, and register file writes.
3.4.1
ALU section
The ALU pipeline performs most of the data manipulation instructions, like arithmetical and logical operations. The ALU stage performs the following tasks:
• Target address calculation and condition check for change-of-flow instructions.
• Condition code checking for conditional instructions.
• Address calculation for memory accesses
• Writeback address calculation for the LS pipeline.
• All flag setting for arithmetical and logical instructions.
• The saturation needed by satadd and satsub.
• The operation needed by satrnds, satrndu, sats and satu.
• Signed and unsigned division
3.4.2
Multiply section
All multiply instructions execute in the multiply section. This section implements a 32 by 32 multiplier array, and 16x16, 32x16 and 32x32 multiplications and multiply-accumulates therefore
have an issue latency of one cycle. Multiplication of 32 by 32 bits to a 64-bit result require two
iterations through the multiplier array, and therefore needs several cycles to complete. This will
stall the multiply pipeline until the instruction is complete.
A special accumulator cache is implemented in the MUL section. This cache saves the multiplyaccumulate result in dedicated registers in the MUL section, as well as writing them back to the
register file. This allows subsequent MAC instructions to read the accumulator value from the
cache, instead of from the register file. This will speed up MAC operations by one clock cycle. If
a MAC instruction targets a register not found in the cache, one clock cycle is added to the MAC
operation, loading the accumulator value from the register file into the cache. In the next cycle,
the MAC operation is restarted automatically by hardware. If an instruction, like an add, mul or
load, is executed with target address equal to that of a valid cached register, the instruction will
update the cache.
The accumulator cache can hold one doubleword accumulator value, or one word accumulator
value. Hardware ensures that the accumulator cache is kept consistent. If another pipeline section writes to one of the registers kept in the accumulator cache, the cache is updated. The
cache is automatically invalidated after reset.
3.4.3
Load-store section
The load-store (LS) pipeline is able to read or write one register per clock cycle. The address is
calculated by the ALU section. Thereafter the address is passed on to the LS section and output
to the memory interface, together with the data to write if the access is a write. If the access is a
read, the read data is returned from the memory interface in the same cycle. If the read data
requires typecasting or other manipulation like performed by ldins or ldswp, this manipulation is
performed in the same cycle.
Any load or store multiple registers are decoded by the ID stage and passed on to the EX stage
as a series of single load or store word operations.
21
32002F–03/2010
AVR32
The read-modify-write instructions memc, mems and memt are performed as a non-interruptable
sequence of read from and write to memory. The load-store section generates the control signals required to perform this sequence. This sequence takes several clock cycles, so any
following instructions requiring the use of the load-store section must stall until the sequence is
finished. Following instructions that do not use the load-store section will not have to stall even if
the sequence has not finished.
Some memory operations to slow memories, such as memories on the HSB bus, may require
several clock cycles to perform. If required, the CPU pipeline will stall as long as necessary in
order to perform the memory access.
3.5
Support for unaligned addresses
All memory accesses must be performed with the correct alignment according to the data size.
The only exception to this is doubleword accesses, which are performed as two word accesses,
and therefore can be word-aligned. Any other unaligned memory access will cause an Data
Address Exception.
Instruction fetches must be halfword aligned. Any other alignment will cause an Instruction
Address Exception.
3.6
Forwarding hardware and hazard detection
Since the register file is read and written in the same pipeline stage, no hazards can occur, and
no forwarding is necessary. The programmer does not need to take any special considerations
regarding data hazards when writing code.
3.7
Event handling
Due to various reasons, the CPU may be required to abort normal program execution in order to
handle special, high-priority events. When handling of these events is complete, normal program
execution can be resumed. Traditionally, events that are generated internally in the CPU are
called exceptions, while events generated by sources external to the CPU are called interrupts.
The possible sources of events are listed in Table 3-4 on page 28.
The AVR32 has a powerful event handling scheme. The different event sources, like Illegal
Opcode and external interrupt requests, have different priority levels, ensuring a well-defined
behaviour when multiple events are received simultaneously. Additionally, pending events of a
higher priority class may preempt handling of ongoing events of a lower priority class.
When an event occurs, the execution of the instruction stream is halted, and execution control is
passed to an event handler at an address specified in Table 3-4 on page 28. Most of the handlers are placed sequentially in the code space starting at the address specified by EVBA, with
four bytes between each handler. This gives ample space for a jump instruction to be placed
there, jumping to the event routine itself. A few critical handlers have larger spacing between
them, allowing the entire event routine to be placed directly at the address specified by the
EVBA-relative offset generated by hardware. All external interrupt sources have autovectored
interrupt service routine (ISR) addresses. This allows the interrupt controller to directly specify
the ISR address as an address relative to EVBA. The autovector offset has 14 address bits, giving an offset of maximum 16384 bytes. The target address of the event handler is calculated as
(EVBA | event_handler_offset), not (EVBA + event_handler_offset), so EVBA and exception
code segments must be set up appropriately.
22
32002F–03/2010
AVR32
The same mechanisms are used to service all different types of events, including external interrupt requests, yielding a uniform event handling scheme.
Each pipeline stage has a pipeline register that holds the exception requests associated with the
instruction in that pipeline stage. This allows the exception request to follow the contaminated
instruction through the pipeline. Exceptions are detected in two different pipeline stages. The EX
stage detects all data-address related exceptions (DTLB Protection and Data Address). All other
exceptions, including interrupts, are detected in the ID stage. When an exception is detected in
EX, the EX stage and all upstream stages are flushed.
Generally, all exceptions, including breakpoint, have the failing instruction as restart address.
This allows a fixup exception routine to correct the error and restart the instruction. Interrupts
(INT0-3, NMI) have the address of the first non-completed instruction as restart address.
3.7.1
Exceptions and interrupt requests
When an event other than scall or debug request is received by the core, the following actions
are performed atomically:
1. The pending event will not be accepted if it is masked. The I3M, I2M, I1M, I0M, EM and
GM bits in the Status Register are used to mask different events. Not all events can be
masked. A few critical events (NMI, Unrecoverable Exception, TLB Multiple Hit and Bus
Error) can not be masked. When an event is accepted, hardware automatically sets the
mask bits corresponding to all sources with equal or lower priority. This inhibits acceptance of other events of the same or lower priority, except for the critical events listed
above. Software may choose to clear some or all of these bits after saving the necessary state if other priority schemes are desired. It is the event source’s responsibility to
ensure that their events are left pending until accepted by the CPU.
2. When a request is accepted, the Status Register and Program Counter of the current
context is stored to the system stack. If the event is an INT0, INT1, INT2 or INT3, registers R8-R12 and LR are also automatically stored to stack. Storing the Status Register
ensures that the core is returned to the previous execution mode when the current
event handling is completed. When exceptions occur, both the EM and GM bits are set,
and the application may manually enable nested exceptions if desired by clearing the
appropriate bit. Each exception handler has a dedicated handler address, and this
address uniquely identifies the exception source.
3. The Mode bits are set to reflect the priority of the accepted event, and the correct register file bank is selected. The address of the event handler, as shown in Table 3-4, is
loaded into the Program Counter.
The execution of the event handler routine then continues from the effective address calculated.
The rete instruction signals the end of the event. When encountered, the Return Status Register
and Return Address Register are popped from the system stack and restored to the Status Register and Program Counter. If the rete instruction returns from INT0, INT1, INT2 or INT3,
registers R8-R12 and LR are also popped from the system stack. The restored Status Register
contains information allowing the core to resume operation in the previous execution mode. This
concludes the event handling.
Note that event priorities are only used to determine which event handler to call first when multiple events are received simultaneously. Once control is passed on to the event handler,
handling of pending and lower priority events may be initiated if not masked.
For instance, it is possible to make a supervisor call (SCALL) from an interrupt level 0 handler,
even though the priority of a supervisor call event is lower than the active interrupt level 0 event.
23
32002F–03/2010
AVR32
3.7.2
Supervisor calls
The AVR32 instruction set provides a supervisor mode call instruction. The scall instruction is
designed so that privileged routines can be called from any context. This facilitates sharing of
code between different execution modes. The scall mechanism is designed so that a minimal
execution cycle overhead is experienced when performing supervisor routine calls from timecritical event handlers.
The scall instruction behaves differently depending on which mode it is called from. The behaviour is detailed in the instruction set reference. In order to allow the scall routine to return to the
correct context, a return from supervisor call instruction, rets, is implemented. In the AVR32A
microarchitecture, scall and rets uses the system stack to store the return address and the status register.
3.7.3
Debug requests
The AVR32 architecture defines a dedicated debug mode. When a debug request is received by
the core, Debug mode is entered. Entry into Debug mode can be masked by the DM bit in the
status register. Upon entry into Debug mode, hardware sets the SR[D] bit and jumps to the
Debug Exception handler. By default, debug mode executes in the exception context, but with
dedicated Return Address Register and Return Status Register. These dedicated registers
remove the need for storing this data to the system stack, thereby improving debuggability.
Debug mode is exited by executing the retd instruction. This returns to the previous context.
3.8
3.8.1
Special concerns
System stack
Event handling in AVR32UC, like in all AVR32A architectures, uses the system stack pointed to
by the system stack pointer, SP_SYS, for pushing and popping R8-R12, LR, status register and
return address. Since exception code may be timing-critical, SP_SYS should point to memory
addresses in the IRAM section, since the timing of accesses to this memory section is both fast
and deterministic.
The user must also make sure that the system stack is large enough so that any event is able to
push the required registers to stack. If the system stack is full, and an event occurs, the system
will enter an UNDEFINED state.
3.8.2
Clearing of pending interrupt requests
When an interrupt request is accepted by the CPU, the interrupt handler will eventually be
called. The interrupt handler is responsible for performing the required actions so that the
requesting module disasserts the interrupt request before the interrupt routine is exited with rete.
Failing to do so will cause the interrupt handler to be re-entered after the rete instruction has
been executed, since the interrupt request is still active. Different interrupt sources have different ways of disasserting requests, for example reading an interrupt cause register or writing to
specific control registers. Refer to the module-specific documentation for information on how to
disassert interrupt requests.
Disasserting an interrupt request often requires that a bus access is performed to the requesting
module. An example of such an access is to read an interrupt cause register. There will be a
latency from the execution of the load or store instruction that is to disassert the interrupt request
and the actual disassertion of the request. This latency can be caused by the bus system and
internal latencies in the interrupting module. It is important that the programmer makes sure that
the interrupt request has actually been disasserted before returning from the interrupt with rete.
24
32002F–03/2010
AVR32
This can usually be ensured by scheduling the code sequence disasserting the interrupt request
in such a way that one can be certain that the interrupt request has actually been disasserted
before the rete instruction is executed.
Code 3-1.
Clearing IRQs using code scheduling
// Using scheduling of instructions in the IRQ handler to make sure that the
// request has been disasserted before returning from the handler.
// Assume that the IRQ is cleared by reading PERIPH_INTCAUSE, r0 points to
// this register.
irq_handler:
<some instructions>
ld.w r12, r0[0] // Clear the IRQ
<some other instructions, enough to make sure that the IRQ is cleared>
rete
The mechanisms and timing required for disasserting an interrupt request from a module is specific to different modules. Usually, the request is disasserted within a few clock cycles after the
load or store instruction has been received by the module. In this case, a simple way of making
sure that the request has actually been disasserted is to use a data memory barrier (“Data memory barriers” on page 64). The DMB will block the CPU pipeline until the interrupt request has
been disasserted. At this point, the rete instruction can safely be executed.
Code 3-2.
Clearing IRQs using data memory barriers
// Using data memory barriers in the IRQ handler to make sure that the
// request has been disasserted before returning from the handler
// Assume that the IRQ is cleared by writing a bitmask to PERIPH_INTCLEAR.
// r0 points to this register, r1 contains the correct bitmask.
irq_handler:
<some instructions>
st.w r0[0], r1
ld.w r12, r0[0] // data memory barrier
rete
The programmer should consult the data sheets for the different peripheral modules to check if
special timings or concerns related to disasserting of interrupt requests apply to the specific
module.
3.8.3
Masking interrupt requests in peripheral modules
Handling an interrupt request involves several operations like pushing of registers to stack and
takes several clock cycles. The required operations are controlled by sequencing logic in hardware. This sequencing hardware does not permit that an asserted interrupt request is
disasserted while it is in the process of handling this interrupt request.
Hardware makes sure that manipulation of the GM and IxM bits in SREG can be performed
safely at all times using the mtsr, csrf and ssrf instructions. The programmer does not need to
take any special concerns when issuing one of these instructions.
All hardware connected to the CPU is implemented in such a way that once an interrupt request
is asserted by the hardware, it can only be disasserted by explicit actions by the programmer.
25
32002F–03/2010
AVR32
Many peripheral modules that are able to assert interrupt requests have control registers or
other means of masking one or more of its interrupt requests. For example, a USART can contain an interrupt mask register with individual bits for masking “TX ready” and “RX ready”
interrupts. Writing to such a mask register may cause a pending interrupt request from that module to be disasserted.
The programmer must at all times make sure that an action that will disassert interrupts at the
interrupt source is not performed if it is possible that the interrupt sequencing hardware is in the
processing of handling the interrupt request that will be disasserted by the action. It is safe to
perform such an action if one of the following is true:
• The SREG GM or IxM bit corresponding to the priority of the interrupt request to be masked
is set before the action is performed.
• It can be guaranteed that the interrupt request being masked by the action is disasserted
when the action is initiated and being performed.
Code 3-3.
Masking IRQs in a peripheral module which may assert an IRQ at any time
// Masking TX_READY IRQ in a peripheral by setting the TXMASK bit in the
// IRQMASK register of the peripheral.
// Could alternatively mask the SREG IxM bit associated with the IRQ source
disassert_periph_tx_irq:
ssrf AVR32_SREG_GM
mems PERIPH_IRQMASK, PERIPH_TXMASK
csrf AVR32_SREG_GM
If the interrupt request is disasserted during the critical clock cycles where the sequencing hardware is active handling this interrupt request, the CPU may enter an UNPREDICTABLE state.
3.9
Entry points for events
Several different event handler entry points exists. In AVR32UC, the reset address is
0x8000_0000. This places the reset address in the boot flash memory area.
TLB miss exceptions and scall have a dedicated space relative to EVBA where their event handler can be placed. This speeds up execution by removing the need for a jump instruction placed
at the program address jumped to by the event hardware. All other exceptions have a dedicated
event routine entry point located relative to EVBA. The handler routine address identifies the
exception source directly.
AVR32UC uses the ITLB and DTLB protection exceptions to signal a MPU protection violation.
ITLB and DTLB miss exceptions are used to signal that an access address did not map to any of
the entries in the MPU. TLB multiple hit exception indicates that an access address did map to
multiple TLB entries, signalling an error.
All external interrupt requests have entry points located at an offset relative to EVBA. This
autovector offset is specified by an external Interrupt Controller. The programmer must make
sure that none of the autovector offsets interfere with the placement of other code. The autovector offset has 14 address bits, giving an offset of maximum 16384 bytes.
Special considerations should be made when loading EVBA with a pointer. Due to security considerations, the event handlers should be located in non-writeable flash memory, or optionally in
a privileged memory protection region if an MPU is present.
26
32002F–03/2010
AVR32
If several events occur on the same instruction, they are handled in a prioritized way. The priority
ordering is presented in Table 3-4. If events occur on several instructions at different locations in
the pipeline, the events on the oldest instruction are always handled before any events on any
younger instruction, even if the younger instruction has events of higher priority than the oldest
instruction. An instruction B is younger than an instruction A if it was sent down the pipeline later
than A.
The addresses and priority of simultaneous events are shown in Table 3-4 on page 28. Some of
the exceptions are unused in AVR32UC since it has no MMU or coprocessor interface.
The interrupt system requires that an interrupt controller is present outside the core in order to
prioritize requests and generate a correct offset if more than one interrupt source exists for each
priority level. An interrupt controller generating different offsets depending on interrupt request
source is referred to as autovectoring. Note that the interrupt controller should generate
autovector addresses that do not conflict with addresses in use by other events or regular program code.
The addresses of the interrupt routines are calculated by adding the address on the autovector
offset bus to the value of the Exception Vector Base Address (EVBA). The INT0, INT1, INT2,
INT3, and NMI signals indicate the priority of the pending interrupt. INT0 has the lowest priority,
and NMI the highest priority of the interrupts.
27
32002F–03/2010
AVR32
Table 3-4.
Priority and handler addresses for events
Priority
Handler Address
Name
Event source
Stored Return Address
1
0x8000_0000
Reset
External input
Undefined
2
Provided by OCD system
OCD Stop CPU
OCD system
First non-completed instruction
3
EVBA+0x00
Unrecoverable exception
Internal
PC of offending instruction
4
EVBA+0x04
TLB multiple hit
MPU
PC of offending instruction
5
EVBA+0x08
Bus error data fetch
Data bus
First non-completed instruction
6
EVBA+0x0C
Bus error instruction fetch
Data bus
First non-completed instruction
7
EVBA+0x10
NMI
External input
First non-completed instruction
8
Autovectored
Interrupt 3 request
External input
First non-completed instruction
9
Autovectored
Interrupt 2 request
External input
First non-completed instruction
10
Autovectored
Interrupt 1 request
External input
First non-completed instruction
11
Autovectored
Interrupt 0 request
External input
First non-completed instruction
12
EVBA+0x14
Instruction Address
CPU
PC of offending instruction
13
EVBA+0x50
ITLB Miss
MPU
PC of offending instruction
14
EVBA+0x18
ITLB Protection
MPU
PC of offending instruction
15
EVBA+0x1C
Breakpoint
OCD system
First non-completed instruction
16
EVBA+0x20
Illegal Opcode
Instruction
PC of offending instruction
17
EVBA+0x24
Unimplemented instruction
Instruction
PC of offending instruction
18
EVBA+0x28
Privilege violation
Instruction
PC of offending instruction
19
EVBA+0x2C
Floating-point
UNUSED
20
EVBA+0x30
Coprocessor absent
Coprocessor
PC of offending instruction
21
EVBA+0x100
Supervisor call
Instruction
PC(Supervisor Call) +2
22
EVBA+0x34
Data Address (Read)
CPU
PC of offending instruction
23
EVBA+0x38
Data Address (Write)
CPU
PC of offending instruction
24
EVBA+0x60
DTLB Miss (Read)
MPU
PC of offending instruction
25
EVBA+0x70
DTLB Miss (Write)
MPU
PC of offending instruction
26
EVBA+0x3C
DTLB Protection (Read)
MPU
PC of offending instruction
27
EVBA+0x40
DTLB Protection (Write)
MPU
PC of offending instruction
28
EVBA+0x44
DTLB Modified
UNUSED
3.9.1
Description of events
3.9.1.1
Reset Exception
The Reset exception is generated when the reset input line to the CPU is asserted. The Reset
exception can not be masked by any bit. The Reset exception resets all synchronous elements
and registers in the CPU pipeline to their default value, and starts execution of instructions at
address 0x8000_0000.
SR = reset_value_of_SREG;
28
32002F–03/2010
AVR32
PC = 0x8000_0000;
All other system registers are reset to their reset value, which may or may not be defined. Refer
to the Programming Model chapter for details.
3.9.1.2
OCD Stop CPU Exception
The OCD Stop CPU exception is generated when the OCD Stop CPU input line to the CPU is
asserted. The OCD Stop CPU exception can not be masked by any bit. This exception is identical to a non-maskable, high priority breakpoint. Any subsequent operation is controlled by the
OCD hardware. The OCD hardware will take control over the CPU and start to feed instructions
directly into the pipeline.
RSR_DBG = SR;
RAR_DBG = PC;
SR[M2:M0] = B’110;
SR[D] = 1;
SR[DM] = 1;
SR[EM] = 1;
SR[GM] = 1;
3.9.1.3
Unrecoverable Exception
The Unrecoverable Exception is generated when an exception request is issued when the
Exception Mask (EM) bit in the status register is asserted. The Unrecoverable Exception can not
be masked by any bit. The Unrecoverable Exception is generated when a condition has
occurred that the hardware cannot handle. The system will in most cases have to be restarted if
this condition occurs.
*(--SPSYS) = PC of offending instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x00;
3.9.1.4
TLB Multiple Hit Exception
The TLB Multiple Hit Exception is generated when an access hits in multiple MPU regions. This
is usually caused by programming error. Used only if an MPU is present.
*(--SPSYS) = PC of offending instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x04;
3.9.1.5
Bus Error Exception on Data Access
The Bus Error on Data Access exception is generated when the data bus detects an error condition. This exception is caused by events unrelated to the instruction stream, or by data written to
the cache write-buffers many cycles ago. Therefore, execution can not be resumed in a safe
way after this exception. The return address placed on stack is unrelated to the operation that
29
32002F–03/2010
AVR32
caused the exception. The exception handler is responsible for performing the appropriate
action.
*(--SPSYS) = PC of first non-issued instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x08;
BEAR = failing address
3.9.1.6
Bus Error Exception on Instruction Fetch
The Bus Error on Instruction Fetch exception is generated when the data bus detects an error
condition. This exception is caused by events related to the instruction stream. Therefore, execution can be restarted in a safe way after this exception, assuming that the condition that
caused the bus error is dealt with.
*(--SPSYS) = PC of first non-issued instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x0C;
3.9.1.7
NMI Exception
The NMI exception is generated when the NMI input line to the core is asserted. The NMI exception can not be masked by the SR[GM] bit. However, the core ignores the NMI input line when
processing an NMI Exception (the SR[M2:M0] bits are B’111). This guarantees serial execution
of NMI Exceptions, and simplifies the NMI hardware and software mechanisms.
Since the NMI exception is unrelated to the instruction stream, the instructions in the pipeline are
allowed to complete. After finishing the NMI exception routine, execution should continue at the
instruction following the last completed instruction in the instruction stream.
*(--SPSYS) = PC of first noncompleted instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’111;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x10;
3.9.1.8
INT3 Exception
The INT3 exception is generated when the INT3 input line to the core is asserted. The INT3
exception can be masked by the SR[GM] bit, and the SR[I3M] bit. Hardware automatically sets
the SR[I3M] bit when accepting an INT3 exception, inhibiting new INT3 requests when processing an INT3 request.
The INT3 Exception handler address is calculated by adding EVBA to an interrupt vector offset
specified by an interrupt controller outside the core. The interrupt controller is responsible for
providing the correct offset.
30
32002F–03/2010
AVR32
Since the INT3 exception is unrelated to the instruction stream, the instructions in the pipeline
are allowed to complete. After finishing the INT3 exception routine, execution should continue at
the instruction following the last completed instruction in the instruction stream.
*(--SPSYS) = R8;
*(--SPSYS) = R9;
*(--SPSYS) = R10;
*(--SPSYS) = R11;
*(--SPSYS) = R12;
*(--SPSYS) = LR;
*(--SPSYS) = PC of first noncompleted instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’101;
SR[I3M] = 1;
SR[I2M] = 1;
SR[I1M] = 1;
SR[I0M] = 1;
PC = EVBA | INTERRUPT_VECTOR_OFFSET;
3.9.1.9
INT2 Exception
The INT2 exception is generated when the INT2 input line to the core is asserted. The INT2
exception can be masked by the SR[GM] bit, and the SR[I2M] bit. Hardware automatically sets
the SR[I2M] bit when accepting an INT2 exception, inhibiting new INT2 requests when processing an INT2 request.
The INT2 Exception handler address is calculated by adding EVBA to an interrupt vector offset
specified by an interrupt controller outside the core. The interrupt controller is responsible for
providing the correct offset.
Since the INT2 exception is unrelated to the instruction stream, the instructions in the pipeline
are allowed to complete. After finishing the INT2 exception routine, execution should continue at
the instruction following the last completed instruction in the instruction stream.
*(--SPSYS) = R8;
*(--SPSYS) = R9;
*(--SPSYS) = R10;
*(--SPSYS) = R11;
*(--SPSYS) = R12;
*(--SPSYS) = LR;
*(--SPSYS) = PC of first noncompleted instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’100;
SR[I2M] = 1;
SR[I1M] = 1;
SR[I0M] = 1;
PC = EVBA | INTERRUPT_VECTOR_OFFSET;
3.9.1.10
INT1 Exception
The INT1 exception is generated when the INT1 input line to the core is asserted. The INT1
exception can be masked by the SR[GM] bit, and the SR[I1M] bit. Hardware automatically sets
31
32002F–03/2010
AVR32
the SR[I1M] bit when accepting an INT1 exception, inhibiting new INT1 requests when processing an INT1 request.
The INT1 Exception handler address is calculated by adding EVBA to an interrupt vector offset
specified by an interrupt controller outside the core. The interrupt controller is responsible for
providing the correct offset.
Since the INT1 exception is unrelated to the instruction stream, the instructions in the pipeline
are allowed to complete. After finishing the INT1 exception routine, execution should continue at
the instruction following the last completed instruction in the instruction stream.
*(--SPSYS) = R8;
*(--SPSYS) = R9;
*(--SPSYS) = R10;
*(--SPSYS) = R11;
*(--SPSYS) = R12;
*(--SPSYS) = LR;
*(--SPSYS) = PC of first noncompleted instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’011;
SR[I1M] = 1;
SR[I0M] = 1;
PC = EVBA | INTERRUPT_VECTOR_OFFSET;
3.9.1.11
INT0 Exception
The INT0 exception is generated when the INT0 input line to the core is asserted. The INT0
exception can be masked by the SR[GM] bit, and the SR[I0M] bit. Hardware automatically sets
the SR[I0M] bit when accepting an INT0 exception, inhibiting new INT0 requests when processing an INT0 request.
The INT0 Exception handler address is calculated by adding EVBA to an interrupt vector offset
specified by an interrupt controller outside the core. The interrupt controller is responsible for
providing the correct offset.
Since the INT0 exception is unrelated to the instruction stream, the instructions in the pipeline
are allowed to complete. After finishing the INT0 exception routine, execution should continue at
the instruction following the last completed instruction in the instruction stream.
*(--SPSYS) = R8;
*(--SPSYS) = R9;
*(--SPSYS) = R10;
*(--SPSYS) = R11;
*(--SPSYS) = R12;
*(--SPSYS) = LR;
*(--SPSYS) = PC of first noncompleted instruction;
*(--SPSYS) = SR;
SR[M2:M0] = B’010;
SR[I0M] = 1;
PC = EVBA | INTERRUPT_VECTOR_OFFSET;
32
32002F–03/2010
AVR32
3.9.1.12
Instruction Address Exception
The Instruction Address Error exception is generated if the generated instruction memory
address has an illegal alignment.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x14;
3.9.1.13
ITLB Miss Exception
The ITLB Miss exception is generated when the MPU is enabled and the instruction memory
access does not hit in any regions. Used only if an MPU is present.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x50;
3.9.1.14
ITLB Protection Exception
The ITLB Protection exception is generated when the instruction memory access violates the
access rights specified by the protection region in which the address lies. Used only if an MPU is
present.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x18;
3.9.1.15
Breakpoint Exception
The Breakpoint exception is issued when the OCD breakpoint input line to the CPU is aseerted,
and SREG[DM] is cleared.
When entering the exception routine, RAR_DBG points to the breakpoint instruction, and the
CPU will enter Debug mode. An external debugger can optionally assume control of the CPU
when the Breakpoint Exception is executed. The debugger can then issue individual instructions
to be executed in Debug mode. Debug mode is exited with the retd instruction. This passes control from the debugger back to the CPU, resuming normal execution.
RSR_DBG = SR;
RAR_DBG = PC;
SR[M2:M0] = B’110;
SR[D] = 1;
SR[DM] = 1;
SR[EM] = 1;
SR[GM] = 1;
33
32002F–03/2010
AVR32
PC = EVBA | 0x1C;
3.9.1.16
Illegal Opcode
This exception is issued when the core fetches an unknown instruction, or when a coprocessor
instruction is not acknowledged. When entering the exception routine, the return address on
stack points to the instruction that caused the exception.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x20;
3.9.1.17
Unimplemented Instruction
This exception is issued when the core fetches an instruction supported by the instruction set
but not by the current implementation. This allows software implementations of unimplemented
instructions. When entering the exception routine, the return address on stack points to the
instruction that caused the exception.
Table 3-5.
List of unimplemented instructions.
Privileged Instructions
Comment
All SIMD instructions
No SIMD implemented
Coprocessor instructions adressing
unimplemented coprocessors
cache - perform cache operation
No cache implemented
incjosp - increment Java stack pointer
No Java implemented
popjc - pop Java context
No Java implemented
pushjc - push Java context
No Java implemented
retj- return from Java mode
No Java implemented
tlbr - read addressed TLB entry into
TLBEHI and TLBELO
No MMU present
tlbw - write TLB entry registers into
TLB
No MMU present
tlbs - search TLB for entry matching
TLBEHI[VPN]
No MMU present
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x24;
34
32002F–03/2010
AVR32
3.9.1.18
Data Read Address Exception
The Data Read Address Error exception is generated if the address of a data memory read has
an illegal alignment.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x34;
3.9.1.19
Data Write Address Exception
The Data Write Address Error exception is generated if the address of a data memory write has
an illegal alignment.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x38;
3.9.1.20
DTLB Read Miss Exception
The DTLB Read Miss exception is generated when the MPU is enabled and the data memory
read access does not hit in any regions. Used only if an MPU is present.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x60;
3.9.1.21
DTLB Write Miss Exception
The DTLB Write Miss exception is generated when the MPU is enabled and the data memory
write access does not hit in any regions. Used only if an MPU is present.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x70;
3.9.1.22
DTLB Read Protection Exception
The DTLB Protection exception is generated when the data memory read violates the access
rights specified by the protection region in which the address lies. Used only if an MPU is
present.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
35
32002F–03/2010
AVR32
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x3C;
3.9.1.23
DTLB Write Protection Exception
The DTLB Protection exception is generated when the data memory write violates the access
rights specified by the protection region in which the address lies. Used only if an MPU is
present.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x40;
3.9.1.24
Privilege Violation Exception
If the application tries to execute privileged instructions, this exception is issued. The complete
list of priveleged instructions is shown in Table 3-6 on page 36. When entering the exception
routine, the address of the instruction that caused the exception is stacked as the return
address.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x28;
Table 3-6.
List of instructions which can only execute in privileged modes.
Privileged Instructions
Comment
csrf - clear status register flag
Privileged only when accessing upper half of status register
mtsr - move to system register
mfsr - move from system register
mtdr - move to debug register
mfdr - move from debug register
rete- return from exception
rets - return from supervisor call
retd - return from debug mode
sleep - sleep
ssrf - set status register flag
3.9.1.25
Privileged only when accessing upper half of status register
DTLB Modified Exception
Unused in AVR32UC, since it has no MMU.
36
32002F–03/2010
AVR32
3.9.1.26
Floating-point Exception
Unused in AVR32UC.
3.9.1.27
Coprocessor Absent Exception
The Coprocessor Absent exception is generated when a nonexisting coprocessor is addressed
by a coprocessor instruction. Used only if one or more coprocessors are present. Executing
coprocessor instructions in systems with no coprocessors results in an Unimplemented Instruction exception instead.
*(--SPSYS) = PC;
*(--SPSYS) = SR;
SR[M2:M0] = B’110;
SR[EM] = 1;
SR[GM] = 1;
PC = EVBA | 0x30;
3.9.1.28
Supervisor call
Supervisor calls are signalled by the application code executing a supervisor call (scall) instruction. The scall instruction behaves differently depending on which context it is called from. This
allows scall to be called from other contexts than Application.
When the exception routine is finished, execution continues at the instruction following scall. The
rets instruction is used to return from supervisor calls.
If ( SR[M2:M0] == {B’000 or B’001} )
*(--SPSYS) = PC;
*(--SPSYS) = SR;
PC ← EVBA | 0x100;
SR[M2:M0] ← B’001;
else
LR ← PC + 2;
PC ← EVBA | 0x100;
3.10
Interrupt latencies
The following features in AVR32UC ensure low and deterministic interrupt latency:
• Four different interrupt levels and an NMI ensures that the user can efficiently prioritize the
interrupt sources.
• Long-running instructions such as ldm, stm, pushm, popm, divs and divu will be aborted if an
interrupt request is received. The slowest instruction that can not be aborted by a pending
interrupt has a worst case issue latency of 5 cycles. This implies that an interrupt request will
need to wait at most 5 cycles for an instruction to complete. The fastest instructions need
only a single cycle to complete.
• Interrupts are autovectored, allowing the CPU to jump directly to the interrupt handler.
• When an interrupt of level m is received, the CPU will start stacking register file registers,
return address and status register. After this stacking is performed, the CPU will jump to the
autovector address of the interrupt of level m. If an interrupt of level n, where n > m, is
received during this stacking, the CPU will jump to the autovector address of the interrupt of
level n, NOT the autovector address of the original interrupt.
37
32002F–03/2010
AVR32
Note that the overall system latency from an interrupt request is signaled to the request is being
handled depends on a number of things in addition to the latency through the CPU. The latency
through the interrupt controller will affect interrupt latency for all peripheral interrupt requests and
the bus matrix, code and data memories will affect overall responsiveness.
3.10.1
Maximum interrupt latency
The maximum CPU interrupt latency can be calculated as follows:
Table 3-7.
3.10.2
Maximum interrupt latency
Source
Delay
Wait for the slowest instruction to complete
6
Stack register file registers, return address and status register, and jump to
autovector target
10
Wait for autovector target instruction to be fetched
1
TOTAL
17
Minimum interrupt latency
The minimum CPU interrupt latency of an interrupt request of level m will occur when the CPU is
in the process of stacking the registers and return address associated with an interrupt request
of level n, where n < m. If the level m interrupt request arrives just as the CPU is about to jump to
the autovector address for the interrupt of level n, the CPU will jump directly to the autovector
address of the latest arriving interrupt. In this case, the minimum interrupt latency is as follows:
Table 3-8.
Minimum interrupt latency - higher priority interrupt preempts lower priority
interrupt
Source
Delay
Jump to autovector target
1
Wait for autovector target instruction to be fetched
1
TOTAL
2
Assuming that the interrupt request arrives when the CPU is in the process of executing program
code, the minimum interrupt latency can be calculated as follows:
Table 3-9.
3.11
Minimum interrupt latency - interrupt received when executing program code
Source
Delay
Wait for the fastest instruction to complete
1
Stack register file registers, return address and status register, and jump to
autovector target
10
Wait for autovector target instruction to be fetched
1
TOTAL
12
NMI latency
Non-maskable interrupts (NMI) behave similarly to interrupts, except that they do not automatically push register file registers on the stack. NMI can, similar to interrupts, abort long-running
instructions.
38
32002F–03/2010
AVR32
The maximum NMI latency can be calculated as follows:
Table 3-10.
Maximum NMI latency
Source
Delay
Wait for the slowest instruction to complete
6
Stack return address and status register, and jump to autovector target
4
Wait for autovector target instruction to be fetched
1
TOTAL
11
39
32002F–03/2010
AVR32
4. Floating Point Hardware
Newer versions of UC3 CPU introduced optional floating-point hardware performing 32-bit floating-point operations. Instructions controlling this hardware are mapped into the coprocessor
instruction space, addressed as coprocessor 0. The CONFIG0 system register F bit indicates if
floating-point hardware is present on a specific AVR32 device.
The floating point hardware reads operands and places results in the same register file as the
traditional AVR32 instructions. Floating-point compare updates the flags in the AVR32 Status
Register, so that the regular AVR32 branch instructions can be used directly after a floatingpoint compare.
The floating-point hardware consists of a fused multiply-accumulate unit, performing ± A ± ( X × Y ) )
as a single operation with no intermediate rounding, thereby resulting in greater precision than if
separate multiplication and addition had been performed. Hardware is also provided to convert
between integer and floating-point, to compare floating-point values, and to provide initial
approximations for reciprocal and reciprocal square root.
4.1
Compliance
The floating point hardware conforms to the requirements of the C standard, which is based on
the IEEE 754 floating point standard.
The round-to-nearest, ties to even rounding mode is used for all instructions except float-to-integer conversions. Float-to-integer conversions use the round-to-zero mode.
The hardware supports denormal numbers.
Signalling NaN are not provided, all NaN are non-signalling (quiet). NaNs are not propagated,
the default quiet NaN is always returned (0x7FC00000).
No floating-point exceptions are generated.
4.2
Operations
The floating-point instructions are mapped into the coprocessor instruction space, but use the
ordinary integer register file. The ordinary integer instructions such as memory accesses and
logical operations can therefore be used on the same register data as the floating point hardware uses. Therefore, no special floating-point data transfer instructions are required. All floating
point instructions are mapped to coprocessor 0 cop instructions, i.e. they are aliases for cop
instructions.
Attempting to execute instructions on any other coprocessor than coprocessor 0 will return a
coprocessor absent exception.
Attempting to execute coprocessor 0 instructions other than cop on a device with floating point
hardware will result in an unimplemented instruction exception.
Attempting to execute coprocessor 0 cop instructions on a device without floating point hardware will result in an unimplemented instruction exception.
4.2.1
Floating point compare (fcp.s)
The floating point compare instruction, fcp.s, updates the status register flags. Ordinary AVR32
branch instructions such as breq and conditional instructions such as retge and movls can use
40
32002F–03/2010
AVR32
the condition flags set by fcmp.s directly. The following mapping from floating point compare
results to AVR32 status register flags is used:
Table 4-1.
Compare result
Status register flags
Less
SREG[C] = 1
SREG[N] = 1
SREG[V] = 0
SREG[Z] = 0
Greater
SREG[C] = 0
SREG[N] = 0
SREG[V] = 0
SREG[Z] = 0
Equal
SREG[C] = 0
SREG[N] = 0
SREG[V] = 0
SREG[Z] = 1
Unordered
SREG[C] = 0
SREG[N] = 0
SREG[V] = 1
SREG[Z] = 0
Table 4-2.
4.2.2
Floating point compare flag setting
Floating point branch conditions
Branch if:
AVR32 Branch condition mnemonic
Equal
eq
Not Equal
ne
Greater than or equal
ge
Greater than
gt
Less than
lo
Less than or equal
ls
Unordered
vs
Floating point check (fchk.s)
This instruction checks the operand for special values, such as Not-a-Number (NaN), infinity (inf)
and denormal. Status register flags are set according to the result of the fchk.s instruction. This
instruction is useful since some algorithms require special treatment of these special values. The
floating point approximation instructions updates the status register flags in the same way as
fchk.s, since iterative approximation algorithms require special handling of these special values.
Ordinary AVR32 branch instructions such as breq and conditional instructions such as retge and
movls can use the condition flags set by fchk.s directly.
41
32002F–03/2010
AVR32
4.3
Instruction set
The following instructions are provided:
Table 4-3.
Floating point arithmetical instructions
Issue
latency
Mnemonics
Operands
Description
fmac.s
Rd, Ra, Rx, Ry
Multiply accumulate.
(Rd ← Ra + Rx*Ry)
2
fnmac.s
Rd, Ra, Rx, Ry
Multiply accumulate.
(Rd ← −Ra + Rx*Ry)
2
fmsc.s
Rd, Ra, Rx, Ry
Multiply subtract.
(Rd ← Ra − Rx*Ry)
2
fnmsc.s
Rd, Ra, Rx, Ry
Multiply subtract.
(Rd ← −Ra − Rx*Ry)
2
fmul.s
Rd, Rx, Ry
Multiply.
(Rd ← Rx*Ry)
2
fnmul.s
Rd, Rx, Ry
Multiply.
(Rd ← −Rx*Ry)
2
fadd.s
Rd, Rx, Ry
Add.
(Rd ← Rx + Ry)
2
fsub.s
Rd, Rx, Ry
Subtract.
(Rd ← Rx − Ry)
2
:
Table 4-4.
Floating point conversion instructions
Issue
latency
Mnemonics
Operands
Description
fcastrs.sw
Rd, Ry
Convert float to signed word, round-to-zero.
(Rd ← (signed int)Ry)
1
fcastrs.uw
Rd, Ry
Convert float to unsigned word, round-to-zero.
(Rd ← (unsigned int)Ry)
1
fcastsw.s
Rd, Ry
Convert signed word to float, round-to-nearest.
(Rd ← (float)Ry)
1
fcastuw.s
Rd, Ry
Convert unsigned word to float, round-tonearest.
(Rd ← (float)Ry)
1
42
32002F–03/2010
AVR32
:
Table 4-5.
Floating point compare instructions
Issue
latency
Mnemonics
Operands
Description
fcp.s
Rd, Rx
Compare floating point values in Rd and Rx,
and set status register flags accordingly.
1
fchk.s
Ry
Check floating point value in Rd for special
values such as Inf, NaN and Denormal, and set
status register flags accordingly.
1
:
Table 4-6.
4.4
Floating point approximation instructions
Mnemonics
Operands
Description
Issue
latency
frcpa.s
Rd, Ry
(Rd ← approx(1/Rx)), set status flags as fchk.s
1
frsqrta.s
Rd, Ry
(Rd ← approx(1/sqrt(Rx))), set status flags as
fchk.s
1
Detailed instruction description
43
32002F–03/2010
AVR32
FMAC.S – Floating Point Multiply-Accumulate
Description
Performs multiply-accumulate of the registers specified and stores the result in destination register.
Operation:
I.
Rd ← Ra + Rx*Ry;
Syntax:
I.
fmac.s Rd, Ra, Rx, Ry
Operands:
I.
{a, d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
0
0
24
1
20
1
0
1
0
19
16
Ra
15
0
0
0
0
0
Rd
Rx
Ry
44
32002F–03/2010
AVR32
FNMAC.S – Floating Point Negate-Multiply-Accumulate
Description
Performs negate-multiply-accumulate of the registers specified and stores the result in destination register.
Operation:
I.
Rd ← - Ra + Rx*Ry;
Syntax:
I.
fnmac.s Rd, Ra, Rx, Ry
Operands:
I.
{a, d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
0
0
24
1
20
1
0
1
0
19
16
Ra
15
0
0
0
0
1
Rd
Rx
Ry
45
32002F–03/2010
AVR32
FMSC.S – Floating Point Multiply-Subtract
Description
Performs multiply-subtract of the registers specified and stores the result in destination register.
Operation:
I.
Rd ← Ra - Rx*Ry;
Syntax:
I.
fmsc.s Rd, Ra, Rx, Ry
Operands:
I.
{a, d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
0
1
24
1
20
1
0
1
0
19
16
Ra
15
0
0
0
0
0
Rd
Rx
Ry
46
32002F–03/2010
AVR32
FNMSC.S – Floating Point Negate-Multiply-Subtract
Description
Performs negate-multiply-subtract of the registers specified and stores the result in destination
register.
Operation:
I.
Rd ← - Ra - Rx*Ry;
Syntax:
I.
fnmsc.s Rd, Ra, Rx, Ry
Operands:
I.
{a, d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
0
1
24
1
20
1
0
1
0
19
16
Ra
15
0
0
0
0
1
Rd
Rx
Ry
47
32002F–03/2010
AVR32
FADD.S – Floating Point Add
Description
Performs addition of the registers specified and stores the result in destination register.
Operation:
I.
Rd ← Rx + Ry;
Syntax:
I.
fadd.s Rd, Rx, Ry
Operands:
I.
{d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
0
16
0
0
15
0
0
0
0
0
0
Rd
Rx
Ry
48
32002F–03/2010
AVR32
FSUB.S – Floating Point Subtract
Description
Performs subtraction of the registers specified and stores the result in destination register.
Operation:
I.
Rd ← Rx - Ry;
Syntax:
I.
fsub.s Rd, Rx, Ry
Operands:
I.
{d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
0
16
0
0
15
0
1
0
0
0
0
Rd
Rx
Ry
49
32002F–03/2010
AVR32
FMUL.S – Floating Point Multiplication
Description
Performs multiplication of the registers specified and stores the result in destination register.
Operation:
I.
Rd ← Rx * Ry;
Syntax:
I.
fmul.s Rd, Rx, Ry
Operands:
I.
{d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
0
16
1
0
15
0
0
0
0
0
0
Rd
Rx
Ry
50
32002F–03/2010
AVR32
FNMUL.S – Floating Point Multiply-Negate
Description
Performs multiply-negate of the registers specified and stores the result in destination register.
Operation:
I.
Rd ← - Rx * Ry;
Syntax:
I.
fnmul.s Rd, Rx, Ry
Operands:
I.
{d, x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
0
16
1
0
15
0
1
0
0
0
0
Rd
Rx
Ry
51
32002F–03/2010
AVR32
FCAST{S,U}W.S – Convert from Integer to Floating Point
Description
Converts the signed or unsigned integer specified and stores the result in destination register.
The conversion used is rounds to nearest, ties to even.
Operation:
I.
Rd ← (float)Rx;
Rx is signed integer, round-to-nearest-even
II.
Rd ← (float)Rx;
Rx is unsigned integer, round-to-nearest-even
Syntax:
I.
fcastsw.s
II.
fcastuw.s
Rd, Ry
Rd, Ry
Operands:
I-IV.
{d, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
S=0: Ry is an unsigned number, S=1: Ry is signed number
31
29 28
25 24
20
1
1
1
0
0
1
0
1
1
0
1
0
19
0
16
S
1
15
0
0
0
0
0
0
Rd
0
0
0
0
Ry
52
32002F–03/2010
AVR32
FCASTRS.{S,U}W – Convert from Floating Point to Integer
Description
Converts the floating-point number in the specified register to a signed or unsigned integer and
stores the result in destination register. Rounding used is towards zero.
Operation:
I.
Rd ← (signed int)Rx;
Round towards zero
II.
Rd ← (unsigned int)Rx;
Round towards zero
Syntax:
I.
fcastrs.sw
II.
fcastrs.uw
Rd, Ry
Rd, Ry
Operands:
I-IV.
{d, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
V:
N:
Z:
C:
Not affected
Not affected
Not affected
Not affected
Not affected
Opcode:
S=0: Ry is an unsigned number, S=1: Ry is signed number
31
29 28
25 24
20
1
1
1
0
0
1
0
1
1
0
1
0
19
1
16
S
0
15
0
1
0
0
0
0
Rd
0
0
0
0
Ry
53
32002F–03/2010
AVR32
FCP.S – Floating Point Compare
Description
Performs a compare between the two floating point operands specified. The operation is implemented by doing a floating-point subtraction without writeback of the difference. The operation
sets the status flags according to the result of the subtraction, but does not affect the operand
registers. See Table 4-2, “Floating point branch conditions,” on page 41 for branch condition
mnemonics corresponding to different compare results.
Operation:
I.
Rx - Ry;
Syntax:
I.
fcmp.s Rx, Ry
Operands:
I.
{x, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
Not affected
Compare result
Status register flags
Less
C←1
N←1
V←0
Z←0
Greater
C←0
N←0
V←0
Z←0
Equal
C←0
N←0
V←0
Z←1
Unordered
C←0
N←0
V←1
Z←0
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
1
16
0
1
15
0
0
0
0
0
0
0
0
0
0
Rx
Ry
54
32002F–03/2010
AVR32
FCHK.S – Floating Point Check for Special Values
Description
Checks the floating point operand specified for the special values Infinity, Not-a-Number and
Denormal. A check is also performed for values with the two biggest possible representable
exponents, i.e. 0xFD and 0xFE. This is useful for avoiding overflow in intermediate calculations
in certain iterative algorithms. The operation sets the status flags according to the result of the
check, but does not affect the operand register.
Operation:
I.
Set flags depending on the value in the specified register
Syntax:
I.
fchk.s Ry
Operands:
I.
y ∈ {0, 1, …, 15}
Status Flags:
Q:
Not affected
Status register flag
values if predicate
true
Condition for
branch if
predicate true
Condition for
branch if
predicate false
Operand == NaN
C←1
N←1
V←0
Z←0
lo
gt
Operand == Infinity
C←0
N←0
V←0
Z←0
gt
lo
Operand ==
(Denormal or
(Exponent==0xFD) or
(Exponent==0xFE))
C←0
N←0
V←0
Z←1
eq
ne
Operand == Normal
C←0
N←0
V←1
Z←0
vs
vc
Predicate
55
32002F–03/2010
AVR32
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
1
16
0
1
15
0
1
0
0
0
0
Rd
0
0
0
0
Ry
56
32002F–03/2010
AVR32
FRCPA.S – Floating Point Reciprocal Approximation
Description
Returns an approximation of the reciprocal of the operand. This can be used as a starting point
for iterative approximation algorithms. Also checks the operand for the special values Infinity,
Not-a-Number and Denormal. A check is also performed for values with the two biggest possible
representable exponents, i.e. 0xFD and 0xFE. This is useful for avoiding overflow in intermediate calculations in certain iterative algorithms. The operation sets the status flags according to
the result of this check.
Operation:
I.
Rd ← ApproximateReciprocal(Ry);
Set flags depending on the value in Ry
Syntax:
I.
frcpa.s Rd, Ry
Operands:
I.
{d, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
Not affected
Status register flag
values if predicate
true
Condition for
branch if
predicate true
Condition for
branch if
predicate false
Operand == NaN
C←1
N←1
V←0
Z←0
lo
gt
Operand == Infinity
C←0
N←0
V←0
Z←0
gt
lo
Operand ==
(Denormal or
(Exponent==0xFD) or
(Exponent==0xFE))
C←0
N←0
V←0
Z←1
eq
ne
Predicate
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
1
16
1
1
15
0
0
0
0
0
0
Rd
0
0
0
0
Ry
57
32002F–03/2010
AVR32
FRSQRTA.S – Floating Point Reciprocal Square Root Approximation
Description
Returns an approximation of the reciprocal of the square root of the operand. This can be used
as a starting point for iterative approximation algorithms. Also checks the operand for the special
values Infinity, Not-a-Number and Denormal. A check is also performed for values with the two
biggest possible representable exponents, i.e. 0xFD and 0xFE. This is useful for avoiding overflow in intermediate calculations in certain iterative algorithms. The operation sets the status
flags according to the result of this check.
Operation:
I.
Rd ← ApproximateSquareRootReciprocal(Ry);
Set flags depending on the value in Ry
Syntax:
I.
frsqrta.s
Rd, Ry
Operands:
I.
{d, y} ∈ {0, 1, …, 15}
Status Flags:
Q:
Not affected
Status register flag
values if predicate
true
Condition for
branch if
predicate true
Condition for
branch if
predicate false
Operand == NaN
C←1
N←1
V←0
Z←0
lo
gt
Operand == Infinity
C←0
N←0
V←0
Z←0
gt
lo
Operand ==
(Denormal or
(Exponent==0xFD) or
(Exponent==0xFE))
C←0
N←0
V←0
Z←1
eq
ne
Predicate
Opcode:
31
1
1
29
28
1
0
25
0
1
0
24
1
1
0
1
20
19
0
1
16
1
1
15
0
1
0
0
0
0
Rd
0
0
0
0
Ry
58
32002F–03/2010
AVR32
5. Secure State
Revision 3 of the AVR32 architecture introduced a separate system state allowing execution of
secure or secret code alongside nonsecure code on the same processor. The secret code will
execute in the secure state, and therefore be protected from hacking or readout by the code
executing in the nonsecure state.
Customers not needing the secure state functionality can just leave the associated hardware
disabled, as it is by default, and the device will behave as previous versions of the AVR32UC.
5.1
Basic concept
The secure state architecture extension divides the memory space into two sections, a secure
section and a nonsecure section. The processor can be in one of two execution states, secure or
nonsecure. The SS bit in the Status Register indicates which mode the processor is in. If the
processor is in the secure state, it can access both secure and nonsecure memory spaces, but if
it is in the nonsecure state, only nonsecure memory sections can be accessed. The SS_ADRR
and SS_ADRF registers are used to configure the sizes of these secure sections. How the
SS_ADR registers map secure sections of the associated memories is determined by the individual memories, but usually SS_ADR is programmed with a secure memory size starting from
the first address in the associated memory, ie. if SS_ADRF is programmed with the value 0x800,
the secure section of the flash contains the addresses from 0x8000_0000 to 0x8000_07FF. Any
sections of the RAM and Flash that are not in a secure section are considered nonsecure.
The processor can pass between the secure and nonsecure state by using dedicated sscall and
retss instructions. If an access to secure memory is attempted from nonsecure space, a bus
error exception is asserted and the access is aborted.
5.2
Typical use scenario
The secure state hardware support allows our customers to program their proprietary IP code
such as telecom stacks, DSP libraries etc into the secure section of the memories. This secret
code must be placed in a special secure section of the flash program memory, and locate its
secret data structures in a special secure section of the RAM. Thereafter a dedicated fuse in
non-volatile memory, called the Secure State Enable (SSE) fuse, is programmed. When set, this
fuse blocks all external access to the secure memories, both from debuggers and programs running in the nonsecure sections of the processor. The SSE fuse can only be erased by a full chip
erase, which will also erase all data in the memory secure sections.
This partially programmed device can then be sold to customers who will program their software
application into the nonsecure section of the memories. This software can communicate with the
secret IP code through a secure API provided by the secret code. This allows the application to
call routines in the secret software IP, however this IP is protected from hacking or unauthorized
copying. After the application has been programmed into the partially programmed device, the
security fuse in the flash is set, protecting the entire application from unauthorized readout by
any end user.
59
32002F–03/2010
AVR32
Figure 5-1.
Typical secure state use scenario
Empty device
COMPANY A
(Secret IP)
ATMEL
5.3
Secure memories
programmed
SSE set
COMPANY B
(Application)
All memories
programmed
SSE + flash security
fuse set
END USER
Secure state boot sequence
At system boot time, hardware state machines preloads the secure state address registers with
an initial value programmed into a secure section in the flash. Also, the SS bit in the status register is preloaded with the value of the Secure State Enable (SSE) fuse from the flash. This
preloading is done before the system has completed the boot sequence, so the secure state
address registers and SR[SS] are initialized before code starts executing and before the debug
system has been enabled.
5.4
Secure state debugging
Normally, debugging when executing in secure state should be turned off to prevent compromising the secure code. However, it is useful to allow debugging of the secure state code during
development of this code. A fuse in flash, called Secure State Debug Enable (SSDE), can be
programmed to enable debugging of secure state code.
5.5
Events in secure state
Normal RISC state interrupt and exception handling has been described in Section 3.7 ”Event
handling” on page 22. This behavior is modified in the following way when interrupts and exceptions are received in secure state:
• A sscall instruction will set SR[GM]. In secure state, SR[GM] masks both INT0-INT3, and
NMI. Clearing SR[GM], INT0-INT3 and NMI will remove the mask of these event sources.
INT0-INT3 are still additionally masked by the I0M-I3M bits in the status register.
• sscall has handler address at 0x8000_0004.
• Exceptions have a handler address at 0x8000_0008.
• NMI has a handler address at 0x8000_000C.
• BREAKPOINT has a handler address at 0x8000_0010.
• INT0-INT3 are not autovectored, but have a common handler address at 0x8000_0014.
Note that in the secure state, all exception sources share the same handler address. It is therefore not possible to separate different exception causes when in the secure world. The secure
world system must be designed to support this, the most obvious solution is to design the secure
software so that exceptions will not arise when executing in the secure world.
60
32002F–03/2010
AVR32
6. Memory System
AVR32UC implements a 32-bit unsegmented memory space. Regions of this memory space
can be protected by an optional MPU. The memory map is as follows:
Figure 6-1.
The AVR32UC memory map.
H'FFFFFFFF
1GB High Speed Bus
space
HSB
1GB Boot Program
Memory
BOOT
H'C0000000
H'80000000
1GB CPU Local Bus
Memory
LOCAL
H'40000000
1 GB Internal Data RAM
IRAM
H'00000000
6.1
Memory sections
The memory map contains four sections, named IRAM, LOCAL, BOOT and HSB. The IRAM
section contains the internal EX stage memory, and this memory is mapped from address 0 and
upwards. The LOCAL section is mapped from address 0x4000_0000 and is designed for containing device-specific high-speed interfaces, such as floating-point units, encryption hardware
or high-speed GPIO ports. Access to the LOCAL space is performed using any ordinary load
and store instructions, and is performed in a single clock cycle. Mapping timing-critical devices in
the LOCAL section is beneficial as the interface operates with high clock frequency, and its timing is deterministic since it does not need to access a shared bus which may be heavily loaded.
The BOOT section starts at address 0x8000_0000, which is the reset address for AVR32UC.
This section will typically contain an internal program FLASH, mapped from address
0x8000_0000 and upwards. The HSB section contains the addresses of all modules mapped on
the HSB bus. This may include peripherals such as USARTs and external memory interfaces.
The memory space is uniform, so program code can execute from the IRAM, BOOT and HSB
sections, and data accesses can be performed to any of the these sections. Note that implementations of AVR32UC of may forbid certain accesses to certain memory sections, eg a write to
program FLASH mapped into the BOOT section may be forbidden. The LOCAL section is only
accessible by the Load-Store Unit in the CPU EX pipeline stage, therefore, code can not be executed from addresses in the LOCAL space.
61
32002F–03/2010
AVR32
6.2
Memory interfaces
The AVR32UC CPU has three memory interfaces:
• IF stage HSB master interface for instruction fetches
• EX stage HSB master interface for data accesses into BOOT or HSB sections
• EX stage HSB slave interface enabling other parts of the system to access addresses in the
IRAM section
6.3
IF stage interface
The single master interface in the IF stage performs instruction fetches. All fetches are performed with word alignment, except for the first fetch after a change-of-flow, which may use
halfword alignment. The IF stage can not perform writes, only reads are possible. Reads can be
perfomed from all addresses mapped on the HSB bus. Reads are performed as incrementing
bursts of unspecified length. The IF stage master interface will stall appropriately to support slow
slaves.
6.4
EX stage interfaces
The EX stage separates between CPU accesses to the IRAM section, and accesses to
BOOT/HSB. Any access to the IRAM section are performed to dedicated, high-speed RAMs
implemented inside the memory controller. These fast RAMs are able to read or write within the
cycle they are initiated. This means that a load instruction in EX will have the read-data ready at
the end of the clock cycle for writing into the register file.
6.4.1
EX stage HSB master interface
Any CPU access to the BOOT or HSB sections will use multiple clock cycles, as dictated by the
HSB semantics. Writes to the BOOT or HSB sections can be pipelined, and are performed as a
stream of nonsequential transfers, each taking one cycle unless stalled by the slave. If the slave
stalls the transfer, the CPU will stall until the slave releases the stall. CPU reads from the BOOT
or HSB sections are not pipelined, and transfer of a data therefore takes two clock cycles, one
cycle for the address phase, and one cycle for the data phase. The CPU will be stalled in the
data phase.
6.4.2
EX stage HSB slave interface
The AVR32UC CPU provides a slave interface into the high-speed RAMs that are implemented
inside the memory controller. This interface enables other parts of the system, like DMA controllers, to write or read data to or from the RAMs. The slave interface support bursts for both reads
and writes. If the high-speed RAMs for some reason cannot accept the transfer request, it will
reply by stalling the request until it can be serviced.
The arbitration priorities between the CPU and the slave interface for the RAMs can be controlled by programming the CPU Control Register (CPUCR). The CPUCR is described in Section
2.5 on page 11. Arbitration is performed according to the following rules:
Assuming the memory interface is idle, and no memory transfers have been performed. Whoever requests access to the RAMs will win the arbitration and get access. If both the CPU and
the slave interface requests access, the CPU will win.
The source that won the arbitration can use the RAMs for as long as they require. If the other
source also has a pending request for use of the RAM, this source will have to wait maximum
the number of cycles specified by the SPL or CPL fields of CPUCR. The pending source will gain
62
32002F–03/2010
AVR32
access to the RAMs when the current owner voluntarily releases the RAMs, or after the
SPL/CPL timeout period, whichever comes first.
If the CPU wins arbitration for the RAMs, the CPU is guaranteed to own the RAM for the period
specified by the COP field in CPUCR. Any slave request will be left pending during this period,
even if the CPU is not using the RAMs.
The following state diagram shows the states in arbitration for the RAM.
Figure 6-2.
Arbitration between CPU and slave interface for RAMs.
RAM is free
1
5
2
6
3
CPU owns the RAM
Slave I/F owns the RAM
4
The state transitions are as follows:
1: CPU_wants_to_perform_mem_access
2: CPU_access_complete && (been_in_state > CPUCR[COP])
3: (been_in_state > CPUCR[COP]) && slave_wants_to_perform_mem_access &&
(slave_been_pending > CPUCR[SPL])
4: CPU_wants_to_perform_mem_access && (CPU_been_pending > CPUCR[CPL])
5: slave_wants_to_perform_mem_access && !CPU_wants_to_perform_mem_access
6: slave_access_complete
6.4.3
EX stage local bus interface
Any CPU access to the the LOCAL section is completed in a single clock cycle, both for reads
and writes. Transfers on this bus can not be stalled. The CPU will never be stalled due to an
access to the LOCAL section. Accesses to this section is performed using regular load-store
instructions such as for example ldswp.w, ld.w, ld.ub, st.w, stswp.w, ldm or stm.
Which devices are mapped in the LOCAL section, and their memory maps, is device-specific.
The LOCAL interface must be enabled by the user by programming the LOCEN bit in CPUCR.
Accesses to LOCAL memory addresses without first enabling the section will result in a BUS
ERROR exception.
If the MPU is enabled, accesses to LOCAL will be subject to permission checking.
To ensure maximum transfer speed and cycle determinism, any slaves being addressed by the
CPU on the local bus must be able to receive and transmit data on the bus at CPU clock speeds.
The consequences of this may vary between different slave devices, but for some slave devices
it may imply that the slaves have to run at the CPU clock frequency when local bus transfers are
63
32002F–03/2010
AVR32
being performed. Refer to the device datasheet for information on any relationships between
CPU and device clock frequencies imposed by the local bus.
6.5
IRAM Write buffer
The EX stage has a write buffer used to hold data to be written to the IRAM section. The operation of this buffer is usually transparent to the programmer. The programmer should be aware of
the following:
• The IRAM has a single port, allowing either one read or one write per clock cycle.
• The write buffer is pipelined, allowing sequential writes to IRAM to be pipelined without any
pipeline stalls. The previous contents of the write buffer is written to the RAM in parallel with
the new store data being placed in the write buffer.
• Any read instruction to IRAM in EX will be performed immediately, even if a previous store
instruction has placed data to store in the write buffer. In this case, the previous store data
remains in the write buffer and will be written back to RAM in a later clock cycle.
• If a read instruction in EX accesses the same address as the data in the write buffer is to be
stored to, the pipeline is stalled for one clock cycle while the write buffer is emptied to RAM.
The read will be performed normally in the next clock cycle.
• The contents of the write buffer is written to the physical RAM as soon as the memory
interface is not used by any instructions.
• The state of the write buffer may affect the timing of RMW instructions, see “Read-modifywrite instructions” on page 84 for details
6.6
Memory barriers
Memory barriers are constructs used to enfore memory consitency. Caches and self-modifying
code may cause memory to become inconsistent. AVR32UC has a simple pipeline with no
caches, so there is usually no need for memory barriers. Mechanisms for memory barriers are
present to handle the cases where such barriers are needed.
6.6.1
Instruction memory barriers
An instruction memory barrier (IMB) is usually only needed when executing self-modifying code,
for example when self-programming program flash. In this case, one must ensure that all levels
in the memory hierarchy are consistent. Due to the simple non-cached memory system in
AVR32UC, this is usually trivial.
The programmer should make sure that an IMB is used if there is a possibility that an instruction
to be modified by self-modifying code has already been prefetched by the instruction prefetch
unit. In this case, an IMB should be inserted between the instruction modifying the code and the
execution of the modified instruction. To make sure that the modified version of the instruction is
executed, the prefetch buffer should be flushed between changing the program memory and
executing the new version of the program.
Any instruction performing a change-of flow, such as return from exception, conditional
branches, unconditional branches, subprogram call or return, or instructions writing to PC would
implement an IMB in AVR32UC.
6.6.2
Data memory barriers
A data memory barrier (DMB) is used to make sure that a data memory access, either a read or
write, is actually performed before the rest of the code is executed. Caches, write buffers and
64
32002F–03/2010
AVR32
bus latency may cause a memory access to be seen by a slave many cycles after it has been
executed by the pipeline. In some cases, this may lead to UNPREDICTABLE behavior in the
system.
One example of this is found in interrupt handlers. One usually wants to make sure that the interrupt request has been cleared before executing the rete instruction, otherwise the same interrupt
may be serviced immediately after executing the rete instruction. In this case a DMB must be
inserted between the code clearing the interrupt request and the rete.
All accesses to HSB space are strongly ordered. This is used to implement DMBs. A DMB after
a store to a HSB slave is implemented by performing a dummy read from the same slave. Any
critical code after the read will stall until the read has been performed.
Consider an interrupt request made by a peripheral. This peripheral will disassert the interrupt
request as soon as the interrupt handler has written a specific bitmask to its
PERIPH_INTCLEAR register. A read from the same peripheral performs a bus transfer that
implements the DMB. The rete instruction can be executed after the DMB.
Code 6-1.
Clearing IRQs using data memory barriers
// Using data memory barriers in the IRQ handler to make sure that the
// request has been disasserted before returning from the handler
// Assume that the IRQ is cleared by writing a bitmask to PERIPH_INTCLEAR.
// r0 points to this register, r1 contains the correct bitmask.
irq_handler:
<some instructions>
st.w r0[0], r1
ld.w r12, r0[0] // data memory barrier
rete
65
32002F–03/2010
AVR32
7. Memory Protection Unit
The AVR32 architecture defines an optional Memory Protection Unit (MPU). This is a simpler
alternative to a full MMU, while at the same time allowing memory protection. The MPU allows
the user to divide the memory space into different protection regions. These protection regions
have a user-defined size, and starts at a user-defined address. The different regions can have
different cacheability attributes and bufferability attributes. Each region is divided into 16 subregions, each of these subregions can have one of two possible sets of access permissions.
The MPU does not perform any address translation.
7.1
Memory map in systems with MPU
An AVR32 implemetation with a MPU has a flat, unsegmented memory space. Access permissions are given only by the different protection regions.
7.2
Understanding the MPU
The AVR32 Memory Protection Unit (MPU) is responsible for checking that memory transfers
have the correct permissions to complete. If a memory access with unsatisfactory privileges is
attempted, an exception is generated and the access is aborted. If an access to a memory
address that does not reside in any protection region is attempted, an exception is generated
and the access is aborted.
The user is able to allow different privilege levels to different blocks of memory by configuring a
set of registers. Each such block is called a protection region. Each region has a user-programmable start address and size. The MPU allows the user to program 8 different protection
regions. Each of these regions have 16 sub-regions, which can have different access permissions, cacheability and bufferability.
The “DMMU SZ” fields in the CONFIG1 system register identifies the number of implemented
protection regions, and therefore also the number of MPU registers. An AVR32UC system with
caches also have MPU cacheability and bufferability registers.
A protection region can be from 4 KB to 4 GB in size, and the size must be a power of two. All
regions must have a start address that is aligned to an address corresponding to the size of the
region. If the region has a size of 8 KB, the 13 lowest bits in the start address must be 0. Failing
to do so will result in UNDEFINED behaviour. Since each region is divided into 16 sub-regions,
each sub-region is 256 B to 256 MB in size.
When an access hits into a memory region set up by the MPU, hardware proceeds to determine
which subregion the access hits into. This information is used to determine whether the access
p e r m i s s io n s f o r t h e s u b r e g i o n a r e g iv e n i n M P U A P R A/ M PU B R A / M P U C R A o r i n
MPUAPRB/MPUBRB/MPUCRB.
If an access does not hit in any region, the transfer is aborted and an exception is generated.
The MPU is enabled by writing setting the E bit in the MPUCR register. The E bit is cleared after
reset. If the MPU is disabled, all accesses are treated as uncacheable, unbufferable and will not
generate any access violations. Before setting the E bit, at least one valid protection region must
be defined.
7.2.1
MPU interface registers
The following registers are used to control the MPU, and provide the interface between the MPU
and the operating system, see Figure 7-1 on page 67. All the registers are mapped into the Sys-
66
32002F–03/2010
AVR32
tem Register space, their addresses are presented in “System registers” on page 11. They are
accessed with the mtsr and mfsr instructions.
The MPU interface registers are shown below. The suffix n can have the range 0-7, indicating
which region the register is associated with.
Figure 7-1.
The MPU interface registers
MPUARn
12 11
31
Base Address
6 5
-
1 0
Size
V
MPUPSRn
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
P15
P14
P13
P12
P11
P10
P9
P8
P7
P6
P5
P4
P3
P2
P1
P0
31
-
MPUCRA / MPUCRB
8 7 6 5 4 3 2 1 0
C7
C6
C5
C4
C3
C2
C1
C0
31
-
MPUBRA / MPUBRB
8 7 6 5 4 3 2 1 0
B7
B6
B5
B4
B3
B2
B1
B0
31
-
MPUAPRA / MPUAPRB
28 27
31
AP7
24 23
AP6
20 19
AP5
16 15
AP4
12 11
AP3
8 7
AP2
4 3
AP1
0
AP0
MPUCR
31
1 0
-
7.2.1.1
E
MPU Address Register - MPUARn
A MPU Address register is implemented for each of the 8 protection regions. The MPUAR registers specify the start address and size of the regions. The start address must be aligned so that
its alignment corresponds to the size of the region. The minimum allowable size of a region is 4
KB, so only bits 31:12 in the base address needs to be specified. The other bits are always 0.
Each MPUAR also has a valid bit that specifies if the protection region is valid. Only valid regions
are considered in the protection testing.
The MPUAR register consists of the following fields:
• Base address - The start address of the region. The minimum size of a region is 4KB, so only
the 20 most significant bits in the base address needs to be specified. The 12 lowermost
base address bits are implicitly set to 0. If protection regions larger than 4 KB is used, the
user must write the appropriate bits in Base address to 0, so that the base address is aligned
to the size of the region. Otherwise, the result is UNDEFINED.
67
32002F–03/2010
AVR32
• Size - Size of the protection region. The possible sizes are shown in Table 7-1 on page 68.
Table 7-1.
Protection region sizes implied by the Size field
Size
Region size
Constraints on Base address
B’00000 to B’01010
UNDEFINED
-
B’01011
4 KB
None
B’01100
8 KB
Bit [12] in Base Address must be 0
B’01101
16 KB
Bit [13:12] in Base Address must be 0
B’01110
32 KB
Bit [14:12] in Base Address must be 0
B’01111
64 KB
Bit [15:12] in Base Address must be 0
B’10000
128 KB
Bit [16:12] in Base Address must be 0
B’10001
256 KB
Bit [17:12] in Base Address must be 0
B’10010
512 KB
Bit [18:12] in Base Address must be 0
B’10011
1 Mb
Bit [19:12] in Base Address must be 0
B’10100
2 MB
Bit [20:12] in Base Address must be 0
B’10101
4 MB
Bit [21:12] in Base Address must be 0
B’10110
8 MB
Bit [22:12] in Base Address must be 0
B’10111
16 MB
Bit [23:12] in Base Address must be 0
B’11000
32 MB
Bit [24:12] in Base Address must be 0
B’11001
64 MB
Bit [25:12] in Base Address must be 0
B’11010
128 MB
Bit [26:12] in Base Address must be 0
B’11011
256 MB
Bit [27:12] in Base Address must be 0
B’11100
512 MB
Bit [28:12] in Base Address must be 0
B’11101
1 GB
Bit [29:12] in Base Address must be 0
B’11110
2 GB
Bit [30:12] in Base Address must be 0
B’11111
4 GB
Bit [31:12] in Base Address must be 0
• V - Valid. Set if the protection region is valid, cleared otherwise. This bit is written to 0 by a
reset. The region is not considered in the protection testing if the V bit is cleared.
7.2.1.2
MPU Permission Select Register - MPUPSRn
A MPU Permission Select register is implemented for each of the 8 protection regions. Each
MPUPSR register divides the protection region into 16 subregions. The bitfields in MPUPSR
specifies whether each subregion has access permissions as specified by the region entry in
either MPUAPRA or MPUAPRB.
Table 7-2.
Subregion access permission implied by MPUPSR bitfields
MPUPSRn[P]
Access permission
0
MPUAPRA[APn]
1
MPUAPRB[APn]
68
32002F–03/2010
AVR32
7.2.1.3
MPU Cacheable Register A / B- MPUCRA / MPUCRB
The MPUCR registers have one bit per region, indicating if the region is cacheable. If the corresponding bit is set, the region is cacheable. The register is written to 0 upon reset.
AVR32UC implementations may optionally choose not to implement the MPUCR registers.
7.2.1.4
MPU Bufferable Register A / B- MPUBRA / MPUBRB
The MPUBR registers have one bit per region, indicating if the region is bufferable. If the corresponding bit is set, the region is bufferable. The register is written to 0 upon reset.
AVR32UC implementations may optionally choose not to implement the MPUBR registers.
7.2.1.5
MPU Access Permission Register A / B - MPUAPRA / MPUAPRB
The MPUAPR registers indicate the access permissions for each region. The MPUAPR is written to 0 upon reset. The possible access permissions are shown in Table 7-3 on page 69.
Table 7-3.
7.2.1.6
Access permissions implied by the APn bits
AP
Privileged mode
Unprivileged mode
B’0000
Read
None
B’0001
Read / Execute
None
B’0010
Read / Write
None
B’0011
Read / Write / Execute
None
B’0100
Read
Read
B’0101
Read / Execute
Read / Execute
B’0110
Read / Write
Read / Write
B’0111
Read / Write / Execute
Read / Write / Execute
B’1000
Read / Write
Read
B’1001
Read / Write
Read / Execute
B’1010
None
None
Other
UNDEFINED
UNDEFINED
MPU Control Register - MPUCR
The MPUCR controls the operation of the MPU. The MPUCR has only one field:
• E - Enable. If set, the MPU address checking is enabled. If cleared, the MPU address
checking is disabled and no exceptions will be generated by the MPU.
7.2.2
MPU exception handling
This chapter describes the exceptions that can be signalled by the MPU.
7.2.2.1
ITLB Protection Violation
An ITLB protection violation is issued if an instruction fetch violates access permissions. The violating instruction is not executed. The address of the failing instruction is placed on the system
stack.
69
32002F–03/2010
AVR32
7.2.2.2
DTLB Protection Violation
An DTLB protection violation is issued if a data access violates access permissions. The violating access is not executed. The address of the failing instruction is placed on the system stack.
7.2.2.3
ITLB Miss Violation
An ITLB miss violation is issued if an instruction fetch does not hit in any region. The violating
instruction is not executed. The address of the failing instruction is placed on the system stack.
7.2.2.4
DTLB Miss Violation
An DTLB miss violation is issued if a data access does not hit in any region. The violating access
is not executed. The address of the failing instruction is placed on the system stack.
7.2.2.5
TLB Multiple Hit Violation
An access hit in multiple protection regions. The address of the failing instruction is placed on
the system stack. This is a critical system error that should not occur.
7.3
Example of MPU functionality
As an example, consider region 0. Let region 0 be of size 16 KB, thus each subregion is 1KB.
Subregion 0 has offset 0-1KB from the base address, subregion 1 has offset 1KB-2KB and so
on.
MPUAPRA and MPUAPRB each has one field per region. Each subregion in region 0 can get its
access permissions from either MPUAPRA[AP0] or MPUAPRB[AP0], this is selected by the subregion’s bitfield in MPUPSR0.
Let:
MPUPSR0 = {0b0000_0000_0000_0000, 0b1010_0000_1111_0101}
MPUAPRA = {A, B, C, D, E, F, G, H}
MPUAPRB = {a, b, c, d, e, f, g, h}
where {A-H, a-h} have legal values as defined in Table 7-3.
Thus for region 0:
Table 7-4.
Example of access rights for subregions
Subregion
Access
permission
Subregion
Access
permission
0
h
8
H
1
H
9
H
2
h
10
H
3
H
11
H
4
h
12
H
5
h
13
h
6
h
14
H
7
h
15
h
70
32002F–03/2010
AVR32
8. Instruction Cycle Summary
This chapter presents the instructions in AVR32UC CPU, and the number of clock cycle they
require to complete. All the instructions in each group behave similarly in the pipeline. The final
subchapter presents code examples to illustrate the clock cycle requirements of various code
constructs.
8.1
Definitions
The following definitions are presented in the tables below:
8.1.1
Issue
An instruction is issued when it leaves the ID stage and enters the EX stage.
8.1.2
Issue latency
The issue latency represents the number of clock cycles required between the issue of the
instruction and the issue of the following instruction. For some change-of-flow instructions, this
includes the cycle penalty caused by the pipeline flush. The issue latency assumes, unless
stated otherwise, that the instruction and data memories are able to return an instruction or data
in a single cycle, which may not be true for slow program memories or data memories mapped
on the HSB bus.
8.2
Special considerations
8.2.1
PC as destination register
Most instructions can take PC as destination register. This will result in a jump to the calculated
address. The jump is performed when the instruction writing to PC has completed, and all other
effects of the instruction, like updating of pointer registers for load instructions with PC as target
instruction, have been committed. Instructions writing to PC will have an additional issue latency
of 2 cycles due to the pipeline flush.
8.2.2
Alignment of change-of-flow targets
The cycle count number for change-of-flow instructions assumes that the target instruction is a
compact instruction or word-aligned extended instruction. An extra cycle will be required if the
target instruction is a halfword-aligned extended instruction, since both halves of the instruction
must be fetched before it can be issued.
8.2.3
Memory and bus timings
Performance of memory accesses and instruction fetching are affected by the performance of
system memories and system bus. The following are examples of factors that may affect the
cycle count of such operations:
• Accesses to the IRAM section in parallel with another bus master, for example a DMA
controller.
• Accesses to memories with wait states, for example flash or external memories.
• Using system buses with wait states or arbitration overhead.
• Accesses to memories that are simultaneously being accessed by other bus masters.
71
32002F–03/2010
AVR32
8.3
CPU revision
Revision 1, 2 and 3 of the AVR32UC CPU has the same instruction timings, except that the
divider in revision 2 and later is faster than in revision 1. Instructions only present in revision 2 or
3 of the CPU are explicitly noted.
8.4
ALU instructions
This group comprises simple single-cycle ALU instructions like add and sub. The conditional
subtract and move instructions are also in this group. All instructions in this group, except ssrf to
bits 15 to 31, take one cycle to execute, and the result is available for use by the following
instruction.
Table 8-1.
ALU instructions
Mnemonics
Operands
Description
Issue
latency
abs
C
Rd
Absolute value.
1
acr
C
Rd
Add carry to register.
1
adc
E
Rd, Rx, Ry
Add with carry.
1
add
C
Rd, Rs
Add.
1
E
Rd, Rx,
(Ry << sa)
Add shifted.
1
add{cond4}
E
Rd, Rx, Ry
Add if condition satisfied. CPU revision 2 and
higher only.
1
addhh.w
C
Rd, Rx<part>,
Ry<part>
Add signed halfwords
(32 ← 16 + 16)
1
addabs
E
Rd, Rx, Ry
Add with absolute value.
1
cp.b
E
Rd, Rs
Compare byte.
1
cp.h
E
Rd, Rs
Compare halfword.
1
C
Rd, Rs
C
Rd, imm
E
Rd, imm
C
Rd
cp.w
cpc
1
Compare.
1
1
1
Compare with carry.
E
Rd, Rs
max
E
Rd, Rx, Ry
Return signed maximum
1
min
E
Rd, Rx, Ry
Return signed minimum
1
neg
C
Rd
Two’s Complement.
1
C
Rd, Rs
rsub
1
1
Reverse subtract.
E
Rd, Rs, k8
1
rsub{cond4}
E
Rd, imm
Reverse subtract immediate if condition
satisfied. CPU revision 2 and higher only.
1
sbc
E
Rd, Rx, Ry
Subtract with carry.
1
scr
C
Rd
Subtract carry from register.
1
72
32002F–03/2010
AVR32
Table 8-1.
ALU instructions
C
Rd, Rs
1
E
Rd, Rx,
(Ry << sa)
1
C
Rd, imm
E
Rd, imm
1
E
Rd, Rs, imm
1
C
Rd, Rx<part>,
Ry<part>
Subtract signed halfwords
(32 ← 16 - 16)
1
E
Rd, imm
Subtract immediate if condition satisfied.
1
E
Rd, imm
Subtract if condition satisfied. CPU revision 2
and higher only.
1
C
Rd
Test no byte equal to zero.
1
C
Rd, Rs
E
Rd, Rx, Ry << sa
E
Rd, Rx, Ry >> sa
and{cond4}
E
Rd, Rx, Ry
Logical AND if condition satisfied. CPU revision
2 and higher only.
1
andn
C
Rd, Rs
Logical AND NOT.
1
E
Rd, imm
Logical AND High Halfword, low halfword is
unchanged.
1
E
Rd, imm, COH
Logical AND High Halfword, clear other
halfword.
1
E
Rd, imm
Logical AND Low Halfword, high halfword is
unchanged.
1
E
Rd, imm, COH
Logical AND Low Halfword, clear other
halfword.
1
C
Rd
One’s Complement (NOT).
1
C
Rd, Rs
E
Rd, Rx, Ry << sa
E
Rd, Rx, Ry >> sa
eor{cond4}
E
Rd, Rx, Ry
Logical EOR if condition satisfied. CPU revision
2 and higher only.
1
eorh
E
Rd, imm
Logical Exclusive OR
(High Halfword).
1
eorl
E
Rd, imm
Logical Exclusive OR
(Low Halfword).
1
C
Rd, Rs
E
Rd, Rx, Ry << sa
E
Rd, Rx, Ry >> sa
E
Rd, Rx, Ry
sub
subhh.w
sub{cond4}
tnbz
and
Subtract.
1
1
Logical AND.
1
1
andh
andl
com
eor
or
or{cond4}
1
Logical Exclusive OR.
1
1
1
Logical (Inclusive) OR.
1
1
Logical OR if condition satisfied. CPU revision
2 and higher only.
1
73
32002F–03/2010
AVR32
Table 8-1.
ALU instructions
orh
E
Rd, imm
Logical OR (High Halfword).
1
orl
E
Rd, imm
Logical OR (Low Halfword).
1
tst
C
Rd, Rs
Test register for zero.
1
bfins
E
Rd, Rs, o5, w5
Insert the lower w5 bits of Rs in Rd at bit-offset
o5.
1
bfexts
E
Rd, Rs, o5, w5
Extract and sign-extend the w5 bits in Rs
starting at bit-offset o5 to Rd.
1
bfextu
E
Rd, Rs, o5, w5
Extract and zero-extend the w5 bits in Rs
starting at bit-offset o5 to Rd.
1
bld
E
Rd, b5
Bit load.
1
brev
C
Rd
Bit reverse.
1
bst
E
Rd, b5
Bit store.
1
casts.b
C
Rd
Typecast byte to signed word.
1
casts.h
C
Rd
Typecast halfword to signed word.
1
castu.b
C
Rd
Typecast byte to unsigned word.
1
castu.h
C
Rd
Typecast halfword to unsigned word.
1
cbr
C
Rd, b5
Clear bit in register.
1
clz
E
Rd, Rs
Count leading zeros.
1
sbr
C
Rd, b5
Set bit in register.
1
swap.b
C
Rd
Swap bytes in register.
1
swap.bh
C
Rd
Swap bytes in each halfword.
1
swap.h
C
Rd
Swap halfwords in register.
1
E
Rd, Rx, Ry
E
Rd, Rs, sa
C
Rd, sa
1
E
Rd, Rx, Ry
1
E
Rd, Rs, sa
C
Rd, sa
1
E
Rd, Rx, Ry
1
E
Rd, Rs, sa
C
Rd, sa
rol
C
Rd
Rotate left through carry.
1
ror
C
Rd
Rotate right through carry.
1
C
Rd, imm
asr
lsl
lsr
1
Arithmetic shift right (signed).
Logical shift left.
Logical shift right.
1
1
1
1
1
Load immediate into register.
mov
E
Rd, imm
1
C
Rd, Rs
Copy register.
1
E
Rd, Rs
Copy register if condition is true
1
E
Rd, imm
Load immediate into register if condition is true
1
mov{cond4}
74
32002F–03/2010
AVR32
Table 8-1.
8.5
ALU instructions
movh
E
Rd, imm
Load immediate into high halfword of register.
CPU revision 2 and higher only.
1
csrf
C
b5
Clear status register flag.
1
csrfcz
C
b5
Copy status register flag to C and Z.
1
ssrf
C
b5
Set status register flag.
1/3
sr{cond4}
C
Rd
Conditionally set register to true or false
1
Multiply instructions
These instructions require one pass through the multiplier array and produce a 32- or 48-bit
result. For mulrndhh, a rounding value of 0x8000 is added to the product producing the final
result. This group does not set any flags, except for the mulsat instructions which set Q if saturation occurred.
Table 8-2.
Multiply instructions
Mnemonics
Issue
latency
Operands
Description
E
Rd, Rx, Ry
Multiply.
(32 ← 32 x 32)
1
E
Rd, Rs, imm
Multiply immediate.
1
mulhh.w
E
Rd, Rx<part>,
Ry<part>
Signed Multiply of halfwords.
(32 ← 16 x 16)
1
mulnhh.w
E
Rd, Rx<part>,
Ry<part>
Signed Multiply of halfwords.
(32 ← 16 x 16)
1
mulnwh.d
E
Rd, Rx, Ry<part>
Signed Multiply, word and halfword.
(48 ← 32 x 16)
1
mulwh.d
E
Rd, Rx, Ry<part>
Signed Multiply, word and halfword.
(48 ← 32 x 16)
1
mulsathh.h
E
Rd, Rx<part>,
Ry<part>
Fractional signed multiply with saturation.
Return halfword.
(16 ← 16 x 16)
1
mulsathh.w
E
Rd, Rx<part>,
Ry<part>
Fractional signed multiply with saturation.
Return word.
(32 ← 16 x 16)
1
mulsatwh.w
E
Rd, Rx, Ry<part>
Fractional signed multiply with saturation.
Return word.
(32 ← 32 x 16)
1
mulsatrndhh.h
E
Rd, Rx<part>,
Ry<part>
Signed multiply with rounding. Return
halfword.
(16 ← 16 x 16)
1
mulsatrndwh.
w
E
Rd, Rx, Ry<part>
Signed multiply with rounding. Return
halfword.
(32 ← 32 x 16)
1
mul
75
32002F–03/2010
AVR32
8.6
MAC instructions
These instructions require one pass through the multiplier array and produce a 32- or 48-bit
result. This result is added to an accumulator register. A valid copy of this accumulator may be
cached in the accumulator cache. Otherwise, an extra cycle is needed to read the accumulator
from the register file. Therefore, issue and result latencies depend on whether the accumulator
is cached in the AccCache.
This group does not set any flags, except for the macsathh.w instruction which set Q if saturation
occurred.
Table 8-3.
MAC instructions
Mnemonics
8.7
Operands
Description
Issue
latency
mac
E
Rd, Rx, Ry
Multiply accumulate.
(32 ← 32x32 + 32)
1/2
machh.w
E
Rd, Rx<part>,
Ry<part>
Multiply signed halfwords and accumulate.
(32 ← 16x16 + 32)
1/2
machh.d
E
Rd, Rx<part>,
Ry<part>
Multiply signed halfwords and accumulate.
(48 ← 16x16 + 48)
1/2
macwh.d
E
Rd, Rx, Ry<part>
Multiply signed word and halfword and
accumulate.
(48 ← 32 x 16 + 48)
1/2
macsathh.w
E
Rd, Rx<part>,
Ry<part>
Fractional signed multiply accumulate with
saturation. Return word.
(32 ← 16 x 16 + 32)
1/2
MulMac64 instructions
These instructions require two passes through the multiplier array to produce a 64-bit result. For
macs.d and macu.d, a valid copy of this accumulator may be cached in the accumulator cache.
Otherwise, an extra cycle is needed to read the accumulator from the register file. Therefore,
issue and result latencies depend on whether a valid entry is found in the accumulator cache.
Table 8-4.
MulMac64 instructions
Mnemonics
Operands
Description
Issue
latency
macs.d
E
Rd, Rx, Ry
Multiply signed accumulate.
(64 ← 32x32 + 64)
3/4
macu.d
E
Rd, Rx, Ry
Multiply unsigned accumulate.
(64 ← 32x32 + 64)
3/4
muls.d
E
Rd, Rx, Ry
Signed Multiply.
(64 ← 32 x 32)
2
mulu.d
E
Rd, Rx, Ry
Unsigned Multiply.
(64 ← 32 x 32)
2
76
32002F–03/2010
AVR32
8.8
Divide instructions
These instructions require several cycles in the EX stage to complete. The divs and divu instructions will be aborted immediately if any interrupts are pending, in order to limit the interrupt
latency. The divide instructions are faster in revision 2 than in revision 1 of the AVR32UC CPU.
Table 8-5.
Divide instructions
Mnemonics
divs
E
divu
E
Issue
latency
Operands
Description
Rd, Rx, Ry
Divide signed.
(32 ← 32/32)
(32 ← 32%32)
191
Rd, Rx, Ry
Divide unsigned.
(32 ← 32/32)
(32 ← 32%32)
191
1.) 35 cycles in revision 1 of the CPU
8.9
Saturate instructions
These instructions perform arithmetic operations with possible saturation.
Table 8-6.
Saturate instructions
Mnemonics
Operands
Description
Issue
latency
satadd.h
E
Rd, Rx, Ry
Saturated add halfwords.
1
satadd.w
E
Rd, Rx, Ry
Saturated add.
1
satsub.h
E
Rd, Rx, Ry
Saturated subtract halfwords.
1
E
Rd, Rx, Ry
satsub.w
8.10
1
Saturated subtract.
E
Rd, Rs, imm
1
satrnds
E
Rd >> sa, b5
Signed saturate from bit given by sa after a
right shift with rounding of b5 bit positions.
2
satrndu
E
Rd >> sa, b5
Unsigned saturate from bit given by sa after a
right shift with rounding of b5 bit positions.
2
sats
E
Rd >> sa, b5
Shift sa positions and do signed saturate from
bit given by b5.
1
satu
E
Rd >> sa, b5
Shift sa positions and do unsigned saturate
from bit given by b5.
1
Load and store instructions
This group includes all the load and store instructions. The address calculations are performed
by the adder in the EX stage. The EX adder also performs the writeback address calculation for
the autoincrement and autodecrement operation.
Loaded data are available at the end of the cycle in the EX stage. Byte and halfword data must
be extended and rotated before they are valid. This is performed in the EX stage. Ldins and
ldswp instructions also require modification in the EX stage before their results are valid. Stswp
instructions require modification before their data is output to the memory interface. This modification is performed in the EX stage.
77
32002F–03/2010
AVR32
The stcond instruction takes 2 cycles if the store is not performed, 3 cycles if the store is
performed.
All issue latencies are given for accesses to IRAM or LOCAL. These timings must be modified
as follows for accesses to BOOT or HSB sections:
• A byte, halfword or word load requires 1+w cycles in addition to the count listed in Table 8-7,
where w is the number of wait states from the slave and bus system. The pipeline will stall
during these cycles.
• A doubleword load performs two memory accesses, so 2(1+w) cycles are needed in addition
to the count listed in Table 8-7. The pipeline will stall during these cycles.
• A byte, halfword or word store requires (1+w) cycles in addition to the count listed in Table 87, where w is the number of wait states from the slave and bus system. Stores to BOOT or
HSB can be performed in the background, so the pipeline will only stall if another memory
access is attempted during these w cycles. However, multiple stores to addresses in BOOT
or HSB can be automatically combined by the memory interface to create bursts on the HSB
bus. This means that any consecutive stores to BOOT or HSB sections will not stall the
pipeline unless the bus itself inserts wait cycles, for example due to wait states or bus
contention. Instructions not performing memory accesses will never stall the pipeline when
executed after stores to BOOT or HSB.
• A doubleword store performs two memory accesses, but these will be pipelined. The last of
these accesses will stall if the instruction following the doubleword is a memory access
instruction other than a store to BOOT or HSB. Therefore, a non-memory instruction or
another store to BOOT or HSB should be scheduled after a doubleword store to BOOT or
HSB for maximum performance.
Table 8-7.
Load and store instructions
Operands
Description
Issue
latency
IRAM
C
Rd, Rp++
Load unsigned byte with post-increment.
2
C
Rd, --Rp
Load unsigned byte with pre-decrement.
2
C
Rd, Rp[disp]
Mnemonics
ld.ub
1
Load unsigned byte with displacement.
ld.ub{cond4}
E
Rd, Rp[disp]
1
E
Rd, Rb[Ri<<sa]
Indexed Load unsigned byte.
1
E
Rd, Rp[disp]
Load unsigned byte with displacement if
condition satisfied. CPU revision 2 and higher
only.
1
E
Rd, Rp[disp]
Load signed byte with displacement.
1
E
Rd, Rb[Ri<<sa]
Indexed Load signed byte.
1
E
Rd, Rp[disp]
Load signed byte with displacement if condition
satisfied. CPU revision 2 and higher only.
1
ld.sb
ld.sb{cond4}
78
32002F–03/2010
AVR32
Table 8-7.
ld.uh
Load and store instructions
C
Rd, Rp++
Load unsigned halfword with post-increment.
2
C
Rd, --Rp
Load unsigned halfword with pre-decrement.
2
C
Rd, Rp[disp]
1
Load unsigned halfword with displacement.
ld.uh{cond4}
ld.sh
E
Rd, Rp[disp]
1
E
Rd, Rb[Ri<<sa]
Indexed Load unsigned halfword.
1
E
Rd, Rp[disp]
Load unsigned halfword with displacement if
condition satisfied. CPU revision 2 and higher
only.
1
C
Rd, Rp++
Load signed halfword with post-increment.
2
C
Rd, --Rp
Load signed halfword with pre-decrement.
2
C
Rd, Rp[disp]
1
Load signed halfword with displacement.
ld.sh{cond4}
ld.w
E
Rd, Rp[disp]
1
E
Rd, Rb[Ri<<sa]
Indexed Load signed halfword.
1
E
Rd, Rp[disp]
Load signed halfword with displacement if
condition satisfied. CPU revision 2 and higher
only.
1
C
Rd, Rp++
Load word with post-increment.
2
C
Rd, --Rp
Load word with pre-decrement.
2
C
Rd, Rp[disp]
1
Load word with displacement.
E
Rd, Rp[disp]
E
Rd, Rb[Ri<<sa]
Indexed Load word.
1
E
Rd, Rp
[Ri<part> << 2]
Load word with extracted index.
1
E
Rd, Rp[disp]
Load word with displacement if condition
satisfied. CPU revision 2 and higher only.
1
C
Rd, Rp++
Load doubleword with post-increment.
3
C
Rd, --Rp
Load doubleword with pre-decrement.
3
C
Rd, Rp
Load doubleword.
2
E
Rd, Rp[disp]
Load double with displacement.
2
E
Rd, Rb[Ri<<sa]
Indexed Load double.
2
ldins.b
E
Rd<part>,
Rp[disp]
Load byte with displacement and insert at
specified byte location in Rd.
1
ldins.h
E
Rd<part>,
Rp[disp]
Load halfword with displacement and insert at
specified halfword location in Rd.
1
ldswp.sh
E
Load halfword with displacement, swap bytes
and sign-extend.
1
ldswp.uh
E
Load halfword with displacement, swap bytes
and zero-extend.
1
ldswp.w
E
Load word with displacement and swap bytes.
1
lddpc
C
Load with displacement from PC.
1
ld.w{cond4}
ld.d
Rd, Rp[disp]
Rd, PC[disp]
1
79
32002F–03/2010
AVR32
Table 8-7.
lddsp
st.b
Load and store instructions
C
Rd, SP[disp]
Load with displacement from SP.
1
C
Rp++, Rs
Store with post-increment.
1
C
--Rp, Rs
Store with pre-decrement.
1
C
Rp[disp], Rs
1
Store byte with displacement.
st.b{cond4}
st.d
st.h
E
Rp[disp], Rs
1
E
Rb[Ri<<sa], Rs
Indexed Store byte.
2
E
Rp[disp], Rs
Store byte with displacement if condition
satisfied. CPU revision 2 and higher only.
1
C
Rp++, Rs
Store with post-increment.
2
C
--Rp, Rs
Store with pre-decrement.
2
C
Rp, Rs
Store doubleword.
2
E
Rp[disp], Rs
Store double with displacement.
2
E
Rb[Ri<<sa], Rs
Indexed Store double.
3
C
Rp++, Rs
Store with post-increment.
1
C
--Rp, Rs
Store with pre-decrement.
1
C
Rp[disp], Rs
1
Store halfword with displacement.
st.h{cond4}
st.w
E
Rp[disp], Rs
1
E
Rb[Ri<<sa], Rs
Indexed Store halfword.
2
E
Rp[disp], Rs
Store halfword with displacement if condition
satisfied. CPU revision 2 and higher only.
1
C
Rp++, Rs
Store with post-increment.
1
C
--Rp, Rs
Store with pre-decrement.
1
C
Rp[disp], Rs
1
Store word with displacement.
E
Rp[disp], Rs
1
E
Rb[Ri<<sa], Rs
Indexed Store word.
2
st.w{cond4}
E
Rp[disp], Rs
Store word with displacement if condition
satisfied. CPU revision 2 and higher only.
1
stcond
E
Rp[disp], Rs
Conditional store with displacement.
2/3
stdsp
C
SP[disp], Rs
Store with displacement from SP.
1
E
Rp[disp<<2], Rx,
Ry
Combine halfwords to word and store with
displacement
2
E
Rb[Ri<<sa], Rx,
Ry
Combine halfwords to word and store indexed
2
Swap bytes and store halfword with
displacement.
1
Swap bytes and store word with displacement.
1
sthh.w
stswp.h
E
Rp[disp], Rs
stswp.w
E
80
32002F–03/2010
AVR32
8.11
Multiple data memory access instructions
These instructions perform multiple data accesses. The incremental accesses are performed as
word accesses. The number of cycles is dependent on the number of registers to load or store,
n. The issue latency must be modified as follows:
• LDM and POPM will use an additional cycle if testing of R12 is performed
• LDM and POPM that updates PC will cause a change-of-flow, which is performed in parallel
with the pointer writeback and therefore has a penalty of only one cycle.
• The issue latency for HSB accesses increases if the HSB bus is busy or the slave inserts wait
states.
The instructions in this group will be aborted immediately if any interrupts are pending, in order
to limit the interrupt latency.
Table 8-8.
Multiple data memory accesses
Mnemonics
Operands
Description
Issue
latency
IRAM
Issue
latency
HSB
ldm
E
Rp, Reglist16
Load multiple registers. R12 is
tested if PC is loaded. If PC is in the
register list, p=1, otherwise p=0.
1+n+p
1+2n+p
ldm
E
Rp++, Reglist16
Load multiple registers. R12 is
tested if PC is loaded.
2+n
1+2n
ldmts
E
Rp, Reglist16
Load multiple registers for task
switch.
1+n
1+2n
ldmts
E
Rp++, Reglist16
Load multiple registers for task
switch.
2+n
1+2n
popm
C
Reglist8
Pop multiple registers from stack.
R12 is tested if PC is popped.
2+n
1+2n
pushm
C
Reglist8
Push multiple registers to stack.
2+n
3+n
stm
E
Rp, Reglist16
Store multiple registers.
1+n
2+n
stm
E
--Rp, Reglist16
Store multiple registers.
2+n
3+n
stmts
E
Rp, Reglist16
Store multiple registers for task
switch.
1+n
2+n
stmts
E
--Rp, Reglist16
Store multiple registers for task
switch.
2+n
3+n
81
32002F–03/2010
AVR32
8.12
Branch instructions
The branch instructions cause a pipeline flush and change-of-flow if taken. Two cycles must be
added to the issue latency if the branch is taken.
Table 8-9.
Branch instructions
Mnemonics
br{cond3}
Operands
C
Description
disp
Issue
latency
1
Branch if condition satisfied.
8.13
br{cond4}
E
disp
1
rjmp
C
disp
Relative jump.
1
ret{cond4}
C
Rs
Conditional return from subroutine with move
and test of return value.
1
Call instructions
Call instructions behave similarly to branches, except that the link register (LR) must be
updated. The issue latency presented in the table includes the branch penalty.
The breakpoint instruction takes a single cycle if Debug mode is disabled, in this case it executes as a nop. The breakpoint instruction updates RAR_DBG instead of LR.
Table 8-10.
Call instructions
Mnemonics
Operands
Description
Issue
latency
acall
C
disp
Application call
4
icall
C
Rd
Register indirect call.
4
mcall
E
Rp[disp]
Memory call.
4
C
disp
rcall
E
8.14
4
Relative call.
disp
4
scall
C
Supervisor call
6
sscall
C
Secure State call. CPU revision 3 and higher
only.
5
breakpoint
C
Breakpoint.
3
Return from execution mode instructions
The rete and rets instruction may pop the status register and return address from the system
stack, and perform a branch to the return address. The retd instruction gets the return address
and return status registers from the RAR_DBG and RSR_DBG system registers. The issue
latency presented in the table includes the branch penalty.
82
32002F–03/2010
AVR32
The rete instruction has a latency of 12 cycles when returning from INT0-INT3 modes, 5 cycles
otherwise. The rete instruction can be aborted by a pending interrupt.
Table 8-11.
Return from execution mode instructions
Mnemonics
8.15
Operands
Description
Issue
latency
retd
C
Return from debug mode
3
rete
C
Return from exception
5 / 12
rets
C
Return from supervisor call
5
retss
C
Return from Secure State call. CPU revision 3
and higher only.
5
Swap instructions
The swap instruction performs two atomical memory accesses, first one read and then one
write.
Table 8-12.
Swap instructions
Mnemonics
xchg
8.16
E
Operands
Description
Issue
latency
Rd, Rx, Ry
Exchange register and memory.
2
System register instructions
This group moves data to and from the system registers. Accesses to system registers are performed in the EX stage, taking one cycle.
MTSR to SREG takes 3 cycles, MTSR to all other system registers takes 1 cycle.
Table 8-13.
System register instructions
Mnemonics
8.17
Operands
Description
Issue
latency
mfdr
E
Rd, SysRegNo
Move debug register to Rd.
1
mfsr
E
Rd, SysRegNo
Move system register to Rd.
1
mtdr
E
SysRegNo, Rs
Move Rs to debug register.
1
mtsr
E
SysRegNo, Rs
Move Rs to system register.
1/3
musfr
C
Rs
Move Rs to status register.
1
mustr
C
Rd
Move status register to Rd.
1
System control instructions
This group contains simple single-cycle instructions that control the behaviour of different parts
of the system. The frs, pref and sync instructions are executed as NOP in AVR32UC.
Table 8-14.
System control instructions
Mnemonics
frs
Operands
C
Description
Issue
latency
Invalidate the return address stack.
1
83
32002F–03/2010
AVR32
Table 8-14.
8.18
System control instructions
pref
E
Rp[disp]
Prefetch cache line.
1
sleep
E
Op8
Enter SLEEP mode.
1
sync
E
Op8
Flush write buffer.
1
Read-modify-write instructions
This group contains instructions that perform atomical bit-operations on memory addresses.
These instructions require multiple cycles inside the memory controller, but these can be performed in parallel with subsequent instructions if the following instructions are not memory
access instructions.
A RMW instruction performed on an address in the IRAM section executes in a single cycle if the
IRAM write buffer is empty. If the write buffer is not empty, two cycles are required. The programmer can make sure the buffer is empty by ensuring that the instruction immediately before
the RMW instruction is not a store or another RMW instruction.
If the RMW instruction is performed on an address in the HSB section, four cycles are needed
for the RMW instruction to be executed. Therefore, if another instruction attempts to access
memory within one of the three following clock cycles, up to three stall cycles will be inserted. If
a memory access instruction is scheduled less than 3 cycles after the RMW to HSB instruction,
3-n stall cycles are inserted. Here n is the number of cycles used by instructions between the
RMW instruction and the first memory access instruction. RMW operations to the HSB section
will take additional cycles if the HSB inserts wait states.
When using RMW instructions, try to schedule code so that stall cycles are avoided.
Table 8-15.
Read-modify-write instructions to IRAM section
Mnemonics
8.19
Code example
8.19.1
Assumptions
Operands
Description
Execution
cycles IRAM
Execution
cycles HSB
memc
E
imm, bp
Clear bit in memory.
1/2
4
mems
E
imm, bp
Set bit in memory.
1/2
4
memt
E
imm, bp
Toggle bit in memory.
1/2
4
In the example code given in this chapter, the following assumptions are made:
• r0 points to an address in the IRAM space. IRAM is an alias for r0.
• r1 points to an address in the HSB or BOOT space. HSB is an alias for r1.
• All memories and buses have 0 wait state access.
• The CPU is in a priviliged mode, so that no privilege violations occur
84
32002F–03/2010
AVR32
• All instructions are executed in the precise sequence shown below
Instruction
Cycles
Description
add r5, r0
1
sub r5, r5
1
ssrf AVR32_SREG_C
1
ssrf AVR32_SREG_GM
3
max r6, r1, r0
1
mul r5, r1
1
mac r5, r1
1
1 cycle since r5 is already in the accumulator cache
mac r3, r1
2
2 cycles since r3 is not in the accumulator cache
mac r3, r1
1
1 cycle since r3 is not in the accumulator cache
macs.d r2, r1, r2
4
4 cycles since register pair r3:r2 is not in the accumulator cache
mulwh.d r6, r5, r1:t
1
48 bit result calculated and written back in 1 cycle
st.w IRAM[0], r6
1
divs r4, r5, r6
35
ld.w r8, IRAM++
2
satadd.w r4, r8, r9
1
ld.w r4, IRAM[4]
1
add r4, r4
1
ld.w r5, IRAM[8]
1
ld.w r6, IRAM[12]
1
Loads from IRAM can be adjacent without any stalling
ldm IRAM++, r5, r6, r7, r8
5
ldm from IRAM takes 1+n= 5 cycles when loading 4 registers
mfsr r8, AVR32_SREG
1
cbr r8, 0
1
mtsr AVR32_SREG, r8
3
st.w IRAM[0], r5
1
st.w IRAM[4], r6
1
Stores to IRAM can be adjacent without any stalling
nop
1
nop takes 1 cycle
ld.w r5, HSB[0]
2
Reading from memories on the bus takes 2 cycles
ld.w r6, HSB[4]
2
HSB bus reads are not pipelined, each read takes 2 cycles
st.w HSB[8], r6
1
HSB store done in background if 2 next insn is not mem access
add r5, r6
1
Nonmem insn scheduled after HSB store to avoid stall
and r7, r8
1
Nonmem insn scheduled after HSB store to avoid stall
st.w HSB[8], r6
2
First of consecutive HSB stores requires extra cycle to start
st.w HSB[12], r7
1
Consecutive HSB stores are pipelined.
st.w HSB[16], r8
1
Consecutive HSB stores are pipelined.
add r5, r6
1
Consecutive HSB stores followed by 1 nonmem insn do not stall
SSRF to bits 31-16 takes 3 cycles
Load with postincrement takes two cycles
No data hazard after loads from IRAM
mtsr to SREG takes 3 cycles, 1 cycle required for other sysregs
85
32002F–03/2010
AVR32
Instruction
Cycles
Description
ldm HSB++, r5, r6, r7, r8
9
1+n=1+2r where r=#regs when reading from HSB addresses
stm HSB, r5, r6, r7, r8
6
1+n=1+(1+r) where r=#regs when writing to HSB addresses
add r6, r5
1
Instruction not using the write buffer
memc IRAM, 3
1
Requires only one cycle because write buffer was empty
st.w IRAM[4], r8
1
Instruction filling the write buffer
memc IRAM, 3
2
Requires 2 cycles because write buffer was not empty
mul r5, r9
1
memc HSB, 7
4
ld.w r4, IRAM[16]
1
memc HSB, 7
1
sub r7, r4
1
mul r6, r9
1
or r5, r8
1
Next instruction has a memory access, 3 stall cycles needed
Memory not accessed in the 3 following clock cycles, no stall
86
32002F–03/2010
AVR32
9. OCD system
9.1
Overview
The AVR32 CPU is targeted at a wide range of 32-bit applications. The CPU can be delivered in
very different implementations in various ASIC’s, ASSP’s, and standard parts to satisfy requirements for low-cost as well as high-speed markets. According to the cost sensitivity and
complexity of these applications, a similar span in debug complexity must be expected. While
some users expect very simple debug features, or none at all, others will demand full-speed
trace and RTOS debug support. This also applies to the debug tools: While the simplest development takes place on simulators and development boards, most will require basic on-chip
debug emulators, and a few will require complex emulators with full-speed trace.
To match these criteria, the AVR32 OCD system is designed in accordance with the Nexus 2.0
standard (IEEE-ISTO 5001™-2003), which is a highly flexible and powerful open on-chip debug
standard for 32-bit microcontrollers.
9.1.1
Features
• Nexus compliant debug solution
• OCD supports any CPU speed
• Execute debug specific CPU instructions (debug code) from program memory monitor or
external debugger
• Debug code can read and write all registers and data memory
• Debug code can communicate with debugger through the debug port
• Debug mode can be entered by external command, breakpoint instruction, or hardware
breakpoints
• Six program counter hardware breakpoints are supported
• Two data breakpoints are supported
• Breakpoints can be configured as watchpoints (flagged to the external debugger)
• Hardware breakpoints can be combined to give break on ranges
• Real-time program counter branch tracing
• Real-time data trace
• Real-time process trace
• Nexus Class 2+
9.1.2
OCD controller overview
The OCD system interfaces provides the external debugger with access to the on-chip debug
logic through the JTAG port and the Auxiliary (AUX) port, as shown in Figure 9-1. The operation
is described briefly below and in more detail in separate chapters.
9.1.2.1
Host, debugger, and emulator
At the host side, the user debugs his software using a source level debugger, which can read his
compiled and linked object code. The source level debugger accesses features in the emulator
and OCD system through an API (defined by the vendor or based on the Nexus recommendations), which constitutes the abstract interface between the source level debugger and the
emulator. The API translates high-level functions, such as setting breakpoints or reading memory areas, to sets of low level commands understood by the OCD controller. Certain operations
87
32002F–03/2010
AVR32
(such as reading the register file) may require running sections of debug code on the CPU,
which can also be handled in this level. The emulator translates the communication from the
host into commands transmitted to the target over the JTAG port. If trace is enabled, trace messages are transmitted from the device on the Nexus-defined auxiliary (AUX) port. The AUX port
can be scaled to the number of output pins needed to sustain the estimated bandwidth requirement. The Nexus protocol defines the format of the messages and signals, the pin count options
and pinout of the debug port, and the type of connector used.
Figure 9-1.
Block diagram of the OCD system (shaded) and its main connections.
TAP
JTAG Port
Host
Emulator
Service Access
Port (SAP)
AUX Port
OCD system
Transmit Queue
W atchpoint
msg
Service Access Bus (SAB)
Data
Trace msg
CPU observation units
Debug
Status msg
Data Trace
Trigger
Program
Trace
Flow
Control
Unit
Breakpoint
Trigger
OCD
Debug
control
inst
signals
PC
Comparators
Breakpoint Unit
Data
Comparators
Branch
Trace
Message
Ownership
Trace
Message
Ownership
Trace
Unit
CPU
observation
signals
CPU
9.1.2.2
Accessing the debug features
A number of blocks handle the various debug functions specified by the Nexus standard. The
emulator communicates with registers in these blocks by commands on the JTAG port, as specified by the Nexus standard. OCD registers are typically used for configuration, control, and
status information. Trace information and debug events can also generate messages to be
transmitted on the AUX port.
Registers are indexed and are accessed through Read Register and Write Register messages
from the emulator. Alternatively, they can be accessed by the CPU through mtdr and mfdr
instructions, which gives a debug monitor in the CPU access to most of the debug features in
the OCD system, as described in “OCD Register Access” on page 98.
88
32002F–03/2010
AVR32
9.1.2.3
Transmit Queue
Trace and watchpoint messages are inserted into the Transmit Queue (TXQ) before being transmitted on the AUX port. This provides some flexibility between the peak rate of trace message
generation and the average rate of message transmission on the AUX port.
9.1.2.4
Flow Control Unit
The Flow Control Unit (FCU) can bring the CPU into and out of Debug Mode, and control the
CPU operation in Debug Mode. The behavior is controlled by accessing OCD registers.
Debug Mode can be configured as OCD Mode or Monitor Mode. In OCD mode, The CPU
fetches instructions from the Debug Instruction Register. If the register is empty, the CPU is
halted. In Monitor Mode, the CPU fetches debug instructions from a monitor code in the program
memory, and the Debug Instruction Register is not used.
The FCU also handles single stepping by returning the CPU to normal mode, letting the CPU
fetch one instruction from the program memory, and then returning to Debug Mode on the following instruction.
9.1.2.5
Breakpoint modules
A number of instruction and data breakpoint modules can be configured for run-time monitoring
of the instruction fetches and data accesses by the CPU. The modules can report if the monitored operation matches a predefined address, alternatively, also a data value. The modules
operate on virtual addresses.
A breakpoint will bring the CPU into Debug Mode. Watchpoints are reported to the debugger,
but does not affect CPU operation. A watchpoint can also be configured to start or stop data and
program trace.
The breakpoint modules can be combined to produce a watchpoint or breakpoint. Complex
breakpoint/watchpoint conditions are supported, e.g. trigger when a specific procedure writes a
certain variable with a specific value.
9.1.2.6
Program and Data Trace
The Program Trace Unit sends Branch Trace Messages to the debugger, which allows the program flow to be reconstructed. To keep the amount of debug information low to save bandwidth,
only change of program flow are reported (such as unconditional branches, taken conditional
branches interrupts, exceptions, return operations, and load operations with PC as destination),
hence the term "branch tracing". Messages are typically relative to the previously transmitted
message, to be able to compress information as much as possible. Thus, the trace messages
are sent out in temporal order, and regularly, synchronization messages with uncompressed,
absolute addresses, are transmitted in case synchronization is lost.
The Data Trace Unit similarly traces data accesses, for read or write accesses, or both. Similar
relative address compression and synchronization schemes are used for Data Trace Messages.
Since new trace messages can be generated before the previous ones have been transmitted,
all trace messages are queued before being transmitted by the AUX interface. If the queue overflows, the CPU can be halted to avoid losing trace information, or an error message followed by
synchronization trace messages will be transmitted.
9.1.2.7
OS debug support
Applications developed on an OS platform places special requirements on the OCD controller
and the debug software. For high-level debugging, the user will want to see which process is
89
32002F–03/2010
AVR32
running at any time, without having to interrupt the CPU or trace the program flow. This is
accomplished through Ownership Trace Messaging, in which the process ID of the running process is reported at every process switch. The CPU writes the process ID to an OCD register in
the Ownership Trace Unit, which in turn generates an Ownership Trace Message.
9.1.2.8
Timestamps
The emulator can tag events with a timestamp when they are extracted from the OCD system
and transmitted to the emulator, to provide timing information for these events when they are
transmitted to the debug host. However, due to the delay of the transmit queue and transmit time
over the AUX port, this timing will have limited accuracy. To compensate for this, the EVTO pin
can be configured to toggle every time a message is inserted into the Transmit Queue, thus indicating very precisely when each event occurs. The emulator would then store a queue of
timestamp tags with each event, and associate each tag with the corresponding message, as
they are extracted on the AUX port.
9.2
CPU Development Support
The OCD system can bring CPU into and out of Debug Mode, and control the CPU operation in
Debug Mode. The behavior is controlled by OCD register configuration, stop commands from
the debugger, or breakpoints. The OCD registers can be accessed by Nexus messages or from
the CPU as memory-mapped registers.
9.2.1
Debug Mode
Debug Mode is an execution mode dedicated to application debugging and is not intended for
running application code. Debug Mode can execute a debug code either from an external
debugger through the OCD system (OCD Mode), or from a debug routine in program memory
(Monitor Mode). The debug code will typically read out system registers and information about
the various processes running in the system before restarting.
The Nexus class 2+ compliant OCD system contains breakpoint and trace modules, and other
features for debugging code on the CPU. These features are generally accessible both in OCD
Mode and Monitor Mode. In OCD Mode, the debugger accesses the features through messages
over the AUX debug port, and in Monitor Mode, the CPU accesses the features through mtdr
and mfdr instructions. The OCD system runs at system speed to stay synchronous with the CPU
at all times. If the CPU is in a low-power sleep mode, it is woken up before entering Debug
Mode.
9.2.1.1
Operations in Debug Mode
Debug Mode is characterized by the Debug (D) bit in the Status Register (SR) in the CPU.
Debug Mode is a privileged mode, and all legal instructions and memory operations are permitted Illegal opcodes or memory operations which would normally cause an exception will be
ignored in Debug Mode.
The Debug Mode has a dedicated Link and Return Status Register (RAR_DBG and RSR_DBG,
respectively) but no other masked registers. RAR_DBG and RSR_DBG are not observable as
part of the register file, only as system registers. The register file view is mapped according to
the mode bits in the Status Register (M[2:0]). These bits are set to the exception context when
entering Debug Mode, but can be changed freely within Debug Mode by writing to SR. In this
way, different register contexts can be observed and modified, while maintaining the execution
and access privileges of Debug Mode.
90
32002F–03/2010
AVR32
Debug Mode is exited by the retd instruction, both in Monitor Mode and OCD Mode. This
restores PC from RAR_DBG and SR from RSR_DBG.
9.2.1.2
A typical debug session flow
Figure 9-2 shows an example of a typical flow in Debug Mode. A software or hardware breakpoint aborts the execution of an instruction and causes Debug Mode to be entered. If the Monitor
Mode (MM) bit in the Development Control (DC) OCD register is set, Monitor Mode is entered,
and the CPU jumps to the software debug monitor starting at EVBA+0x01C. Otherwise, OCD
Mode is entered, and the CPU stalls while waiting for instructions to be entered by the external
debugger through the Debug Instruction (DINST) OCD register. In either case, the D bit in the
CPU Status Register is set during the whole debug session, giving access to all privileged operations. Any number of instructions can be executed before returning to the breakpointed
instruction by the retd instruction. RAR_DBG stores the address of the breakpointed instruction,
and manipulating RAR_DBG in Debug Mode is useful if a different return address is desired (for
instance, to avoid repeated hits on a breakpoint instruction).
Figure 9-2.
Example of flow in Debug Mode.
User code
Debug Mode
LR_DBG
Breakpointed instruction
DC:MM?
External
Debugger
0 = OCD Mode
1 = Monitor Mode
Write Register commands
EV BA +0x300
DINST
Inst
Instructions from
externaldebugger
SR:D = 1
Software debug monitor
SR:D = 1
retd
retd
9.2.2
Monitor Mode
If the Monitor Mode (MM) bit in the Development Control register (DC) is set, the CPU will enter
Debug Mode in Monitor Mode. Instructions are fetched from the monitor code located in the program memory at the Exception Vector Base Address (EVBA) + 0x01C. The monitor code
contains the necessary mechanisms to read and modify CPU and system registers, and memory
91
32002F–03/2010
AVR32
areas. All other exceptions and interrupts are masked by default when entering Monitor Mode,
but the monitor code can explicitly unmask interrupts to allow critical interrupts to be serviced
while the system is being debugged.
The monitor code will typically communicate with an external debug tool, or (in cases of
advanced systems like PDA’s) a debug tool running within the application (self-hosted debugger). Communication with the external tool may take place over any communication link present
in that device (e.g. USB, RS232), if such a communication line can be reserved for debug
purposes.
Alternatively, the Debug Communication Mechanism in the OCD system can be used to communicate between the CPU and emulator over the JTAG port. This is a set of OCD registers which
can be written by the CPU or emulator, allowing a communication protocol to be developed in
software. This mechanism can be used in any privileged CPU mode, including OCD Mode.
Monitor Mode is exited with the retd instruction.
9.2.2.1
Debugging a monitor code
Each execution mode has a mask bit in SR, which indicates if a request to enter that mode will
be taken or masked. The default priority of modes are reflected in these bits: When entering an
execution mode, modes of the same or lower priority are masked. Privileged modes can override the mask, to dynamically change priorities (e.g. to allow critical interrupts to be serviced).
By default, Debug Mode has priority above all other execution modes. This implies that any
supervisor or user code can be interrupted by Debug Mode. Other modes can be explicitly
unmasked by a monitor code to allow critical interrupts to be serviced. By default, Debug Mode
is masked by the Debug Mask (DM) bit in SR when executing in Monitor Mode. The Monitor
Mode can stack away the RAR_DBG and RSR_DBG and then explicitly clear the DM bit to
enable Debug Mode to be re-entered. If a debug exception occurs in Monitor Mode, the OCD
system will bring the CPU into OCD Mode, even if the MM bit is set. This allows Monitor Mode
programs to be debugged.
9.2.3
OCD Mode
If the Monitor Mode (MM) bit in the Development Control register (DC) is cleared, the CPU will
enter Debug Mode in OCD Mode. When the CPU is in OCD Mode, the Debug Status (DBS) bit
in the Development Status (DS) register is set, in addition to the D bit in SR in the CPU. OCD
Mode is similar to Monitor Mode, except that instructions are fetched from the OCD system.
OCD instructions are loaded by the debug tool by writing the opcode to the Debug Instruction
register (DINST). Once an instruction is written to DINST, the CPU will fetch it, and the Instruction Complete bit in DS (DS:INC) will be cleared until the CPU has completed the operation. The
CPU is then halted until DINST is written again.
The first instruction entered must be aligned to the MSB of DINST. A sequence of instructions
can be entered to DINST one word at a time, in the same sequence they would appear in program memory, i.e. they do not need to be word aligned. If the upper halfword of an extended
instruction is written to the lower halfword of DINST, the lower halfword of the instruction is written as the upper halfword of DINST in the next access. If the last instruction in a sequence is
written to the upper halfword of DINST, the lower halfword should be written with a nop opcode.
See Figure 9-3 for an illustration of a sequence of operations used to execute instructions in
OCD Mode.
92
32002F–03/2010
AVR32
Any instruction valid in Monitor Mode is also valid in OCD Mode. Memory operations can be conducted without any special synchronization with external hardware.
All OCD units can be configured while the CPU executes in OCD Mode, but the following debug
features are disabled:
• PC breakpoints
• Data breakpoints
• Watchpoints
• Program Trace
• Data Trace
OCD Mode is exited by writing the retd instruction to DINST.
Figure 9-3.
Executing instructions on the CPU in OCD Mode.
OCD
Instructions
Opcode
Written by
tool to DINST
mov r12,r7
0x0E9C
0x0E9C201C
INC→0→1
sub
0x201C
r12,0x01
mov r6,r12
0x1896
0x1896F807
INC→0→1
adc
0xF807 0046
0x0046D623
INC→0→1
r6,r12,r7
retd
9.2.4
Changes in DS
0xD623
DBS→0
Entry into Debug Mode
Debug Mode can only be entered when the OCD is enabled, and Debug Mode is not masked.
The following ways of entry are then possible:
• Debug request from the debugger
• Program counter breakpoint
• Data address or value breakpoint
• breakpoint instruction
• Trapping opcode 0x0000
• Single step
• Event on EVTI pin
• Abort command from the debugger
The debugger can identify the condition which caused entry into Debug Mode by examining the
status bits in the Development Status register (DS). Each cause of entry has a particular bit
associated with it. Several exceptions can trigger simultaneously, causing more than one bit to
be set.
Note that any privileged CPU mode may write the SR:D bit to one directly, but this will not cause
entry to Debug Mode.
93
32002F–03/2010
AVR32
9.2.4.1
Debug request
The debugger may want to stop CPU operation, unrelated to current instruction execution, e.g. if
the user presses a "STOP" button in the debug tool GUI. The debugger will then write the Debug
Request (DBR) bit in the Development Control Register (DC). This causes the CPU to enter
Debug Mode on the next instruction to be executed, before execution.
9.2.4.2
Program counter breakpoint
The Program Counter breakpoints can be configured to halt the CPU when executing code at a
specific address, or address range. This will cause the CPU to be halted before the breakpointed instruction is executed.
The Ignore First Match (IFM) bit in the Development Control (DC) register should be written to
one before exiting Debug Mode, to avoid re-triggering the program breakpoint. This bit only prevents program breakpoints from re-triggering. If the instruction causes a breakpoint for another
reason (e.g. a breakpoint instruction or a data breakpoint), Debug Mode will be re-entered.
9.2.4.3
Data address or value breakpoint
CPU memory accesses can be monitored by data breakpoint comparators in the OCD system. If
the access matches a set of predefined conditions (e.g. address, value, or access type), Debug
Mode is entered after the memory operation completes, but before the next instruction is
executed.
Data breakpoints are precise, halting on the instruction immediately after the memory operation
which caused the breakpoint. The CPU will return to the first non-executed instruction when a
retd is executed.
9.2.4.4
breakpoint instruction
The breakpoint instruction is programmed along with the object code into the program memory
or instruction cache, and is decoded by the CPU. When this instruction is scheduled for execution and Debug Mode is enabled, the CPU will enter Debug Mode. If Debug Mode is disabled
(e.g. masked by the DM bit in the Status Register, or DBE in DC is zero), the breakpoint instruction will execute as a nop (no operation).
For devices based on volatile program memory, the breakpoint instruction can be dynamically
inserted into the code by the debug tool, enabling an unlimited number of program breakpoints
in the code. This involves replacing an existing opcode with a breakpoint instruction. The
replaced opcode has to be re-inserted before exiting Debug Mode. Note that this is only possible
in OCD Mode.
For devices based on non-volatile program memory, the breakpoint instruction can be statically
compiled or linked into the code before downloading, marking all points the program can be
halted. Debug Mode will be entered for all breakpoints (if Debug Mode is enabled), and the
debugger would return immediately if it does not want to halt at a particular breakpoint location in
the code.
The breakpoint will be taken before the breakpoint instruction is actually executed. This has the
effect that the CPU will return from Debug Mode to the same breakpoint instruction, re-entering
Debug Mode immediately, unless the OCD system is configured to modify the return address or
replace the breakpoint instruction from the instruction flow. The IFM bit does not have an effect
when Debug Mode returns to a breakpoint instruction.
94
32002F–03/2010
AVR32
9.2.4.5
Trapping opcode 0x0000
In Flash-based microcontrollers, the opcode 0x0000 can overwrite any other opcode without
having to erase and reprogram the Flash. Therefore this instruction can enter Debug Mode, as
for the breakpoint instruction. However, the opcode 0x0000 is also a valid part of the instruction
set (ADD R0,R0 in AVR32) and can be part of the software to be debugged. Therefore, the user
must write the DC:TOZ (Trap Opcode Zero) bit to one to enable this feature.
The DS:BOZ bit will be set if Debug Mode is entered due to a trapped 0x0000 instruction. The
debugger must then identify whether this opcode belongs to the original object file or has been
inserted by the debugger as a software breakpoint. If it was part of the object file, the debugger
should use the Instruction Replacement to return to the program, and insert the 0x0000 opcode
in DINST. Executing 0x0000 during Instruction Replacement only performs an ADD R0,R0 operation without re-entering Debug Mode.
9.2.4.6
Single stepping
The debugger will typically allow the user to step through the application source or object code,
line by line. This single stepping can be either of step-into or step-over type. Step-into will execute exactly one instruction and halt the CPU at the start of the next instruction, regardless of
whether this instruction is part of the main program, subroutine, interrupt, or exception. Stepover will execute the current instruction and any lower-level events generated before the following instruction (including subroutines, interrupts, and exceptions).
Step-over in the object code and all single stepping in the source code are implemented by configuring a program breakpoint on the address of the next object code instruction where the
debugger expects to halt.s
Step-into is implemented in OCD hardware and is controlled by the Single Step (SS) bit in the
Development Control register. When Debug Mode is exited by retd, exactly one instruction from
the program memory will be executed before Debug Mode is re-entered. This mechanism works
identically for OCD and Monitor Mode.
9.2.4.7
Event on EVTI pin
If the Event In Control (EIC) bits in DC are written to 0b01, a high-to-low transition on the EVTI
pin will generate a breakpoint. EVTI must stay low for one CPU clock cycle to guarantee that the
breakpoint will trigger. The External Breakpoint (EXB) bit in DS will be set when a breakpoint is
entered due to an event on the EVTI pin.
9.2.4.8
Abort command
Some software errors could cause the CPU to get stuck in a state which does not allow Debug
Mode to be entered through the mechanisms described above. An example is if a privileged
mode writes SR:DM to one, without clearing the bit.
To prevent the debugger from hanging indefinitely, the debugger can write the DC:ABORT bit to
one after some timeout period, and force the CPU to enter Debug Mode. The abort command
behaves identical to a debug request, except that the DM bit and any pending exception will be
ignored, regardless of exception priority. The RAR_DBG and RSR_DBG will reflect the last nonexecuted instruction, which can aid in locating the error.
If Debug Mode is entered due to an abort command, DS:DBA will be set, as for debug requests.
95
32002F–03/2010
AVR32
9.2.5
Exceptions and Debug Mode
Debug Mode has priority over any execution mode, so that breakpoints can be set in exception
and interrupt routines. However, if a breakpoint is set on an instruction which triggers a critical
exception, the breakpoint is flushed. Critical exceptions are exception which are asynchronous
to the CPU (interrupts), exceptions which invalidate the currently fetched instruction (e.g.
instruction address exceptions), and exceptions which indicate that the system has become
unstable and should abort the program flow (e.g. bus error). The complete list of exceptions with
higher priority than Debug Mode are listed in the exception chapter in the AVR32 Architecture
Manual.
If a PC breakpoint, a breakpoint instruction, or a trapped 0x0000 opcode is flushed by an exception, Debug Mode will not be entered. If another type of breakpoint has triggered, Debug Mode
will be entered on the first instruction in the exception handler.
In the rare cases where the first instruction in a critical exception also triggers a critical exception
(e.g. if EVBA is set incorrectly, triggering an infinite loop of instruction address exceptions), the
debugger must write the DC:ABORT bit to one to halt the CPU and enter Debug Mode to identify
the error.
9.2.6
Instruction replacement
A convenient way of implementing an unlimited number of instruction breakpoints is letting the
debugger replace an instruction by a breakpoint instruction. This mechanism is only available in
OCD Mode on devices implemented with writeable program memory or writeable instruction
cache. If this instruction executes, Debug Mode will be entered, and the debugger identifies the
breakpointed location. When returning, the breakpoint instruction must be replaced by the original instruction. The debugger will write the Instruction Replace (IRP) bit in DC and the
appropriate instruction in the Debug Instruction Register and its corresponding PC value in the
Debug Program Counter (DPC). When retd is executed, PC and SR are restored, but one more
instruction is fetched from the OCD system before returning to fetching from program memory.
Note that instruction replacement operates on word boundaries. The debugger must store the
whole word containing the replaced opcode before inserting the breakpoint instruction. Also note
that DPC should always be written when performing an instruction replacement to ensure the
correct instruction is executed.
The debugger will then perform the following sequence when exiting OCD Mode. Note that
RAR_DBG is accessed through executing CPU instructions through the Debug Instruction register (DINST). The same sequence can be used both for compact and extended instructions,
regardless if the extended instruction is unaligned (in which case only the upper halfword of the
instruction is replaced).
1. Write RAR_DBG to the Debug Program Counter.
2. Increment RAR_DBG by 2 or 4, so the register points to the start of the next word in the
program memory.
3. Write 1 to Instruction Replace (IRP) in DC.
4. Write a retd instruction to DINST. The CPU will exit Debug Mode and stall while waiting
for new instructions.
5. Write the stored word to DINST. This instruction is fetched by the CPU, and the CPU
continues normal program execution.
96
32002F–03/2010
AVR32
9.2.6.1
Instruction replacement example
Table 9-1 shows an example of a code where the user wants to insert a breakpoint.
Table 9-1.
Example of a user code section
PC value
Opcode
Instruction
0x000010
0x0E9C
mov r12,r7
0x000012
0x201C
sub r12,0x01
0x000014
0xC0AC
rcall label1
0x000016
0xF8070046
adc r6,r12,r7
0x00001A
0x2027
sub r7,0x02
The tool wants to insert a software breakpoint on the instruction "adc r6,r12,r7" on
PC=0x000016. This is an extended instruction, and only the upper halfword needs to be
replaced by the breakpoint instruction.
1. The upper halfword is contained within the word located at 0x000014, and the debug
tool stores this value (0xC0ACF807).
2. The debugger writes a breakpoint instruction (opcode 0xD673) to location 0x000016 in
the CPU’s program memory to replace the most significant word of the breakpointed
instruction.
3. When the breakpoint instruction executes, the CPU will enter OCD Mode, and DS:DBS
and DS:SWB are set, indicating that OCD Mode is entered due to a software
breakpoint.
4. The tool performs a normal sequence of operation in OCD Mode.
5. When the tool is ready to return to normal CPU operation, it reads the RAR_DBG value
to find the return address.
6. The tool inserts CPU instructions to DINST to increment RAR_DBG by 2, so it is
aligned to the next word in the program memory.
7. The tool inserts a "retd" instruction to DINST. The tool will receive a Debug Status message, which indicates that the CPU has exited OCD Mode, and is now waiting for one
more instruction from the tool.
8. The tool writes the return address (0x000016) to the Debug Program Counter (DPC).
9. The tool looks up the stored instruction word (based on the return address) and writes
this value (0xC0ACF807) to the Debug Instruction Register (DINST). The CPU now
resumes normal operation.
9.2.7
Sleep Mode
If the CPU is in sleep mode, it will not receive clocks nor respond to an OCD request from the
debugger. Thus, if the Debug Request bit in DC is written to one while the CPU is in sleep mode,
the CPU will automatically return to active mode. The instruction following the sleep instruction
will be tagged with an OCD exception, and the CPU will jump directly to Debug Mode. The normal debug procedure can be followed while executing in Debug Mode. If Debug Mode is entered
from sleep mode, the Stop Status (STP) bit in the Development Status register will be set.
When returning from Debug Mode, the CPU will by default return to the instruction following the
sleep instruction. The debugger can handle this situation in two ways:
Ignore the problem, effectively waking the CPU from sleep mode on a debug request.
97
32002F–03/2010
AVR32
Decrement RAR_DBG in Debug Mode to return to the sleep instruction. This places the CPU
back into sleep mode after exiting Debug Mode.
9.2.8
OCD Register Access
The OCD registers control the OCD system. Their specification is based on the Nexus Recommended Registers as outlined in the Nexus Standard Specification [IEEE-ISTO 5001™-2003].
All registers can be accessed through the JTAG interface.
9.2.9
OCD features in Debug Mode
When the CPU executes in Debug Mode, certain OCD features will be disabled. The following
table indicates how the various OCD features will behave in Debug Mode. For more information
on the specific features, please see the indicated page.
Table 9-2.
OCD features in Debug Mode
Feature
Available in Debug Mode?
Program Breakpoints (HW)
Yes, in Monitor Mode when SR:DM is cleared
Software Breakpoints
Yes, in Monitor Mode when SR:DM is cleared
Data Breakpoints
Yes, in Monitor Mode when SR:DM is cleared
Watchpoints (program and data)
Yes, in Monitor Mode
Program Trace
No
Data Trace
No
Ownership Trace
Yes
Debug Communication Mechanism
Yes
9.2.10
OCD Registers Accessed by CPU
A monitor program running on the target can access the OCD registers through mtdr and mfdr
instructions. These instructions transfer data between a register in the register file and an OCD
register, according to the register index given in “OCD Register Summary” on page 153. These
instructions can also be used in OCD mode to transfer information from the register file and system registers to the debugger, through the Debug Communication Mechanism.
9.2.11
Runtime write access to OCD registers
The OCD registers can always be accessed by JTAG when the when the OCD system is not
enabled or the CPU is in OCD Mode. The OCD registers can also be read by JTAG at any time,
and by the CPU in any privileged mode.
When the CPU is in other modes - either running normal code, or executing in Monitor Mode the OCD registers can be written by JTAG as specified in Table 9-3. If the registers are
accessed in another way than specified, undefined operation may result.
The OCD Register Protect (ORP) bit in DC define the allowed write access to OCD registers in
privileged modes. If the ORP bit in DC does not allow CPU access to OCD registers in the cur-
98
32002F–03/2010
AVR32
rently executing mode, only PID and DCCPU can be written. Illegal access to the registers will
be ignored with no error reporting.
Table 9-3.
9.2.12
OCD Register access
Register
Can be written by JTAG
while CPU is running?
Can be written by
CPU in Monitor
Mode?
Development Control (DC)
Yes
Yes
Watchpoint Trigger (WT)
Yes
Yes
Data Trace Control (DTC)
Can be written to disable /
enable trace channels.
Yes
Data Trace Start Address (DTSA) Channel 1 to
2
Can only be written while
trace channel is disabled
Yes
Data Trace End Address (DTEA) Channel 1 to
2
Can only be written while
trace channel is disabled
Yes
PC Breakpoint/Watchpoint Control (BWC)
Can be written to disable /
enable watchpoints /
breakpoints.
Yes, if SR:DM is set.
Data Breakpoint/Watchpoint Control (BWC)
Can be written to disable /
enable watchpoints /
breakpoints.
Yes, if SR:DM is set.
PC Breakpoint/Watchpoint Address (BWA)
Can only be written while
breakpoint / watchpoint is
disabled
Yes, if SR:DM is set
or breakpoint
disabled.
Data Breakpoint/Watchpoint Address (BWA)
Can only be written while
breakpoint / watchpoint is
disabled
Breakpoint/Watchpoint Data (BWD)
Can only be written while
breakpoint / watchpoint is
disabled
Ownership Trace Process ID (PID)
Yes
Yes
Debug Instruction Register
No
No
Debug Program Counter
No
No
Debug Communication CPU (DCCPU)
Yes
Yes
Debug Communication Emulator (DCEMU)
Yes
Yes
Yes, if SR:DM is set
or breakpoint
disabled.
Yes, if SR:DM is set
or breakpoint
disabled.
OCD Interrupts
To support custom debug protocols running in software the OCD system support giving interrupts to the CPU when DCEMU is written or when DCCPU is read. A software protocol handler
can then be triggered by these interrupts instead of having to poll DCSR to see if the data in
DCCPU or DCEMU has been read or written.
To enable these interrupts the user must do the following:
• Program the interrupt controller with the correct priority and handler address for the interrupt.
• Enable the interrupts from the OCD by setting the corresponding bits in DCCR
99
32002F–03/2010
AVR32
• Turn off the interrupt masks in the CPU
When an interrupt occurd the CPU will jump to the interrupt handler routine and process the
interrupt. The interrupt handler must clear the interrupt before leaving this routing. This is done
by witing a zero to DCEMUDI or DCCPURI for DCEMU reads and DCCPU writes respectively:
9.2.13
Messages
9.2.13.1
Debug Status (DEBS)
This message is output when the CPU enters or exits Debug Mode or a low-power mode. The
message is output whenever the AUX port is enabled. The STATUS field of this message contains the information in the Development Status register. The field will contain these values:
• The CPU enters Debug Mode: STATUS bits indicate cause of entry to Debug Mode. DBS is
set if OCD Mode was entered.
• The CPU exits Debug Mode: STATUS = 0. This includes exiting Debug Mode by writing
DC:RES.
• The CPU enters a low-power mode: Only the STP bit is set, while the other bits are zero.
• The CPU exits a low-power mode: STATUS = 0
Table 9-4.
Debug Status
Debug Status Message
9.2.14
Packet
Size
Packet
Name
Packet
Type
Description
32
STATUS
Fixed
The contents of the Development Status register.
6
TCODE
Fixed
Value = 0
Registers
9.2.14.1
Device ID Register (DID)
The Device ID Register (DID) provides key attributes to the development tool concerning the
embedded processor. This is the same as the value returned by the JTAG ID instruction.
Table 9-5.
9.2.14.2
DID Register
R/W
Bit Number
Field Name
Init. Val.
Description
R
31:28
RN
Part
specific
RN - Revision Number
R
27:12
PN
Part
specific
PN - Product Number
R
11:1
MID
0x01F
Manufacturer ID
0x01F = ATMEL
R
0
Reserved
1
Reserved
This bit always reads as 1
Nexus Configuration Register (NXCFG)
The Nexus Configuration Register (NXCFG) provides key information about the specific implementation of the CPU and OCD architecture, and the configuration of the Nexus development
100
32002F–03/2010
AVR32
features on this device. This information is static, and may be used to develop generic Nexus
debuggers which will work across a family of AVR32 devices with different Nexus configurations.
Table 9-6.
R/W
Bit Number
Field Name
Init. Val.
R
31:29
Reserved
0
R
28
NXDMA
0
Direct Memory Access support
0 = Not supported
1 = Supported
R
27:25
NXDTC
0
Data Trace Channels
0 = Not supported
1 = Supported
Description
R
24
NXDRT
0
Data Read Trace Support
0 = Not supported
1 = Supported
R
23
NXDWT
0
Data Write Trace Support
0 = Not supported
1 = Supported
R
22
NXOT
0
Ownership Trace support
0 = Not supported
1 = Supported
0
Program Trace support
0 = Not supported
1 = Supported
R
21
NXPT
R
20:17
NXMDO
6
AUX MDO pins
0 = no MDO or MSEO pins
n = n MDO pins, NXMSEO MSEO pins
R
16
NXMSEO
1
AUX MSEO pins
0 = 1 MSEO pin
1 = 2 MSEO pins
R
15:12
NXDB
2
Number of Data breakpoints
R
11:8
NXPCB
6
Number of PC breakpoints
0
OCD Version
0000 = AVR32AP OCD
0001 = AVR32UC OCD
Other = Reserved
0
Architecture
0000 = AVR32B
0001 = AVR32A
Other = reserved
R
R
9.2.14.3
Nexus Configuration Register
7:4
3:0
NXOCD
NXARCH
Debug Communication CPU Register (DCCPU)
If the CPU wants to transmit data to the debugger tool, it writes data to the Debug Communication CPU Register using mtdr. By writing this register, a dirty bit is set in the Debug
101
32002F–03/2010
AVR32
Communication Status Register. The emulator should poll the status register and read DCCPU if
the dirty bit is set.
Table 9-7.
9.2.14.4
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
DATA
0x0000_
0000
Data Value
Data written by CPU
Debug Communication Emulator Register (DCEMU)
When the emulator writes to this register, a dirty bit is set in the Debug Communication Status
register. The CPU can poll this bit to see if DCEMU contains unread data..
Table 9-8.
9.2.14.5
Debug Communication CPU Register
Debug Communication Emulator Register
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
DATA
0x0000_
0000
Data Value
Data written by Emulator
Debug Communication Status Register (DCSR)
To avoid overruns the CPU must poll this register before writing a new value to DCCPU. Note
that the bits in this register are not automatically cleared in OCD mode. This allows a debugger
to update views and observe the system without accidentally modifying the DCSR register.
The OCD system can produce interrupts when the DCEMU register has been updated and when
the DCCPU register is read. The CPURI and EMUDI flags are set on the interrupt events, but
are cleared by software by writing the DCSR register. To enable the interrupts the corresponding
bits in the DCCR register has to be set and the Interrpt controller has to be programmed.
Table 9-9.
Debug Communication Status Register
R/W
Bit Number
Field Name
Init. Val.
Description
R
31:4
Reserved
0x0000_
0000
Reserved
These bits are reserved, and will always read as 0
0
Emulator Data Dirty Interrupt flag
0 = DCEMU has not been written to since the
clearing of this bit.
1 = DCEMU contains a new data value.
This bit is cleared by writing this bit to 0.
R/W
1
EMUDI
102
32002F–03/2010
AVR32
Table 9-9.
R/W
R/W
R/W
R/W
9.2.14.6
Debug Communication Status Register
Bit Number
1
Field Name
CPURI
1
EMUD
0
CPUD
Init. Val.
Description
0
CPU Data Read Interrupt flag
0 = DCCPU has not been read since the clearing
of this bit.
1 = DCEMU has been read.
This bit is cleared by writing this bit to 0.
0
Emulator Data Dirty
0 = DCEMU has not been written to since last read
from CPU.
1 = DCEMU contains a new data value.
This bit is cleared by reading DCEMU.
0
CPU Data Dirty
0 = DCCPU has not been written to since last read
from emulator.
1 = DCCPU contains a new data value.
This bit is cleared by reading DCCPU.
Debug Communication Control Register (DCCR)
To enable the DCCPU read and DCEMU dirty interrupts the corresponding enable bits must be
set in this register.
Table 9-10.
Debug Communication Control Register
R/W
Bit Number
Field Name
Init. Val.
Description
R
31:2
Reserved
0x0000_
0000
Reserved
These bits are reserved, and will always read
as 0
R/W
1
DCCPUIMASK
0
DCCPU Interrupt Mask
0 = DCCPU interrupts are disabled.
1 = DCCPU interrupts are enabled.
R/W
0
DCEMUIMASK
0
DCEMU Interrupt Mask
0 = DCEMU interrupts are disabled.
1 = DCEMU interrupts are enabled.
103
32002F–03/2010
AVR32
9.2.14.7
Development Control Register (DC)
DC is used for basic development control of the CPU.
Table 9-11.
R/W
R/W
S
R/W
R/W
Development Control Register
Bit Number
31
30
29
28
Field Name
ABORT
RES
MM
ORP
Init. Val.
Description
0
ABORT
Writing ABORT to one while DBE is asserted
causes the CPU to enter Debug Mode, regardless
of SR:DM and any pending exceptions. If the CPU
was in sleep mode, it will first be woken up before
entering Debug Mode. The ABORT bit is cleared
automatically when Debug Mode is entered.
0
RES - Application Reset
Writing this bit causes an application reset, which
will reset the CPU and other system modules. The
OCD state machines will be reset and the Transmit
Queue flushed, but the OCD control and
configuration registers will not be cleared.
0
MM - Monitor Mode
1 = The CPU will enter Debug Mode in Monitor
Mode
0 = The CPU will enter Debug Mode in OCD Mode
Changing this bit in Debug Mode does not take
effect until the CPU enters Debug Mode the next
time.
0
ORP - OCD Register Protect
0 = OCD registers can be written by any privileged
CPU mode
1= OCD registers can be written only in Debug
Mode
RID - Run In Debug
0: Peripherals are frozen in Debug Mode
1: Peripherals keep running in Debug Mode.
In addition the PDBG register must be configured
with individual masks for each module.
R/W
27
RID
0
R
26
Reserved
0
R/W
R/W
25
24
TOZ
IFM
0
TOZ - Trap Opcode Zero
0: The opcode 0x0000 is executed as a normal
CPU instruction
1: The opcode 0x0000 causes entry to Debug
Mode
0
IFM - Ignore First Match
When written to one, a PC breakpoint on the first
instruction after exiting Debug Mode with the retd
instruction will not trigger re-entry to Debug Mode.
Typically used when returning from a program
breakpoint. This bit stays one until written to zero.
104
32002F–03/2010
AVR32
Table 9-11.
R/W
Development Control Register
Bit Number
Field Name
Init. Val.
Description
R/W
23
IRP
0
IRP - Instruction Replace
If IRP is written to one before exiting OCD Mode
with the retd instruction, the first instruction after
exiting OCD Mode will be fetched from the Debug
Instruction Register. This bit is cleared
automatically after this fetch takes place. This bit
will not have any effect if written at the same time
as RES.
R/W
22
SQA
0
SQA - Software Quality Assurance
0: Regular program trace
1: SQA enhanced program trace
0
EOS - Event Out Select
00 = No operation
01 = Emit event out when the CPU enters Debug
Mode
10 = Emit event out for breakpoints/watchpoints
11 = Emit event out for message insertion into the
TXQ
0
DBE - Debug Enable
DBE enables Debug Mode and all debug features
in the CPU. DBE must be written to one to enable
breakpoints, debug requests, or single steps.
0
DBR - Debug Request
Writing DBR to one while DBE is asserted causes
the CPU to enter Debug Mode. If the CPU was in
sleep mode, it will first be woken up before
entering Debug Mode. The DBR bit is cleared
automatically when Debug Mode is entered.
0
SS - Single Step
If SS is written to one before exiting Debug Mode
with the retd instruction, exactly one instruction will
be executed before returning to Debug Mode. SS
stays one until written to zero by the debugger.
R/W
21:20
EOS
R
19:14
Reserved
R/W
R/W
R/W
13
DBE
12
DBR
11:9
Reserved
8
SS
105
32002F–03/2010
AVR32
Table 9-11.
R/W
R/W
R/W
R/W
9.2.14.8
Development Control Register
Bit Number
7:5
4:3
2:0
Field Name
OVC
EIC
TM
Init. Val.
Description
0
OVC[2:0] - Overrun Control
OVC controls the action taken if Branch, Data, or
Ownership trace messages are generated while
the Transmit Queue is full. Settings 111 though
100 are reserved.
000 = Generate overrun messages
001 = Delay CPU to avoid BTM and Ownership
Trace overruns
010 = Delay CPU to avoid DTM and Ownership
Trace overruns
011 = Delay CPU to avoid BTM, DTM, and
Ownership Trace overruns
111-100 = Reserved
0
EIC[1:0] - EVTI Control
The EIC bits control the action performed when
the EVTI pin on the Nexus debug port receives a
high-to-low transition. If trace is enabled, EVTI can
be configured to cause a trace synchronization
message. If Debug Mode is enabled, EVTI can be
configured to cause a breakpoint.
00 = EVTI for program and data trace
synchronization
01 = EVTI for breakpoint generation
10 = No operation
11 = Reserved
0
TM[2:0] - Trace Mode
The TM bits select which trace modes are
enabled.
000 = No Trace
XX1 = OTM Enabled
X1X = DTM Enabled
1XX = BTM Enabled
If Data or Branch tracing is triggered or stopped by
a watchpoint , the DTM and BTM bits are updated
accordingly.
Development Status (DS) register
This register is used to examine the debug state of the CPU and the cause for entering Debug
Mode. Note that multiple sources may trigger Debug Mode simultaneously, causing more than
one bit to be set. The register is read-only. All bits are dynamic and do not require clearing.
106
32002F–03/2010
AVR32
This register is undefined when the CPU is not in Debug Mode.
Table 9-12.
Development Status register
R/W
Bit Number
Field Name
Init. Val.
R
31:29
Reserved
0
R
R
R
R
28
27
26
25
NTBF
EXB
DBA
BOZ
0
NTBF -NanoTrace Buffer Full
This bit is set if Debug Mode is entered because
the Memory Service Unit has signalled that the
NanoTrace Buffer is full. This bit is cleared when
Debug Mode is exited.
0
EXB -External Breakpoint
This bit is set if Debug Mode was entered due to
an event on the EVTI pin. This bit is cleared when
Debug Mode is exited.
0
DBA - Debug Acknowledge
This bit is set if Debug Mode was entered due to
setting the Debug Request or ABORT bit in the
DC register. This bit is cleared when Debug Mode
is exited.
0
BOZ - Break on Opcode Zero
This bit is set if Debug Mode was entered due to
opcode 0x0000 being executed. This bit is cleared
when Debug Mode is exited.
INC - Instruction Complete
0: The CPU is executing one or more instructions,
or is not in OCD Mode.
1: The CPU is in OCD Mode and is not executing
any instructions.
R
24
INC
0
R
23:16
Reserved
0
R
15:8
BP[7:0]
0
R
7:6
Reserved
0
R
5
DBS
Description
0
BP - Breakpoint Status
The BP bits identify which hardware breakpoint
caused Debug Mode to be entered:
BP[0]: BP0A
BP[1]: BP0B
BP[2]: BP1A
BP[3]: BP1B
BP[4]: BP2A
BP[5]: BP2B
BP[6]: BP3A
BP[7]: BP3B
These bits are cleared when Debug Mode is
exited.
DBS - Debug Status
DBS is set when the CPU is in OCD Mode,
otherwise cleared. This bit stays cleared also
when the CPU operates in Monitor Mode.
107
32002F–03/2010
AVR32
Table 9-12.
R/W
Bit Number
Init. Val.
Description
STP - Stop Status
STP is set if OCD Mode is entered from sleep
mode. This bit can be used by the debugger to
determine the proper return sequence from OCD
Mode. This bit is cleared when OCD Mode is
exited.
4
STP
0
R
3
Reserved
0
R
R
2
HWB
1
SWB
0
SSS
0
HWB - Hardware Breakpoint Status
This bit is set if Debug Mode was entered due to a
hardware breakpoint. The BP[7:0] bits should be
examined to determine the breakpoint(s) which
triggered. This bit is cleared when Debug Mode is
exited.
0
SWB - Software Breakpoint Status
This bit is set if Debug Mode was entered due to a
breakpoint instruction being executed. Returning
from a software breakpoint may require special
handling by the debugger. This bit is cleared when
Debug Mode is exited.
0
SSS - Single Step Status
This bit is set when Debug Mode is entered due to
a single step. This bit is cleared when Debug
Mode is exited.
Debug Instruction Register (DINST)
The Debug Instruction Register contains the instruction to be executed in OCD Mode. The CPU
fetches and executes the instruction faster than they can be written by the Debug port. DINST is
also used to store the instruction to replace the breakpoint instruction.
Table 9-13.
9.2.14.10
Field Name
R
R
9.2.14.9
Development Status register
Debug Instruction register
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
DINST
0
DINST - Debug Instruction
The instruction to be executed on the CPU.
Peripheral Debug Register (PDBG)
The Peripheral Debug Register controls the operation of modules in debug mode. If the DC.RID
bit is set, the CPU is in debug mode and the PDBG bit for a module is set this module is kept
running in debug mode. Otherwise the module is stopped. The mapping between the bits in this
register and modules are part specific and are described in the OCD module configuration section of the part datasheet.
Table 9-14.
Debug Instruction register
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
PDBG
0
PDBG - Peripheral debug.
0 = The peripheral is running in debug mode.
1 = The peripheral is stopped in debug mode.
108
32002F–03/2010
AVR32
9.2.14.11
Debug Program Counter (DPC)
This register contains the PC value of the last executed instruction in any non-debug mode. This
allows a debugger to sample program execution addresses for statistical purposes without interrupting the CPU.
If this register is read in Debug Mode, it will reflect the last executed instruction before Debug
Mode was entered. Note that several types of breakpoints trigger before an instruction is executed, so this value is not necessarily identical to RAR_DBG.
When replacing the return instruction from Debug Mode, the CPU will see the DPC value as the
PC value for the executed instruction. The user only needs to write this register when replacing
the return instruction from OCD Mode.
Table 9-15.
9.3
Debug Port
9.3.1
Overview
Debug Program Counter
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
DPC
0
DPC - Debug Program Counter
PC of the last executed instruction
The OCD debug port consists of the JTAG port and the AUX port. The low bandwidth JTAG port
handles all register access, while the high bandwidth AUX port transfers all Nexus messages
from the OCD system.
The Nexus standard defines the maximum clock frequency for JTAG to be 33 MHz, and for AUX
200 MHz.
9.3.2
JTAG
Access to OCD register is done through an IEEE1149.1 JTAG-port. The JTAG TAP controller is
shared with the rest of the system. In order to enable access to OCD register the emulator must
perform the following sequence.
1. Put the TAP controller in the state "test logic reset".
2. Insert the OCD Instruction to prepare the Debug Port to receive OCD register access.
The OCD instruction is inserted using the IR scan path.
3. Use the DR scan path to insert the OCD register address and operation (Read / Write).
4. Use the DR scan path to read / write the data to / from the register.
5. Repeat 3 through 4 for every register operation. The TAP controller will remain in OCD
mode until a test logic reset is detected.
109
32002F–03/2010
AVR32
To be able to use JTAG-based debug tools for AVR32 without adapters, it is recommended that
a circuit design using an AVR32 device should use a standard 10-pin 50-mil IDC connector with
the pinout shown in Table 9-16. The signals are described in Table 9-17.
Table 9-16.
AVR32 standard JTAG connector pinout. All directions relative to processor
Signal
Dir
Pin
Pin
TCK
In
1
2
TDO
Out
3
4
Out
VREF
TMS
In
5
6
In
RESET_N
7
8
N/C
9
10
N/C
N/C
TDI
Table 9-17.
In
Dir
Signal
GND
JTAG signals
Pin
Direction
Description
TRST_N
Input
Asynchronous reset for the TAP controller and JTAG registers
TCK
Input
Test Clock. Data is driven on falling edge, sampled on rising edge.
TMS
Input
Test Mode Select
TDI
Input
Test Data In
TDO
Output
Test Data Out
RESET_N
Input
Device reset
VREF
Output
Reference voltage from target. Signals should be driven relative to this
voltage level.
110
32002F–03/2010
AVR32
Figure 9-4.
JTAG TAP controller state diagram.
1
Test-LogicReset
0
0
Run-Test/
Idle
1
Select-DR
Scan
1
Select-IR
Scan
1
0
1
0
Capture-DR
1
0
Shift-DR
0
0
Shift-IR
1
1
Exit1-DR
Exit1-IR
0
0
Pause-DR
1
Exit2-DR
0
0
Pause-IR
1
1
0
1
1
Update-DR
0
9.3.3
Capture-IR
0
0
1
Exit2-IR
1
1
Update-IR
0
AUX port
The Auxiliary (AUX) port and messaging protocol follow the definitions of the Nexus standard.
This standard allows varying the number of signalling pins. The following configuration is
selected for AVR32UC.
• 6 data output pins (MDO)
• 2 message start/end output pins (MSEO)
• 1 EVTO pin
• 1 EVTI pin
The configuration is based on the presumed needs for bandwidth in a system being traced at
100+ MIPS, balanced against the desire to keep debug pincount low. This configuration can be
changed in future implementations to allow for greater or smaller bandwidth over the AUX port.
The AUX pins may be multiplexed with GPIO in a device. By default, the MCKO, MDO, and
MSEO pins are tristated or used as GPIO, and the Nexus functionality must be explicitly enabled
by the debugger. EVTO, EVTI, and the JTAG pins are always available to the debugger.
If the AUX pins are needed for Nexus functionality in an application, it is recommended not to
use these pins for GPIO purposes, as this can affect the signal integrity required for Nexus
operation.
111
32002F–03/2010
AVR32
The complete signal list of the AUX port is shown in Table 9-18.
Table 9-18.
Auxiliary pins
Auxiliary
pins
Width
Direct
ion
MCKO
1
O
Message Clockout (MCKO) is a free-running output clock to
development tools for timing of MDO and MSEO pin functions.
O
Message Data Out (MDO[5:0]) are output pins used for all messages
generated by the device. In single datarate mode, external latching of
MDO shall occur on rising edge of MCKO. In double datarate mode,
external latching of MDO shall occur on both edges of MCKO.
O
Message Start/End Out (MSEO[1:0]) pins indicate when a message
on the MDO pins has started, when a variable length packet has
ended, and when the message has ended. In single datarate mode,
external latching of MSEO shall occur on rising edge of MCKO. In
double datarate mode, external latching of MSEO shall occur on both
edges of MCKO.
O
Event Out (EVTO) is an output pin which can be configured to toggle
every time a message is inserted into the Transmit Queue, when the
CPU entered OCD Mode, or when a breakpoint or watchpoint hit
occured, as configured by the EOS bits in the Development Control
register .
MDO
MSEO
EVTO
6
2
1
Description
EVTI
1
I
Event In (EVTI) is an input which, when a high-to-low transition occurs,
a processor is halted (breakpoint) or program and data
synchronization messages are transmitted from the OCD controller, as
configured by the EIC bits in the Development Control register.
RESET_
N
1
I
System reset
112
32002F–03/2010
AVR32
To be able to use AUX-based debug tools for AVR32, a circuit design using an AVR32 device
should use a Mictor38 connector (AMP P/N 767054-1) as defined in the Nexus standard, with
the pinout shown in Table 9-19.
Table 9-19.
AVR32 standard Nexus connector pinout. All directions relative to processor
Signal
Dir
Pin
Pin
Dir
Signal
MSEO0
Out
38
37
N/C
MSEO1
Out
36
35
N/C
MCKO
Out
34
33
N/C
EVTO_N
Out
32
31
N/C
MDO0
Out
30
29
N/C
MDO1
Out
28
27
N/C
MDO2
Out
26
25
N/C
MDO3
Out
24
23
N/C
MDO4
Out
22
21
In
TRST_N
MDO5
Out
20
19
In
TDI
N/C
18
17
In
TMS
N/C
16
15
In
TCK
N/C
14
13
VREF
Out
12
11
Out
TDO
EVTI_N
In
10
9
In
RESET_N
N/C
8
7
N/C
N/C
6
5
N/C
N/C
4
3
N/C
N/C
2
1
N/C
N/C
9.3.3.1
Reset configuration
The Nexus standard specifies that the AUX port can be enabled by keeping EVTI low while pulsing TRST (or exiting Test-Logic-Reset). The OCD system in AVR32 has removed this feature. In
order to enable the AUX port, the debugger has to write the AXC:AXE (Auxiliary Enable) bit.
9.3.3.2
Message protocol
The OCD System implements the Auxiliary Port Message Protocol defined in the Nexus standard. The following section is merely a summary of this protocol. For details, please see the
Nexus standard.
Messages are composed of a Start-of-Message (SOM) token, followed by one or more packets
of information, each of fixed or variable length, and ended by an End-of-Message (EOM) token.
SOM/EOM and End-of-Variable-Length-Packets (EVLP) are signalled by MSEO for transmitted
messages. Packet information is carried by the MDO pins. The number of MDO pins available is
known as the port boundary. The information carried by the MDO and MSEO pins each cycle is
known as a frame.
113
32002F–03/2010
AVR32
9.3.3.3
Message rules
MDO is valid whenever MSEO does not indicate "idle".
Fixed length packets are implicitly recognized from the message format, and are not required to
end on a port boundary. Thus, packets may also start within a port boundary if following a fixed
length packet. The end of variable length packets is identified through the MSEO pins, and to
identify the end of the packet uniquely, these packets must end on a port boundary. If necessary, the packet must be stuffed with zeroes to align the end to a port boundary. Variable length
packets may be truncated by omitting leading zeroes so that the packet ends on the first possible port boundary.
• The MSEO pins behave the following way ("x" means "don’t care"):
• 0b11 followed by 0b00 indicates SOM
• 0b0x followed by 0b11 indicates EOM
• 0b00 followed by 0b01 indicates EVLP
• MSEO is 0b00 at all other clocks during transmission of a message
• MSEO is 0b11 at all clocks when idle.
9.3.3.4
Clock and frame rate
In single datarate mode (default), MDO and MSEO should be sampled by an external tool on the
rising edge of MCKO. In double datarate mode, the MCKO clock runs at half frequency, so MDO
and MSEO should be sampled on both edges of MCKO. This is configured by the Double Datarate bit in the AUX Port Control Register.
It is also possible to reduce the frequency of the AUX port compared to the CPU clock by writing
the AXC:LS and AXC:DIV bits. If LS=1, the DIV value selects the frame rate of the AUX port:
fAUX = fCPU/(DIV+1)
If LS=1 and DIV=0, fAUX = fCPU/2.
This can be combined with the single or dual datarate mode, as described above. In either case,
the sampling edge will be as close to the middle of the MDO data frame as possible. The duty
cycle of the MCKO clock will stay within the 40-60 duty cycle requirement of the Nexus standard
for all settings apart from DIV=2.
9.3.3.5
Example
Figure 9-5 shows an example of transmission of a Program Trace Indirect Branch message. The
TCODE is fixed at 6 bits (=4 for PTIB), followed by a fixed-length packet (EVT-ID = 2), and a
variable-length packet (I-CNT = 63). I-CNT is stuffed with zeroes to fit the port boundary. Finally,
the variable packet U-ADDR (=5) is transmitted. Since this leading zeroes of this packet can be
truncated, it fits within a single frame.
114
32002F–03/2010
AVR32
Figure 9-5.
Example of a Nexus message transmission with single and double datarate.
IDLE
SOM
NORMAL
EVLP
EOM
MCKO (DDR=1)
MCKO (DDR=0)
M S E O [1 . . 0 ]
M D O [ 5. . 0 ]
11
00
000100
111110
01
11
000011
000101
I-CNT = 63
TCODE = 4
9.3.3.6
EVT-ID = 2
Zero stuffing
U-ADDR = 5
Transmit queue and overruns
Messages from various sources are inserted in a Transmit Queue (TXQ), which stores a number
of frames. This queue acts as a FIFO which allows messages to be inserted more rapidly than
they can be retrieved by the emulator.
The queue holds 16 frames. If more messages are inserted than there is room for in the queue,
information will be lost, and an overrun situation occurs. The TXQ will block any more messages
from being inserted, and allow the queue to be emptied by the emulator before allowing any
more messages to be inserted. The first message to be inserted after the overrun is cleared, is
an Error message, which informs the emulator that an overrun has occurred and which types of
trace messages have been lost. After this, transmission continues as normal.
Alternatively, the user can configure the OCD to halt the CPU to prevent overruns. This can be
done selectively for different message types, and is controlled by writing to the Overrun Control
(OVC) bits in the DC register.
If any of the OVC bits are set, watchpoint trace messages will usually not generate TXQ overflow. However, triggering an program and data watchpoint on the same instruction may in some
rare cases cause an overrun independently of the OVC settings, since a large amount of trace
message data will be produced for this instruction.
9.3.3.7
Trace and reset
All pending trace messages in the Transmit Queue are flushed if: the OCD is reset by a system
reset; the OCD is disabled; or an application reset is triggered by writing to the DC:RES bit.
Thus, if the CPU is reset, but not the OCD, the program flow can be observed by program trace.
However, if the debugger resets the system, the remaining messages in the queue are of no
value, and expected to be flushed.
Note that if the OCD is disabled (by clearing DC:DBE or by a system reset), trace is suspended
until DC:DBE is written to one. The DC:TM bits must be written simultaneously, and define which
trace features should now be active.
Similarly, when an application reset is triggered by writing DC:RES, the DC:TM bits are written
simultaneously and define which trace features should now be active.
115
32002F–03/2010
AVR32
9.3.4
Messages
9.3.4.1
Error
The error message indicates various errors that can occur during trace or debugging. Table 9-21
lists the various errors that can be reported, along with the associated ECODE.
If trace messages are lost because of insufficient space in the Transmit Queue, an error message is transmitted, followed by a synchronization message, as soon as space is available in the
Transmit Queue.
Table 9-20.
Indirect Branch Message with Sync
Direction: From target
Packet
Size (bits)
Packet
Name
Packet
Type
Description
5
ECODE
Fixed
Error code. Refer to Table 9-21.
6
TCODE
Fixed
Value = 8
Table 9-21.
9.3.5
Error
Error codes
ECODE
Description
0b00000
Ownership trace overrun
0b00001
Program trace overrun
0b00010
Data trace overrun
0b00011 0b00101
Reserved
0b00110
Watchpoint overrun.
0b00111
Program and/or data and/or ownership trace overrun.
0b01000
Program trace and/or data and/or ownership trace and/or watchpoint overrun.
0b01001 0b11111
Reserved
Registers
9.3.5.1
Auxiliary Port Control Register (AXC)
Table 9-22 shows the description of the Auxiliary Port Control Register. This register allows
greater flexibility in controlling the operation of the AUX port than specified by the Nexus stan-
116
32002F–03/2010
AVR32
dard. This includes enabling the AUX port, and controlling the speed of the clock and data
compared to the CPU clock.
Table 9-22.
R/W
Bit Number
Field Name
Init. Val.
Description
R
31:16
Reserved
0
Reserved
These bits are reserved, and will always read as 0
R/W
15:14
AXS
0
AXS - Auxiliary Port Select
0: AUX port is mapped to pin configuration 0.
1: AUX port is mapped to pin configuration 1.
2: AUX port is mapped to pin configuration 2.
3: AUX port is mapped to pin configuration 3.
R
13
Reserved
0
Reserved
This bit is reserved, and will always read as 0
R
12
Reserved
0
Reserved
This bit is reserved, and will always read as 0.
0
LS - Low Speed
0:AUX port runs at the same speed as the CPU
1:AUX port runs at reduced speed compared to
the CPU.
0
DDR - Double Data Rate
Setting this bit halves the MCKO rate so that MDO
data must be sampled on both edges of MCKO.
1 = Double data rate mode
0 = Single datarate mode
0
AXO - Auxiliary Port Override
0: AUX port is mapped to the pins dictated by
AXS.
1: AUX port is overridden and mapped to pin
configuration 1.
R/W
R/W
R/W
9.4
Breakpoints
9.4.1
Overview
AUX Port Control Register
11
10
9
LS
DDR
AXO
R/W
8
AXE
0
AXE - Auxiliary Port Enable
0: AUX port is used for GPIO
1: AUX port is used for Nexus operation.
This bit does not need to be written in devices with
dedicated AUX pins
R
7:4
Reserved
0
Reserved
These bits are reserved, and will always read as 0
R/W
3:0
DIV
0
DIV - Division factor
If LS=1, the DIV value selects the frame rate of the
AUX port.
The Nexus Recommended Register map supports up to 8 universal breakpoints. However since
the AVR32UC hardware employs separate instruction and data memories, the OCD system
must also separate program and data breakpoints. Any breakpoint can also be programmed as
117
32002F–03/2010
AVR32
a watchpoint. The watchpoint will trigger a Watchpoint Hit message. The OCD system supports
up to six program breakpoints modules and two data breakpoint modules. In addition to this, the
data trace modules can also be used as data address watchpoints. The trace watchpoints result
in a vendor defined Trace Watchpoint Hit message.
Figure 9-6.
Breakpoint modules.
CPU
PC
Program
BP/WP
Data
A ddress
Data V alue
Data
BP/WP
Data
A ddress
Trace
BP/WP
118
32002F–03/2010
AVR32
Figure 9-7.
Breakpoint unit overview.
PC
Breakpoint
Unit
PC
Breakpoint
Module 0A
PC
Breakpoint
Module 0B
PC
Breakpoint
Module 1A
PC
Breakpoint
Module 1B
PC
Breakpoint
Module 2A
PC
Breakpoint
Module 2B
Data
Breakpoint
Unit
Data
Breakpoint
Module 3A
Data
Breakpoint
Module 3B
6 PC
Breakpoints
5 PC Watchpoints
2 Data Watchpoints
Trigger
Unit
Start/
Stop
Program
Trace
Unit
6 PC
Watchpoints
Start/
Stop
Data
Trace
Unit
2 Data
Watchpoints
2 Range Data
Watchpoints
2 Data
Breakpoints
Watchpoint
Message
Generator
Messages to
Transmit Queue
9.4.2
Breakpoint Unit description
The Breakpoint unit consists of the units shown in Figure 9-7. The PC Breakpoint Unit (PBU)
handles the program counter breakpoints. The PBU can have up to 6 PC breakpoint modules
that can match on a single PC. Two modules can be combined to give a match on a range of PC
values, thus up to three ranges can be defined. The PBU is configured with registers Breakpoint
/ Watchpoint Control (BWC) and Breakpoint / Watchpoint Address (BWA) 0A, 0B, 1A, 1B, 2A,
and 2B.
The Data Breakpoint Unit handles data breakpoints. The data breakpoints can be configured
with the BWC / BWA / BWD 3A and 3B registers.
119
32002F–03/2010
AVR32
The Watchpoint Message Generator (WMG) generates watchpoint messages for all breakpoint
modules and data trace watchpoints.
Optionally, a breakpoint or watchpoint can be signalled by a pulse on the EVTO pin. This
requires DC:EOS bits to be set to 1 and EOC in the corresponding Breakpoint/Watchpoint Control Register must be written to one.
9.4.2.1
Program Breakpoints
In order to enable a simple program breakpoint the Breakpoint / Watchpoint Address (BWA) and
Breakpoint / Watchpoint Control (BWC) registers for that breakpoint must be updated.
The BWA register must be written with the address of the instruction where the debugger wants
to halt.
The BWC must have the Breakpoint / Watchpoint Enable (BWE) field set to breakpoint.
Program breakpoints break on the instruction pointed to by BWA. The instruction will cause a
debug exception and the Debug Mode Link Register (RAR_DBG) and Debug Mode Return Status Register (RSR_DBG) will point to the instruction that caused the debug exception. The
Development Status register will also be updated to indicate which breakpoint caused the
exception. In OCD Mode the debug tool can then feed the CPU with debug code to ascertain the
state of the processor. In OCD Mode the breakpoint modules are disabled.
Upon return from Debug Mode, the PC and SR will be restored from the RAR_DBG and
RSR_DBG and the instruction that caused the debug exception will be fetched again. If the program breakpoint has not been disabled in Debug Mode, the Ignore First Match (IFM) bit in the
Development Control (DC) register must be written to one to avoid triggering another breakpoint
on the first instruction after exiting Debug Mode. The IFM bit prevents any Program Breakpoint
operation on the first instruction after exiting Debug Mode.
9.4.2.2
Watchpoints
When enabled in the BWC, a watchpoint message is sent when the instruction address matches
the address stored in BWA. If both a Trace watchpoint and a Watchpoint triggers at the same
time, the Trace watchpoint will be ignored and only a Watchpoint Hit message will be generated.
Note that Program, Data, and Trace watchpoints are generated at different pipeline stages and
will not be synchronized when the messages are generated. A Program Watchpoint on a load
store instruction will hit before a data watchpoint on the same instruction.
9.4.2.3
Data Breakpoints
Data Breakpoint modules listen on the data address and data value lines between the CPU and
the data cache and can halt the CPU, or send a watchpoint message, if the address and / or
value meets a stored compare value. Unlike program breakpoints, data breakpoints halt on the
next instruction after the load / store instruction that caused the breakpoint has completed.
The BWA register must be written with the address of the data the debugger wants to halt on.
9.4.3
Data Breakpoint interface
9.4.3.1
Data alignment
The AVR32 can read or write data in bytes, halfwords, or words. The same data location can be
accessed through either operation, e.g. a byte location can be accessed as part of a double
word. The data bus operations seen by the OCD system are always aligned, i.e. halfwords start
on halfword boundaries, word accesses start on word boundaries, as illustrated in Figure 9-8. If
120
32002F–03/2010
AVR32
the data bus operation is a double word load / store, the breakpoint module will see the word
data value which corresponds to the address in BWA.
One data breakpoint module can only compare 32 bits of data. The data to be matched can
therefore not cross a word boundary if the data breakpoint is to match correctly. When the
debugger wants to match on a byte or halfword, the BWD register must be written with the LSB
aligned, and the BWC:BME bits must be set to mask the upper bits of the BWD register.
For example, if the debugger wants to match against Byte 1 in Figure 9-8, the BWA must be set
to the byte address of Byte 1 and the BWD written with the value to match on aligned to LSB.
Also the BWC:BME must be set to mask the 24 most significant bits of the BWD register (BME =
0xE).
By default, the data breakpoint module will match on the data value regardless of the size of the
access. The data BWC can also be set to match on a specific access size if the SIZE bits are
set. The debugger can for example, set the breakpoint module to match only on byte writes to
byte 1 in Figure 9-8. The BWD register must still be aligned correctly, and the byte mask must be
set, but the data breakpoint will only trigger if a single byte is written to byte 1 and not if, for
example, a whole word is written to byte 0, 1, 2, and 3.
Figure 9-8.
Memory access data alignment.
Double w ord byte 4 to 7
0x800C
Word
0x8008
Half w ord 0
Byte 0
3
0x8010
Half w ord 1
0x8004
Byte 1
Byte 2
Byte 3
2
1
0
0x8000
Word Address
Double w ord byte 0 to 3
Byte A ddress
9.4.4
Triggering trace
A watchpoint from the program or data breakpoint modules can be used to start or stop program
or data trace. This is done using a trigger unit. The trigger unit can be configured using the
watchpoint trigger register. When the trigger unit is set to start trace upon a watchpoint, DC:TM
will be set accordingly, and trace will then be enabled. If a data watchpoint enables data trace,
the data event is not included in the data trace output, while an event which disables data trace
is included in the data trace output.
121
32002F–03/2010
AVR32
9.4.5
Messages
9.4.5.1
Watchpoint Hit (WH)
Table 9-23.
Watchpoint Hit
Watchpoint Message
Packet
Size
9.4.5.2
Packet
Name
Packet
Type
Description
8
WPHIT
Fixed
XXXXXXX1 = Watchpoint 0 matched
XXXXXX1X = Watchpoint 1 matched
...
X1XXXXXX = Watchpoint 6 matched
1XXXXXXX = Watchpoint 7 matched
6
TCODE
Fixed
Value = 15
Trace Watchpoint Hit (TWH)
Table 9-24.
9.4.6
Direction: From target
Trace Watchpoint Hit
Trace Watchpoint Message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
2
WPHIT
Fixed
X1 = Watchpoint 0 matched
1X = Watchpoint 1 matched
6
TCODE
Fixed
Value = 56
Registers
9.4.6.1
PC Breakpoint/Watchpoint Address registers (BWA0A, BWA0, BWA1A, BWA1B, BWA2A, BWA2B)
The 6 BWA registers contains one instruction address each. The address can be used for a single breakpoint match or used as bitwise mask to create a range.
Table 9-25.
PC BWAnx Register
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
BWA
0
Breakpoint/Watchpoint Address
122
32002F–03/2010
AVR32
9.4.6.2
PC Breakpoint/Watchpoint Control registers - (BWC0A, BWC0B, BWC1A, BWC1B, BWC2A, BWC2B)
Table 9-26.
R/W
9.4.6.3
Bit Number
Field
Name
Init. Val.
Description
RW
31:30
BWE
00
BWE - Breakpoint / Watchpoint Enable
00 = Disabled
01 = Breakpoint enabled
10 = Reserved
11 = Watchpoint enabled
R
29:26
Reserved
0
Reserved
RW
25
AME
0
AME - Address Mask Enable
This bit is only present in BWCxA registers.
0 = Disabled.
1 = Enabled. BWAxB will be used to bitwise mask
the PC compare according to this function:
BP A: (PC & BWA_B) == (BWA_A & BWA_B)
BP B: Will never trigger
R
24:15
Reserved
0
Reserved
RW
14
EOC
0
EOC - EVTO Control
0 = Breakpoint/watchpoint status indication is not
output on EVTO
1 = Breakpoint/watchpoint status indication is
output on EVTO
R
13:0
Reserved
0
Reserved
Data Breakpoint / Watchpoint Address (BWA3A, BWA3B)
Table 9-27.
9.4.6.4
PC BWCnx Register
Data Breakpoint/Watchpoint address (BWA3x) register
R/W
Bit Number
Field Name
Init. Val.
Description
RW
31:0
BWA
0x00000000
Address of data for breakpoint or watchpoint
generation.
Data Breakpoint / Watchpoint Data (BWD3A, BWD3B)
Table 9-28.
Data Breakpoint/Watchpoint data (BWD3x) register
R/W
Bit Number
Field Name
Init. Val.
Description
RW
31:0
BWD
0x00000000
Data value for breakpoint or watchpoint
generation.
123
32002F–03/2010
AVR32
9.4.6.5
Data Breakpoint / Watchpoint Control (BWC3A, BWC3B)
Table 9-29.
R/W
RW
Data Breakpoint / Watchpoint Control (BWC3x)
Bit Number
31:30
Field Name
BWE
Init. Val.
Description
00
BWE - Breakpoint / Watchpoint Enable
00 = Disabled
01 = Breakpoint enabled
10 = Reserved
11 = Watchpoint enabled
RW
29:28
BRW
00
BRW - Breakpoint/Watchpoint Read/Write
Select
00 = Break on read access
01 = Break on write access
10 = Break on any access
11 = Reserved
R
27:24
Reserved
00
Reserved
RW
23:20
BME
0x0
BME - Breakpoint/Watchpoint Data Mask
1XXX = Mask bits 31:24 in BWD
X1XX = Mask bits 23:16 in BWD
XX1X = Mask bits 15:8 in BWD
XXX1 = Mask bits 7:0 in BWD
R
19:18
Reserved
00
Reserved
RW
17:16
BWO
000
BWO - Breakpoint/Watchpoint Operand
1X = Compare with BWA value
X1 = Compare with BWD value
R
15
Reserved
0
Reserved
RW
14
EOC
0
EOC - EVTO Control
0 = Breakpoint/watchpoint status indication
not output on EVTO
1 = Breakpoint/watchpoint status indication is
output on EVTO
R
13:12
Reserved
0
Reserved
R/W
11:9
SIZE
000
SIZE - Size bits to match
0xx = Disregard access size (Default)
100 = Byte access
101 = Halfword access
110 = Word access
111 = Reserved
R/W
8:0
Reserved
0
Reserved
124
32002F–03/2010
AVR32
9.4.6.6
Watchpoint Trigger
Table 9-30.
R/W
9.5
9.5.1
WT, Watchpoint Trigger Register
Bit Number
Field Name
Init. Val.
Description
R/W
31:29
PTS
000
PTS - Program Trace Start
000 = Trigger disabled
001 = Program watchpoint 0b
010 = Program watchpoint 1a
011 = Program watchpoint 1b
100 = Program watchpoint 2a
101 = Program watchpoint 2b
110 = Data watchpoint 3a
111 = Data watchpoint 3b
R/W
28:26
PTE
000
PTE - Program Trace End
000 = Trigger disabled
001 <-> 111 Watchpoint selected as for PTS
R/W
25:23
DTS
000
DTS - Data Trace Start
000 = Trigger disabled
001 <-> 111 Watchpoint selected as for PTS
R/W
22:20
DTE
000
DTE - Data Trace End
000 = Trigger disabled
001 <-> 111 Watchpoint selected as for PTS
R
19:0
Reserved
-
Reserved
Program trace
Program trace overview
The AVR32 OCD system provides program trace support via the debug port. The program trace
feature implements a Program Flow Change Model in which the program trace is synchronized
at each program flow discontinuity. This occurs at taken indirect branches and exceptions. A
record of taken / not taken direct branches is included so that the complete program flow can be
decoded.
The development tool can then interpolate what transpires between each program trace message by correlating information from branch target messaging and static source or object code
files. Self-modifying code cannot be traced with the Program Flow Change Model because the
source code is not static.
The TM[2] bit in the Development Control register must be set to enable program trace.
9.5.1.1
Branch message summary
Five types of branch messages can be generated:
1. Program Trace, Indirect Branch is transmitted on most subroutine calls, returns, interrupts, exceptions, and any situation where the target address of a branch cannot be
determined from the source code. This message contains the instruction count to identify the branch and the target PC to identify the branch target.
2. Program Trace Synchronization is transmitted to indicate the current PC after starting
trace or after trace synchronization is lost.
125
32002F–03/2010
AVR32
3. Program Trace, Indirect Branch messages with sync contain both instruction count and
PC, and are transmitted instead of a Program Trace Synchronization message if a synchronization condition occurs and the current instruction is a taken direct/indirect
branch.
4. Program Trace, Resource full messages is transmitted when an internal buffer overflows. ICNT is transmitted whenever it overflows with this message.
5. Program Trace Correlation. This message is transmitted to synchonize the program
trace with an event. Sent when trace is disabled, debug mode is entered or sleep mode
is entered.
The Nexus standard also specifies Program Trace Correction messages to correct for speculatively transmitted trace messages, but these are not implemented in the AVR32, since program
trace messages are only transmitted for actually executed instructions. Similarly, the Nexusspecified CANCEL packet of synchronized branch messages is not implemented in AVR32.
Entry into Debug Mode will generate an program trace correlation message, while no trace messages are generated while executing in Debug Mode. A Program Trace Synchronization
message is transmitted when Debug Mode is exited.
9.5.2
Branch message packets
The program trace messages contain packets which identify the address of the taken branch,
the target of the branch, and the current program counter value. These packets are discussed
below.
9.5.2.1
Instruction count packet
In several of the program trace messages, an Instruction Count (I-CNT) packet is included, to
identify the number of sequentially executed instruction units since the last program trace message. In AVR32, this figure refers to bytes, i.e. compact instructions count two bytes and
extended instructions are four bytes.
The following rules apply to instruction counts:
• A taken indirect branch which generates a trace message is not included in the instruction
count.
• An indirect branch which is not taken is included in the instruction count.
• Speculatively fetched instructions are not counted until they are actually executed.
• The instruction counter is reset every time a program trace message is generated.
9.5.2.2
Compressed program counter packets
To save bandwidth, the Nexus messages employ compressed versions of the program counter
address. These include:
U-ADDR = StripLeadingZeros (Previous sent addr xor uncompressed address from pipeline).
F-ADDR = Full target address for a taken branch. Leading zeroes may be truncated.
9.5.3
Special cases
9.5.3.1
Debug Mode
When entering Debug Mode, a PTC message is generated with EVCODE = 0.
When exiting Debug Mode, a PTSY message is generated. If the instruction also generates a
branch message, the branch message with sync (i.e. PTDBS or PTIBS) is generated instead of
126
32002F–03/2010
AVR32
PTSY. In this case, the address of the instruction which generated the branch message can not
be explicitly reconstructed from the trace log, but the debugger will normally know which address
was returned to when Debug Mode was exited.
If a breakpoint occurs on the first instruction after exiting Debug Mode, a PTC message with
EVCODE = 0 is generated.
9.5.4
Messages
9.5.4.1
Program Trace, Direct Branch
This message is output by the target processor whenever there is a change of program flow
caused by a conditional or unconditional branch. The instruction count (I-CNT) is included to
identify the branch address. The following AVR32 instructions can cause a direct branch:
Table 9-31.
Direct branch instructions
Mnemonic
Description
br{cond3}
Compact
br{cond4}
Extended
rjmp
Compact
Branch if condition satisfied.
Table 9-32.
Branch if condition satisfied.
Direct Branch message without sync
Direct Branch Message
9.5.4.2
Packet Size
(bits)
Packet
Name
Packet
Type
Description
8
I-CNT
Variable
Number of bytes executed since the last taken branch.
6
TCODE
Fixed
Value = 3
Program Trace, Direct Branch with Target Address
This message is transmitted instead of the Direct Branch message when SQA enhanced program trace is enabled by writing DC:SQA to one. This simplifies real-time PC reconstruction in
the emulator for real-time code coverage and performance analysis purposes.
Table 9-33.
9.5.4.3
Direction: From target
Direct Branch message with Target Address
Direct Branch Message with Sync
Direction: From target
Packet
Size (bits)
Packet
Name
Packet
Type
Description
32
U-ADDR
Variable
The unique portion of the branch target address for a taken
indirect branch or exception. Most significant bits that have a
value of 0 are truncated.
8
I-CNT
Variable
Number of bytes executed since the last taken branch.
6
TCODE
Fixed
Value = 57
Program Trace, Indirect Branch
An indirect branch is output by the target processor whenever there is a change of program flow
caused by a subroutine call, return instruction, interrupt, or exception.
127
32002F–03/2010
AVR32
Messages for taken indirect branches and exceptions include how many sequential bytes were
executed since the last taken branch or exception, and the unique portion of the branch target
address or exception vector address. The unique portion of the branch is found by doing an
exclusively or on the branch target and the last sent UADDR / FADDR. Additionally, the cause
of the indirect branch is identified through an Event ID packet. Operations causing indirect
branches and their corresponding EVT-ID are shown below.
Table 9-34.
Operations causing indirect branch messages
Description
Operation
EVT-ID
Exception entry
Exception, interrupts (0 to 3), NMI, entry to Debug Mode
3
Subroutine call
acall, icall, mcall, jcall, scall, rcall instruction
2
Branch via register
contents
Any mov (except mov pc, lr) or load (except popm/ldm) with
PC as destination.
Any arithmetic instruction with PC as destination.
1
Return
ret{cond4}, rete, rets, retj, (mov pc, lr), popm/ldm loading
PC
0
Note that subrotine returns are often accomplished by a mov pc, lr, popm or ldm instruction with
PC included in the argument list. This generates an EVT-ID of 0 instead of 1..
Table 9-35.
Indirect branch message without sync
Indirect Branch Message
9.5.4.4
Direction: From target
Packet
Size (bits)
Packet
Name
Packet
Type
32
U-ADDR
Variable
The unique portion of the branch target address for a taken
indirect branch or exception. Most significant bits that have a
value of 0 are truncated.
8
I-CNT
Variable
Number of bytes executed since the last taken branch.
Description
2
EVT-ID
Fixed
Cause of indirect branch:
3: Exception entry
2: Call
1: Branch via register contents
0: Return
6
TCODE
Fixed
Value = 4
Program Trace Synchronization
This message is output by the PTU when any of the following conditions occurs:
1. Upon exit from reset. This is required to allow the number of instruction units executed
packet in a subsequent Program Trace Message to be correctly interpreted by the tool.
2. When program trace is enabled during normal execution of the embedded processor.
3. Upon exit from a power-down state. This is required to allow the number of instruction
units executed packet in a subsequent Program Trace Message to be correctly interpreted by the tool.
4. Upon exiting from Debug Mode.
5. An overrun condition had previously occurred in which one or more branch trace occurrences were discarded by the target processor’s debug logic.To inform the tool that an
overrun condition occurred, the target outputs an Error Message (TCODE = 8) with an
128
32002F–03/2010
AVR32
ECODE value of 00001 or 00111 immediately prior to the Program Trace Synchronization Message.
6. A debug control register field specifies that EVTI pin action is to generate program trace
synchronization, and the Event-In (EVTI) pin has been asserted.
7. Upon overflow of the sequential instruction unit counter.
8. After 256 branch messages without sync.
Table 9-36.
9.5.4.5
Program Trace Synchronization Message
Program Trace Sync Message
Direction: From target
Packet
Size (bits)
Packet
Name
Packet
Type
Description
32
PC
Variable
The full current instruction address. Most significant bits that
have a value of 0 are truncated.
8
I-CNT
Variable
Number of bytes executed since the last taken branch.
6
TCODE
Fixed
Value = 9
Program Trace, Direct Branch with Sync
If a Program Trace Synchronization message occurs on an instruction which transmits a direct
branch message, the Direct Branch with Sync message is transmitted instead of the Program
Trace Synchronization message. The Direct Branch with Sync message contains the instruction
count referring to the taken branch, as well as the complete PC value of the branch target.
The format for direct branch messages with sync is shown below. The AVR32 OCD system
never issues speculative branch messages and there is therefore no CANCEL packet.
Table 9-37.
9.5.4.6
Direct Branch message with Sync
Direct Branch Message with Sync
Direction: From target
Packet
Size (bits)
Packet
Name
Packet
Type
Description
32
F-ADDR
Variable
The full target address for a taken direct branch. Most
significant bits that have a value of 0 are truncated.
8
I-CNT
Variable
Number of bytes executed since the last taken branch.
6
TCODE
Fixed
Value = 11
Program Trace, Indirect Branch with Sync
If a Program Trace Synchronization message occurs on an instruction which transmits an indirect branch message, the Indirect Branch with Sync message is transmitted instead of the
Program Trace Synchronization message. The Indirect Branch with Sync message contains the
instruction count referring to the taken branch, as well as the complete PC value of the branch
target.
129
32002F–03/2010
AVR32
The format for indirect branch messages with sync is shown below. The AVR32 OCD system
never issues speculative branch messages and there is therefore no CANCEL packet.
Table 9-38.
9.5.4.7
Indirect Branch message with Sync
Indirect Branch Message with Sync
Direction: From target
Packet
Size (bits)
Packet
Name
Packet
Type
Description
32
F-ADDR
Variable
The full target address for a taken direct branch. Most
significant bits that have a value of 0 may be truncated.
8
I-CNT
Variable
Number of bytes executed since the last taken branch.
2
EVT-ID
Fixed
Cause of indirect branch:
3: Exception entry
2: Call
1: Branch via register contents
0: Return
6
TCODE
Fixed
Value = 12
Program Trace, Resource Full
This message is output whenever an internal resource (sequential instruction counter) has
reached its maximum value. To avoid losing information when this resource becomes full, the
Resource Full message is transmitted. The information from this message is added with information from subsequent messages to interpret the full picture of what has transpired. Multiple
Resource Full messages can occur before the arrival of the message that the information
belongs with.
Table 9-39.
Resource Full message
Program Trace, Resource Full
Direction: From target
Packet
Size (bits)
Packet
Name
Packet
Type
Description
8
RDATA
Variable
Number of bytes executed since the last taken branch.
4
RCODE
Fixed
Resource Code. This code indicates which internal resource
has reached its maximum value. Refer to Table 9-40 for
details.
6
TCODE
Fixed
Value = 27
Table 9-40.
Resourc
e Code
Resource Code (RCODE) description
Resource
Data Packet Value
0b0000
Program Trace - Sequential Instruction
Counter
Number of instruction units executed since
the last taken branch.
0b0001 0b1111
Reserved
130
32002F–03/2010
AVR32
9.5.4.8
Program Trace Correlation
Program Trace Correlation messages are used to correlate events to the program flow that may
not be associated with the instruction stream (e.g. Data Trace Messages). The occurrence of an
event listed in Table 9-41 will cause this message to be transmitted.
Table 9-41.
Program Trace Correlation message
Program Trace Correlation
Packet
Size (bits)
Packet
Name
Packet
Type
8
I-CNT
Variable
Number of instruction units executed since the last taken
branch.
4
EVCODE
Fixed
Event Code. Refer to Table 9-42.
6
TCODE
Fixed
Value = 33
Table 9-42.
9.5.5
Direction: From target
Description
Event Code (EVCODE) description
Event Code
(EVCODE)
Event Description
0b0000
Entry into Debug Mode
0b0001
Entry into Low Power Mode
0b0010 - 0b0011
Reserved
0b0100
Program Trace Disabled
0b0101 - 0b1111
Reserved
Registers
Program trace is enabled using the TM field in the Development Control register.
9.6
Data Trace
9.6.1
Overview
The AVR32 OCD system provides data trace via the AUX port. The CPU data memory accesses
can be monitored real-time using the Nexus class 2+ compliant Data Trace Unit. Both reads and
writes can be traced.
Data Trace information is transmitted through data trace messages, which can be of read or
write type, with or without sync. The messages contain information about the data address and
value which triggered the trace. Data addresses can be complete (with sync), or compressed
relative to the previous transmitted message (without sync). The value contains the data value
read or written from the data cache, and is of the same width as the access size (byte, halfword,
word, or doubleword).
The TM[1] bit in the Development Control register must be set to enable data trace. It is also
possible to trigger data trace using watchpoints. In this case, TM[1] will be set or cleared
automatically.
131
32002F–03/2010
AVR32
9.6.2
Using data trace channels as watchpoints
Data Trace is enabled for address ranges (trace channels) specified by pairs of Data Trace Start
and End Address registers (DTSA/DTEA). Each data access within that boundary will generate
an action as specified by the corresponding bits in the Data Trace Control register (DTC). The
AVR32 OCD system currently supports two data trace channels.
While each channel can be used to trigger data trace messages, it is also possible to trigger
watchpoint messages, providing flexibility when using the OCD system. Watchpoints can be
ranged, i.e. trigger on all accesses between DTSA through DTEA, or trigger on a single location,
if DTSA and DTEA are written to the same value.
Writing TnWP to one enables a watchpoint on accesses for data trace channel n. The watchpoint message is sent as a vendor defined trace watchpoint message.
It is possible to enable both trace and watchpoint on the same channel, but typically, only one of
the options will be used.
9.6.3
Messages
The Trace Watchpoint Hit message is described in Section 9.4.5.2 on page 122.
9.6.3.1
Data Trace, Data Write (DTDW)
This message is output by the target processor when it detects a memory write that matches the
OCD system’s data trace attributes.
Table 9-43.
9.6.3.2
Data Trace, Data Write message
Data Trace, Data Write message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
8 / 16 /
32
DATA
Variable
The data value written. The size will vary depending on the load
/ store instruction being traced.
32
U-ADDR
Variable
The unique portion of the data write address, which is relative to
the previous Data Trace Message (read or write).
2
DSZ
Fixed
Data size:
00 = 8 bits
01 = 16 bits
10 = 32 bits
6
TCODE
Fixed
Value=5
Data Trace, Data Write with Sync (DTDWS)
This message is an alternative to the Data Trace, Data Write Message. It is output instead of a
Data Trace, Data Write Message whenever a memory write occurs that matches the debug
logic’s data trace attributes, and when one of the following conditions has occurred:
1. The processor has exited from reset. This synchronization message is required to allow
the unique portion of the data write address of following Data Trace, Data Write Messages to be correctly interpreted by the tool.
2. When data trace is enabled during normal execution of the embedded processor.
132
32002F–03/2010
AVR32
3. Upon exit from a power-down state. This synchronization message is required to allow
the unique portion of the data write address of following Data Trace, Data Write Messages to be correctly interpreted by the tool.
4. The Event-In pin has been asserted and a debug control register field specifies that
EVTI pin action is to generate data trace synchronization.
5. An overrun condition had previously occurred in which one or more data trace occurrences were discarded by the target processor’s debug logic. To inform the tool that an
overrun condition occurred,the target outputs an Error Message (TCODE = 8) with an
ECODE value of 00010 or 00111 immediately prior to the Data Trace, Data Write with
Sync Message.
6. The Data Trace Message counter has expired indicating that at most 256 without-sync
versions of Data Trace Messages have been sent since the last with-sync version.
7. A data write is detected following the processor exiting from Debug Mode.
Table 9-44.
9.6.3.3
Data Trace, Data Write with Sync message
Data Trace, Data Write with Sync
message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
8 / 16 /
32
DATA
Variable
The data value written. The size will vary depending on the load /
store instruction being traced.
32
F-ADDR
Variable
The full address of the memory location written. Most significant
bits that have a value of 0 are truncated.
2
DSZ
Fixed
Data size:
00 = 8 bits
01 = 16 bits
10 = 32 bits
6
TCODE
Fixed
Value=13
Data Trace, Data Read (DTDR)
This message is output by the target processor when it detects a memory read that matches the
OCD system’s data trace attributes.
Table 9-45.
Data Trace, Data Read message
Data Trace, Data Read message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
8 / 16 /
32
DATA
Variable
The data value read. The size will vary depending on the load /
store instruction being traced.
32
U-ADDR
Variable
The unique portion of the data read address, which is relative to
the previous Data Trace Message (read or write).
2
DSZ
Fixed
Data size:
00 = 8 bits
01 = 16 bits
10 = 32 bits
6
TCODE
Fixed
Value=6
133
32002F–03/2010
AVR32
9.6.3.4
Data Trace, Data Read with Sync (DTDRS)
This message is an alternative to the Data Trace, Data Read Message. It is output instead of a
Data Trace, Data Read Message whenever a memory read occurs that matches the debug
logic’s data trace attributes, and when one of the following conditions has occurred:
The processor has exited from reset. This synchronization message is required to allow the
unique portion of the data write address of following Data Trace, Data Read Messages to be correctly interpreted by the tool.
When enabling data trace is during normal execution of the embedded processor.
Upon exit from a power-down state. This synchronization message is required to allow the
unique portion of the data write address of following Data Trace, Data Read Messages to be correctly interpreted by the tool.
The Event-In pin has been asserted and a debug control register field specifies that EVTI pin
action is to generate data trace synchronization.
An overrun condition had previously occurred in which one or more data trace occurrences were
discarded by the target processor’s debug logic. To inform the tool that an overrun condition
occurred, the target outputs an Error Message (TCODE = 8) with an ECODE value of 00010 or
00111 immediately prior to the Data Trace, Data Read with Sync Message.
The periodic Data Trace Message counter has expired indicating that 255 without-sync versions
of Data Trace Messages have been sent since the last with-sync version.
A data read is detected following the processor exiting from Debug Mode.
Table 9-46.
Data Trace, Data Read with Sync message
Data Trace, Data Read with Sync
message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
8 / 16 /
32
DATA
Variable
The data value read. The size will vary depending on the load /
store instruction being traced.
32
F-ADDR
Variable
The full address of the memory location written. Most significant
bits that have a value of 0 are truncated.
2
DSZ
Fixed
Data size:
00 = 8 bits
01 = 16 bits
10 = 32 bits
6
TCODE
Fixed
Value=14
134
32002F–03/2010
AVR32
9.6.3.5
Data Trace, Read-Modify-Write (DTRMW)
This message is generated when a Read-Modify-Write (RMW) instruction is generated with a
target address within an active data trace window. These instructions have the format "memc/s/t
imm, bp", and can clear, set, or toggle a specified bit in memory.
Table 9-47.
9.6.3.6
Data Trace, Read-Modify-Write message
Data Trace, Read-Modify-Write
message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
32
U-ADDR
Variable
The unique portion of the data write address, which is relative to
the previous Data Trace Message (read or write).
5
BIT
Variable
Bit argument of the RMW instruction.
2
TYPE
Fixed
Bit operation:
00 = Reserved
01 = Clear
10 = Set
11 = Toggle
6
TCODE
Fixed
Value=58
Data Trace, Read-Modify-Write with Sync (DTRMWS)
This message is output instead of DTRMW under the same conditions as shown for DTDWS.
Table 9-48.
Data Trace, Read-Modify-Write with Sync message
Data Trace, Read-Modify-Write
with sync message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
32
F-ADDR
Variable
The full address of the memory location written. Most significant
bits that have a value of 0 are truncated.
5
BIT
Variable
Bit argument of the RMW instruction.
2
TYPE
Fixed
Bit operation:
00 = Reserved
01 = Clear
10 = Set
11 = Toggle
6
TCODE
Fixed
Value=59
135
32002F–03/2010
AVR32
9.6.4
Registers
9.6.4.1
Data Trace Control register (DTC)
This register controls actions taken on data accesses within all data trace channels.
Table 9-49.
R/W
R/W
9.6.4.2
Data Trace Control Register
Bit Number
31:30
Field Name
RWT0
Init. Val.
Description
0
RWT0 - Read/Write Trace channel 0
00 = No trace enabled
x1 = Enable data read trace
1x = Enable data write trace
RWT1 - Read/Write Trace channel 1
00 = No trace enabled
x1 = Enable data read trace
1x = Enable data write trace
R/W
29:28
RWT1
0
R
27:2
Reserved
0
R/W
1
T1WP
0
T1WP - Trace Channel 1 Watchpoint
R/W
0
T0WP
0
T0WP - Trace Channel 0 Watchpoint
Data Trace Start/End Address register (DTSA/DTEA)
DTSAn and DTEAn define the inclusive data access range [DTSAn : DTEAn] for trace channel
n. Each trace channel 0 and 1 has its own DTSA/DTEA register pair. If DTSA=DTEA, the trace
channel will match on accesses to a single location. If DTSA>DTEA, no match will occur for the
trace channel.
DTSA0, DTSA1
Table 9-50.
Data Trace Start Address Register
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
DTSA
0
DTSA - Start address for trace visibility
DTEA0, DTEA1
Table 9-51.
9.7
9.7.1
Data Trace End Address Register
R/W
Bit Number
Field Name
Init. Val.
Description
R/W
31:0
DTEA
0
DTEA - End address for trace visibility
Ownership Trace
Functional description
The AVR32 OCD system implements Ownership Trace in compliance with the Nexus standard.
Ownership trace provides a macroscopic view, such as task flow reconstruction, when debugging software written in a high level (or object oriented) language. It offers the highest level of
abstraction for tracking operating system software execution. This is especially useful when the
developer is not interested in debugging at lower levels.
Ownership trace is especially important for embedded processors with a memory management
unit, in which all processes can use the same virtual program and data spaces. Ownership trace
136
32002F–03/2010
AVR32
offers development tools a mechanism to decipher which set of symbolics and sources are
associated for lower levels of visibility and debugging.
Ownership trace information is transmitted out the AUX using an Ownership Trace Message.
OTM facilitates ownership trace by providing visibility of which process ID or operating system
task is activated. An Ownership Trace Message is transmitted to indicate when a new process/task is activated, allowing development tools to trace ownership flow. Additionally, an
Ownership Trace Message is also transmitted periodically during runtime at a minimum frequency of every 256 Program Trace or Data Trace Messages.
In the AVR32, this feature is supported through an Ownership Trace Register, which automatically produces an Ownership Trace Message when written to. The RTOS scheduler routine
writes the new process ID to this register during process switching using the mtdr instruction.
The TM[0] bit in the Development Control register must be set to enable ownership trace.
9.7.2
Messages
9.7.2.1
Ownership Trace (OT)
• The ownership trace message is sent:
• When the Ownership Trace Process ID (PID) register is written.
• When program trace with sync message is generated due to overflow in the periodic
message counter.
• When a data trace with sync message is generated due to overflow in the periodic message
counter.
• After a Transmit Queue overrun if the CPU has written to PID when the queue was full.
If there is no room in the Transmit Queue for the message, and the CPU is not halted to prevent
overruns, an error message is produced.
Table 9-52.
9.7.3
Ownership Trace Message
Ownership Trace Message
Direction: From target
Packet
Size
Packet
Name
Packet
Type
Description
32
PROCESS
Fixed
Task / process ID.
6
TCODE
Fixed
Value = 2
Registers
9.7.3.1
Ownership Trace Process ID (PID)
The CPU should write the current Process ID value to this register, whenever the RTOS performs a process switch. This will automatically create an Ownership Trace Message to be
transmitted to the tool. This register can be written from any privileged CPU mode.
137
32002F–03/2010
AVR32
The tool can read and write this register, although it is recommended that only the CPU writes
this register.
Table 9-53.
R/W
RW
9.8
Ownership Trace Process ID (PID)
Bit Number
31:0
Field Name
PROCESS
Init. Val.
Description
0
PROCESS - Process ID
The unique Process ID number of the currently
running process.
Memory Service Unit
The Memory Service Unit (MSU) provides access to complex memory operations, such as CRC
checking and NanoTrace. The MSU is accessed by SAB registers, but these are not mapped
into the OCD register space, and needs to be accessed with MEMORY_WORD_ACCESS or
MEMORY_SERVICE_ACCESS JTAG commands. In addition the MSU registers are mapped in
the system register space and are available tothe CPU. Refer to Section 2.5 ”System registers”
on page 11 for details.
9.8.1
CRC
The MSU can calculate a Cyclic Redundancy Check (CRC) value for a memory area. The algorithm used is the industry standard CRC32 algorithm using the generator polynomial
0xEDB88320.
9.8.1.1
Starting CRC calculation
To calculate CRC for a memory range, you need to write the start address into the ADDRHI and
ADDRLO registers, and the size of the memory range into the LENGTH register. Both the start
address and the length must be word aligned.
The initial value used for the CRC calculation must be written to the DATA register. This value
will usually be 0xFFFFFFFF, but can be e.g. the result of a previous CRC calculation if generating a common CRC of separate memory blocks.
Once completed, the calculated CRC value can be read out of the DATA register. The read
value must be inverted to match standard CRC32 implementations, or kept non-inverted if used
as starting point for subsequent CRC calculations.
If the device has enabled protection features, e.g. the protection fuse has been set on devices
with onboard flash memory, it is only possible to calculate CRC on a predefined memory area. In
most cases this area will be the entire onboard flash memory. The ADDRHI, ADDRLO,
LENGTH, and DATA registers will be forced to predefined values once the CRC operation is
started, and user-written values are ignored. This allows the user to verify the contents of a protected device, while denying malicious users the option of analyzing the memory contents
through selective CRC calculations.
The actual test is started by writing OP_CRC to the CTRL register. A running CRC operation can
be cancelled by writing OP_IDLE to CTRL.
138
32002F–03/2010
AVR32
9.8.1.2
Interpreting the results
The user should monitor the RESULT register. The possible values are:
Table 9-54.
9.8.2
CRC results
Result
Description
NOT_IMPL
The CRC feature is not implemented in this device.
CANCELED
The CRC operation was canceled by writing a value different from OP_CRC to CTRL.
BUSY
The CRC operation is running.
DONE
The CRC operation has completed.
BUS_ERR
Part of the specified memory could not be read or written.
The offending address can be read out of ADDRHI and ADDRLO.
NanoTrace
The MSU redirect OCD trace output from the normal trace output port to memory. This feature is
called NanoTrace, and enables trace functionality with low cost debuggers, or even trace support in self hosted debuggers.
9.8.2.1
Starting NanoTrace
The memory range to write the trace data to is specified by writing the start address into the
ADDRHI and ADDRLO registers, and the size of the memory range into the LENGTH register. In
addition, the TAIL register may need to be updated, see below for details. Both the start address
and the length must be word aligned.
The MSU starts expecting trace data when OP_NANOTRACE is written to the CTRL register.
The OCD system must be separately configured to actually produce trace data.
The ntbc field of CTRL controls how the memory service behaves when the trace buffer is full.
9.8.2.2
Controlling buffer overflow
The trace buffer as specified by the initial ADDRHI, ADDRLO and LENGTH registers works as a
circular buffer. The ADDRLO register is used by the MSU as the insert or head pointer, and the
TAIL register is used by the debugger as the extract or tail pointer. ADDRHI is used together
with both ADDRLO and TAIL to generate complete SAB addresses.
While writing trace data to memory, the ADDRLO and LENGTH registers are continuously
updated with the current address being written and bytes remaining until the end of the buffer.
When LENGTH reaches zero, the original contents of ADDRLO and LENGTH are restored, and
trace data are again written from the beginning of the buffer.
The trace buffer is considered to be full when the ADDRLO register reaches the same value as
TAIL. When starting trace, the TAIL register should be written to the same value as ADDRLO,
meaning that the entire buffer can be written before ADDRLO matches TAIL. During tracing, a
debugger may read out trace data from the buffer area between ADDRLO and TAIL, and then
update TAIL. This will release space in the buffer, and can potentially allow continuos tracing if
the trace buffer can be read out faster than trace data is generated.
When the buffer does fill up, the behavior of the MSU depends on the setting of the ntbc field:
• OVER: The unit keeps tracing, overwriting old trace data. The wrap field in the STATUS
register is set to indicate that trace data are lost. The OCD system is asked to generate a
synchronization message as soon as possible. The wrap field must be manually cleared by
writing it to zero.
139
32002F–03/2010
AVR32
• DISABLE: The MSU leaves the NanoTrace mode, and goes to idle mode. The OCD system
will continue as before, but trace data will go to the trace output port if enabled.
• BREAK_STOP: The MSU will ask the OCD system to enter debug mode, so no more trace
data is generated. There will be a few more trace messages that do not fit in the buffer, so the
CPU is forced to stop until the debugger clears room in the trace buffer. This ensures that no
trace data is lost, but this also means that this mode will deadlock self hosted debuggers, as
the CPU is stalled and can never release room in the trace buffer!
• BREAK_FLUSH: The MSU will ask the OCD system to enter debug mode, so no more trace
data is generated. There will be a few more trace messages that do not fit in the buffer, but
the MSU will flush these and allow the CPU to enter OCD mode. This means that the last few
messages will be lost!
When restarting NanoTrace after entering debug mode using one of the BREAK modes, the
TAIL register must be updated in order to tell the MSU that there is free space in the trace buffer.
It is perfectly valid to update TAIL to the value it already contains - this is interpreted as making
the entire buffer available. After this, the CPU can be restarted by issuing a retd instruction
through the OCD system. If ntbc is set to BREAK_FLUSH the first new trace message will be a
synchronization message.
9.8.2.3
Interpreting the results
The debugger should monitor the RESULT register. The possible values are:
Table 9-55.
NanoTrace results
Result
Description
NOT_IMPL
The NanoTrace feature is not implemented in this device.
CANCELED
The NanoTrace operation was canceled by writing a value different from
OP_NANOTRACE to CTRL.
BUSY
The NanoTrace operation is running.
DONE
The NanoTrace operation has completed. This can only happen when the ntbc field is
written to DISABLE.
BUS_ERR
Part of the specified memory could not be written to.
The offending address can be read out of ADDRHI and ADDRLO.
140
32002F–03/2010
AVR32
9.8.3
MSU Register summary
Table 9-56.
Offset
MSU Register Summary
Register
Name
Access
Reset State
0
Address register, high part
ADDRHI
Read/Write
-
1
Address register, low part
ADDRLO
Read/Write
-
2
Length register
LENGTH
Read/Write
-
3
Control register
CTRL
Read/Write
0x0
4
Status register
STATUS
Read-only
0x0
5
Data register
DATA
Read/Write
0x0
6
Tail address register
TAIL
Read/Write
-
141
32002F–03/2010
AVR32
9.8.3.1
Address Register, High Part
Name:
ADDRHI
Access Type: Read/Write
Address offset: 0
31
–
23
–
15
–
7
–
30
–
22
–
14
–
6
–
29
–
21
–
13
–
5
–
28
–
20
–
12
–
4
–
27
–
19
–
11
–
3
26
–
18
–
10
–
2
25
–
17
–
9
–
1
24
–
16
–
8
–
0
ADDRHI
• ADDRHI: Address High part
Bits 35:32 of full SAB address
142
32002F–03/2010
AVR32
9.8.3.2
Address Register, Low Part
Name:
ADDRLO
Access Type: Read/Write
Address offset: 1
31
30
29
28
23
22
21
20
27
26
25
24
19
18
17
16
11
10
9
8
3
2
1
0
0
0
ADDRLO
ADDRLO
15
14
13
12
7
6
5
4
ADDRLO
ADDRLO
• ADDRLO: Address Low part
Bits 31-2 of full SAB address.
• Bits 1:0
Always zero.
143
32002F–03/2010
AVR32
9.8.3.3
Length Register
Name:
LENGTH
Access Type: Read/Write
Address offset: 2
31
30
29
28
23
22
21
20
27
26
25
24
19
18
17
16
11
10
9
8
3
2
1
0
0
0
LENGTH
LENGTH
15
14
13
12
7
6
5
4
LENGTH
LENGTH
• LENGTH: Length value
• Bits 1:0
Always zero.
144
32002F–03/2010
AVR32
9.8.3.4
Control Register
Name:
CTRL
Access Type: Read/Write
Address offset: 3
31
–
23
–
15
–
7
–
30
–
22
–
14
–
6
–
29
–
21
–
13
–
5
28
–
20
–
12
–
4
27
–
19
–
11
–
3
NTBC
26
–
18
–
10
–
2
25
–
17
–
9
–
1
24
–
16
–
8
–
0
OP
• OP: Requested operation:
Table 9-57. OP field of control register
OP
Name
Description
0
NONE
No operation, or cancel current operation
1
CRC
2
NTRACE
Others
N/A
Calculate CRC of memory area
Write trace data from debug system to memory
Reserved
• NTBC: NanoTrace Buffer Control
What to do when trace buffer is full:
Table 9-58.
NTBC field of control register
OP
Name
Description
0
OVER
Overwrite old buffer contents
1
DISABLE
2
BREAK_STOP
Make the CPU enter debug mode, but stop CPU until place has been freed for the last few trace
frames.
Note: This may deadlock self-hosted debuggers!
3
BREAK_FLUSH
Make the CPU enter debug mode, and flush the last few trace frames that don’t fit in the buffer.
Disable nanotrace
145
32002F–03/2010
AVR32
9.8.3.5
Status Register
Name:
STATUS
Access Type: Read/Write
Address offset: 4
31
–
23
–
15
–
7
–
30
–
22
–
14
–
6
–
29
–
21
–
13
–
5
–
28
–
20
–
12
–
4
–
27
–
19
–
11
–
3
WRAP
26
–
18
–
10
–
2
25
–
17
–
9
–
1
RESULT
24
–
16
–
8
–
0
• RESULT: Result of current or last operation
Table 9-59. Result field of status register
OP
Name
Description
0
DONE
The previous operation finished successfully
1
BUSY
Some operation is currently active
2
NOT_IMPL
The requested operation is not implemented
3
BUS_ERR
Unable to access part of the requested memory area
4
FAILED
5
LOCKED
6
CANCELED
Others
N/A
The requested operation failed
The requested operation cannot be performed because chip security features are enabled
The previous operation was canceled by the user
Reserved
• WRAP
The NanoTrace write buffer address has wrapped back to the start address.
146
32002F–03/2010
AVR32
9.8.3.6
Data Register
Name:
DATA
Access Type: Read/Write
Address offset: 5
31
30
29
28
23
22
21
20
27
26
25
24
19
18
17
16
11
10
9
8
3
2
1
0
DATA
DATA
15
14
13
12
7
6
5
4
DATA
DATA
• DATA: Generic Data Register
147
32002F–03/2010
AVR32
9.8.3.7
Tail Address Register
Name:
TAIL
Access Type: Read/Write
Address offset: 7
31
30
29
28
23
22
21
20
27
26
25
24
19
18
17
16
11
10
9
8
3
2
1
0
0
0
TAIL
TAIL
15
14
13
12
7
6
5
4
TAIL
TAIL
• TAIL: Tail Address Register for NanoTrace buffer.
• Bits 1:0
Always zero.
148
32002F–03/2010
AVR32
9.9
OCD Message Summary
Table 9-60.
Message Summary
TCODE
Message
Public /
Vendor
Defined
0
Debug Status (DEBS)
Public
page 100
1
Reserved
2
Ownership Trace (OT)
Public
page 137
3
Program Trace, Direct Branch (PTDB)
Public
page 127
4
Program Trace, Indirect Branch (PTIB)
Public
page 127
5
Data Trace, Data Write (DTDW)
Public
page 132
6
Data Trace, Data Read (DTDR)
Public
page 133
7
Reserved
8
Error (ERROR)
Public
page 116
9
Program Trace Synchronization (PTSY)
Public
page 128
10
Reserved
11
Program Trace, Direct Branch with Sync (PTDBS)
Public
page 129
12
Program Trace, Indirect Branch with Sync (PTIBS)
Public
page 129
13
Data Trace, Data Write with Sync (DTDWS)
Public
page 132
14
Data Trace, Data Read with Sync (DTDRS)
Public
page 134
15
Watchpoint Hit (WH)
Public
page 122
16–26
Reserved
27
Program Trace Resource Full (PTRF)
Public
page 130
28–32
Reserved
33
Program Trace Correlation (PTC)
Public
page 131
34–55
Reserved
56
Trace Watchpoint Hit (TWH)
Vendor
page 122
57
Direct Branch with Target Address (DBTA)
Vendor
page 127
58
Data Trace, Read-Modify-Write (DTRMW)
Vendor
page 135
59
Data Trace, Read-Modify-Write with Sync (DTRMWS)
Vendor
page 135
60-62
Reserved
Vendor
63 (0x3F)
Vendor Defined Extension Message
Reserved
Vendor
Page
Table 9-62 shows the messages which can be transmitted by the target on the AUX port. OCD
registers can be written by the tool using the JTAG mechanism described in “Debug Port” on
page 109.
149
32002F–03/2010
AVR32
Table 9-63 shows the format of the transmitted messages. Packets shown in bold are variable
length, the others are fixed length. All variable length packets can be truncated by omitting leading zeroes, but will always end on a port boundary.
Table 9-61.
Message formats
Message format
Nexus Message
TCODE
[5:0]
Packet 1
Debug Status
0
STATUS[31:0]
Ownership Trace
2
Error
8
Program Trace,
Direct Branch
3
Program Trace,
Direct Branch with
Target Address
57
Program Trace,
Indirect Branch
4
Program Trace
Synchronization
9
Program Trace,
Direct Branch with
Sync
11
Program Trace,
Indirect Branch
with Sync
12
Program Trace
Resource Full
27
RCODE[3:0]
Program Trace
Correlation
33
EVCODE[3:0]
Data Trace, Data
Write
5
DSZ[1:0]
Data Trace, Data
Read
6
DSZ[1:0]
Data Trace, ReadModify-Write
58
TYPE[1:0]
Data Trace, Data
Write with Sync
13
DSZ[1:0]
Data Trace, Data
Read with Sync
14
DSZ[1:0]
Data Trace, ReadModify-Write with
Sync
59
TYPE[1:0]
Watchpoint Hit
15
WPHIT[7:0]
-
-
Trace Watchpoint
Hit
56
WPHIT[1:0]
-
-
Packet 2
Packet 3
PROCESS [31:0]
-
-
ECODE[4:0]
-
-
-
-
I-CNT[7:0]
I-CNT[7:0]
EVT-ID[1:0]
U-ADDR[31:0]
I-CNT[7:0]
I-CNT[7:0]
PC[31:0]
I-CNT[7:0]
F-ADDR[31:0]
U-ADDR[31:0]
-
-
EVT-ID[1:0]
I-CNT[7:0]
F-ADDR[31:0]
RDATA[7:0]
I-CNT[7:0]
U-ADDR[31:0]
DATA[31:0]
U-ADDR[31:0]
DATA[31:0]
BIT[4:0]
U_ADDR[31:0]
F-ADDR[31:0]
DATA[31:0]
F-ADDR[31:0]
DATA[31:0]
BIT[4:0]
F_ADDR[31:0]
150
32002F–03/2010
AVR32
9.10
OCD Message Summary
Table 9-62.
Message Summary
TCODE
Message
Public /
Vendor
Defined
0
Debug Status (DEBS)
Public
page 100
1
Reserved
2
Ownership Trace (OT)
Public
page 137
3
Program Trace, Direct Branch (PTDB)
Public
page 127
4
Program Trace, Indirect Branch (PTIB)
Public
page 127
5
Data Trace, Data Write (DTDW)
Public
page 132
6
Data Trace, Data Read (DTDR)
Public
page 133
7
Reserved
8
Error (ERROR)
Public
page 116
9
Program Trace Synchronization (PTSY)
Public
page 128
10
Reserved
11
Program Trace, Direct Branch with Sync (PTDBS)
Public
page 129
12
Program Trace, Indirect Branch with Sync (PTIBS)
Public
page 129
13
Data Trace, Data Write with Sync (DTDWS)
Public
page 132
14
Data Trace, Data Read with Sync (DTDRS)
Public
page 134
15
Watchpoint Hit (WH)
Public
page 122
16–26
Reserved
27
Program Trace Resource Full (PTRF)
Public
page 130
28–32
Reserved
33
Program Trace Correlation (PTC)
Public
page 131
34–55
Reserved
56
Trace Watchpoint Hit (TWH)
Vendor
page 122
57
Direct Branch with Target Address (DBTA)
Vendor
page 127
58
Data Trace, Read-Modify-Write (DTRMW)
Vendor
page 135
59
Data Trace, Read-Modify-Write with Sync (DTRMWS)
Vendor
page 135
60-62
Reserved
Vendor
63 (0x3F)
Vendor Defined Extension Message
Reserved
Vendor
Page
Table 9-62 shows the messages which can be transmitted by the target on the AUX port. OCD
registers can be written by the tool using the JTAG mechanism described in “Debug Port” on page
109.
151
32002F–03/2010
AVR32
Table 9-63 shows the format of the transmitted messages. Packets shown in bold are variable
length, the others are fixed length. All variable length packets can be truncated by omitting leading zeroes, but will always end on a port boundary.
Table 9-63.
Message formats
Message format
Nexus Message
TCODE
[5:0]
Packet 1
Debug Status
0
STATUS[31:0]
Ownership Trace
2
Error
8
Program Trace,
Direct Branch
3
Program Trace,
Direct Branch with
Target Address
57
Program Trace,
Indirect Branch
4
Program Trace
Synchronization
9
Program Trace,
Direct Branch with
Sync
11
Program Trace,
Indirect Branch
with Sync
12
Program Trace
Resource Full
27
RCODE[3:0]
Program Trace
Correlation
33
EVCODE[3:0]
Data Trace, Data
Write
5
DSZ[1:0]
Data Trace, Data
Read
6
DSZ[1:0]
Data Trace, ReadModify-Write
58
TYPE[1:0]
Data Trace, Data
Write with Sync
13
DSZ[1:0]
Data Trace, Data
Read with Sync
14
DSZ[1:0]
Data Trace, ReadModify-Write with
Sync
59
TYPE[1:0]
Watchpoint Hit
15
WPHIT[7:0]
-
-
Trace Watchpoint
Hit
56
WPHIT[1:0]
-
-
Packet 2
Packet 3
PROCESS [31:0]
-
-
ECODE[4:0]
-
-
-
-
I-CNT[7:0]
I-CNT[7:0]
EVT-ID[1:0]
U-ADDR[31:0]
I-CNT[7:0]
I-CNT[7:0]
PC[31:0]
I-CNT[7:0]
F-ADDR[31:0]
U-ADDR[31:0]
-
-
EVT-ID[1:0]
I-CNT[7:0]
F-ADDR[31:0]
RDATA[7:0]
I-CNT[7:0]
U-ADDR[31:0]
DATA[31:0]
U-ADDR[31:0]
DATA[31:0]
BIT[4:0]
U_ADDR[31:0]
F-ADDR[31:0]
DATA[31:0]
F-ADDR[31:0]
DATA[31:0]
BIT[4:0]
F_ADDR[31:0]
152
32002F–03/2010
AVR32
9.11
OCD Register Summary
Use the index shown in the "Register index" column when accessing OCD registers by the
Nexus access mechanism (see Section 9.3.2 on page 109).Use the index shown in the
"mtdr/mfdr index" column when accessing OCD registers by mtdr/mfdr instructions from the
CPU (see Section 9.2.10 on page 98). These indexes are identical to the register index multiplied by 4.
Table 9-64.
OCD Register Summary
Register
Index
mtdr/mf
dr index
Register
Access
Type
Page
0
0
Device ID (DID)
R
page 100
1
4
Reserved
—
2
8
Development Control (DC)
R/W
3
12
Reserved
—
4
16
Development Status (DS)
R
5-6
20-24
Reserved
—
7
28
Reserved
—
8
32
Reserved
—
9
36
Reserved
—
10
40
Reserved
—
11
44
Watchpoint Trigger (WT)
R/W
12
48
Reserved
—
13
52
Data Trace Control (DTC)
R/W
page 136
14–15
56-60
Data Trace Start Address (DTSA) Channel 0 to 1
R/W
page 136
16-17
64-68
Reserved
—
18–19
72-76
Data Trace End Address (DTEA) Channel 0 to 1
R/W
20-21
80-84
Reserved
—
22
88
PC Breakpoint/Watchpoint Control 0A (BWC0A)
R/W
page 123
23
92
PC Breakpoint/Watchpoint Control 0B (BWC0B)
R/W
page 123
24
96
PC Breakpoint/Watchpoint Control 1A (BWC1A)
R/W
page 123
25
100
PC Breakpoint/Watchpoint Control 1B (BWC1B)
R/W
page 123
26
104
PC Breakpoint/Watchpoint Control 2A (BWC2A)
R/W
page 123
27
108
PC Breakpoint/Watchpoint Control 2B (BWC2B)
R/W
page 123
28
112
Data Breakpoint/Watchpoint Control 3A (BWC3A)
R/W
page 124
29
116
Data Breakpoint/Watchpoint Control 3B (BWC3B)
R/W
page 124
30
120
PC Breakpoint/Watchpoint Address 0A (BWA0A)
R/W
page 122
31
124
PC Breakpoint/Watchpoint Address 0B (BWA0B)
R/W
page 122
32
128
PC Breakpoint/Watchpoint Address 1A (BWA1A)
R/W
page 122
33
132
PC Breakpoint/Watchpoint Address 1B (BWA1B)
R/W
page 122
page 104
page 106
page 125
page 136
153
32002F–03/2010
AVR32
Table 9-64.
OCD Register Summary
Register
Index
mtdr/mf
dr index
Register
Access
Type
Page
34
136
PC Breakpoint/Watchpoint Address 2A (BWA2A)
R/W
page 122
35
140
PC Breakpoint/Watchpoint Address 2B (BWA2B)
R/W
page 122
36
144
Data Breakpoint/Watchpoint Address 3A (BWA3A)
R/W
page 123
37
148
Data Breakpoint/Watchpoint Address 3B (BWA3B)
R/W
page 123
38
152
Breakpoint/Watchpoint Data 3A (BWD3A)
R/W
page 123
39
156
Breakpoint/Watchpoint Data 3B (BWD3B)
R/W
page 123
40–65
160-260
Reserved
—
64
256
Nexus Configuration (NXCFG)
R
page 100
65
260
Debug Instruction Register (DINST)
R/W
page 108
66
264
Debug Program Counter (DPC)
R/W
page 109
67
268
Reserved
—
68
272
Debug Communication CPU Register (DCCPU)
R/W
page 101
69
276
Debug Communication Emulator Register (DCEMU)
R/W
page 102
70
280
Debug Communication Status Register (DCSR)
R/W
page 102
71
284
Ownership Trace Process ID (PID)
R/W
page 137
72
288
Debug Communication Control Register (DCCR)
R/W
page 103
73
292
Peripheral Debug Register(PDBG)
R/W
74-75
296-300
Reserved
—
76
304
AUX port Control (AXC)
R/W
77– 255
3081020
Reserved
—
page 116
154
32002F–03/2010
AVR32
10. Revision History
Doc. Rev.
Date
Comments
32002F
2010-03-12
Improved description of events and priority.
Replaced invalid reference in the OCD.PDBG register.
Note added about overall system interrupt latency.
Added MSU system registers.
32002E
2009-09-01
Added Floating-Point hardware description.
32002D
2009-08-01
Added OCD DCCPU and DCEMU interrupts.
Added PDBG register for individual module masks.
Added AVR32 architecture revision 3 secure state support.
COUNT/COMPARE system register reset-on-match now programmable by CPUCR
Corrected LDM, STM, and SCALL instruction cycle count in cycle count chapter.
Corrected maximum IRQ latency in the Pipeline chapter.
32002C
2007-11-19
MPU compilant with revision 2 of AVR32 Architecture.
Added cycle counts for new instruction in version 2 of the CPU.
Added COUNT/COMPARE system register reset-on-match.
Added CPU Local Bus.
Reconfigured OCD AXC register.
32002B
2007-08-03
Added Memory Service Unit (MSU) description. Added description of peripheral behavior
in Debug.
32002A
2007-03-30
Initial revision.
155
32002F–03/2010
AVR32
Table of contents
1
Introduction .............................................................................................. 2
1.1The AVR family .........................................................................................................2
1.2 The AVR32 Microprocessor Architecture .................................................................2
1.3Exceptions and Interrupts ..........................................................................................3
1.4Java Support .............................................................................................................3
1.5FlashVault .................................................................................................................3
1.6Microarchitectures .....................................................................................................3
1.7The AVR32UC architecture .......................................................................................4
1.8AVR32UC CPU revisions ..........................................................................................5
2
Programming Model ................................................................................ 7
2.1Architectural compatibility ..........................................................................................7
2.2Implementation options .............................................................................................7
2.3Register file configuration ..........................................................................................7
2.4The Status Register ...................................................................................................8
2.5System registers ......................................................................................................11
2.6COMPARE and COUNT registers ...........................................................................17
2.7Configuration Registers ...........................................................................................17
3
Pipeline ................................................................................................... 20
3.1Overview .................................................................................................................20
3.2Prefetch unit ............................................................................................................20
3.3Decode unit .............................................................................................................20
3.4EX pipeline stage ....................................................................................................21
3.5Support for unaligned addresses ............................................................................22
3.6Forwarding hardware and hazard detection ............................................................22
3.7Event handling .........................................................................................................22
3.8Special concerns .....................................................................................................24
3.9Entry points for events .............................................................................................26
3.10Interrupt latencies ..................................................................................................37
3.11NMI latency ...........................................................................................................38
4
Floating Point Hardware ........................................................................ 40
4.1Compliance .............................................................................................................40
4.2Operations ...............................................................................................................40
4.3Instruction set ..........................................................................................................42
i
32002F–03/2010
AVR32
4.4Detailed instruction description ...............................................................................43
5
Secure State ........................................................................................... 59
5.1Basic concept ..........................................................................................................59
5.2Typical use scenario ................................................................................................59
5.3Secure state boot sequence ....................................................................................60
5.4Secure state debugging ..........................................................................................60
5.5Events in secure state .............................................................................................60
6
Memory System ..................................................................................... 61
6.1Memory sections .....................................................................................................61
6.2Memory interfaces ...................................................................................................62
6.3IF stage interface .....................................................................................................62
6.4EX stage interfaces .................................................................................................62
6.5IRAM Write buffer ....................................................................................................64
6.6Memory barriers ......................................................................................................64
7
Memory Protection Unit ........................................................................ 66
7.1Memory map in systems with MPU .........................................................................66
7.2Understanding the MPU ..........................................................................................66
7.3Example of MPU functionality .................................................................................70
8
Instruction Cycle Summary .................................................................. 71
8.1Definitions ................................................................................................................71
8.2Special considerations ............................................................................................71
8.3CPU revision ...........................................................................................................72
8.4ALU instructions ......................................................................................................72
8.5Multiply instructions .................................................................................................75
8.6MAC instructions .....................................................................................................76
8.7MulMac64 instructions .............................................................................................76
8.8Divide instructions ...................................................................................................77
8.9Saturate instructions ................................................................................................77
8.10Load and store instructions ...................................................................................77
8.11Multiple data memory access instructions .............................................................81
8.12Branch instructions ................................................................................................82
8.13Call instructions .....................................................................................................82
8.14Return from execution mode instructions ..............................................................82
8.15Swap instructions ..................................................................................................83
8.16System register instructions ..................................................................................83
ii
32002F–03/2010
AVR32
8.17System control instructions ...................................................................................83
8.18Read-modify-write instructions ..............................................................................84
8.19Code example .......................................................................................................84
9
OCD system ............................................................................................ 87
9.1Overview .................................................................................................................87
9.2CPU Development Support .....................................................................................90
9.3Debug Port ............................................................................................................109
9.4Breakpoints ...........................................................................................................117
9.5Program trace ........................................................................................................125
9.6Data Trace .............................................................................................................131
9.7Ownership Trace ...................................................................................................136
9.8Memory Service Unit .............................................................................................138
9.9OCD Message Summary ......................................................................................149
9.10OCD Message Summary ....................................................................................151
9.11OCD Register Summary ......................................................................................153
10 Revision History ................................................................................... 155
iii
32002F–03/2010
Headquarters
International
Atmel Corporation
2325 Orchard Parkway
San Jose, CA 95131
USA
Tel: 1(408) 441-0311
Fax: 1(408) 487-2600
Atmel Asia
Room 1219
Chinachem Golden Plaza
77 Mody Road Tsimshatsui
East Kowloon
Hong Kong
Tel: (852) 2721-9778
Fax: (852) 2722-1369
Atmel Europe
Le Krebs
8, Rue Jean-Pierre Timbaud
BP 309
78054 Saint-Quentin-enYvelines Cedex
France
Tel: (33) 1-30-60-70-00
Fax: (33) 1-30-60-71-11
Atmel Japan
9F, Tonetsu Shinkawa Bldg.
1-24-8 Shinkawa
Chuo-ku, Tokyo 104-0033
Japan
Tel: (81) 3-3523-3551
Fax: (81) 3-3523-7581
Technical Support
[email protected]
Sales Contact
www.atmel.com/contacts
Product Contact
Web Site
www.atmel.com
Literature Requests
www.atmel.com/literature
Disclaimer: The information in this document is provided in connection with Atmel products. No license, express or implied, by estoppel or otherwise, to any
intellectual property right is granted by this document or in connection with the sale of Atmel products. EXCEPT AS SET FORTH IN ATMEL’S TERMS AND CONDITIONS OF SALE LOCATED ON ATMEL’S WEB SITE, ATMEL ASSUMES NO LIABILITY WHATSOEVER AND DISCLAIMS ANY EXPRESS, IMPLIED OR STATUTORY
WARRANTY RELATING TO ITS PRODUCTS INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE, OR NON-INFRINGEMENT. IN NO EVENT SHALL ATMEL BE LIABLE FOR ANY DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE, SPECIAL OR INCIDENTAL DAMAGES (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF PROFITS, BUSINESS INTERRUPTION, OR LOSS OF INFORMATION) ARISING OUT OF
THE USE OR INABILITY TO USE THIS DOCUMENT, EVEN IF ATMEL HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Atmel makes no
representations or warranties with respect to the accuracy or completeness of the contents of this document and reserves the right to make changes to specifications
and product descriptions at any time without notice. Atmel does not make any commitment to update the information contained herein. Unless specifically provided
otherwise, Atmel products are not suitable for, and shall not be used in, automotive applications. Atmel’s products are not intended, authorized, or warranted for use
as components in applications intended to support or sustain life.
© 2007 Atmel Corporation. All rights reserved. Atmel ®, logo and combinations thereof, AVR ® and others are registered trademarks or trademarks of Atmel Corporation or its subsidiaries. Other terms and product names may be trademarks of others.
32002F–03/2010