dm00024168

RM0078
Reference manual
SPEAr1340 architecture and functionality
Introduction
The SPEAr1340 is a member of the SPEAr® (structured processor enhanced architecture)
family of embedded microprocessors, targeting high-performance human-machine interface
(HMI) applications. It offers an unprecedented combination of integer/floating-point CPU
performance, media processing, security features, and aggressive power reduction control
for next-generation products.
SPEAr1340 is based on ARM's latest multi-core technology (Cortex-A9 SMP/AMP, ARMv7
instruction set) and it is manufactured using ST's 55 nm HCMOS low-power silicon process.
This document provides technical details about the architecture and functionality of
SPEAr1340, and is intended to be used by systems-level and board-level product designers,
as well as software developers.
The SPEAr1340 address map and detailed register descriptions are provided in the
companion reference manual: RM0089, Reference manual, SPEAr1340 address map and
registers.
November 2012
Doc ID 018553 Rev 3
1/590
www.st.com
Contents
RM0078
Contents
1
2
Device overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.1
Simplified block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.2
Summary of features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3
IP groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
CPU subsystem (A9SM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6
2.5.1
CORTEXA9INTEGRATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5.2
A9 CoreSight subsystem (A9CS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.5.3
Clock manager (CMR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5.4
Snoop control unit (SCU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.5.5
Global timer (GTIM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.5.6
Timer and watchdog blocks (WDTIM) . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5.7
Generic interrupt controller (GIC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.6.1
3
4
2/590
Programming the global timer registers . . . . . . . . . . . . . . . . . . . . . . . . . 48
Multilayer interconnect matrix (BUSMATRIX) . . . . . . . . . . . . . . . . . . . . 49
3.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.1
Crossbars (XB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.2
Shared link (SL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.3
S3220 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.4
Masters (IAs) and slaves (TAs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
System configuration registers (MISC) . . . . . . . . . . . . . . . . . . . . . . . . . 55
Doc ID 018553 Rev 3
RM0078
5
Contents
Reset and clock generator (RCG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.5
6
5.4.1
Main clock sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4.2
PLLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4.3
Fractional clock generator (SSCG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4.4
XYSYNT clock divider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4.5
AMBA clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.4.6
A9SM clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4.7
GMAC clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.8
I2S clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.9
UART clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4.10
C3 clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4.11
CLCD clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.4.12
GPT clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4.13
MPMC clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4.14
Gate unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4.15
Reset generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.5.1
Programming PLLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.5.2
Changing system modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.5.3
Setting cpu_clk = 600 MHz and hclk = 166 MHz . . . . . . . . . . . . . . . . . . 83
5.5.4
Configuring the fractional clock generator (SSCG) . . . . . . . . . . . . . . . . 83
Power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.2
Power domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.2.1
Power domain management: power states . . . . . . . . . . . . . . . . . . . . . . 85
6.2.2
Power domain management: configuration registers . . . . . . . . . . . . . . . 87
6.2.3
Power management procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3
Clock power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4
IP power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4.1
Standard IP power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4.2
USBPHY power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Doc ID 018553 Rev 3
3/590
Contents
7
RM0078
6.4.3
MPMC/DDR PHY power management . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4.4
PCIE/SATA/MIPHY power management . . . . . . . . . . . . . . . . . . . . . . . . 93
6.4.5
ADC power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.5
Voltage regulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.6
Power control module (PCM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.6.1
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.6.2
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
BootROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.1.1
7.2
7.3
7.4
4/590
Useful terms and definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2.1
Hardware components used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2.2
OTP configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.3
Boot device selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2.4
Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.2.5
System initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.2.6
Boot device initialization and code-shadowing . . . . . . . . . . . . . . . . . . 112
7.2.7
Xloader authentication and execution . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.2.8
Image header authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.2.9
Default boot mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Secure boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.3.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.3.2
First stage secure boot process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.3.3
Life cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.3.4
Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.3.5
Security table in BootROM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.3.6
BootROM and RAM layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3.7
OTP layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3.8
Usage examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.3.9
BootROM signed image format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.3.10
Image signature cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.4.1
BootROM on Core 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.4.2
Error codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4.3
List of supported devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4.4
BootROM table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Doc ID 018553 Rev 3
RM0078
8
Contents
Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.4.6
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Static RAMs (SRAM)
8.1
9
7.4.5
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
One-time programmable antifuse (OTP) . . . . . . . . . . . . . . . . . . . . . . . 149
9.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.4.1
9.5
10
11
12
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
OTP banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.5.1
Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.5.2
Masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
General purpose timers (GPT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
10.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
10.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
10.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
10.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
10.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Real-time clock (RTC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
11.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
11.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
11.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
11.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
11.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Direct memory access controllers (DMAC) . . . . . . . . . . . . . . . . . . . . . 156
12.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
12.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
12.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
12.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
12.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Doc ID 018553 Rev 3
5/590
Contents
RM0078
12.6
13
12.5.2
DMAC multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
12.5.3
DMAC transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
12.5.4
Generating requests for the AHB master bus interface . . . . . . . . . . . . 164
12.5.5
AHB master interface arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
12.5.6
Scatter/Gather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
12.5.7
Endianness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
12.6.1
DMAC transfer types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
12.6.2
Programming example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
12.6.3
Programming a channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
12.6.4
Disabling a channel prior to transfer completion . . . . . . . . . . . . . . . . . 199
12.6.5
Defined-length burst support on DMAC . . . . . . . . . . . . . . . . . . . . . . . . 200
13.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
13.2
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
13.2.1
AHB Master Interface (HIF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
13.2.2
C3 RAM Buffer (MEMORY) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
13.2.3
Instruction dispatching subsystem (IDS) . . . . . . . . . . . . . . . . . . . . . . . 203
13.2.4
Couple and chaining module (CCM) . . . . . . . . . . . . . . . . . . . . . . . . . . 204
13.2.5
AHB Slave interface (SIF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
13.2.6
System registers (SYS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
13.2.7
Reset logic (MRGEN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
13.2.8
Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
13.3.1
Generic flow type instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
13.3.2
Move channel instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
13.3.3
DES/3DES channel instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
13.3.4
AES (MPCM) channel instruction set . . . . . . . . . . . . . . . . . . . . . . . . . 212
13.3.5
Unified hash with HMAC (UHH) channel instruction set . . . . . . . . . . . 217
13.3.6
Unified hash with HMAC 2 (UHH2) channel instruction set . . . . . . . . . 222
13.3.7
Public key (PKA) channel instruction set . . . . . . . . . . . . . . . . . . . . . . . 226
13.3.8
RNG channel instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Temperature sensor (THSENS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
14.1
6/590
DMAC wrapper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Cryptographic co-processor (C3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
13.3
14
12.5.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Doc ID 018553 Rev 3
RM0078
Contents
14.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
14.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
14.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
14.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
14.5.1
15
Multiport DDR2/3 controller (MPMC) . . . . . . . . . . . . . . . . . . . . . . . . . . 235
15.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
15.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
15.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
15.3.1
16
Low power modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Changing the input clock frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
15.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
15.5
Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
15.6
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
15.6.1
AXI interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
15.6.2
AHB interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
15.6.3
Initialization protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
15.6.4
Exclusive access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
15.6.5
Error responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
15.6.6
Multiport arbiter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
15.6.7
Command queue with placement logic . . . . . . . . . . . . . . . . . . . . . . . . 251
15.6.8
Other memory controller features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
15.6.9
Address mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Static memory controller (FSMC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
16.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
16.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
16.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
16.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
16.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
16.5.1
NAND Flash controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
16.5.2
NOR Flash / SRAM controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
16.5.3
Asynchronous operating modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
16.5.4
ECC calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
16.5.5
Bus turn around . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Doc ID 018553 Rev 3
7/590
Contents
17
RM0078
Serial NOR Flash controller (SMI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
17.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
17.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
17.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
17.3.1
17.4
Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
17.4.1
AHB interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
17.4.2
Memory device compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
17.4.3
Hardware mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
17.4.4
Software mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
17.4.5
Booting from external memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
17.4.6
External memory read request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
17.4.7
External memory write request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
17.4.8
Write burst mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
17.4.9
Read while write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
17.4.10 Erasing and write status register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
18
19
Memory card interface (MCIF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
18.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
18.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
18.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
18.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
18.4.1
SD2.0/SDIO2.0/MMC4.3 AHB Host controller . . . . . . . . . . . . . . . . . . . 275
18.4.2
Not using DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
18.4.3
Using DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
18.4.4
Using ADMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
18.4.5
Abort transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
18.4.6
Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
18.4.7
CF4.1/xD1.3 AHB Host controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Giga/Fast Ethernet controller (GMAC) . . . . . . . . . . . . . . . . . . . . . . . . 291
19.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
19.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
19.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
19.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
19.4.1
8/590
Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
Doc ID 018553 Rev 3
RM0078
Contents
19.5
20
21
22
19.4.2
Precision Time Protocol (PTP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
19.4.3
Advanced Timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
19.4.4
AV feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
19.4.5
Energy efficient ethernet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
19.5.1
Initializing DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
19.5.2
Initializing GMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
19.5.3
Performing normal receive and transmit operation . . . . . . . . . . . . . . . 313
19.5.4
Stopping and starting transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
19.5.5
GMII link transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
19.5.6
IEEE 1588 time stamping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
19.5.7
AV feature initialization steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
19.5.8
Energy efficient ethernet initialization steps . . . . . . . . . . . . . . . . . . . . . 316
USB 2.0 host controllers (UHC)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
20.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
20.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
20.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
20.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
USB OTG controller (UOC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
21.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
21.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
21.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
PCI express controller (PCIe) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
22.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
22.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
22.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
22.4
Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
22.5
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
22.6
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
22.6.1
AXI bridge interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
22.6.2
Common xpress port logic (CXPL) . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
22.6.3
Transmit application-dependent module (XADM) . . . . . . . . . . . . . . . . 335
22.6.4
Receive application-dependent module (RADM) . . . . . . . . . . . . . . . . . 336
Doc ID 018553 Rev 3
9/590
Contents
RM0078
22.7
22.6.5
Configuration-dependent module (CDM) . . . . . . . . . . . . . . . . . . . . . . . 338
22.6.6
Power management control (PMC) . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
22.6.7
Local bus controller (LBC) and data bus interface (DBI) . . . . . . . . . . . 339
22.6.8
Message generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
22.6.9
Hot plug control (HOTPLUG_CTRL) module . . . . . . . . . . . . . . . . . . . . 344
Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
22.7.1
Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
22.7.2
Link establishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
22.7.3
Transmit TLP processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
22.7.4
Receive TLP processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
22.7.5
Error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
22.7.6
Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
22.7.7
Address translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
22.7.8
Outbound iATU operation: address match mode . . . . . . . . . . . . . . . . . 366
22.7.9
Inbound iATU operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
22.7.10 Gen2 5.0GT/s operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
22.7.11 Power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
22.8
23
10/590
22.8.1
Programming example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
22.8.2
Programming example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
Serial ATA controllers (SATA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
23.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
23.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
23.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
23.4
Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
23.5
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
23.6
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
23.7
24
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
23.6.1
Bus interface unit (BIU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
23.6.2
Generic registers (GCSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
23.6.3
Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
23.7.1
Software initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
23.7.2
Software manipulation of Port DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
SATA/PCIe physical interface (MiPHY) . . . . . . . . . . . . . . . . . . . . . . . . 392
Doc ID 018553 Rev 3
RM0078
Contents
24.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
24.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
24.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
26
Reference clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
24.3.2
Recommended clock frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
24.3.3
SerDes clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
24.4
Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
24.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
24.6
25
24.3.1
24.5.1
PLL description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
24.5.2
SerDes description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
24.5.3
Compensation module (COMPENS) description . . . . . . . . . . . . . . . . . 397
Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Asynchronous serial ports (UART) . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
25.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
25.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
25.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
25.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
25.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
25.5.1
Main interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
25.5.2
Modem operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
25.5.3
Hardware flow control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
25.5.4
IrDA SIR ENDEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
25.5.5
Baud rate generation and transmit logic . . . . . . . . . . . . . . . . . . . . . . . 410
Synchronous serial port (SSP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
26.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
26.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
26.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
26.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
26.4.1
26.5
26.6
Main interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
26.5.1
Bit rate generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
26.5.2
Frame format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
26.6.1
Defining the chip select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
Doc ID 018553 Rev 3
11/590
Contents
27
RM0078
12/590
26.6.3
Configuring SSP as master or slave . . . . . . . . . . . . . . . . . . . . . . . . . . 419
27.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
27.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
27.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
27.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
27.4.1
Main interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
27.4.2
I2C terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
27.4.3
I2C behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
27.4.4
I2C protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
27.4.5
Multiple master arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
27.4.6
Clock synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
27.4.7
IC_CLK frequency configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
27.4.8
SDA hold time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
27.4.9
DMA controller interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
27.5.1
Slave mode operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
27.5.2
Master mode operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
27.5.3
Disabling I2C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
General purpose I/O (GPIOA-B) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
28.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
28.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
28.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
28.4
Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
28.5
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
28.6
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
28.7
29
Enabling SSP operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
I2C bus controllers (I2C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
27.5
28
26.6.2
28.6.1
APB interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
28.6.2
Interrupt detection logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
28.7.1
Interrupt configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
28.7.2
Operation of the input/output lines (I/O read/write) . . . . . . . . . . . . . . . 455
Extended general purpose I/O (XGPIO) . . . . . . . . . . . . . . . . . . . . . . . . 457
Doc ID 018553 Rev 3
RM0078
30
Contents
29.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
29.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
29.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
29.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
29.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
29.5.1
XGPIO IN read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
29.5.2
XGPIO OUT write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
29.5.3
Using an XGPIO pin as an interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Keyboard controller (KBD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
30.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
30.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
30.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
30.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
30.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
30.5.1
31
32
Operating modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
A/D converter (ADC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
31.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
31.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
31.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
31.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
31.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
31.5.1
Enhanced mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
31.5.2
Touchscreen mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
31.5.3
High-resolution mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
31.5.4
DMA handshaking interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
PWM generators (PWM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
32.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
32.1.1
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
32.2
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
32.3
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
32.4
32.3.1
Prescaler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
32.3.2
Pulse generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Doc ID 018553 Rev 3
13/590
Contents
RM0078
32.4.1
33
34
Configuring a channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
HDMI CEC interfaces (CEC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
33.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
33.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
33.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
33.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
33.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
33.5.1
Control logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
33.5.2
Bit timing logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
33.5.3
Bit shaping logic (BSL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
33.5.4
Prescaler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
33.5.5
Normal functional behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
33.5.6
Error conditions and error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Display controller (CLCD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
34.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
34.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
34.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
34.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
34.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
34.5.1
LCD controller core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
34.5.2
Master and slave bus interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
34.5.3
Timing and control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
34.5.4
DMA controller & memory interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
34.5.5
Frame buffer organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
34.5.6
Input FIFOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
34.5.7
Pixel unpack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
34.5.8
Palette lookup table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
34.5.9
Output FIFO and formatter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
34.5.10 Power sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
34.5.11 Pulse-width modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
34.5.12 Overlay windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
35
Graphics processing unit (GPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
35.1
14/590
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
Doc ID 018553 Rev 3
RM0078
Contents
35.2
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
35.3
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
35.4
Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
35.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
35.6
36
35.5.1
Geometry processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
35.5.2
Pixel processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
35.5.3
Memory management unit (MMU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
35.6.1
3D system level operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
35.6.2
2D system level operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
35.6.3
Graphics pipeline level operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Video decoder (VDEC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
36.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
36.2
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
36.3
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
36.4
36.3.1
Decoder interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
36.3.2
Post-processor interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
36.4.1
H.264 decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
36.4.2
MPEG-4 / H.263 / Sorenson Spark decoder . . . . . . . . . . . . . . . . . . . . 521
36.4.3
MPEG-2 / MPEG-1 decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
36.4.4
JPEG decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
36.4.5
VC-1 decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
36.4.6
RV decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
36.4.7
VP6 decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
36.4.8
VP7/VP8 decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
36.4.9
AVS decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
36.4.10 DivX decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
36.4.11 Post processor (PP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
36.4.12 Video frame storage formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
37
Video encoder (VENC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
37.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
37.2
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
37.3
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
Doc ID 018553 Rev 3
15/590
Contents
38
RM0078
40
16/590
Bus interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
37.3.2
Video stabilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
37.3.3
Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
37.3.4
Multi-instance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
Camera input interfaces (CAM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
38.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
38.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
38.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
38.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
38.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
38.6
39
37.3.1
38.5.1
Data capture and conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
38.5.2
Data transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
38.5.3
Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
38.5.4
Performance levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
38.5.5
Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
38.6.1
Selecting synchronization type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
38.6.2
Masking interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
Video input parallel port (VIP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
39.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
39.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
39.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
39.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
39.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
I2S digital audio interfaces (I2S) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
40.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
40.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
40.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
40.4
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
40.5
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
40.5.1
Transmit channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
40.5.2
Receive channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
40.5.3
Audio data interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
Doc ID 018553 Rev 3
RM0078
Contents
40.5.4
40.6
41
External sclk gating and enable signal . . . . . . . . . . . . . . . . . . . . . . . . 571
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
40.6.1
Using the I2S transmitter (Tx mode) . . . . . . . . . . . . . . . . . . . . . . . . . . 572
40.6.2
Using the I2S receiver (Rx mode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
40.6.3
Configuring channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
40.6.4
Using interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
40.6.5
Programming FIFO thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
40.6.6
Exchanging data with system memory . . . . . . . . . . . . . . . . . . . . . . . . 573
S/PDIF digital audio ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
41.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
41.2
Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
41.3
Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
41.4
Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
41.4.1
SPDIF IN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
41.4.2
SPDIF OUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
Appendix A Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
Appendix B Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Appendix C Copyright statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
Revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
Doc ID 018553 Rev 3
17/590
List of tables
RM0078
List of tables
Table 1.
Table 2.
Table 3.
Table 4.
Table 5.
Table 6.
Table 7.
Table 8.
Table 9.
Table 10.
Table 11.
Table 12.
Table 13.
Table 14.
Table 15.
Table 16.
Table 17.
Table 18.
Table 19.
Table 20.
Table 21.
Table 22.
Table 23.
Table 24.
Table 25.
Table 26.
Table 27.
Table 28.
Table 29.
Table 30.
Table 31.
Table 32.
Table 33.
Table 34.
Table 35.
Table 36.
Table 37.
Table 38.
Table 39.
Table 40.
Table 41.
Table 43.
Table 44.
Table 45.
Table 47.
Table 48.
Table 49.
Table 50.
18/590
Summary of SPEAr1340 features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
SPEAr1340 IP groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
CortexA9 subsystem clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
CORTEXA9INTEGRATION and PL310 configuration parameters. . . . . . . . . . . . . . . . . . . 36
A9CS memory map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Interrupt output source selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
IA group organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Connectivity matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
RCG clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
PLL source clocks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
PLL output clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
PLL division factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Jitter at PLL output clock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Selection of ? value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
PLL modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
SSCGn output frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
XYSYNT clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
A9SM clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
GMAC clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Setting GMAC clocks to different modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Reset sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Allowed power states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Allowed power states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Allowed wakeup events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Power management configuration registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Clock power states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Standard IPs power states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
USBPHY power states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
USBPHY power management-related registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
ADC power state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
ADC power management-related registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
PCM internal pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
OTP Bank M configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Hardware boot selection (STRAP[0..3]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
IP configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
USB device descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
USB configuration descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
USB interface descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
USB IN endpoint descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
USB OUT endpoint descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
USB string descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Security parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Error codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Supported NAND devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
BANK 1/ 2 bit mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
BANK M bit mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
DMAC MUX - selecting the peripheral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
DMAC MUX - selecting the peripheral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Doc ID 018553 Rev 3
RM0078
Table 51.
Table 52.
Table 53.
Table 54.
Table 55.
Table 56.
Table 57.
Table 58.
Table 59.
Table 60.
Table 61.
Table 62.
Table 63.
Table 64.
Table 65.
Table 66.
Table 67.
Table 68.
Table 69.
Table 70.
Table 71.
Table 72.
Table 73.
Table 74.
Table 75.
Table 76.
Table 77.
Table 78.
Table 79.
Table 80.
Table 81.
Table 82.
Table 83.
Table 84.
Table 85.
Table 86.
Table 87.
Table 88.
Table 89.
Table 90.
Table 91.
Table 92.
Table 93.
Table 94.
Table 95.
Table 96.
Table 97.
Table 98.
Table 99.
Table 100.
Table 101.
Table 102.
List of tables
DMAC MUX - selecting the flow controller and data direction . . . . . . . . . . . . . . . . . . . . . 160
DMAC MUX - selecting the DMAC core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Transfer types and flow controller combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Programming of transfer types and channel register update method . . . . . . . . . . . . . . . . 168
MOVE_INIT bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
MOVE_INIT bits nn definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
MOVE_DATA bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
DES START ECB bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Bit a definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Bit b definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
DES START CBC bit encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
DES APPEND ECB bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
DES APPEND CBC bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
HASH [MD5/SHA1/SHA2] INIT bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
HASH [MD5/SHA1/SHA2] INIT bits aa definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
HASH [MD5/SHA1/SHA2] APPEND instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
HASH [MD5/SHA1/SHA2] END bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
HASH [MD5/SHA1/SHA2] END bit t definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
HASH CONTEXT SAVE bit encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
HASH CONTEXT RESTORE bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
HMAC [MD5/SHA1/SHA2] INIT bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
HMAC [MD5/SHA1/SHA2] APPEND bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
HMAC [MD5/SHA1/SHA2] END bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
HMAC [MD5/SHA1/SHA2] END bit t definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
HMAC CONTEXT SAVE bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
HMAC CONTEXT RESTORE bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
HASH [SHA384/SHA512] INIT bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
HASH [SHA384/SHA512] INIT bits aa definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
HASH [SHA384/SHA512] APPEND bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
HASH [SHA384/SHA512] END bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
HASH CONTEXT SAVE bit encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
HASH CONTEXT RESTORE bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
HMAC [SHA384/SHA512] INIT bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
HMAC [SHA384/SHA512] APPEND bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
HMAC [SHA384/SHA512] END bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
HMAC CONTEXT SAVE bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
HMAC CONTEXT RESTORE bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
MONTY_EXP instruction data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
MONTY_PAR bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Input data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Resulting data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
MOD_EXP bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Input data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Resulting data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
MONTY_EXP bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Input data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Resulting data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
ECC_MUL bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Input data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Resulting data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
ECC_MONTY_MUL bit encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Input data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Doc ID 018553 Rev 3
19/590
List of tables
Table 103.
Table 104.
Table 105.
Table 106.
Table 107.
Table 108.
Table 109.
Table 110.
Table 111.
Table 112.
Table 113.
Table 114.
Table 115.
Table 116.
Table 117.
Table 118.
Table 119.
Table 120.
Table 121.
Table 122.
Table 123.
Table 124.
Table 125.
Table 126.
Table 127.
Table 128.
Table 129.
Table 130.
Table 131.
Table 132.
Table 133.
Table 134.
Table 135.
Table 136.
Table 137.
Table 138.
Table 139.
Table 140.
Table 141.
Table 142.
Table 143.
Table 144.
Table 145.
Table 146.
Table 147.
Table 148.
Table 149.
Table 150.
Table 151.
Table 152.
Table 153.
Table 154.
20/590
RM0078
Resulting data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
GET_VAL instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
AXI transfer type limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Configured AXI settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Write response signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
AHB transfer type limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Relative priority example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
System D specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
System D operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Out of range access parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
NAND bank selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
NOR/SRAM bank selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
External memory address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
FSMC asynchronous operating modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Supported instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
Transmit descriptor words 0 through 3 (TDES0 — TDES3) . . . . . . . . . . . . . . . . . . . . . . . 293
Transmit descriptor words 6 and 7 (TDES6 and TDES7) . . . . . . . . . . . . . . . . . . . . . . . . . 297
Receive descriptor fields (RDES0 through RDES3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Extended status — receive descriptor fields 4 (RDES4) . . . . . . . . . . . . . . . . . . . . . . . . . 301
Time-stamp snapshot — receive descriptor fields 6 and 7 (RDES6 & RDES7) . . . . . . . . 302
AXI bridge DBI -> CDM / ELBI access details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Result of filtering rules applied to request TLPs and completion (CPL) TLPs: EP mode . 349
Result of filtering rules to request TLPs and completions (CPL) TLPs: RC mode . . . . . . 351
Error message (Msg) format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Possible causes for typical errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Message classes based on the message code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
Message transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Message reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Controlling the routing of received messages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Registers used for programming the iATU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
PCIe core completion timeout ranges versus PCI express specification . . . . . . . . . . . . . 371
p1_clk_osc selection truth table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
UART interrupt summary with combined outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Meaning of modem input/output in DTE and DCE modes . . . . . . . . . . . . . . . . . . . . . . . . 405
Control bits to enable and disable hardware flow control . . . . . . . . . . . . . . . . . . . . . . . . . 407
External CS selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
I2C definition of bits in first byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
ic_clk in relation to high and low counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Triggering an interrupt from pin 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Block signals and external interconnection cross reference . . . . . . . . . . . . . . . . . . . . . . . 462
Key-code table (hex values) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
Mapping between external pins and PARDATAREG bits. . . . . . . . . . . . . . . . . . . . . . . . . 464
RX_ERROR conditions, types, and actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
Wait loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
TX_ERROR conditions, types, and actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
Frame buffer support for palette load (PSS =1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
Frame buffer organization, PSS =0 or BPP = 16, 18, 24 bpp . . . . . . . . . . . . . . . . . . . . . . 489
Frame buffer organization, PSS =1, BPP = 1 bpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Frame buffer organization, PSS =1, BPP = 2 bpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Frame buffer organization, PSS =1, BPP = 4 bpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Frame buffer organization, PSS =1, BPP = 8 bpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
LEB_LEP, Input FIFO Read Side bits [31:16]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
Doc ID 018553 Rev 3
RM0078
Table 155.
Table 156.
Table 157.
Table 158.
Table 159.
Table 160.
Table 161.
Table 162.
Table 163.
Table 164.
Table 165.
Table 166.
Table 167.
Table 168.
Table 169.
Table 170.
Table 171.
Table 172.
Table 173.
Table 174.
Table 175.
Table 176.
Table 177.
Table 178.
Table 179.
Table 180.
Table 181.
Table 182.
Table 183.
Table 184.
Table 185.
Table 186.
Table 187.
Table 188.
Table 189.
Table 190.
Table 191.
Table 192.
Table 193.
Table 194.
Table 195.
List of tables
LEB_LEP, Input FIFO Read Side bits [15:0]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
BEB_BEP, Input FIFO Read Side bits [31:16] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
BEB_BEP, Input FIFO Read Side bits [15:0] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
LEB_BEP, Input FIFO Read Side bits [31:16] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
LEB_ BEP, Input FIFO Read Side bits [15:0] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
Supported standards, profiles and levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Deviations from the supported profiles and levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Decoder interrupt register (SWREG1 OFFSET 0X4). . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Post-processing interrupt register (swreg60 offset 0xf0) . . . . . . . . . . . . . . . . . . . . . . . . . 517
H.264 / SVC decoder base layer features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
MPEG-4 / H.263 / Sorenson Spark decoder features. . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
MPEG-2 / MPEG-1 features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
JPEG decoder features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
VC-1 decoder features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
RV decoder features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
VP6 features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
VP7/VP8 features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
AVS features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
DivX features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
Post processor features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
64-bit data bus parameter divisibility requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
32-bit data bus parameter divisibility requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
QCIF video frame luminance data pixel numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
QCIF video frame luminance pixel data storage in raster-scan order.
All pixels in a row are stored in consecutive memory locations. . . . . . . . . . . . . . . . . . . . . 549
QCIF video frame luminance pixel data storage in tiled order.
All pixels in a macroblock are stored in consecutive memory locations. . . . . . . . . . . . . . 549
Video stabilization features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
Connectivity features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
CAM interrupts summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
Maximum picture size according data format and buffer size. . . . . . . . . . . . . . . . . . . . . . 563
VIP internal pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
Single link 16-bit data storing format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Single link 24-bit data storing format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Single link 32-bit data storing format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Dual link 16-bit data storing format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
Dual link 24-bit data storing format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
I2S interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
Channel configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
Interrupt configurations with respect to interrupt pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
SPEAr1340 external interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
List of acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Document revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
Doc ID 018553 Rev 3
21/590
List of figures
RM0078
List of figures
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
Figure 10.
Figure 11.
Figure 12.
Figure 13.
Figure 14.
Figure 15.
Figure 16.
Figure 17.
Figure 18.
Figure 19.
Figure 20.
Figure 21.
Figure 22.
Figure 23.
Figure 24.
Figure 25.
Figure 26.
Figure 27.
Figure 28.
Figure 29.
Figure 30.
Figure 31.
Figure 32.
Figure 33.
Figure 34.
Figure 35.
Figure 36.
Figure 37.
Figure 38.
Figure 39.
Figure 40.
Figure 41.
Figure 42.
Figure 43.
Figure 44.
Figure 45.
Figure 46.
Figure 47.
Figure 48.
22/590
SPEAr1340 simplified block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
CortexA9 subsystem top level block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
CORTEXA9INTEGRATION internal block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
A9CS block diagram with CORTEXA9INTEGRATION. . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Secure and non-secure interrupt priority formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
SPEAr1340 block diagram with BUSMATRIX topology details . . . . . . . . . . . . . . . . . . . . . 49
RCG block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
RCG integration in SPEAr1340 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
PLL overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
X=1 , Y= 4 (duty cycle < 50 %) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
System clock controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
AMBA clock generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
A9SM clock domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
GMAC clock generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
I2S_M clock generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
UART clock generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
C3 clock generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
CLCD clock generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
MPMC clocks scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Reset generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Reset waveform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
SPEAr1340 power islands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Power states transition graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
PCM block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
PCM core block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Relevant PCM core interface timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Configuration funnel block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Configuration funnel selection flow graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Domain checker block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
BootROM start-up sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
SYSROM memory map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
SYSRAM0 memory map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
System initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
SD/MMC card detection sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
BootROM flowchart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Header authentication flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Default boot mode flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
First stage secure boot process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
BootROM and RAM layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Boot image format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
BootROM on Core 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
GPT block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
DMAC block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
DMAC wrapper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
DMAC handshaking lines allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Multiblock transfer using linked lists when DMAH_CHx_STAT_SRC set to true . . . . . . . 167
Multiblock transfer using linked lists when DMAH_CHx_STAT_SRC set to false . . . . . . 167
Mapping of block descriptor (LLI) in memory to channel registers when
Doc ID 018553 Rev 3
RM0078
Figure 49.
Figure 50.
Figure 51.
Figure 52.
Figure 53.
Figure 54.
Figure 55.
Figure 56.
Figure 57.
Figure 58.
Figure 59.
Figure 60.
Figure 61.
Figure 62.
Figure 63.
Figure 64.
Figure 65.
Figure 66.
Figure 67.
Figure 68.
Figure 69.
Figure 70.
Figure 71.
Figure 72.
Figure 73.
Figure 74.
Figure 75.
Figure 76.
Figure 77.
Figure 78.
Figure 79.
Figure 80.
Figure 81.
Figure 82.
Figure 83.
Figure 84.
Figure 85.
Figure 86.
Figure 87.
Figure 88.
Figure 89.
Figure 90.
Figure 91.
Figure 92.
Figure 93.
List of figures
DMAH_CHx_STAT_SRC set to True . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Mapping of block descriptor (LLI) in memory to channel registers when
DMAH_CHx_STAT_SRC set to False . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Flowchart for DMA programming example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Multi-block with linked address for source and destination. . . . . . . . . . . . . . . . . . . . . . . . 180
Multi-block with linked address for source and destination where SARx and DARx between
successive blocks are contiguous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
DMA transfer flow for source and destination linked list address . . . . . . . . . . . . . . . . . . . 182
Multi-block dma transfer with source and destination address auto-reloaded . . . . . . . . . 184
DMA transfer flow for source and destination address auto-reloaded . . . . . . . . . . . . . . . 185
Multi-block DMA transfer with source address auto-reloaded and linked list destination address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
DMA transfer flow for source address auto-reloaded and linked list destination address 190
Multi-block DMA transfer with source address auto-reloaded and contiguous destination address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
DMA transfer flow for source address auto-reloaded and contiguous destination address . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Multi-block DMA transfer with linked list source address and contiguous destination address
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
DMA transfer for linked list source address and contiguous destination address. . . . . . . 198
C3 block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
C3 channel architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
AES (MPCM) channel instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
MPCM Core block RAM diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
MPCM vector tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
THSENS block interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
MPMC clocks scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Multiport memory controller architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
AXI interface blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Weighted round-robin priority group structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Memory controller memory map: maximum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
Alternate memory map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
FSMC and embedded MPU boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
SRAM asynchronous read access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
SRAM asynchronous write access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
SRAM asynchronous read access with FSMC_REn toggling. . . . . . . . . . . . . . . . . . . . . . 263
SRAM asynchronous write access with FSMC_REn toggling . . . . . . . . . . . . . . . . . . . . . 263
NOR Flash asynchronous read access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
NOR Flash asynchronous write access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
NOR Flash asynchronous read access with FSMC_REn toggling . . . . . . . . . . . . . . . . . . 265
NOR Flash asynchronous write access with FSMC_REn toggling. . . . . . . . . . . . . . . . . . 265
Asynchronous read access with extended address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
Asynchronous write access with extended address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
SMI block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
SD/SDIO/MMC Host controller block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Data transfer using DAT line sequence (not using DMA) . . . . . . . . . . . . . . . . . . . . . . . . . 278
Data transfer using DAT line sequence (using DMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Data transfer using DAT line sequence (using ADMA). . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Synchronous abort sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Data path synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
CF/xD Host controller block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Transmitter descriptor fields - alternate (enhanced) format . . . . . . . . . . . . . . . . . . . . . . . 293
Doc ID 018553 Rev 3
23/590
List of figures
Figure 94.
Figure 95.
Figure 96.
Figure 97.
Figure 98.
Figure 99.
Figure 100.
Figure 101.
Figure 102.
Figure 103.
Figure 104.
Figure 105.
Figure 106.
Figure 107.
Figure 108.
Figure 109.
Figure 110.
Figure 111.
Figure 112.
Figure 113.
Figure 114.
Figure 115.
Figure 116.
Figure 117.
Figure 118.
Figure 119.
Figure 120.
Figure 121.
Figure 122.
Figure 123.
Figure 124.
Figure 125.
Figure 126.
Figure 127.
Figure 128.
Figure 129.
Figure 130.
Figure 131.
Figure 132.
Figure 133.
Figure 134.
Figure 135.
Figure 136.
Figure 137.
Figure 138.
Figure 139.
Figure 140.
Figure 141.
Figure 142.
Figure 143.
Figure 144.
Figure 145.
24/590
RM0078
Transmit descriptor fetch (read) for alternate (enhanced) format . . . . . . . . . . . . . . . . . . . 293
Receive descriptor fields - alternate (enhanced) format . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Networked time synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
System time update using fine method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
UHC block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
USB open Host controller block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
UOC module in SPEAr1340 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
PCIe port system block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
PCIe integration in SPEAr1340 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
PCIe main interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
DM core block diagram (with AHB/AXI bridge module) . . . . . . . . . . . . . . . . . . . . . . . . . . 330
System level view of the PCIe AXI core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
PCIe AXI core top-level interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
CXPL module block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
XADM block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
RADM block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
LBC context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
LBC switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
PCIe configuration space address map (per function) . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
DBI access to LBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Receive TLP processing flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Default request TLP routing (assuming no TLPs with CA/CRS/UR completion status) . . 353
Message transmission: EP mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Message transmission: RC mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Message reception: EP mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Message reception: RC mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
iATU address region mapping: outbound and inbound (address match mode) . . . . . . . . 366
iATU address region mapping: inbound (bar match mode) . . . . . . . . . . . . . . . . . . . . . . . 368
Relationship of power down states between link partners . . . . . . . . . . . . . . . . . . . . . . . . 369
SATA block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Bus interface unit block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Transport layer functional block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Link layer functional block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Port power control module diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
MiPHY application diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
Reference clock selection circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
SerDes clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
MiPHY functional block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
MiPHY module in SPEAr1340 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
UART block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
Hardware flow control between two similar devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
Hardware flow control transfer diagram (start of transfer) . . . . . . . . . . . . . . . . . . . . . . . . 406
Hardware flow control transfer diagram (end of transfer) . . . . . . . . . . . . . . . . . . . . . . . . . 407
UART/IrDA block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
IrDA data modulation (3/16) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
UART character frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
RXFIFO payload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
UART transfer bit diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Baud rate divisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
SSP block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
I2C block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
Master/slave and transmitter/receiver relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Doc ID 018553 Rev 3
RM0078
Figure 146.
Figure 147.
Figure 148.
Figure 149.
Figure 150.
Figure 151.
Figure 152.
Figure 153.
Figure 154.
Figure 155.
Figure 156.
Figure 157.
Figure 158.
Figure 159.
Figure 160.
Figure 161.
Figure 162.
Figure 163.
Figure 164.
Figure 165.
Figure 166.
Figure 167.
Figure 168.
Figure 169.
Figure 170.
Figure 171.
Figure 172.
Figure 173.
Figure 174.
Figure 175.
Figure 176.
Figure 177.
Figure 178.
Figure 179.
Figure 180.
Figure 181.
Figure 182.
Figure 183.
Figure 184.
Figure 185.
Figure 186.
Figure 187.
Figure 188.
Figure 189.
Figure 190.
Figure 191.
Figure 192.
Figure 193.
Figure 194.
Figure 195.
Figure 196.
Figure 197.
List of figures
Data transfer on the I2C bus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
START and STOP condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
7-bit address format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
10-bit address format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Master-transmitter protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Master-receiver protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
START BYTE transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Multiple master arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
Multi-master clock synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
I2C master implementing tHD;DAT when IC_SDA_HOLD = 3. . . . . . . . . . . . . . . . . . . . . 436
Breakdown of DMA transfer into burst transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
Breakdown of DMA transfer into single and burst transactions . . . . . . . . . . . . . . . . . . . . 438
Case 1 watermark levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Case 2 watermark levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
I2C Receive FIFO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Burst transaction – pclk = hclk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
Back-to-back burst transaction – hclk = 2*pclk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
Single transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Burst transaction + 3 back-to-back singles – hclk = 2*pclk. . . . . . . . . . . . . . . . . . . . . . . . 444
GPIOA and GPIOB block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
GPIO detailed block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
GPIO interrupt registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Example to write to address 0x098. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Example to read from address 0x0C4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
XGPIO block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Mapping of XGPIO40 pad to XGPIO registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Interrupt detection logic on XGPIOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Keyboard controller block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
Timing diagram of ADC conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
PWM block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Output pulse generation example (Duty = 3, Period = 7) . . . . . . . . . . . . . . . . . . . . . . . . . 471
CEC block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
CEC control logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
Example: a complete message reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
Example: RX_ERROR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Example: a complete message transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Example: a TX_ERROR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
Quanta counter timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
Bit shaping logic timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
Message description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
Bit timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
Signal-free time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Arbitration phase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Bit error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
LCD controller block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
A single overlay window over a background graphics window . . . . . . . . . . . . . . . . . . . . . 498
GPU top level block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
GPU functional block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
The GPU software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
Typical 3D graphics flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
Geometry processor data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Pixel processor data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
Doc ID 018553 Rev 3
25/590
List of figures
Figure 198.
Figure 199.
Figure 200.
Figure 201.
Figure 202.
Figure 203.
Figure 204.
Figure 205.
Figure 206.
Figure 207.
Figure 208.
Figure 209.
Figure 210.
Figure 211.
Figure 212.
Figure 213.
Figure 214.
Figure 215.
Figure 216.
Figure 217.
Figure 218.
Figure 219.
Figure 220.
Figure 221.
Figure 222.
Figure 223.
Figure 224.
Figure 225.
Figure 226.
Figure 227.
Figure 228.
Figure 229.
Figure 230.
Figure 231.
Figure 232.
Figure 233.
Figure 234.
Figure 235.
Figure 236.
Figure 237.
26/590
RM0078
2D graphics process flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
GPU image filter process flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Typical graphics pipeline flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Decoder functional block diagrams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
Video decoder detailed block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
H.264 decoder initialization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
H.264 / SVC decoder basic process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
MPEG-4 / H.263 / Sorenson Spark decoder initialization . . . . . . . . . . . . . . . . . . . . . . . . . 521
MPEG-4 / H.263 / Sorenson Spark decoder basic procces . . . . . . . . . . . . . . . . . . . . . . . 522
MPEG-2 / MPEG-1 decoder initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
MPEG-2 / MPEG-1 decoder basic procces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
JPEG decoder basic process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
VC-1 decoder initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
VC-1 decoder basic procces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
RV decoder initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
RV decoder basic procces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
VP6 decoder initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
VP6 decoder basic procces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
VP7/VP8 decoder initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
VP7/VP8 decoder basic procces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
AVS decoder initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
AVS decoder basic procces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Data flow and functional block diagram - standalone mode . . . . . . . . . . . . . . . . . . . . . . . 539
Data flow and functional block diagram - combined mode . . . . . . . . . . . . . . . . . . . . . . . . 540
Post processor flowchart - standalone mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
Post processor flowchart - combined mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
External memory use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
YCbCr 4:2:0 planar video frame storage format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
YCbCr 4:2:0 semi-planar video frame storage format . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
YCbCr 4:2:2 interleaved video frame storage format . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
AYCbCr 4:4:4 interleaved video frame storage format . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
RGB 16bpp video frame storage format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
RGB 32bpp video frame storage format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
Encoder functional block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Stabilization picture dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
CAM block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
External VSYNC and HSYNC synchronization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562
ITU656 embedded synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
Video input block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
I2S block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
Doc ID 018553 Rev 3
RM0078
1
Device overview
Device overview
The SPEAr1340 device is a system-on-chip belonging to the SPEAr® (Structured Processor
Enhanced Architecture) family of embedded microprocessors. The product is suitable for
consumer and professional applications where an advanced human machine interface
(HMI) combined with high performance are required, such as low-cost tablets, thin clients,
media phones and industrial/printer smart panels.
The device is hardware-compliant to the support of both real-time (RTOS) and high-level
(HLOS) operating systems, such as Android, Linux and Windows Embedded Compact 7.
The architecture of SPEAr1340 is based on several internal components, communicating
through a multilayer interconnection matrix (BUSMATRIX). This switching structure enables
different data flows to be carried out concurrently, improving the overall platform efficiency.
In particular, high-performance master agents are directly interconnected with the DDR
memory controller in order to reduce access latency. The overall memory bandwidth
assigned to each master port can be programmed and optimized through an internal
weighted round-robin (WRR) arbitration scheme.
Figure 1 on page 28 is the internal connectivity block diagram.
Table 1 on page 29 lists device features and capabilities.
Table 2 on page 32 lists device IP groups and their constituent IPs.
Doc ID 018553 Rev 3
27/590
Device overview
RM0078
1.1
Simplified block diagram
Figure 1.
SPEAr1340 simplified block diagram
JTAG
Trace
Highspeedconnectivity
Memory
Coresight
BootROM
SRAMs
MPCore
USB2.0HostCtrl
CPU0
DDR2/3Ctrl
CPU1
FPU
StaticMemoryCtrl
SerialMemoryI/F
USB2.0HostCtrl
PTM
FPU
CortexA9CPU
32KB
ICache
PTM
CortexA9CPU
32KB
DCache
32KB
ICache
32KB
DCache
USB2.0OTGCtrl
Giga/Fast
EthernetCtrl
PCIe Ctrl
PHY
MemorycardI/F
SCU
Graphics,video,audio
USB
PHYs
SATACtrl
Timer&
Watchdog
CPU0
Timer&
Watchdog
CPU1
Global
Timer
Interrupt
Controller
Lowspeedconnectivity
GPIO
2D/3DGPU
AXIBus
Master0
Snoop
Filtering
AXIBus
Master1
Cache
Transfers
XGPIO
VideoDecoder
VideoEncoder
ACP
I2C(2x)
SSP
DisplayCtrl
UART(2x)
512KBL2Cache
Reset&clockGenerator
KBD
CameraI/F(4x)
THSENS
OTP
PowerControl
CEC(2x)
VideoInput
I2SAudioI/F
(8in,8out)
S/PDIFAudioI/F
28/590
Configuration
registers
DMACtrl(2x)
Timers
Security
Coprocessor
ADC
PWM(4x)
BUSMATRIXInterconnect
Doc ID 018553 Rev 3
RTC
Opt.
Battery
RM0078
Device overview
1.2
Summary of features
Table 1.
Summary of SPEAr1340 features
Category
Cortex A9
subsystem
Interconnect
Features
Details
CPU cores
ARM Cortex A9 with FPU, dual-core, up to 600 MHz
32 KB L1 ICache per core
32 KB L1 DCache per core
L2 Cache
512 KB, shared
Debug & trace
Coresight sub system, 2 x PTM debug I/F
Other features
–
–
–
–
–
–
Multilayer bus matrix
up to 166 MHz
Shared interrupt controller (GIC)
1x 64-bit global timer
2x 32-bit timers (one per core)
2x watchdog/timers (one per core)
snoop control unit
ACP
Reset and clock generation
—
System configuration
registers (MISC)
—
One-time programmable
antifuse
510 + 209 bits
Temperature sensor
System-level
—
DMA controllers
2 x DMAC modules, total 16 channels
General purpose timers
2 x GPT modules, total 8 timers (4 with capture mode)
Real-time clock
—
Power control module
—
Security co-processor
HW acceleration for DES, 3DES, AES, universal hashing,
SHA1/2, MD5,
HMAC PKA, True_RNG
Doc ID 018553 Rev 3
29/590
Device overview
Table 1.
RM0078
Summary of SPEAr1340 features (continued)
Category
Internal / external
memories
Features
Details
BootROM
32 KB
Stores resident bootstrap firmware
System SRAM
32 KB
Always-on SRAM
4 KB
DDR controller
– DDR2-1066/DDR3-1066, up to 533 MHz
– 16-/32-bit
– up to 2 GB address space
Static memory controller
16-bit interface
Supports:
– NAND Flash
– parallel NOR Flash
– static RAM
Serial Flash controller
Supports serial NOR Flash, up to 2 banks, 16 MB each
Memory card interface
Supported standards:
– SD/SDIO 2.0
– SDHC
– MMC 4.x
– CF/ CF+ 4.1
– xD
2D/3D graphics processing
unit
ARM MALI 200
Video decoder
Supported standards:
– H.264 1080p
– MPEG-1/2/4 1080p
– H.263 SD
– Sorenson Spark 1080p
– WMV9/ VC-1 1080p
– RealVideo
– DivX
– VP6, VP7, VP8 AVS
– JPEG 67 Mpixels
Video encoder
– H.264 1080p
– JPEG 64 Mpixels
Display controller
Up to 24 bpp, 1920x1080 @60 fps
Embedded PWM
Graphics, video &
audio
Camera input interfaces
—
Video input parallel port
—
I2S digital audio interfaces
2 modules for total 8 x input + 8 x output channels
SPDIF digital audio interface
30/590
Doc ID 018553 Rev 3
—
RM0078
Table 1.
Device overview
Summary of SPEAr1340 features (continued)
Category
Features
Details
USB 2.0 host controllers
2 x USB 2.0 host ports
USB OTG controller
High-speed
connectivity
—
Ethernet controller
1 x Giga/Fast Ethernet port (external GMII/ RGMII/MII/RMII
PHY)
PCI Express controller
1 port, alternative to SATA
SATA gen-2 controller
1 port, alternative to PCIe
PCIe/SATA physical interface
General purpose IOs
—
2 modules, total 16 IOs
Extended general purpose
IOs
Low-speed
connectivity
—
I2C bus controllers
2 ports, master/slave
Synchronous serial port
Master/slave, 4 chip select signals
Asynchronous serial ports
2 x UART ports, also IrDA capable
Keyboard controller
6x6 matrix
HDMI/CEC
2 x interfaces
Analog-to-digital converter
10-bit, 1 Msps, 8 channels
Also suitable for resistive touchscreen interface
Pulse width modulators
4 x PWM outputs
Doc ID 018553 Rev 3
31/590
Device overview
1.3
RM0078
IP groups
Table 2.
SPEAr1340 IP groups
IP group
32/590
Constituent IPs
Overview, processors, & busses
CPU subsystem (A9SM)
Multilayer interconnect matrix (BUSMATRIX)
General device resources
BootROM
Direct memory access controllers (DMAC)
General purpose timers (GPT)
One-time programmable antifuse (OTP)
Power control module (PCM)
Reset and clock generator (RCG)
Real-time clock (RTC)
Cryptographic co-processor (C3)
Static RAMs (SRAM)
System configuration registers (MISC)
Temperature sensor (THSENS)
Memory interfaces
Multiport DDR2/3 controller (MPMC)
Memory card interface (MCIF)
Serial NOR Flash controller (SMI)
Static memory controller (FSMC)
Graphics, video, & audio
Camera input interfaces (CAM)
Display controller (CLCD)
Graphics processing unit (GPU)
I2S digital audio interfaces (I2S)
S/PDIF digital audio ports
Video decoder (VDEC)
Video encoder (VENC)
Video input parallel port (VIP)
High-speed connectivity
Giga/Fast Ethernet controller (GMAC)
PCI express controller (PCIe)
Serial ATA controllers (SATA)
SATA/PCIe physical interface (MiPHY)
USB 2.0 host controllers (UHC)
USB OTG controller (UOC)
Other connectivity
A/D converter (ADC)
Asynchronous serial port (UART0)
Extended general purpose I/O (XGPIO)
General purpose I/O (GPIOA-B)
HDMI CEC interfaces (CEC)
I2C bus controllers (I2C)
Keyboard controller (KBD)
PWM generators (PWM)
Synchronous serial port (SSP)
Doc ID 018553 Rev 3
RM0078
2
CPU subsystem (A9SM)
CPU subsystem (A9SM)
This chapter focuses on the A9SM functionality and operation.
For the A9SM feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
Overview
The CPU subsystem is based on the ARM Cortex A9 processor, and has a dual core
configuration.
Figure 2 shows the main blocks of the CPU subsystem:
●
a dual Cortex A9 core (CORTEXA9INTEGRATION)
●
a CoreSight subsystem (A9CS)
●
a clock manager (CMR)
●
an L2 cache controller (PL310)
and the bus interfaces:
●
two AXI masters for PL310 BUSMATRIX connections
●
an AXI slave for the accelerator coherency port (ACP)
●
two APB slave interfaces; one for access to the clock manager, and one for access to
the internal CoreSight components
Figure 2.
CortexA9 subsystem top level block diagram
AXI-ACP
(Q0)
AXI slave for the accelerator coherency port
CORTEXA9INTEGRATION
Coresight subsystem
(A9CS)
APB-SYS
(B0)
APB slave interface
for access to the internal
Coresight component
L2 cache controller
PL310
Clock manager
(CMR)
APB-CMR
(B1)
AXI-M1
(20)
APB slave interface
for access to the clock manager
AXI-M0
(10)
2.1
RM0089, Reference manual, SPEAr1340 address map and registers
AXI masters for PL310 BUSMATRIX connections
Doc ID 018553 Rev 3
33/590
CPU subsystem (A9SM)
2.2
RM0078
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
2.3
Clocks
Note:
This section gives a general presentation of the A9SM clocks. For more details, refer to
Chapter 5: Reset and clock generator (RCG).
Although all A9SM clocks are theoretically synchronous, for implementation reasons they
are classified into different scopes.
●
CLK_CORE and PERIPHCLK are considered fully asynchronous compared to the
other A9SM clocks. This provides better tolerance to on-chip variations (OCV).
The clock tree from CLK_CORE to CORTEXA9INTEGRATION must be as short as
possible.
●
ATCLK, PCLKEDB, TRACECLKIN, DAPCLK, CTMCLK, CTICLK are the clocks inside
A9CS. These clocks must be considered synchronous and equal to the the main A9CS
clock (ATCLK).
●
TRACECLK (output) is one-half of TRACECLKIN.
Clock gating is present to reduce power consumption.
Table 3.
CortexA9 subsystem clocks
Clock
2.4
Frequency (MHz)
Block
Type
A9SM
CLK1GHz
1200
CMR
Input
External
CLK_CORE
600
CORTEXA9INTEGRATION
Input
Internal
PERIPHCLK
250
CORTEXA9INTEGRATION
Input
Internal
ATCLK
250
A9CS
Input
Internal
CTICLK
250
A9CS
Input
Internal
CTMCLK
250
A9CS
Input
Internal
TRACECLKIN
250
A9CS
Input
Internal
PCLKDBG
250
A9CS
Input
Internal
PCLKSYS
250
A9CS
Input
Internal
TRACECLK
125
A9CS
Output
External
PCLKDBG_SOC
100
A9CS
Input
External
ATCLK_SOC
166
A9CS
Input
External
ACLKS_SOC
166
CORTEXA9INTEGRATION
Input
External
Interrupts
Refer to Section 2.5.7: Generic interrupt controller (GIC) and Appendix A: Interrupts.
34/590
Doc ID 018553 Rev 3
RM0078
2.5
CPU subsystem (A9SM)
Functional description
This chapter describes the main blocks and functionalities of the CPU subsystem.
2.5.1
CORTEXA9INTEGRATION
The CORTEXA9INTEGRATION comprises:
●
two Cortex-A9 processors in a cluster and a snoop control unit (SCU) that ensures
coherency within the cluster
●
a global timer (GTIM)
●
a private timer and watchdog unit per processor (WDTIM)
●
a generic interrupt controller (GIC) with 128 dedicated external lines
●
a second master port with programmable address filtering capability (disabled at reset)
●
an accelerator coherency port (ACP) suitable for coherent memory transfers
Figure 3.
CORTEXA9INTEGRATION internal block diagram
General timer
Cortex-A9
MPCore
CPU0
Cortex-A9
MPCore
CPU1
Instruction, data, and
coherency buses
Tag RAM
Slave 0
Slave 1
Tag RAM
Timer and
watchdog
Timer and
watchdog
Tag RAM
Tag RAM
Tag
control
Interrupt
controller
Snoop control unit
(SCU)
Cache to
cache transfers
Snoop
filtering
Master 0
Master 1
AXI RW
64-bit bus
AXI RW
64-bit bus
Accelerator
coherency
port (ACP)
AXI RW
64-bit bus
Four tag RAMs
per CPU
CORTEXA9INTEGRATION and PL310
CORTEXA9INTEGRATION and PL310 support full parity error detection.
Table 4 lists the CORTEXA9INTEGRATION and PL310 configuration parameters. An RTL
parameter is a register transfer level (RTL) static parameter (not changeable at runtime). A
PIN parameter is a signal available at the CortexA9 subsystem boundary that is used to
impose a predefined behavior (primarily to determine reset values).
Doc ID 018553 Rev 3
35/590
CPU subsystem (A9SM)
Table 4.
RM0078
CORTEXA9INTEGRATION and PL310 configuration parameters
Type
Name
Value
Description
CORTEXA9INTEGRATION
RTL
CORE_NUM
RTL
MP_MODE
YES
2
Multiprocessor system (enable SCU)
2 CPUs in system
RTL
ACP_PRESENT
YES
ACP port present
RTL
MASTER_NUM
2
RTL
INT_NUM
128
Number of external interrupt lines
RTL
POWER_DOMAIN_WRAPPER
NO
Internal power-down feature not enabled
RTL
PTM_INTERFACE_PRESENT
YES
PTM interface present for each CPU (for CS)
RTL
PARITY
YES
Enabled parity fail signal generation on all internal RAMs
RTL
PRELOAD_ENGINE_PRESENT
YES
Enable the presence of a preload engine
RTL
PRELOAD_ENGINE_FIFO_SIZE
8
Number of entries on the preload engine
PIN
CFGSDISABLE
Number of AXI masters
zero
No restrictions in writing GIC registers in secure mode at reset
[CPU0/1]
RTL
DCACHESIZE
32 K
Data cache size in byte
RTL
ICACHESIZE
32 K
Instruction cache size in byte
RTL
TLBSIZE
128
Number of TLB entries
RTL
JAZELLE_PRESENT
YES
Java processor present
RTL
FPU_PRESENT
YES
Vectorized Floating Point Unit present
RTL
NEON_PRESENT
NO
NEON instruction set NOT available
PIN
CFGEND[1:0]
0x0
Little endian at reset
PIN
CFGNMFI[1:0]
0x0
NMFI bit in the CP15 c1 control register set to 0 at reset
PIN
CP15SDISABLE[1:0]
0x0
No restrictions in CP15 access at reset
PIN
VINITHI[1:0]
0x3
High vector table at reset
PIN
TEINIT
0x0
Default exception handling state at reset: ARM
PL310
RTL pl310_PARITY
YES
Enabled parity fail signal generation on all internal RAMs
RTL pl310_S1
YES
2 AXI slaves
RTL pl310_M1
YES
2 AXI masters
RTL pl310_AXI_ID_MAX
4
AXI ID width on the PL310 slave ports: pl310_AXI_ID_MAX+1
AXI ID width on the PL310 master ports:
pl310_AXI_ID_MAX+3
RTL pl310_LOCKDOWN_BY_MASTER
YES
Enable lockdown by master support
RTL pl310_LOCKDOWN_BY_LINE
YES
Enable lockdown by line support
RTL pl310_ADDRESS_FILTERING
YES
Address filtering on 2nd AXI master enabled
RTL pl310_TAG_SETUP_LAT
0
Setup time for Tag RAM = 0 core clock cycles
RTL pl310_TAG_READ_LAT
1
READ tag RAM latencies = 1 core clock cycles
RTL pl310_TAG_WRITE_LAT
1
WRITE tag RAM latencies = 1 core clock cycles
RTL pl310_DATA_SETUP_LAT
0
Setup time for data RAM = 0 core clock cycles
RTL pl310_DATA_READ_LAT
2
READ data RAM latencies = 2 core clock cycles
36/590
Doc ID 018553 Rev 3
RM0078
Table 4.
CPU subsystem (A9SM)
CORTEXA9INTEGRATION and PL310 configuration parameters (continued)
Type
Name
Value
Description
RTL pl310_DATA_WRITE_LAT
2
WRITE data RAM latencies = 2 core clock cycles
RTL pl310_NB_WAYS
8
Number of ways = 8
RTL pl310_SPECULATIVE_READ
YES
Enable the capability of emitting speculative reads
RTL pl310_DATA_BANKING
NO
Allow a data reoganization in banks
PIN CFGBIGEND
zero
Little endian at reset
PIN WAYSIZE
2.5.2
3'b011 Way size of 64 KB
A9 CoreSight subsystem (A9CS)
The A9CS is dedicated to debugging and tracing. It is a modular and fully customizable
subsystem. In normal functional mode, A9CS is powered off.
These are the main A9SM components:
AMBA advanced trace bus (ATB)
The ATB transfers trace data through CoreSight infrastructure in a SoC. Trace sources are
ATB masters, and sinks are ATB slaves. Link components provide both master and slave
interfaces.
Trace port interface unit (TPIU)
The TPIU is an ATB slave that drains trace data off the chip. It acts as a bridge between the
on-chip trace data and a data stream that is captured by a Trace Port Analyzer (TPA). The
Formatter within the TPIU combines the source data and IDs into a single data stream, to
enable serialization of data, inserting trigger packets on trigger detection.
Embedded trace buffer (ETB)
The ETB is an ATB slave and provides on-chip storage of trace data using a configurable
sized RAM. The ETB accepts trace data from CoreSight trace source components through
an AMBA trace bus (ATB). The Formatter in the ETB combines the source data and IDs into
a single data stream. The Formatter operates in an identical manner to the Formatter in the
TPIU. In this implementation ETB size is 8 Kbyte.
Program trace macrocell (PTM)
The PTM for the Cortex-A9 processor is a module that performs real-time instruction flow
tracing based on the Program Flow Trace (PFT) architecture. The PTM-A9 generates
information that trace tools use to reconstruct the execution of all or part of a program.
Doc ID 018553 Rev 3
37/590
CPU subsystem (A9SM)
RM0078
Cross trigger interface (CTI)
The CTI combines and maps the trigger requests, and broadcasts them to all other
interfaces on the ECT as channel events. When the CTI receives a channel event it maps
this onto a trigger output. This enables subsystems to cross trigger with each other. The
receiving and transmitting of triggers is performed through the trigger interface.
Cross trigger matrix (CTM)
This block controls the distribution of channel events. It provides Channel Interfaces (CIs) for
connection to either CTIs or CTMs. This enables multiple CTIs to be linked together.
Debug Access Port (DAP)
The DAP comprises a number of components supplied in a single configuration. All the
supplied components fit into the various architectural components for Debug Ports (DPs),
which are used to access the DAP from an external debugger and Access Ports (APs), to
access on-chip system resources. The debug port and access ports together are referred to
as the DAP. The DAP provides real-time access by the debugger software to the JTAG scan
chains in the chip, to all debug and trace configuration registers. For multicore systems
debug access is maintained even if one core is powered down or asleep.
Debug Access Port ROM table (DAP ROM)
The DAP provides an internal ROM table connected to the master Debug APB port of the
APB Mux. The ROM table stores the locations of the components on the Debug APB. The
ROM table is a read-only device, writes are ignored.
Figure 4 is the block diagram for both A9CS and CORTEXA9INTEGRATION.
An active power up request of the debug domain must be applied to the APB-CMR. This
request can be done either by the processor or by the debug access port (DAP) through the
JTAG interface.
All CoreSight peripherals are mapped within a space of 128 Kbytes. CoreSight components
are mapped within the memory space of the system and each one has a 4-Kbyte address
space reserved. Table 5 provides the complete list.
The two ROM tables, DAPROM and CortexA9ROM (see Table 5: A9CS memory map, and
DAPROM register details in RM0089, Reference manual, SPEAr1340 address map and
registers) contain all the entries required to perform a topology detection of the system by
reading on their contents. Starting from the DAPROM it is possible to follow the link to the
CortexA9ROM.
38/590
Doc ID 018553 Rev 3
RM0078
CPU subsystem (A9SM)
Figure 4.
A9CS block diagram with CORTEXA9INTEGRATION
Trace
Port
A9CS
TPIU
CTI
ETB
ATB
ATB
FUNNEL
ETB
ATB
PTM0
PTM1
CPU0
CPU1
CTI0
CTI1
ATB
CTM
FUNNEL
TPIU
DAP
ROM
CTM
CORTEXA9
ROM
DAP
JTAG
CORTEXA9INTEGRATION
APB-SYS
In both ROM tables, each entry has the following fields:
●
Bit 0: component present (1) or not (0)
●
Bit 1: component with 32 bit (1) or 8 bit (0) data
●
Bit 11-2: always 0
●
Bit 31-12: base address of the component.
The system designer must define the external ROM table; for this purpose, a dedicated
input at the A9SM boundary is provided: EXTROMTABLEOFFSET[31:12] and
EXTROMTABLEOFFSETV.
Doc ID 018553 Rev 3
39/590
CPU subsystem (A9SM)
Table 5.
2.5.3
RM0078
A9CS memory map
CoreSight component
Base address
OFFSET from DAPROM
DAPROM
0xE0780000
0x00000
TPIU
0xE0781000
0x01000
CTI
0xE0782000
0x02000
ETB
0xE0783000
0x03000
FUNNEL TPIU
0xE0784000
0x04000
FUNNEL ETB
0xE0785000
0x05000
RESERVED
0xE0786000
0x06000
CORTEXA9 ROM
0xE07A0000
0x20000
RESERVED
0xE07A1000
0x21000
CORE0 CP14
0xE07B0000
0x30000
CORE0 PMU
0xE07B1000
0x31000
CORE1 CP14
0xE07B2000
0x32000
CORE1 PMU
0xE07B3000
0x33000
RESERVED
0xE07B4000
0x34000
CORE0 CTI
0xE07B8000
0x38000
CORE1 CTI
0xE07B9000
0x39000
RESERVED
0xE07BA000
0x3A000
CORE0 PTM
0xE07BC000
0x3C000
CORE1 PTM
0xE07BD000
0x3D000
RESERVED
0xE07BE000
0x3E000
Clock manager (CMR)
The clock manager is the block that takes the clock coming from the system (the PLL is
outside of A9SM) and divides it into clock signals for each internal block.
To enable the A9CS clocks, you can either:
●
program the CMR register through the APB-CMR interface
–or–
●
use the signals provided internally and managed by the CMR itself to enable the A9CS
clocks through the JTAG interface
To disable the A9CS clocks, use only the first of the above methods.
You can use the clock manager to control the clock gating of the debug part, for instance the
CoreSight subsystem (A9CS). This is the role of the APB-CMR interface, a standard APB3
interface whose main task is to provide a bus interface for A9CS clock enable/disable (for
more detail, see Clock manager registers in RM0089, Reference manual, SPEAr1340
address map and registers).
40/590
Doc ID 018553 Rev 3
RM0078
2.5.4
CPU subsystem (A9SM)
Snoop control unit (SCU)
The SCU connects the two Cortex-A9 processors to the memory system through the AXI
interfaces.
The SCU functions are to:
Note:
●
maintain data cache coherency between the Cortex-A9 processors
●
initiate L2 AXI memory accesses
●
arbitrate between Cortex-A9 processors requesting L2 accesses
●
manage ACP accesses
The A9 SCU does not support hardware management of coherency of the instruction cache.
Address filtering
In the two-master port configuration, the SCU can be given an address range that redirects
all memory transactions within the range to the second master port. The SCU routes all
other memory transactions to the first master port.
When filtering is off, exclusive accesses go to port M0; when filtering is on, exclusive
accesses go to either port M0 or port M1, depending on the address. If the exclusive access
is in the filtering range, it goes to M1; if not, it goes to M0.
The SCU register bank provides the filtering mode enable bits and the address range
selection registers (see Filtering Start Address Register, Filtering End Address Register,
and SCU Control Register in RM0089, Reference manual, SPEAr1340 address map and
registers).
SCU event monitoring
The individual CPU event monitors can be configured to gather statistics on the operation of
the SCU. Refer to Cortex-A9 technical reference manual for more detail on monitoring
events.
2.5.5
Global timer (GTIM)
The global timer is:
●
a 64-bit incrementing counter with an auto-incrementing feature
●
memory mapped in the same address space as the private timers
●
accessed at reset in secure state only (using the SCU Access Control Register)
●
accessible to all Cortex-A9 processors. Each Cortex-A9 processor has a 64-bit
comparator that is used to assert a private interrupt when the global timer has reached
the comparator value. All the Cortex-A9 processors in a design use a common ID,
ID[27], for this interrupt. This ID is sent to the interrupt controller as a private peripheral
interrupt (see Interrupt distributor section).
Global timer interrupt
The global timer interrupt (ID[27]) is set as pending in the interrupt distributor when the
counter register has the same value as the comparator register, after the event flag is set in
the global timer interrupt status register.
See also, Section 2.6.1: Programming the global timer registers.
Doc ID 018553 Rev 3
41/590
CPU subsystem (A9SM)
2.5.6
RM0078
Timer and watchdog blocks (WDTIM)
The watchdog can be configured as a timer. Both the timer and watchdog blocks have the
following features:
●
a 32-bit counter that generates an interrupt when it reaches zero
●
an 8-bit prescaler for better control of the interrupt period
●
configurable single-shot or auto-reload modes
●
configurable starting values for the counter
●
same clock as the interrupt controller clock
Calculating timer intervals
Use the following equation to calculate the timer intervals; this equation can be used to
calculate the period between two events generated by a timer or watchdog.
( Prescaler_value + 1 ) × ( Load_value + 1 )---------------------------------------------------------------------------------------------------------------PERIPHCLK
Timer and watchdog interrupts
The timer interrupt ID[29] is set as pending in the interrupt distributor when the timer counter
register reaches zero, after the event flag is set in the timer interrupt status register.
The watchdog interrupt ID[30] is set as pending in the interrupt distributor when the
watchdog counter register reaches zero, after the event flag is set in the watchdog interrupt
status register.
2.5.7
Generic interrupt controller (GIC)
The generic interrupt controller is a single functional unit located in a Cortex-A9
multiprocessor design. It is memory-mapped. The Cortex-A9 processors access it by using
a private interface through the SCU. The GIC collates interrupts from a large number of
sources and provides:
42/590
●
masking of interrupts
●
prioritization of interrupts
●
distribution of interrupts to the target Cortex-A9 processors
●
tracking of the status of interrupts
●
generation of interrupts by software
●
support for security extensions
Doc ID 018553 Rev 3
RM0078
CPU subsystem (A9SM)
Interrupt sources can be of the following types:
●
Software generated interrupts (SGI): they are generated by writing to the Software
generated interrupt register (ICDSGIR). A maximum of 16 SGIs can be generated for
each Cortex-A9 processor interface.
●
Private peripheral interrupts (PPI): An interrupt generated by a peripheral that is
specific to a single Cortex-A9 processor. There are 5 PPIs for each Cortex-A9
processor interface.
●
Shared peripheral interrupts (SPI)
An interrupt generated by a peripheral that the generic interrupt controller can route to
any, or all, Cortex-A9 processor interfaces.
The generic interrupt controller supports 128 SPIs.
●
Lockable shared peripheral interrupts (LSPI)
There are 31 LSPIs. You can configure and then lock these interrupts against further
change using CFGSDISABLE. The LSPIs are present only if the SPIs are present.
The generic interrupt controller consists of an Interrupt distributor and Cortex A9 processor
interfaces.
Interrupt distributor
The interrupt distributor consists of a register-based list of interrupts, their priorities and
activation requirements, Cortex-A9 processor targets, and their pending and active status.
The interrupt distributor centralizes all interrupt sources, determines the priority of each
interrupt and distributes the interrupt with the high priority to the Cortex A9 processor
interfaces that connect to the processors in the system. The processor interface
acknowledges interrupts and changes interrupt priority masks. Hardware ensures that an
interrupt targeted at several processors can be taken by only one processor at a time.
When the interrupt distributor detects an interrupt assertion, it sets the status of the interrupt
for the targeted Cortex-A9 processors to pending. Level-triggered interrupts cannot be
marked as pending if they are active for at least one Cortex-A9 processor.
When an interrupt is triggered by the software interrupt register or the set-pending register,
the status of that interrupt for the targeted Cortex-A9 processor or processors is set to
pending. This interrupt then has the same behavior as a hardware interrupt. The distributor
does not differentiate between software and hardware triggered interrupts.
When multiple pending interrupts have the same priority, the selected interrupt is the one
with the lowest ID. If there are multiple pending software-generated interrupts with the same
ID, the lowest Cortex-A9 processor source is selected.
For each processor the prioritization and selection block searches for the pending interrupt
with the highest priority. This interrupt is then sent with its priority to the processor interface.
The prioritization logic is physically duplicated to enable the simultaneous selection of the
highest priority interrupt for each processor. The processor interface returns information to
the distributor when the processor acknowledges (pending to active transition) or clears an
interrupt (active to inactive transition). With the given interrupt ID, the interrupt distributor
updates the status of this interrupt according to the information sent by the processor
interface.
Interrupt distributor interrupt sources. All interrupt sources are identified by a unique ID.
They have their own configurable priority and a list of targeted Cortex-A9 processors, which
is a list of processors that the interrupt is sent to when triggered by the interrupt distributor.
Doc ID 018553 Rev 3
43/590
CPU subsystem (A9SM)
RM0078
Interrupt sources can be of the following types:
●
Software generated interrupts (SGI)
Each Cortex-A9 processor has private interrupts, ID[0:15], that can be triggered only by
software. These interrupts are aliased so that there is no requirement for a requesting
Cortex-A9 processor to determine its own CPU ID when it deals with SGIs. The priority
of an SGI depends on the value set by the receiving Cortex-A9 processor in the banked
SGI priority registers, not the priority set by the sending Cortex-A9 processor.
●
A legacy nFIQ pin, PPI(0)
In legacy FIQ mode, the legacy nFIQ pin, on a per Cortex-A9 processor basis,
bypasses the interrupt distributor logic and directly drives interrupt requests into the
Cortex-A9 processor. When a Cortex-A9 processor uses the generic interrupt
controller, rather than the legacy pin in the legacy mode, by enabling its own Cortex-A9
processor interface, the legacy nFIQ pin is treated like other interrupt lines and uses
ID[28].
●
Private timer, PPI(1)
Each Cortex-A9 processor has its own private timers that can generate interrupts,
using ID[29].
●
Watchdog timers, PPI(2)
Each Cortex-A9 processor has its own watchdog timers that can generate interrupts,
using ID[30].
●
A legacy nIRQ pin, PPI(3)
In legacy IRQ mode, the legacy nIRQ pin, on a per Cortex-A9 processor basis,
bypasses the interrupt distributor logic and directly drives interrupt requests into the
Cortex-A9 processor.
●
Generic interrupt controller
When a Cortex-A9 processor uses the interrupt controller (rather than the legacy pin in
the legacy mode) by enabling its own Cortex-A9 processor interface, the legacy nIRQ
pin is treated like other interrupt lines and uses ID[31].
●
Global timer, PPI(4)
The global timer uses ID[27].
●
Shared peripheral interrupts (SPI)
SPIs are triggered by events generated on associated interrupt input lines. The
interrupt controller can support up to 224 interrupt input lines. The interrupt input lines
can be configured as either edge sensitive (posedge), or level sensitive (high level).
SPIs start at ID[32].
Cortex A9 processor interfaces
The Cortex-A9 processor interfaces are slaves to the Cortex-A9 processors. They perform
priority masking and preemption handling for a connected processor. There is one CortexA9 processor interface for each processor.
A pending interrupt is accepted only if its priority is higher than the priority mask and also
than the priority of the highest priority active interrupt active on that Cortex-A9 processor. If
a pending interrupt is accepted, the effect is that an interrupt request is made to the
processor for interrupt exception entry. If the processor then reads its interrupt acknowledge
register, the processor interface records the priority of this interrupt and marks it as active in
the interrupt distributor for that processor.
44/590
Doc ID 018553 Rev 3
RM0078
CPU subsystem (A9SM)
If an interrupt is sent by several processors, only the first one gets this interrupt ID and other
processors read the spurious ID, or another pending interrupt ID. If the interrupt is cleared
before the Cortex-A9 processor reads its interrupt acknowledge register, for example
because of a priority mask change or a write to the interrupt pending clear register, the
Cortex-A9 processor gets the interrupt ID value 1023, indicating a spurious interrupt.
The interrupt active to inactive transition is triggered by an Cortex-A9 processor writing the
completed interrupt ID in its end of interrupt register.
Security extensions support
The generic interrupt controller enables all implemented interrupts to be individually defined
as secure or non-secure.
You can program secure interrupts to use either the IRQ or FIQ interrupt mechanism of a
Cortex-A9 processor through the FIQen bit in the ICPICR register. Non-secure interrupts
are always signalled using the IRQ mechanism of a Cortex-A9 processor.
Note:
A non-secure access to a register of a secure interrupt behaves as RAZ/WI.
Priority formats. The software view of priority fields depends on the status of the access
request to the priority field (NS-prot), and the security status (NS-int) of the interrupt that the
priority field refers to.
The priority space is partitioned to ensure that secure interrupts can always be given a
priority higher than any non-secure interrupt. The non-secure domain observes a smaller
available range of priority levels than the range available to the secure domain as Figure 5
shows.
In Figure 5, priority format A shows the format this implementation uses for secure
accesses. Priority format B shows the format this implementation uses for non-secure
accesses. Bit D is the most significant bit (MSB) of the non-secure interrupt priority view.
The least significant bit (LSB) is always zero. Priority format C shows the non-secure
interrupt priority internal format as viewed by secure accesses. The MSB is usually one and
it is automatically set for non-secure writes.
Note:
Priority zero is the highest priority. The lowest priority is priority 0x1F.
Doc ID 018553 Rev 3
45/590
CPU subsystem (A9SM)
Figure 5.
RM0078
Secure and non-secure interrupt priority formats
MSB
LSB
7 6 5 4 3 2 1 0
Priority format A
Interrupt security setting Security status of access
SBZ
E D C B A
Any
Secure
Secure
accesses
7 6 5 4 3 2 1 0
Priority format B
D C B A
Interrupt security setting Security status of access
SBZ
Non-secure
Non-secure
7 6 5 4 3 2 1 0
Interrupt security setting Security status of access
Priority format C
1 D C B A
SBZ
Non-secure
Secure
Non-secure
accesses
Non-secure
writes as
viewed by
secure
reads
Interrupt security setting Security status of access Silently fails ,
no exception
generated,
Secure
Non-secure
RAZ/WI
RAZ/WI
Interrupt output source selection. There are two legacy interrupt inputs, nFIQ[n] and
nIRQ[n], for each Cortex-A9 processor. When you use the legacy mode, the interrupt
controller disables the corresponding Cortex-A9 processor interface and it routes the legacy
interrupt inputs to the Cortex-A9 processor generating IRQ and FIQ exceptions respectively.
Otherwise, these pins are used as PPI(0) and PPI(3).
Table 6 shows the bits in the ICPICR register that enable you to select the signals that drive
the interrupt outputs of a Cortex-A9 processor interface.
Table 6.
Interrupt output source selection
ICPICR register
Bit[3]
FIQEn
46/590
Bit[1]
EnableNS
Interrupt output signals
Bit[0]
EnableS
FIQ exception generated by
IRQ exception generated by
0
0
0
nFIQ[n]
nIRQ[n]
0
0
1
nFIQ[n]
Secure interrupts
0
1
0
nFIQ[n]
Non-Secure interrupts
0
1
1
nFIQ[n]
Secure and non-Secure
interrupts
1
0
0
nFIQ[n]
nIRQ[n]
1
0
1
Secure interrupts
nIRQ[n]
1
1
0
nFIQ[n]
Non-Secure interrupts
1
1
1
Secure interrupts
Non-Secure interrupts
Doc ID 018553 Rev 3
RM0078
CPU subsystem (A9SM)
Using CFGSDISABLE. The interrupt controller provides the facility to prevent write
accesses to critical configuration registers when you assert CFGSDISABLE. This signal
controls the read and write behavior for the secure control registers in the distributor and
Cortex-A9 processor interfaces, and the lockable shared peripheral interrupts (LSPIs) in the
interrupt controller.
If you use CFGSDISABLE, ARM recommends that you assert CFGSDISABLE during the
system boot process, after the software has configured the registers. Ideally, the system
must deassert CFGSDISABLE only if a hard reset occurs. When CFGSDISABLE is HIGH,
the interrupt controller prevents write accesses to the following registers in the:
●
Distributor
The enable_set register
●
●
Secure interrupts defined by LSPI field in the ic_type register:
–
Interrupt security registers
–
Enable set registers
–
Enable clear registers
–
Pending set registers
–
Pending clear registers
–
Priority level registers
–
SPI target registers
–
Interrupt configuration register
Cortex-A9 processor interface
The ICPICR register, except for the EnableNS bit.
Note:
When CFGSDISABLE is HIGH the interrupt controller permits write access only to the
EnableNS bit. All other bits are read-only. After you assert CFGSDISABLE, it changes the
register bits to read-only and therefore the behavior of these secure interrupts cannot
change, even in the presence of rogue code executing in the secure domain.
Doc ID 018553 Rev 3
47/590
CPU subsystem (A9SM)
RM0078
2.6
Programming
2.6.1
Programming the global timer registers
This section provides information about how to program the global timer registers
Programming the global timer counter register
Note:
1.
Clear the timer enable bit in timer control register
2.
Write the lower 32-bit timer counter register
3.
Write the upper 32-bit timer counter register
4.
Set the timer enable bit
You must use this register with 32-bit accesses.You cannot use the STRD/LDRD
instructions.
Reading the global timer counter register
1.
Read the upper 32-bit timer counter register
2.
Read the lower 32-bit timer counter register
3.
Read again the upper 32-bit timer counter register. If the value is different from the
precedent 32-bit upper value, read the lower 32-bit timer counter register.
Otherwise, the 64-bit timer counter value is correct.
Programming the global timer compare register
Use the following steps to ensure that updates to this register do not set the timer interrupt
status register.
48/590
1.
Clear the COMPEN bit in the timer control register
2.
Write the lower 32-bit comparator value register
3.
Write the upper 32-bit comparator value register
4.
Set the COMPEN bit and, if necessary, the IRQ enable bit
Doc ID 018553 Rev 3
RM0078
Multilayer interconnect matrix (BUSMATRIX)
3
Multilayer interconnect matrix (BUSMATRIX)
This chapter focuses on BUSMATRIX functionality and operation.
For the BUSMATRIX feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
3.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The multilayer interconnect matrix is the connectivity infrastructure that enables data
exchange between the various blocks of the device. This structure supports parallel
communications between master and slave components, and ensures the maximum level of
system throughput.
SPEAr1340 block diagram with BUSMATRIX topology details
120
A9 subsystem module
PTM
I/F
B0/B1
CoreSight subsystem
FPU
A9 CPU
G
I
C
AXI-32 AXI-64
Instr
Cache
Data
Cache
SPEAr1340
SCU
Cache2cache
transfers
Snoop
filtering
Watchdogs/
Timers
Q
ABI
CLCD
60
GMAC
36
A9 CPU
Data
Cache
Instr
Cache
PTM
I/F
FPU
ACP
Figure 6.
L2 cache (512 KB)
10
20
SATA
PCIe0
35
VENC
75
VDEC
55
AXI-64
AXI-64
AXI-64
VIP
72
AXI-64
AXI-64
GPU
100
AXI-64
DMAC1
50
AXI-64
AHB-64
AHB-64
SMX0 Xbar
PP
AHB-64
SMX1 Xbar
PP
MCIF
UOC
34
UHC0
30/31
UHC1
32/33
C3
70
AHB-32
2xAHB-32
2xAHB-32
AHB-32
DMAC0
40
SD/SDIO/
MMC
71
AHB-32
SMX2 Shared link
to
S3220
AHB-32
AXI-64
AXI-64
AXI-64
AXI-64
AXI-64
AXI-64
MPMC
K
MPMC
H
MPMC
J
MPMC
L
MPMC
M
MPMC
N
Port 2
Port 0
Port 1
Port 3
Port 4
Port 5
AXI-64
CEC
I_2/_3
PCIe0
C5
16/32 bits
(with ECC)
DDR2/3
@533 MHz
CAM
I_4/5/
6/7
AXI-32
UART1
I2C1
I2S
S/M
D11_0
D11_1
A5_0/_1
APB
MPMC
B2
MPMC
VIPP
I_1
AHB-32
AXI-32
SATA
PCIe0
C6
Native-32
Native-32
AHB-32
SYS
RAM0
A8
SYS
ROM
B11
FSMC
A0/1
AHB-32
AHB-32
MCIF
MCIF
SD/SDIO/
MMC
CF/xD
C3
C4
OCP-32
GPU
I_0
SPDIF
I/O
I_8/_9
GPT(2)
B4/B5
GPIO (2)
B7/B8
RTC
B9
MISC
B10
GPT(2)
B15/B16
2xAPB
2xAPB
APB
APB
2xAPB
SYS
RAM1
A10
UART0
A2
SSP
A3
I2C0
A4
ADC
A6
KBD
A9
PWM
A7
Native-32
APB
APB
APB
APB
APB
APB
AHB-32
to
SMX
2xAPB
S3220
2xAHB-32
2xAHB-32
AHB-32
AHB-32
AHB-32
AHB-32
UHC0
D1/D2
UHC1
D3/D4
UOC
D5
GMAC
D0
XGPIO
D6
SMI
B3/13
Doc ID 018553 Rev 3
AHB-32
AHB-32
AHB-32
DMAC0
B6
DMAC1
B14
MIPHY
B12_0
AHB-32
AHB-32
AHB-32
AHB-32
VENC
VDEC
B12_1
B12_2
CLCD
C0
C3
C1
49/590
Multilayer interconnect matrix (BUSMATRIX)
3.2
RM0078
Pins
The BUSMATRIX does not have any off-chip signals.
3.3
Clocks
Refer to Chapter 5: Reset and clock generator (RCG).
3.4
Interrupts
Refer to Appendix A: Interrupts.
3.5
Functional description
Note:
In this document, initiator agent (IA) and master are used synonymously, and target agent
(TA) and slave are used synonymously.
3.5.1
Crossbars (XB)
SMX0 and SMX1 can enable full connectivity between all of the IAs and TAs that require it.
Crossbars are meant for performance and latency control.
3.5.2
Shared link (SL)
SMX2 allows full connectivity between all IAs and TAs by maintaining a unique channel that
uses time division to share its use. A shared link is easier to implement, and alleviates the
problem of a high frequency design by allowing a better implementation for IAs and TAs that
do not require high bandwidth.
3.5.3
S3220
The S3220 can manage up to four transactions in parallel, and can easily adapt to serve a
peripheral with slow register access due to low-speed data FIFOs.
50/590
Doc ID 018553 Rev 3
RM0078
3.5.4
Multilayer interconnect matrix (BUSMATRIX)
Masters (IAs) and slaves (TAs)
Table 7 lists IAs by connectivity (group ID), and provides individual IDs, initiating IPs, and
protocol types.
Table 8 provides an IA and TA connectivity matrix.
Table 7.
IA group organization
IA group ID
IA ID
Initiating IP
Protocol type
10
10
A9SM
AXI-64
20
20
A9SM
AXI-64
30
UHC0
AHB-32
31
UHC0
AHB-32
32
UHC1
AHB-32
33
UHC1
AHB-32
34
UOC
AHB-32
36
GMAC
AXI-32
35
35
PCIE/SATA0
AXI-64
40
40
DMAC0
AHB-64
50
50
DMAC1
AHB-64
55
55
VDEC
AXI-64
60
60
CLCD
AXI-64
70
C3
AHB-32
71
MCIF
AHB-32
72
VIP
AHB-64
75
75
VENC
AXI-64
100
100
GPU
AXI-64
30
70
Known limitations
●
The AXI protocol supports sequences of locked transactions, which are restricted to a
single read-modify-write sequence.
●
Read and write transactions must be made to the same address.
●
The bridge converts AXI exclusive transactions to an OCP2
ReadLinked/WriteConditional pair.
OCP2 limits ReadLinked/WriteConditional requests to a single request per thread at
the taget core, and the bridge extends this restriction to AXI exclusive transactions.
Doc ID 018553 Rev 3
51/590
Multilayer interconnect matrix (BUSMATRIX)
Table 8.
RM0078
Connectivity matrix
IA group index
Slave
index
IP
Comment
10
H
20
30
35
40
50
55
60
70
75
80 100
X
K
X
X
J
X
MPMC
X
X
DDR memory access
L
X
X
X
M
X
N
X
IO space and
configuration space
X
X
NOR/SRAM memory
space
X
X
X
X
Configuration registers
X
X
X
X
NAND memory space
X
X
PCIE/SATA0
DBI space
X
X
A5_0
I2S_S
X
X
X
X
A5_1
I2S_M
Configuration and data
registers
X
X
X
X
CF/xD
X
X
X
X
X
X
X
SD/SDIO/MMC
X
X
X
X
X
X
X
Standard shared RAM
(32 KB)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
C3
X
X
PCIE0
C6
X
X
X
C5
A1_1
X
X
X
ACP
FSMC
X
X
A9SM
A0
X
X
Q
A1_0
X
X
X
X
X
X
X
X
X
MCIF
C4
A8
SYSRAM0
D11_1 UART1
D11_0 I2C1
Configuration and DMA
port
X
X
E0
BUSMATRIX
Configuration registers for
SMX
X
X
I_0
GPU
Configuration registers
X
X
X
X
X
X
X
X
X
X
I_1
VIP
Configuration registers
X
X
X
X
X
X
X
X
X
X
Configuration and data
registers
X
X
X
X
X
X
X
X
X
X
CEC
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
I_2
I_3
I_4
CAM0
I_5
CAM1
Configuration and data
registers
X
X
I_6
CAM2
I_7
CAM3
X
X
X
X
X
X
X
X
X
X
I_8
SPDIF OUT
X
X
X
X
X
X
X
X
X
X
I_9
SPDIF IN
X
X
X
X
X
X
X
X
X
X
52/590
Doc ID 018553 Rev 3
X
RM0078
Multilayer interconnect matrix (BUSMATRIX)
Table 8.
Connectivity matrix (continued)
IA group index
Slave
index
IP
Comment
10
20
30
35
40
50
55
60
70
75
80 100
A2
UART0
Configuration and data
registers
X
X
X
X
X
X
X
X
A6
ADC
Configuration and data
registers
X
X
X
X
X
X
X
X
A3
SSP
Configuration and data
registers
X
X
X
X
X
X
X
X
A7
PWM
Configuration and data
registers
X
X
X
X
X
X
X
X
A4
I2C0
Configuration and data
registers
X
X
X
X
X
X
X
X
A9
KBD
Configuration registers
X
X
X
X
X
X
X
X
B4
GPT0
X
X
X
X
X
X
X
X
B5
GPT1
X
X
X
X
X
X
X
X
Configuration registers
B15
GPT2
X
X
X
X
X
X
X
X
B16
GPT3
X
X
X
X
X
X
X
X
B9
RTC
X
X
X
X
X
X
X
X
B7
GPIOA
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Configuration registers
X
X
X
X
X
X
X
X
APB-SYS for internal
(A9SM) Coresight access
X
X
X
X
X
X
X
X
APB-CMR for clock
manager access
X
X
X
X
X
X
X
X
Configuration registers
Configuration registers
B8
GPIOB
B10
MISC
B0
A9SM
B1
A10
SYSRAM1
Memory for always-on
support (4 KB)
X
X
X
X
X
X
X
X
C0
CLCD
Configuration registers
X
X
X
X
X
X
X
X
C1
C3
Configuration registers
X
X
X
X
X
X
X
X
D0
GMAC
Configuration registers
X
X
X
X
X
X
X
X
D6
XGPIO
Registers
X
X
X
X
X
X
X
X
D5
UOC
Control and status
registers programming
interface
X
X
X
X
X
X
X
X
OHCI
X
X
X
X
X
X
X
X
EHCI
X
X
X
X
X
X
X
X
OHCI
X
X
X
X
X
X
X
X
EHCI
X
X
X
X
X
X
X
X
D1
UHC0
D2
D3
UHC1
D4
Doc ID 018553 Rev 3
53/590
Multilayer interconnect matrix (BUSMATRIX)
Table 8.
Connectivity matrix (continued)
IA group index
Slave
index
IP
B13
SMI
B3
B6
RM0078
Comment
10
20
NAND/NOR memory
access
X
Configuration registers
75
80 100
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Configuration registers
X
X
X
X
X
X
X
X
Internal peripherals (to be
set using PERIPHBASE)
X
X
L2CC configuration space
(to be set using
REGFILEBASE)
X
X
DMAC0
30
35
40
50
55
X
X
X
X
X
X
X
X
X
X
X
X
X
X
60
70
Configuration registers
B14
DMAC1
B12_0 MIPHY
B12_1 VENC
Programming port
B12_2 VDEC
B2
MPMC
P0
A9SM
P1
E1
BUSMATRIX
Configuration registers for
S3220
X
X
X
X
X
X
X
X
B11
SYSROM
Embedded ROM (32 KB)
X
X
X
X
X
X
X
X
Each time an IA accesses outside its allowed address map, either it receives a bus error, or
an interrupt is raised through the BUSMATRIX interrupt line to signal an abnormal
transaction.
READ operations always return a bus error.
For WRITE operations, the interconnect distinguishes between posted and unposted
transactions. Because there is no wait for a response for posted transactions, the bus
signals the event through its interrupt line (sideband signaling) rather than transporting an
in-band error. For information on how to handle this condition, refer to Appendix A:
Interrupts.
Information on programming posted and unposted transactions is provided in the individual
IP chapters.
54/590
Doc ID 018553 Rev 3
RM0078
4
System configuration registers (MISC)
System configuration registers (MISC)
Using a 32-bit APB interface, the miscellaneous registers configure the SPEAr1340 global
parameters (such as clocks, resets, and pads) and peripherals.
SPEAr1340 registers are described in the companion reference manual: RM0089,
Reference manual, SPEAr1340 address map and registers.
Doc ID 018553 Rev 3
55/590
Reset and clock generator (RCG)
5
RM0078
Reset and clock generator (RCG)
This chapter focuses on RCG functionality and operation.
For the RCG feature list, refer refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
5.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The reset and clock generator (RCG) provides the system clocks and resets. It can be
configured through the miscellaneous registers.
Figure 7.
RCG block diagram
Primarily used to generate
the 1 GHz clock for the
AMBA subsystem
osci1
Generates clocks
osci3
PLL1
Contains the gating cells
(driven by MISC registers) that
enable/disable clocks
pll1out vco1div2
XGPIO90
XGPIO132
PLL2
pll2out vco2div2
CLOCK SYS
GATE UNIT
clock
pll3out vco3div2
PLL3
CLOCK CONTROL
Primarily used to
generate the 1.2 GHz clock
for the AMBA subsystem
RESET
GENERATOR
MISC control signals
Drives
system
resets
See also: Figure 8: RCG integration in SPEAr1340.
56/590
reset
Doc ID 018553 Rev 3
RM0078
5.2
Reset and clock generator (RCG)
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
5.3
Clocks
Table 9.
RCG clocks
IP
A9SM
ADC
RCG name
CAM1
CAM2
CAM3
CAM4
CEC0
CEC1
Maximum
frequency
(MHz)
CLK1GHZ
pclk_a9sm
PCLK
83
PERIP1_CLK_ENB[0]
atclks_a9sm
ACLKS_SOC
166
PERIP2_CLK_ENB[6]
aclkm0_a9sm
ACLKM0
166
aclkm1_a9sm
ACLKM0
166
pclk_adc
PCLK
83
PERIP1_CLK_ENB[30]
clk_adc
ADC_CLK
20
See Chapter 31
ahclkclk_i
166
PERIP1_CLK_ENB[0]
hclk_c3
hclk
166
PERIP1_CLK_ENB[29]
clk_c3
clk48m
48
CAM1_PIXCLK
PIXCLK
<=100
hclk_cam1
HCLK
CAM2_PIXCLK
PixCLK
hclk_cam2
HCLK
CAM3_PIXCLK
PixCLK
hclk_cam3
HCLK
CAM4_PIXCLK
PIXCLK
hclk_cam4
HCLK
166
PERIP3_CLK_ENB[7]
hclk_cec0
HCLK
166
PERIP3_CLK_ENB[5]
ck_cec0 = hclk_cec0
ck
166
PERIP3_CLK_ENB[5]
hclk_cec1
HCLK
166
PERIP3_CLK_ENB[4]
ck_cec1 = hclk_cec1
ck
166
PERIP3_CLK_ENB[4]
Doc ID 018553 Rev 3
1200
Configuration registers
and references
clk1ghz
BUSMATRIX hclk_bus
C3
IP native name
166
<=100
166
<=100
166
<=100
See A9SM clock configuration
Connected to the PAD and
enabled through
PERIP3_CLK_ENB[10]
PERIP3_CLK_ENB[10]
Connected to the PAD and
enabled through
PERIP3_CLK_ENB[9]
PERIP3_CLK_ENB[9]
Connected to the PAD and
enabled through
PERIP3_CLK_ENB[8]
PERIP3_CLK_ENB[8]
Connected to the PAD and
enabled through
PERIP3_CLK_ENB[7]
57/590
Reset and clock generator (RCG)
Table 9.
RM0078
RCG clocks (continued)
IP
RCG name
IP native name
Maximum
frequency
(MHz)
Configuration registers
and references
hclk_clcd
hclk
166
PERIP1_CLK_ENB[27]
aclk_clcd
aclk
166
PERIP1_CLK_ENB[27]
clk_clcd
pclk_in
200
See CLCD clock configuration
DMAC
hclk_dma
hclk_i
166
PERIP1_CLK_ENB[25]
FSMC
hclk_fsmc
hclk_i
166
PERIP1_CLK_ENB[4]
hclk_gmac
hclk_i
166
PERIP1_CLK_ENB[8]
clk_tx
clk_tx_i
125
clk_rx
clk_rx_i
125
clk_rmii
clk_rmii_i
50
clk_ptp_ref = osci1
clk_ptp_ref_i
24
GPIOA
pclk_gpioa
PCLK
83
PERIP1_CLK_ENB[23]
GPIOB
pclk_gpiob
PCLK
83
PERIP1_CLK_ENB[24]
pclk_gpt0
pclk
83
PERIP1_CLK_ENB[21]
clk_timer0
timer_clk
pclk_gpt1
pclk
clk_timer1
timer_clk
pclk_gpt2
pclk
clk_timer2
timer_clk
pclk_gpt3
pclk
clk_timer3
timer_clk
hclk_gpu
MALI_SUBSYS_AXI_
m_aclk
clk_gpu = gen3_clk
MALI_200Mhz_clk
pclk_i2c0
pclk
83
clk_i2c0= hclk
ic_clk
166
pclk_i2c1
pclk
83
clk_i2c1= hclk
ic_clk
166
pclk_i2s_m
pclk
83
i2s_m_sclk
sclk
<=12
i2s_m_sclk
I2S_OUT_BITCLK
<=12
CLCD
GMAC
GPT0
GPT1
GPT2
GPT3
GPU
I2C0
I2C1
I2S_M
<=83
83
<=83
83
<=83
83
<=83
166
<=200
See GMAC clock configuration
See GPT clock configuration
PERIP1_CLK_ENB[22]
See GPT clock configuration
PERIP2_CLK_ENB[4]
See GPT clock configuration
PERIP2_CLK_ENB[5]
See GPT clock configuration
PERIP3_CLK_ENB[6]
See Fractional clock generator
(SSCG)
PERIP1_CLK_ENB[18]
PERIP3_CLK_ENB[2]
PERIP1_CLK_ENB[20]
See I2S clock configuration
I2S_OUT_OVRSAMP_CLK
I2S _S
KBD
58/590
pclk_i2s_s
pclk_kbd
pclk
83
sclk
<=12
pclk
83
Doc ID 018553 Rev 3
PERIP1_CLK_ENB[19]
See I2S clock configuration
PERIP2_CLK_ENB[3]
RM0078
Table 9.
Reset and clock generator (RCG)
RCG clocks (continued)
IP
RCG name
IP native name
Maximum
frequency
(MHz)
Configuration registers
and references
hclk_sd
hclk_sd
166
PERIP1_CLK_ENB[6]
hclk_cf_xd
hclk_cf_xd
166
PERIP1_CLK_ENB[7]
clk_sd
clk_sd
10<clk<83
clk_cf_xd
clk_cf_xd
25<clk<166 See XYSYNT clock divider
pclk_ao
pclk
83
Always-on
hclk_mpmc
hclk
166
PERIP2_CLK_ENB[0]
aclk_mpmc
aclk
166
PERIP2_CLK_ENB[0]
clk_mpmc_phy
clk
533
PERIP2_CLK_ENB[1]
clk_mpmc_ctrl
clk_d2
266
PERIP2_CLK_ENB[1]
clk_mpmc_ddr
clk_ref
533
PERIP2_CLK_ENB[1]
OTP
pclk_o
clk_i
83
Always-on
PCIE
aclk_pcie_sata
aclk
166
PERIP1_CLK_ENB[12]
PCM
pclk_ao
pclk
83
Always-on
PWM
pclk_pwm
PCLK
83
PERIP3_CLK_ENB[3]
pclk_rtc
pclk
83
PERIP1_CLK_ENB[31]
clk_32k
clk32k
aclk_pcie_sata
aclk
166
PERIP1_CLK_ENB[12]
hclk_smi
hclk_i
166
PERIP1_CLK_ENB[5]
clk_smi
smi_clk
50
See Chapter 17
hclk_spdif_in
HCLK_I
166
PERIP3_CLK_ENB[12]
clk_spdif_in
CLK_APPL
hclk_spdif_out
HCLK
clk_spdif_out
clk_appl
pclk_ssp
pclk
83
clk_ssp=pclk_ssp
sspclk
83
SYSRAM0
hclk_sysram0
hclk
166
PERIP1_CLK_ENB[3]
SYSRAM1
hclk_sysram1
hclk
166
PERIP1_CLK_ENB[2]
SYSROM
hclk_sysrom
hclk
166
PERIP1_CLK_ENB[1]
pclk_ao
pclk
83
Always-on
clk_thsens
thsclk
pclk_uart0
clk_uart0
MCIF
MISC
MPMC
RTC
SATA
SMI
SPDIF (in)
SPDIF (out)
SSP
THSENS
UART0
See XYSYNT clock divider
32 KHz
<=200
166
<=147
PERIP_CLK_CFG[14]
See Fractional clock generator
(SSCG)
PERIP3_CLK_ENB[13]
PERIP_CLK_CFG[15]
See Fractional clock generator
(SSCG)
PERIP1_CLK_ENB[17]
187.5 KHz
PERIP2_CLK_ENB[8]
pclk
83
PERIP1_CLK_ENB[15]
uartclk
125
See UART clock configuration
Doc ID 018553 Rev 3
59/590
Reset and clock generator (RCG)
Table 9.
IP
UART1
UHC0
UHC1
UOC
VDEC
VENC
VIP
XGPIO
60/590
RM0078
RCG clocks (continued)
RCG name
IP native name
Maximum
frequency
(MHz)
Configuration registers
and references
pclk_uart1
pclk
83
PERIP3_CLK_ENB[1]
clk_uart1
uartclk
125
See UART clock configuration
freeclk_usb
phy_clk_i
30
clk48_uhc0
ohci_clk48_i
48
clk12_uhc0
ohci_clk12_i
12
clk30_uhc0
utmi_phy_clock_i
30
hclk_uhc0
hclk_i
166
freeclk_usb
phy_clk_i
30
clk48_uhc1
ohci_clk48_i
48
clk12_uhc1
ohci_clk12_i
12
clk30_uhc1
utmi_phy_clock_i
30
hclk_uhc1
hclk_i
166
PERIP1_CLK_ENB[10]
hclk_uoc
hclk
166
PERIP1_CLK_ENB[11]
clk30_uoc
utmi_clk
30
clk_vdec = gen0_clk
DCLK
<=200
hclk_vdec
HCLK
166
PERIP3_CLK_ENB[16]
aclk_vdec
ACLK
166
PERIP3_CLK_ENB[16]
clk_venc = gen1_clk
ENC_CLK
hclk_venc
HCLK
166
PERIP3_CLK_ENB[15]
aclk_venc
ACLK
166
PERIP3_CLK_ENB[15]
VIP_PIXCLK pad
PIX_CLK_I
<=193
PERIP3_CLK_ENB[11]
hclk_video_in
HCLK
166
PERIP3_CLK_ENB[11]
hclk_xgpio
Hclk_i
166
PERIP3_CLK_ENB[18]
Doc ID 018553 Rev 3
<=200
PERIP1_CLK_ENB[9]
See Fractional clock generator
(SSCG)
See Fractional clock generator
(SSCG)
RM0078
Reset and clock generator (RCG)
5.4
Functional description
This section describes the main blocks and functionality of the RCG.
Figure 8 shows how the reset and clock generator is integrated in the device.
Figure 8.
RCG integration in SPEAr1340
AlwaysON
32 KHz
osci2
OSCI32
GMAC
UOC
ARM
divider
RTC
cpu_clk
PCM
24 MHz
osci1
clk1ghz
usb_48
USB PHY
PLL
CPU0
CPU1
RCG
usb_30
hclk/pclk
DDR PHY
PLL
MISC
osci3
CODEC
25/100 MHz
PLL
GPU
GPU
MPMC
C
BUS
CAMIF
CLCD
SPDIF
VIP
PCIe/SATA
UHC
Note:
AlwaysON, ARM, BUS, GPU and CODEC are the names of SPEAr1340 power domains.
For more information, see Chapter 6: Power management.
5.4.1
Main clock sources
●
osci1: 24 MHz clock from internal oscillator connected to external quartz
●
osci2: 32 kHz clock from internal oscillator used for the RTC block (optional)
●
osci3: 25/100 MHz clock from the MIPHY macro (optional)
For a complete list of RCG clocks see Section 5.3: Clocks.
Doc ID 018553 Rev 3
61/590
Reset and clock generator (RCG)
RM0078
5.4.2
PLLs
Note:
See also: Section 5.5.1: Programming PLLs
PLL1, PLL2 and PLL3 in the RCG module, as well as the memory controller subsystem
dedicated PLL (PLL4) are the main sources of system clocks.
Table 10 lists the PLL source clocks, and the fields of register PLL_CFG that configure
them.
At reset, osci1 is the default clock source for all PLLs.
Table 10.
PLL source clocks
PLL
Source
PLL_CFG register field
PLL1
osci1, osci3, XGPIO90
pll1_clk_sel
PLL2
osci1, osci3, XGPIO132
pll2_clk_sel
PLL3
osci1, osci3, XGPIO132
pll3_clk_sel
PLL4
osci1
PLL4 generates the memory controller clocks,
and is always fed by clock osci1.
Table 11 lists the PLL output clocks, their reset values, and the registers that configure them.
Table 11.
PLL output clocks
PLL
Frequency (after reset)
MISC register
PLL1
pll1out at 1 GHz
vco1div2 at 500 MHz
vco1div4 at 250 MHz
PLL1_CTR
PLL1_FRQ
PLL1_MOD
PLL2
pll2out at 125 MHz
vco2div2 at 500 MHz
PLL2_CTR
PLL2_FRQ
PLL2_MOD
PLL3
pll3out at 65 MHz
vco3div2 at 520 MHz
PLL3_CTR
PLL3_FRQ
PLL3_MOD
PLL4
pll4out at 533 MHz
PLL4_CTR
PLL4_FRQ
PLL4_MOD
Figure 9 shows the PLL components.
62/590
Doc ID 018553 Rev 3
RM0078
Figure 9.
Reset and clock generator (RCG)
PLL overview
clk_in
Predivision
factor
(prediv_N)
Postdivision
factor
(postdiv_P)
CP & VCO
pllout
clksel
Internal divider
(fbkdiv_M)
A classic
phase-locked-loop
circuit
vcodiv2
div2
Analog PLL
extfbclk
External divider
(dithering logic)
Modulates the
VCO frequency
Analog PLL
The analog PLL features are:
●
Input clock frequency range: 4 MHz to 350 MHz
●
VCO frequency range: 800 MHz to 1600 MHz
●
Output frequency range: 12.5 MHz to 1600 MHz
●
Power-down mode: consumption is only due to leakage
●
Maximum lock time: 150 us
For the feedback reference clock, it is possible to choose between the output of an internal
divider (fbkdiv_M) and an external divider (dithering logic).
Because the configuration of the internal divider is static, the output frequency is a constant
value. The VCO frequency can be calculated as follows:
2 M 15:8 f in
f VCO = --------------------------------------N
Where
–
fin/N is the reference clock after the prediv_N divider: the frequency range is
(4 MHz, 50 MHz).
–
fin is the frequency of the input clock listed in Table 10.
The output frequency can be calculated as follows:
2 M 15:8 fin
fo ut = --------------------------------------p
N 2
Doc ID 018553 Rev 3
63/590
Reset and clock generator (RCG)
RM0078
Where:
–
M[15:8] is the feedback division factor: pll_fbkdiv_M[15:8] field of PLLx_FRQ
register
–
N is the pre division factor: pll_prediv_N field of PLLx_FRQ register
–
P is the post-division factor: pll_postdiv_P field of PLLx_FRQ register
Table 12 lists division factor ranges.
Table 12.
PLL division factors
Range (decimal)
M[15:8]
N
P
8 to 255
1 to 7
0 to 6
The PLL also generates two other auxiliary clocks:
●
vcodiv2: VCO frequency divided by 2
●
vcodiv4: VCO frequency divided by 4
Table 13 can be used to evaluate the jitter introduced at the output. Note that the jitter
introduced by the input source is not taken into account, only device and supply noise is
considered here.
Table 13.
Jitter at PLL output clock
Jitter type
A
Jitter due to supply noise
(ps)
B
Jitter due to device noise
(% of PLL output time period)
Total jitter
Single-period jitter
25
0.16
+/- (A + σ * B)
Cycle-to-cycle jitter
25
0.32
+/- (A + σ * B)
The σ value is chosen depending on the percentage of the samples exceeding the
calculated jitter.
Table 14.
64/590
Selection of σ value
σ value
Percentage of samples
exceeding the jitter value (%)
1
31.73
2
4.555
3
0.27
4
6.30*1e-03
5
5.63*1e-05
6
2.00*1e-07
7
2.82*1e-10
Doc ID 018553 Rev 3
RM0078
Reset and clock generator (RCG)
For example:
If the output clock frequency is 1 GHz, the jitter is:
+/- (25 ps + 3* 0.32/100 * 1000 ps ) = +/- (25 ps + 9.6 ps ) = +/- 34.6 ps
Only 0.27 % of samples (3 σ) exceed the calculated jitter of +/-34.6 ps.
Dithering logic
●
Programmable modulation period
●
Selectable modulation depth. Recommended range: 0-2.5%
●
Selectable Sigma-Delta order. Recommended: 2nd order
●
Maximum modulation frequency : fmod(max) = 100 KHz
To enable the dithering logic, set the clksel signal to 1 (PLLx_CTR register, field
pll_control1[5]). In this mode the internal feedback divider is bypassed, and the external
logic is used to generate the VCO reference signal.
The external divider is driven in order to generate a triangular wave. The algorithm that
performs this modulation is based on a sigma-delta converter fed by a triangular wave.
When the external feedback is enabled, the output signal frequency is calculated as:
2 M fi n
f out = -------------------------------p
256 N 2
Where M is the feedback division factor of the external divider ( PLLx_FREQ register,
pll_fbkdiv_M field).
PLL modes
Register: PLLx_CTR
Table 15.
Field: pll_control1[2:1]
PLL modes
Mode
Description
Non dithered
The PLL behaves as a normal PLL (internal feedback divider).
Fractional-N
VCO frequencies can be selected that are not integer multiples of the
reference frequency. In this mode the external divider is selected.
Dithering
A triangular wave is added to the VCO frequency.
(double side modulation)
Dithering
(single side modulation)
Similar to double side modulation, but the modulation only subtracts from
the main frequency.
In dithering mode, the PLLx_MOD registers configure the output clock modulation period
and slope parameters. Use the frequency of the modulation wave (fref) and the modulation
depth (md) to compute the field values:
f ref KHz
pll_modperiod = -----------------------------------4 fmod KHz
8
f md M pll_slope = --------------------------------------pll_mod per iod
Doc ID 018553 Rev 3
65/590
Reset and clock generator (RCG)
RM0078
Where:
fo sci
f ref = ---------- is the frequency at the output of input divider
N
fmod is the frequency of the modulation wave
M = pll_fbkdiv_M
md is the modulation depth in respect of the nominal frequency of the undithered clock
5.4.3
Fractional clock generator (SSCG)
An SSCG is a clock synthesizer able to divide an input clock by a fractional factor.
Main features:
●
Input frequency (fin): 250 to 500 MHz
●
Output frequency (fout): fin/16 to fin
●
Single period jitter: maximum value +/- 230 ps
●
Output clock period resolution: 2-13 / fin
The output clock period is calculated as follows:
–
Tout = 2* To * Tin
where:
–
Tout is the output period
–
To is the division parameter; it is a 17-bit fixed point representation of the division
factor, with the first 3 MSBs representing the integer part.
–
Tin is the input clock period
Example
fin = 500 MHz , fout = 48 MHz
To = fin / (2 * fout ) = 5.2083
The corresponding fixed point (14 decimal digit) value is calculated as:
To = 5.2083 * 214 = 85332 => To = 17b10100110101010100
Table 16.
SSCG
66/590
SSCGn output frequencies
Input clock
Input frequency range
(MHz)
IP
Register
SSCG0
vco1div4
vco3div2
pll3out
250-450
VIDEO_DEC
GEN_CLK_SSCG0
PLL_CFG[28:27]
SSCG1
vco1div4
vco3div2
pll3out
250-450
VIDEO_ENC
GEN_CLK_SSCG1
PLL_CFG[28:27]
SSCG2
vco1div4
vco2div2
pll2out
250-450
SPDIF_OUT
GEN_CLK_SSCG2
PLL_CFG[30:29]
SSCG3
vco1div4
vco2div2
pll2out
250-450
GPU, SPDIF_IN
GEN_CLK_SSCG3
PLL_CFG[30:29]
Doc ID 018553 Rev 3
RM0078
Reset and clock generator (RCG)
Table 16.
SSCGn output frequencies (continued)
SSCG
5.4.4
Input clock
Input frequency range
(MHz)
IP
Register
SSCG4
vco1div2
250-600
CPU, AMBA
Subsystem
SYS_CLK_SSCG
SSCG5
vco1div4
pll2out
250-450
CLCD
CLCD_CLK_SSCG
SSCG6
vco1div2
250-600
AMBA Subsystem
AMBA_CLK_SSCG
XYSYNT clock divider
XYSYNT is a clock divider based on an integer counter.
The input clock can be divided by an integer value by setting the parameters X and Y.
The output frequency is calculated as follows:
Formula 1
fo ut = fin X
---Y
With X ≤Y ⁄ 2
In this case, the output signal is high for only one input clock period (see Figure 10).
Figure 10. X=1 , Y= 4 (duty cycle < 50 %)
Tin *Y/X
T in
Tout
If a duty cycle of 50% is required, it is possible to use this formula:
Formula 2
X
fout = fin ----------2 Y
Use the synt_clkout_sel field in the XYSYNT-related configuration registers to choose
between the two formulas:
Note:
●
synt_clkout_sel = 1 selects the first formula
●
synt_clkout_sel = 0 selects the second one (DC = 50%).
The maximum XYSYNT input frequency is 600 MHz.
To have a fixed output period, the Y/X ratio should be an integer.
Doc ID 018553 Rev 3
67/590
Reset and clock generator (RCG)
RM0078
Table 17 lists XYSYNT input and output clocks, and the XYSYNT-related configuration
registers.
Table 17.
XYSYNT clocks
XYSYNT
Input clock
Output clock
IP
Register
I2S_DIV1, I2S_DIV2
Vco1div2
pll2out,
pll3out
I2S_OUT_REFCLK
i2s1_sclk
i2s1_refclk
I2S_M
I2S_CLK_CFG
C3_CLK_SYNT
vco1div2
clk_c3_synt
C3
C3_CLK_SYNT
UART0_CLK_SYNT
vco1div2
clk_uart0_synt
UART0
UART0_CLK_SYNT
UART1_CLK_SYNT
vco1div2
clk_uart1_synt
UART1
UART1_CLK_SYNT
GMAC_CLK_SYNT
MAC_GTXCLK125
pll2out
osci3
clk_tx, clk_rx
GMAC
GMAC_CLK_SYNT
MCIF_SD_CLK_SYNT
vco1div2
clk_sd
MCIF (SD)
MCIF_SD_CLK_SYNT
MCIF_CFXD_CLK_SYNT
vco1div2
clk_cf_xd
MCIF(CF/XD) MCIF_CFXD_CLK_SYNT
ADC_CLK_SYNT
hclk
clk_adc
ADC
5.4.5
ADC_CLK_SYNT
AMBA clock configuration
The following clocks feed the AMBA subsystem:
●
CPU_CLK: the CPU clock nominally running at 500 MHz (PLL1 source). The maximum
frequency is 600 MHz.
●
HCLK/ACLK: the AHB/AXI clock, nominally running at 166 MHz
●
PCLK: the APB clock, nominally running at 83.5 MHz
By default, all these clocks are generated by the same root: SYS_CLK (see Figure 12:
AMBA clock generation), and the ratios between them are fixed:
●
sys_clk: cpu_clk = 2:1
●
cpu_clk: hclk = 3:1
●
hclk:pclk = 2:1
The SSCG4 can be also used as source for hclk/pclk clocks. In this way, it is possible to
decouple the source of cpu_clk and hclk/pclk clocks:
●
sys_clk : cpu_clk = 2:1
●
hclk: pclk = 2:1
Because the AMBA clocks feed most of the SoC registers, they are responsible for most of
the dynamic power consumption. To optimize power resources, the AMBA subsystems can
be set to three different power modes, depending on the source of their clocks.
A system clock controller (see Figure 11) defines the system clock (SYS_CLK in) source.
See also: Section 5.5.2: Changing system modes
68/590
Doc ID 018553 Rev 3
RM0078
Reset and clock generator (RCG)
Figure 11. System clock controller
AMBA
subsystems
power mode
System clocks’ source is a PLL output or SSCG6.
Nominally: PLL1 at 1 GHz, PLL3 at 1.2 GHz.
Notes:
Switching between clock sources PLL2 and PLL3 can
produce glitches; when switching from one to the other, use
SLOW mode. All other clock switches are glitchless.
NORMAL
If osci2_dis is set, it is not possible to switch from SLOW
mode to DOZE mode.
PLL_TIMOUT
System clocks are driven by the osci1(default) clock
or by its divided version; use the oscidiv_cfg and
oscidiv_en fields of SYS_CLK_CTRL to enable and
set the divisor.
SLOW
XTAL_TIMOUT
Reset state. System clock is driven by a low
frequency oscillator. After reset, osci1 is selected;
resetting the osci2_dis bit of PERIP_CLK_CFG
selects osci2 (32 kHz).
DOZE
MRESET
Use the following register to switch among system modes:
– SYS_CLK_CTRL
– SYS_CLK_OSCITIMER
– SYS_CLK_PLLTIMER
All of the clocks come either from external pads or from the internal signal clk_int. Figure 12
illustrates the system clocks (SYS_CLK, HCLK and PCLK) generation circuit.
In the boot process the system switches from DOZE to SLOW mode using the osci1 clock;
once in SLOW mode, it is not possible to switch back from SLOW to DOZE mode if
osci2_dis is set.
All clock transitions between modes are performed without glitches. Once in NORMAL
mode, the clock can be switched without a glitch, between pll1out, SSCG6 output, and
pll2out/pll3out; only a switch between pll2out and pll3out can cause glitches.
To change between pll2out and pll3out sources the system needs to switch from NORMAL
mode (for instance, to SLOW mode). Then, change the clksys_src setting and finally switch
to NORMAL mode.
To select the source in NORMAL mode, configure the clksys_src field of SYS_CLK_CTRL
register.
To configure SSCG6, configure the register SYS_CLK_SSCG; it is fed by vco1div2
(nominally at 500 MHz).
To decouple the HCLK (ACLK) clock from SYS_CLK (and so the CPU clock), program the
hclk_sel field of register SYS_CLK_CTRL.
Setting hclk_sel selects the SSCG4 output for HCLK, making it possible to set a different
ratio between CPU_CLK and HCLK.
Doc ID 018553 Rev 3
69/590
Reset and clock generator (RCG)
RM0078
The maximum frequency for HCLK is 166 MHz.
Figure 12. AMBA clock generation
sys_mode_req (SYS_CLK_CTRL )
hclk_sel (SYS_CLK_CTRL)
oscidiv_cfg (SYS_CLK_CTRL)
CLOCK CTRL
osci2
DOZE
osci1div
OSCIDIV
osci1
sys_clk
(clk1ghz)
SLOW
NORMAL
div6
HCLK
MUX
pll1out
hclk
pll2out
pll3out
GLM3
SSCG6
vco1div2
SSCG4
div2
SYS_CLK_SSCG
70/590
clksys_src (SYS_CLK_CTRL)
AMBA_CLK_SSCG
Doc ID 018553 Rev 3
pclk
RM0078
Reset and clock generator (RCG)
5.4.6
A9SM clock configuration
The main A9SM clock source is the CLK1GHZ (see Figure 13).
Because CLK1GHZ is connected to SYS_CLK (see Figure 12 ), its nominal value is 1 GHz
when the PLL1 source is selected. The maximum value is 1.2 GHz, when cpu_clk=
600 MHz.
For further details on configuring SYS_CLK clock, see AMBA clock configuration on
page 68.
Figure 13. A9SM clock domain
CORTEXA9INTEGRATION
CoreSight subsystem
CTM
IRQs
CTM
CTI
ETB
TPIU
ROM
Funnel 2
Funnel 1
Replicator
Replicator
A9SM
Triggers
GIC (128 IRQs)
DBG0
PMU0
DBG1
PMU1
TRM0
WD0
TRM1
WD1
A9 Core #0
Trace
CTI1
CTI0
A9 Core #1
PTM1
Replicator
ATB
PTM0
SCU
ATCLK_SOC
APB
APB Dec.
DAP
ROM
PCLKDBG_SOC
APB Dec.
JTAG TCK
PL310
(with address filtering)
PCLK
APB
DAP
CLK1GHZ
APB
ClkMan
AXI 0
ACLKM0
AXI 1
ACP
ACLKS_SOC
Clock
manager
ACLKM1
Legend
AXI
Table 18.
APB Debug
ATB
Triggers
APB
Synchronizer
500 MHz block
A9SM clocks
A9SM clock
Maximum frequency
MISC register
Description
CLK1GHZ
SYS_CLK at 1.2 GHz
Main clock source
CLK_CORE
(CLK1GHZ /2) at 600 MHz
Used by the two internal CPUs;
generated by dividing CLK1GHZ by two,
and is nominally 600 MHz.
PERIPHCLK
(CLK1GHZ /4) at 300 MHz
ATCLK
(CLK1GHZ /4) at 300 MHz
NA
Used by internal peripherals (WD, GIC);
generated by dividing CLK1GHZ by four,
and is nominally 300 MHz
Feeds the TRACE unit; derived from
CLK1GHZ, and runs at 300 MHz.
The clocks for the two AXI interfaces;
they are in phase with the system.
ACLKM0, ACLKM1 aclk at 166 MHz
Doc ID 018553 Rev 3
71/590
Reset and clock generator (RCG)
Table 18.
RM0078
A9SM clocks (continued)
A9SM clock
Maximum frequency
MISC register
Description
ACLKS_SOC
aclk at 166 MHz
PERIP2_CLK_ENB[6]
Used by the SCU, and runs at the same
frequency as system clock HCLK.
PCLK
pclk at 83 MHz
PERIP1_CLK_ENB[0]
The APB interface clock, in phase with
system clock pclk.
5.4.7
GMAC clock configuration
The GMAC block supports the following PHY interfaces:
●
GMII: Gigabit media independent interface
●
RGMII: Reduced GMII
●
MII: Media independent interface
●
RMII: Reduced MII
To select the PHY interface, configure the register GMAC_CLK_CFG[5:3], macphy_sel field.
Table 19 lists the GMAC clocks for all interfaces.
All of the clocks come either from external pads or from the internal signal clk_int (see
Figure 14)
Table 19.
GMAC clocks
macphy_sel
field
72/590
Source
MAC_GTXCLK (MHz)
clk_tx (MHz)
clk_rx (MHz)
000: MII
25 / 2.5 (MAC_TXCLK)
25 / 2.5 (MAC_RXCLK)
–
000: GMII
125 (clk_int)
125 (MAC_RXCLK)
125
001: RGMII
125 / 25 / 2.5 (clk_int)
125 / 25 / 2.5 (MAC_RXCLK)
125 (both edges)
100: RMII
25 / 2.5 (clk_int)
25 / 2.5 (clk_int)
50
Doc ID 018553 Rev 3
RM0078
Reset and clock generator (RCG)
Figure 14. GMAC clock generation
macphy_sel
(GMAC_CLK_CFG)
synth_en
(GMAC_CLK_CFG)
GMAC_CLK_SYNT
macphy_sel
mac_speed(GMAC_CLK_CFG)
RMII
MAC_GTXCLK
GMII or
RGMII
MAC_TXCLK
MAC_GTXCLK125
MII
1
pll2out
clk_int
GMAC_CLK_SYNT
osci3
GMII
clk_tx
0
1,5,50
2,20
RGMII
RMII
clk_rmii
clk_sel
mac_speed
RMII
clk_rx
MAC_RXCLK
RGMII or
GMII
macphy_sel (GMAC_CLK_CFG)
Table 20.
PHY
MII
Setting GMAC clocks to different modes
Clock
Source
Description
clk_rx
clk_tx
MAC_TXCLK,
MAC_RXCLK pads
In this mode, there is no need to configure a divider or
MUX.
clk_rx
Pad (MAC_RXCLK)
clk_tx
clk_tx = clk_int
GMAC_CLK_CFG and GMAC_CLK_SYNT must be
programmed to set clk_int= 125 MHz.
clk_int is also present on MAC_GTXCLK for the external
PHY.
GMII
Doc ID 018553 Rev 3
73/590
Reset and clock generator (RCG)
Table 20.
PHY
RMII
RGMII
RM0078
Setting GMAC clocks to different modes (continued)
Clock
clk_rx
clk_tx
clk_rx
clk_tx
Source
Description
clk_int = clk_rmii
clk_int must be set to 50 MHz.
Internal dividers generate 25 MHz and 2.5 MHz.
clk_rmii is also present on MAC_GTXCLK for the
external PHY.
clk_int clock can be set between the pad
MAC_GTXCLK125 and the output of
GMAC_CLK_SYNT divider using the synth_en field of
register GMAC_CLK_CFG.
The GMAC_CLK_CFG and GMAC_CLK_SYNT
registers can be used to:
– Set the XYSYNT source using clk_sel source
– Program the XYSYNT division factors
– Enable XYSYNT (synth_en = 1)
clk_tx=clk_int
clk_rx= MAC_RXCLK
clk_int must be set to 125 MHz. The internal divider
generates 25/2.5 MHz frequencies based on the
mac_speed signal when in 100/10 Mbs.
To configure the clk_int clock, use the
GMAC_CLK_SYNT and GMAC_CLK_CFG registers.
There is no need to configure clk_rx; it is connected
directly to the pad.
Examples:
In GMII and RGMII modes, clk_int = 125 MHz using PLL2 as source
1.
Program PLL2 to generate a 500 MHz clock.
2.
Select pll2out as the source for GMAC_CLK_SYNT by setting clk_sel = 2b01.
3.
Configure the GMAC_CLK_SYNT to divide by 4:
a)
synth_clkout_sel = 0
b)
synt_xdiv=1
c)
synt_ydiv=2
4.
Enable the GMAC_CLK_SYNT source for clk_int by setting synth_en = 1.
5.
Select the GMII/RGMII mode through macphy_sel field of GMAC_CLK_CFG register.
In RMII mode, clk_int = 50 MHz using MAC_GTXCLK125 as source
74/590
1.
Select MAC_GTXCLK125 as the source for GMAC_CLK_SYNT by setting clk_sel =
2b01.
2.
Select pll2out as the source for GMAC_CLK_SYNT by setting clk_sel = 2b00.
3.
Disable the GMAC_CLK_SYNT by setting synth_en = 0 in GMAC_CLK_CFG register
4.
Select the RMII mode setting macphy_sel = 3'b100.
Doc ID 018553 Rev 3
RM0078
5.4.8
Reset and clock generator (RCG)
I2S clock configuration
RCG generates two clocks for the I2S master block:
●
I2S_M_SCLK to internal I2S_M block and I2S_OUT_BITCLK to external device
●
I2S_OUT_OVRSAMP_CLK to external device.
I2S_M_SCLK is generated from I2S_OUT_OVRSAMP_CLK, and it is synchronous to it.
To configure the clock sources and the dividers parameters, use the I2S_CLK_CFG register.
I2S slave block (I2S_S) serial clock is provided by the on-board I2S master device, through
the I2S_IN_BITCLK pad.
Figure 15. I2S_M clock generation
I2S_OUT_OVRSAMP_CLK
refout_div_en
(I2S_CLK_CFG)
refout_div_src
(I2S_CLK_CFG)
sclk_div_x,y
sclk_div_sel
sclk_div_en
(I2S_CLK_CFG)
I2S_OUT_BITCLK
vco1div2
0
pll2out
I2S_DIV1
pll3out
I2S_M_SCLK
I2S_DIV2
I2S_M
1
I2S
I2S_OUT_REFCLK
I2S_S
I2S_IN_BITCLK
refout_div_x,y
refout_div_sel
(I2S_CLK_CFG)
Doc ID 018553 Rev 3
75/590
Reset and clock generator (RCG)
5.4.9
RM0078
UART clock configuration
The UART clock can be generated by three sources:
●
48 MHz clock from USBPHY (clk_usb48 in Figure 16)
●
24 MHz clock coming from the main oscillator (osci1 in Figure 16)
●
vco1div2: through the UARTx_SYNT, the vco1div2 clock is divided to generate
clk_uartx_synt (refer to the UARTx_CLK_SYNT register)
To select the source, use the MISC register PERIP_CLK_CFG.
The clk_uartx clock is divided internally in the UART block to generate the desired BAUD
rate (see Chapter 25: Asynchronous serial ports (UART)).
Figure 16. UART clock generation
uartclkx_sel
(PERIP_CLK_CFG)
clk_usb48
osci1
vco1div2
clk_uartx
UARTx_SYNT
clk_uartx_synt
UARTx_CLK_SYNT
5.4.10
C3 clock configuration
The C3 clock (clk_c3 in Figure 17) has two sources:
●
48 MHz clock from USBPHY (clk_usb48 in Figure 17)
●
vco1div2: C3_SYNT divides the vco1div2 clock to generate the clk_c3_synt (see
register C3_CLK_SYNT)
To select the source, use the MISC register PERIP_CLK_CFG.
Figure 17. C3 clock generation
c3clk_sel
(PERIP_CLK_CFG)
clk_usb48
clk_c3
vco1div2
C3_SYNT
clk_c3_synt
C3_CLK_SYNT
76/590
Doc ID 018553 Rev 3
RM0078
5.4.11
Reset and clock generator (RCG)
CLCD clock configuration
To select the CLCD clock, you must configure the following registers:
●
PLL_CFG[31] to select the SSCG input clock source.
●
CLCD_CLK_SSCG register to configure SSCG5.
●
PERIP_CLK_CFG[3:2] to select:
–
48 MHz clock coming from the USB PHY (clk_usb48 in Figure 18):
–
the SSCG5 clock (clk_sscg5 in Figure 18)
–
XGPIO123 primary pad
–
pll3out
The clk_clcd clock is only one option for the CLCD panel clock. For more information, see
Chapter 34: Display controller (CLCD).
Figure 18. CLCD clock generation
clcdclk_sel
(PERIP_CLK_CFG)
clk_usb48
vco1div4
SSCG5
clk_sscg5
clk_clcd
pll2out
XGPIO132
CLCD_CLK_SSCG
pll3out
clcd_synth_sel
(PLL_CFG)
Doc ID 018553 Rev 3
77/590
Reset and clock generator (RCG)
5.4.12
RM0078
GPT clock configuration
All four GPT prescalers (see Chapter 10: General purpose timers (GPT)) are fed by the
clock clk_timer.
To select the clk_timer source, use the PERIP_CLK_CFG register, field gpt_clk_sel:
●
If gpt_clk_sel = 0 (default), osci1 is selected.
●
If gpt_clk_sel = 1, pclk is selected. In this case, the clk_timer is synchronous with APB
pclk.
Disabling the clk_timer
When the CPUs are in debug state, the gpt_dbg_en field of the SOC_CFG register can be
configured to disable the clk_timer:
00: clk_timer is not gated when CPUs enter debug state.
01: clk_timer is gated when CPU0 enters debug state.
10: clk_timer is gated when CPU1 enters debug state.
11: clk_timer is gated when either CPU0 or CPU1 enter debug state.
5.4.13
MPMC clock configuration
The memory controller uses two clock sources:
●
The first clock source, used for the six AXI data ports (aclk_MPMC) and the AHB
register port (hclk_MPMC), is the same as that used in the system for the AMBA
interconnect; the default frequency is 166 MHz. This clock can be enabled/disabled
using the mpmc_amba_clken field of the PERIP2_CLK_ENB register. The source for
this clock is hclk (see Figure 12).
●
The second clock source is used by the memory controller, the physical (PHY)
interface, the MIM structure, and the on-board DDR memory. The relationship between
the frequency value of the memory controller and the PHY interface is fixed at 1:2.
The maximum frequency for the PHY interface and memory interface is 533 MHz. The
memory interface has a dedicated clean clock source (clk_MPMC_ddr). This clock runs
asynchronously with respect to hclk and cpu_clk.
The value can be programmed through the MISC registers PLL4_FREQ and
PLL4_CTR. The controller and the PHY clock can be enabled/disabled by the
mpmc_ctrl_phy_clken field of the miscellaneous register PERIP2_CLK_CFG.
The MIM is used to translate the controller frequency into the PHY frequency;
Figure 19 shows the the memory controller clock relationships.
78/590
Doc ID 018553 Rev 3
RM0078
Reset and clock generator (RCG)
Figure 19.
MPMC clocks scheme
aclk_ddr_ctrl
MPMC
hclk_ddr_ctrl
DFI Bridge
clock_mod
clk_mpmc_ctrl (266 MHz)
/2
osci1 (24 MHz)
clk_mpmc_phy (533 MHz)
PLL4
DDR PHY
DDR
5.4.14
Gate unit
The IP clocks can be enabled and disabled through registers PERIP1_CLK_ENB,
PERIP2_CLK_ENB, and PERIP3_CLK_ENB.
Note:
For the complete clock list, see Table 9: RCG clocks.
The clock enable sequence needs to be done at system start, before the IP software reset
release (see PERIPx_SW_RST register description).
Disabling/enabling the clock when the IP is not in reset state could produce glitches on the
clock line.
5.4.15
Reset generator
The main hardware reset is asserted by the MRESETn pad.
As shown in Figure 20, the reset signal passes through a filter that suppresses glitches with
widths less than 9 ns.
To generate the hresetn root reset, the reset is first synchronized on the osci1 clock, and
then on PCLK.
Doc ID 018553 Rev 3
79/590
Reset and clock generator (RCG)
RM0078
Most SoC resets are generated starting from the main hresetn through a reset module
(dashed box in Figure 20): the reset is asserted asynchronously with the relative clock, and
it is released synchronously on the same clock. Figure 21 shows the reset sequence.
Figure 20. Reset generator
resetn_cpu (to A9SM)
ack_power_state[3] (from PCM)
D
Q
synch
clk1ghz
MRESET
dly
nresets (to clocksys)
sw_reset (from MISC )
cache_parity_fail (from A9SM)
D
Q
synch
osci1
‘1’
D
Q
die_id_valid
hresetn
Count 60
synch
pclk
pclk
‘1’
D
Q
synch
hresetn_x
presetn_x
rstn_x
PERIP_SW_RST (from MISC)
ack_power_state
D
synch
wdog_req (from A9SM)
80/590
Doc ID 018553 Rev 3
Q
RM0078
Reset and clock generator (RCG)
Figure 21. Reset waveform
60 PCLK pulses
pclk (sys_clk/2)
@2 MHz
MRESETn
sw_reset
cache_parity_fail
hresetn
When a power island is switched off, all of the resets of that island are asserted. This is
accomplished by the PCM module through the ack_power_state[3:0] signals. For more
details, see Chapter 6: Power control module (PCM).
The A9SM module can assert a reset when the internal watchdog module timer expires.
When this happens all the SoC but RCG, MISC, PCM and A9SM is reset and the CPUs
jump to the BOOT code. For more details, see Chapter 2: CPU subsystem (A9SM).
A software reset is available by programming the SYS_SW_RES register; this reset acts like
the main hardware reset.
The PERIP1_SW_RST, PERIP2_SW_RST , PERIP3_SW_RST registers are used to
assert a software reset to IPs.
Table 21 summarizes the reset sources and targets.
Table 21.
Reset sources
Reset source
Source
Target
Register & reference
MRESETn
External
SoC
Ack_power_state[ n ]
PCM
Power Island[n]
PCM_CFG
Sw reset
MISC
SoC
SYS_SW_RES
Cache_parity_fail
A9SM
SoC
A9SM_PARITY_CFG
wdog_req(1)
A9SM
SoC – {RCG,PCM,MISC,A9SM,
USBPHY, DDRPHY, MIPHY}
See Chapter 2
PERIPx_SW_RST[ n ]
MISC
IP[n]
PERIP1_SW_RST
PERIP2_SW_RST
PERIP3_SW_RST
1. Note that after a watchdog reset only the CPUs and other blocks are reset (not the entire SoC), hence a
software reset has to be asserted to guarantee that the system works properly.
Doc ID 018553 Rev 3
81/590
Reset and clock generator (RCG)
5.5
Programming
5.5.1
Programming PLLs
1.
2.
Start the analog PLL
a)
Set the PLLx_FRQ.
b)
Enable the PLL: register PLLx_CTR field pll_enable = 1.
c)
Wait until the PLL is locked: register PLLx_CTR field pll_lock = 1.
Switch to an external divider:
a)
3.
RM0078
Change the feedback divider from internal to external: register PLLx_CTR field
pll_control1[5] = 1. Everything but dithering mode can be changed here
(modulation period, slope).
b)
Toggle pll_control1[0] from 1 to 0, and back to 1.
c)
Wait until the PLL is locked.
d)
Change to dither mode: pll_control1[2:1].
e)
Toggle pll_control1[0] from 1 to 0, and back to 1.
Make modulation changes:
a)
Program dither mode to off; change the modulation period and slope as needed.
b)
Toggle pll_control1[0] from 1 to 0, and back to 1.
c)
Wait until the PLL is locked.
d)
Change to dither mode again.
e)
Toggle pll_control1[0] from 1 to 0, and back to 1.
The lock signal is meaningless during modulation.
5.5.2
Changing system modes
1.
Enable xtal and pll counter by setting the xtaltimeout_en and plltimeout_en bits of
register SYS_CLK_CTRL.
2.
Set the SYS_CLK_OSCITIMER timeout value from DOZE to SLOW transition.
For example, SYS_CLK_OSCITIMER = 0x100
3.
Set the SYS_CLK_PLLTIMER timout value from SLOW to NORMAL transition.
For example, SYS_CLK_PLLTIMER = 0x100.
82/590
4.
Set the SYS_CLK_CTRL [sys_mode_req] = 0x2 to switch to SLOW mode.
5.
Configure the PLL1 at 1 GHz (see Section 5.5.1: Programming PLLs).
6.
Set the SYS_CLK_CTRL [sys_mode_req] = 0x4 to switch to NORMAL mode.
Doc ID 018553 Rev 3
RM0078
5.5.3
Reset and clock generator (RCG)
Setting cpu_clk = 600 MHz and hclk = 166 MHz
1.
Configure the system to SLOW mode following the steps 1 to 4 described in
Section 5.5.2: Changing system modes.
2.
Set PLL1 to 1 GHz.
3.
Set PLL2 to 1.2 GHz.
4.
Configure SSCG4 clock to 166 MHz and wait for AMBA_CLK_SSCG[lock] = 1.
5.
Select the SSCG4 source for hclk.
For example, AMBA_CLK_SSCG[T0]= 0x603B since vco1div2= 500 MHz.
For example, SYS_CLK_CTRL[hclk_sel]= 1.
6.
Select the PLL2 for NORMAL mode.
For example, SYS_CLK_CTRL[clksys_src] = 3'b110.
7.
5.5.4
Switch to NORMAL mode.
Configuring the fractional clock generator (SSCG)
This section gives a configuration example for SSCG:
fin= 500 MHz, fout = 48 MHz
The value for To is:
To= fin / (2 * fout ) = 5.2083
The corresponding fixed point value is calculated as:
To= 5.2083 * 214 = 85332 => To = 17b10100110101010100
Modulation example:
fm= 100 KHz, Dt = 2.5% and fin = 500 MHz
Dt= 0.025
Since LSB = 2-9 and the field is of 8 bits
Dt= 0.025 * 2^9 = 12.8 = 8'b00001100
fmod= fm/ fin= 0.0002
Since the fm field is encoded with 8 bit and LSB= 2-16
fmod= 0.0002 * 216= 13.1= 8'b00001101
Doc ID 018553 Rev 3
83/590
Power management
6
RM0078
Power management
This chapter focuses on SPEAr1340 power management.
For technical details about the programmable registers, refer to the following companion
document:
●
6.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device enables you to choose from among a significant number of different
configurations that can optimize the overall power consumption, depending on the target
application. To this end, in addition to the usual technology-related and architecture-based
solutions (LP libraries, high threshold-voltage cell usage, fine-grain clock gating and poweraware synthesis flow), SPEAr1340 also employs high-level mechanisms that enable savings
both on leakage and dynamic power.
The leakage and power configuration-specific savings are obtained by deploying a power
shutoff strategy over a hierarchical partitioning of the design in power domains that
responds to both a logical and a physical rationale. These techniques lead to the definition
of shutoff modes.
6.2
Power domains
SPEAr1340 is partitioned in power domains: sections of core logic with digital supplies that
can be independently managed through the deployment of embedded switches.
SPEAr1340 has five power domains:
●
AlwaysON
●
ARM
●
CODEC
●
GPU
●
BUS
Figure 22 shows these power domains and the IPs present in each domain. On the left side
of the figure you can see the macros with dedicated supply pads and the IOs.
Note:
84/590
The power domain names reflect only a part of the supported features.
Doc ID 018553 Rev 3
RM0078
Power management
Figure 22. SPEAr1340 power islands
DDR PHY
I/Os :1V5 /1V8
Logic: 1V2
PLL: 2.5V+1.2V
AlwaysOn
9Always-on RAM
9GMAC
9GPIO
9MISC
RTC OSCI 32 KHz
1.5V
Hard macros with
dedicated supply
pads
GPU
9PCM
9RCG
9RTC
9UOC
9CAM
9CEC
9GPU
9S/PDIF
9VIP
USB2.0 PHY 3 ports
2.5V+1.2V_3V3
ARM
PCIE/SATA PHY
(MiPHY 1port)
2.5V+1.2V
BUS
9ARM Cortex-A9
9L2 cache
ADC
2.5V (+1.2V power_bus)
9A9SMRegister slices
9ADC
9BootROM
9BUSMATRIX
9C3
9CLCD
9DMAC
9FSMC
9GPT
9I2C
System PLLs & OSCI
1.2V + 2.5V
CODEC
3V3 TTL I/Os
9PCIE/SATA (MiPHY)
9VDEC
9VENC
3V3 TTL/1V8 CMOS I/Os
3V3 TTL/2V5 CMOS I/Os
9I2S
9KBD
9MCIF
9MPMC
9PWM
9SMI
9SSP
9SRAM
9UART
9UHC
3V3 PCI/TTL I/Os
Note:
6.2.1
1
RTC is battery operated, it has non switchable power supply.
2
MiPHY has dedicated, non switchable power supply.
3
The DDR PHY has a dedicated, board-switchable power supply.
Power domain management: power states
The power islands can be configured in five different states, as shown below:
Table 23.
ID
(1)
Allowed power states
Power state
0
AlwaysOn-Only
4
ARM + BUS
5
Comment
All switchable power islands are OFF.
DDRPHY can selectively be switched (board switches)
ON or OFF.
–
ARM + CODEC + BUS –
6
ARM + GPU + BUS
7
ON
–
All switchable power islands are ON.
1. These IDs refer to power states numbering in Figure 23 below.
Figure 23 is a visual representation of the allowed states and transitions. The power states
marked in blue color are only used internally for transitions.
The end user may set as target any of the allowed power states listed in the graphic. The
hardware automatically sequences the power transitions as shown below. Sequencing is
Doc ID 018553 Rev 3
85/590
Power management
RM0078
applied to guarantee power supply integrity: two power domains are never switched on
simultaneously.
Figure 23. Power states transition graph
TURN BUS ON
TURN CODEC ON
ARM+BUS
TURN CODEC OFF
ARM
ARM+BUS+
CODEC
TURN GPU ON
TURN ARM ON
POWERDOWN
TURN GPU OFF
TURN GPU ON
TURN GPU OFF
TURN CODEC ON
ARM+GPU+
BUS
SLEEP
TURN CODEC OFF
TURN ARM ON
TURN GPU ON
TURN ARM ON
TURN ARM OFF
POWERDOWN
TURN ARM OFF
TURN BUS ON
TURN CODEC ON
GPU
TURN BUS OFF
GPU+BUS
GPU+BUS+
CODEC
TURN CODEC OFF
In the case of transitions starting from those power states in which the ARM core is powered
on, the target state is programmed by the CPU itself. In the case of transitions starting from
the power states in which no processing core is available, an alternative mechanism is
needed and made available: a 'wakeup event' triggers the load of the desired configuration.
The possible sources of wakeup events are listed in the table below.
Table 24.
Allowed wakeup events
Wakeup event
86/590
Description
USB wakeup trigger
Exit from USB SUSPEND mode
GPIO wakeup trigger
'0' to '1' event on the dedicated GPIO
RTC wakeup trigger
ALARM interrupt event
GMAC wakeup trigger
Interrupt event generated by the reception of a valid wakeup
frame (magic packet or remote wakeup frame)
Doc ID 018553 Rev 3
RM0078
6.2.2
Power management
Power domain management: configuration registers
This section lists the registers (both IP-specific and miscellaneous) related to the power
management of switchable domains. The second column shows the state of these registers
after SoC power-up/reset. These settings correspond to state “RESET”.
Table 25.
Base
address
Power management configuration registers
Offset
Register name
Register fields
PCM_CFG
All the register fields are used:
– wakeup_en
– wakeup_trig
– sw_config
– config_ack
– config_bad
– ack_power_state
– ddr_phy_no_shutoff
PCM_WKUP_CFG
All the register fields are used:
– rtc_wkup_config
– gpio_wkup_config
– usbdev_wkup_config
– ethernet_wkup_config
108
SWITCH_CTR
All the register fields are used:
– pd1_ctrl
– pd2_ctrl
– pd3_ctrl
– pd4_ctrl
310
PERIP2_CLK_ENB
– mpmc_amba_clken
31C
PERIP2_SW_RST
– mpmc_amba_swrst
200
SYS_CLK_CTRL
– sys_mode_req
214
PLL1_CTR
– pll_enable
220
PLL2_CTR
– pll_enable
22C
PLL3_CTR
– pll_enable
238
PLL4_CTR
– pll_enable
100
104
E0700
Doc ID 018553 Rev 3
87/590
Power management
RM0078
Table 25.
Power management configuration registers (continued)
Base
address
EC000
Offset
Register name
Register fields
018
MPMC_CTRL_REG_06
– pwrup_srefresh_exit
02C
MPMC_CTRL_REG_11
– selfrefresh
– start
318
MPMC_CTRL_REG_129 – cke_status
400
GPIODIR
Fields 3 and 2
030
GPIODATA
Fields 3 and 2
EC060
6.2.3
Power management procedures
This section describes the software procedures for:
1. Activating transitions that do not require wake-up mechanisms (software-driven through
the BUS)
2.
Activating transitions that require wake-up mechanisms through one of the wake-up
sources (GMAC, RTC, GPIO, USB)
3.
Managing the SUSPEND-TO-RAM feature when the target state is State 0 (AlwaysOnOnly
Procedure 1: Activating transitions that do not require wake-up mechanisms
Note:
You can use this procedure to switch to any state other than the AlwaysOn-Only (0000) state
(see Table 26).
1.
88/590
Clear the fields of set PCM_CFG register by setting:
a)
config_ack to '0'
b)
config_bad to '0'
c)
sw_config to a value equal to ack_power_state
2.
Set field sw_config of register PCM_CFG to the desired value.
3.
Poll config_ack for acknowledgment of the execution of power state transition.
The field config_bad is used for debug purposes, it states that an illegal state has been
requested.
Doc ID 018553 Rev 3
RM0078
Power management
Procedure 2: Activating transitions that require wake-up mechanisms
Note:
You can use this procedure to switch to the AlwaysOn-Only (0000) state (see Table 26).
1.
2.
Clear the fields of set PCM_CFG register and enable peripherals for wake-up by
setting:
a)
wakeup_en field of desired wake-up peripheral to '1'
b)
wakeup_trig field to '0000'
c)
config_ack to '0'
d)
config_bad to '0'
e)
sw_config to a value equal to ack_power_state
Prepare the desired configuration for wake-up by setting one or all of the following fields
of register PCM_WKUP_CFG:
a)
rtc_wkup_config if RTC is enabled for wake-up
b)
gpio_wkup_config if GPIO is enabled for wake-up
c)
usbdev_wkup_config if USBDEV is enabled for wake-up
d)
ethernet_wkup_config if ETHERNET is enabled for wake-up
3.
Set field sys_mode_req of registesr SYS_CLK_CTRL to '010' to switch the system to
SLOW MODE.
4.
Set fields pll_enable of registers PLL*_CTR to '0' to power down all the PLLs.
5.
Set field sw_config of register PCM_CFG to '0000' .
6.
To wake the system up, trigger the wake-up event from the chosen external source.
Procedure 3: Managing the SUSPEND-TO-RAM feature when the target state
is State 0
Note:
You can use this procedure to switch to the AlwaysOn-Only state (0000) (see Table 26)
while preserving the contents of the external DDR module.
1.
Put the DDR in self-refresh mode by setting field srefresh (offset 16) of MPMC register
MPMC_CTRL_REG_11 to ‘1’.
2.
Check the CKE signal reading bit cke_status (offset 8) of the MPMC_CTRL_REG_129.
Check that the register bit is set to ‘0’.
3.
To avoid corruption of signals CKE and RESET for external DDR module when the
memory controller is powered off, activate control from GPIO by setting:
a)
gpioa_clken of MISC register PERIP1_CLK_ENB to ‘1’
b)
fields 3 and 2 of GPIOA register GPIODIR to ‘1’ to configure the needed IOs as
outputs
c)
fields 3 and 2 of GPIOA register GPIODATA to ‘11’ to drive an ‘1’ on the selected
IOs. This enables the on-board logic that forces the CKE and RESET signals to
external DDR to the appropriate value allowing the external DDR to remain in self-
Doc ID 018553 Rev 3
89/590
Power management
RM0078
refresh mode regardless of the status of the memory controller (MPMC) inside the
SoC.
4.
Stop the memory controller by setting field start (offset 24) of MPMC register
MPMC_CTRL_REG_11 to ‘0’.
5.
Latch the register values of MPMC in Always-on RAM, to save the result of the levelling
procedure.
6.
Reset and clock-gate the AMBA interface of the MPMC by setting:
a)
mpmc_amba_clken of MISC register PERIP2_CLK_ENB to ‘0’
b)
mpmc_amba_swrst of MISC register PERIP2_SW_RST to ‘1‘
7.
Apply Procedure 2 described above.
8.
Set field sys_mode_req of register SYS_CLK_CTRL to '100' to switch the system to
NORMAL MODE.
9.
Remove reset and clock gating from the AMBA interface of the MPMC by setting:
a)
mpmc_amba_clken of MISC register PERIP2_CLK_ENB to ‘1’
b)
mpmc_amba_swrst of MISC register PERIP2_SW_RST to ‘0‘
10. Restore back the values from Always-on RAM to reprogram the controller. During this
operation, be sure that both bits start and srefresh are set to ‘0’.
11. To drive the CKE signal to external DDR, release the on-board logic by setting field 2 of
GPIOA register GPIODATA + 0x30 to ‘0.’
12. Set the bit pwrup_srefresh_exit (offset 8) of the MPMC_CTRL_REG_06 to ‘1’.
13. Restart the memory controller by setting start field of MPMC register
MPMC_CTRL_REG_11 to ‘1’.
14. Check that bit cke_status of MPMC_CTRL_REG_129 register is set to ‘1’.
15. To drive the RESET signal to external DDR, release the on-board logic by setting field 3
of GPIOA register GPIODATA + 0x30 to ‘0’.
6.3
Clock power management
The reset and clock generator (RCG) provides the system clocks and resets. It is highly
configurable through the miscellaneous registers.
The RCG can be configured in three different states, as shown in Table 26.
Table 26.
Clock power states
ID
90/590
State
Comment
1
DOZE
The system clock source is the RTC oscillator
(nominally @32 KHz).
2
SLOW
The system clock source is the main oscillator
(nominally @24 MHz).
3
NORMAL
The system clock source is the PLL1 (nominally @1 GHz ) or
SSCG.
Doc ID 018553 Rev 3
RM0078
Power management
To change the power state, configure the sys_mode_req field of the miscellaneous register
SYS_CLK_CTRL as follows:
–
3’b001 for DOZE state
–
3’b010 for SLOW state
–
3’b100 for NORMAL state
To change the clock source within each state, configure the same register appropriately.
Note:
For more information on clock configuration, refer to Chapter 5: Reset and clock generator
(RCG).
6.4
IP power management
6.4.1
Standard IP power management
This paragraph provides power management information for the IPs that do not feature
specific power management procedures.
For standard IPs, two power states are generally available: the DISABLED and the
OPERATIVE one (see Table 27 below). Both are programmable through the miscellaneous
registers (MISC).
Table 27.
Standard IPs power states
ID
State
Comment
1
DISABLED
IP under reset, clock disabled
2
OPERATIVE
IP not under reset, clock enabled
●
To enable/disable the clock, configure the PERIP1_CLK_EN and PERIP2_CLK_EN
registers.
●
To activate/disactivate reset, configure the PERIP1_SW_RST and PERIP2_SW_RST
registers.
Note:
See also: RM0089, Reference manual, SPEAr1340 address map and registers for the
description of the MISC registers.
6.4.2
USBPHY power management
The USBPHY is the physical interface of the USB subsystem. It can be configured in two
different states, as shown in Table 28.
Table 28.
USBPHY power states
ID
State
Comment
1
SUSPEND
In this mode the PHY clock along with the 48 MHz clock shuts off.
2
OPERATIONAL
Normal operational mode
Table 29 lists the registers (both IP-specific and miscellaneous) related to the power
management of USBPHY. The second column shows the state of these registers after SoC
power-up/reset.
Doc ID 018553 Rev 3
91/590
Power management
Table 29.
RM0078
USBPHY power management-related registers
Address
Value after reset
Register name
0xE0700314
0x0000183B
USBPHY_GEN_CFG
USBPHY configuration procedure
This section describes how to configure power options for the USBPHY.
Changing the device state to SUSPEND
No register is involved to put USBPHY in SUSPEND state. USBPHY enters SUSPEND
state automatically when there is no activity on USB line. All three ports should get
SUSPENDM signal from the corresponding attached controller to achieve this SUSPEND
state. But in order to reduce the power consumption in SUSPEND state, the following
register is involved.
# MISC SETTINGS (USBPHY)
1. Set address 0xE0700314 to value |= 0x1
This sets COMMONONN signal to '1'. When USBPHY is in suspend state, this
setting allows to power down all USBPHY internal blocks that are common to
the 3 individual phys (XO bias, PLL).
When wake up occurs then USBPHY goes out of suspend state but power up the PLL and
XO bias i.e. to make available all clocks COMMONONN should be written ‘0’ again. Than
can be done by doing:
Set address 0xE0700314 to value &= ~0x1
6.4.3
MPMC/DDR PHY power management
JEDEC standard for DDR3 memories considers two main modes beyond the operational
one: the self-refresh and the power-down one. To enter these modes, it is necessary to use
the corresponding commands.
The self-refresh command can be used to retain data in the DDR3 SDRAM even if the rest
of the system is powered down. When in self-refresh mode, the DDR3 SDRAM retains data
without external clocking.
When the DDR3 SDRAM has entered self-refresh mode, all the external control signals,
except for CKE and RESET#, are “don't care”. In order to keep the DRAM in self-refresh
mode, RESET# must be kept high and CKE low.
The self-refresh mode can be used to change the clock frequency at which MPMC operates.
The memory controller must stop processing requests, the clock must be adjusted, the
memory controller's timing parameters must be reprogrammed and then the memory
controller can be restarted. To retain the data in DRAM during this process the memory can
be put in self-refresh mode via a self-refresh command.
The power-down mode is synchronously entered when CKE is registered low (along with
NOP or Deselect command). CKE is not allowed to go low while the Mode register set
command, MPR operations, ZQCAL operations, DLL locking or read/write operation are in
progress. CKE is allowed to go low while any of other operations such as row activation,
precharge or auto-precharge and refresh are in progress, but powerdown.
92/590
Doc ID 018553 Rev 3
RM0078
Power management
Entering the power-down mode disables all the input and output buffers, including CK, CK#,
ODT, CKE, and RESET# and DRAM content will be lost.
A situation to use the self-refresh mode may arise when the user may wish to power-down
or reset the MPMC without disturbing the contents of memory. In order for memory not to be
erased, the CKE and RESET# signals must remain constant. As the MPMC is not able to
drive these signals, the system handles this responsibility through on-board logic driven by
the GPIO IP. When the MPMC has been restored to active state, it regains control over the
mentioned signals. When the CKE signal is de-asserted, the memory enters self-refresh.
This must occur before the reset signal is asserted to the MPMC or the MPMC is powered
down. If the CKE signal is not released first, the memory may be left in an unknown state.
When power is re-applied to the MPMC or the reset signal is released, the MPMC must be
informed of the type of wakeup required: a full initialization or a memory wakeup where the
memory devices are just pulled out of self-refresh. This information is conveyed to the
MPMC through the pwrup_srefresh_exit parameter. If the pwrup_srefresh_exit parameter is
cleared to 'b0, the MPMC will perform a full memory initialization. If the pwrup_srefresh_exit
parameter is set to 'b1, this allows the controller to exit power-down mode by executing a
self-refresh exit instead of the full memory initialization. This parameter provides means to
skip full initialization when the DRAM devices are in a known self-refresh state.
MPMC puts in self-refresh the attached DRAM device(s) via the srefresh parameter. To do
so, the current memory burst for the current transaction (if any) will complete, all banks will
be closed, the self-refresh command will be issued to the DRAM, and the memory clock
enable signal will be de-asserted. The system will remain in self-refresh mode until this
parameter is cleared to 'b0. The DRAM devices will return to normal operating mode after
the self-refresh exit delay of the device and any DLL initialization time for the DRAM is
reached. The memory controller will resume processing of the commands from the
interruption point. When a self-refresh exit command is executed, an automatic refresh is
requested. By setting the bit srefresh_exit_no_refresh, the automatic refresh request is
inhibited.
MPMC puts in power-down mode the attached DRAM device(s) via the power_down
parameter: When this parameter is set to 'b1, the memory controller will complete
processing of the current memory burst for the current transaction (if any), issue a precharge all command and then disable the clock enable signal to the DRAM devices. Any
subsequent commands in the command queue will be suspended until this parameter is
cleared to 'b0. The DRAM command will be lost in this case.
6.4.4
PCIE/SATA/MIPHY power management
The PCIE/SATA controllers can be disabled (clock-gated and reset) for minimum dynamic
consumption as for standard IPs. This automatically implies minimum power consumption
state for MIPHY (reset by controller).This is also achieved when powering down the PCIE
power island.
Controllers for PCIE and SATA and the shared MIPHY physical layer can be configured
powerwise through the link interface.
Please refer to the standard procedures for link power management which can be found in
the following documents:
●
PCI Express® Base Specification revision 2.0
●
Serial ATA revision 3.0
Doc ID 018553 Rev 3
93/590
Power management
6.4.5
RM0078
ADC power management
ADC can be configured only in one state, as shown in Table 30.
Table 30.
ADC power state
ID
1
State
Comment
POWER DOWN
ADC macro is in power down mode
Table 31 lists the registers (both top level and IP level) related to the power management of
ADC. The second column shows the state of these registers after SoC power-up/reset.
Table 31.
ADC power management-related registers
Address
Value after reset
Register name
0xE0700274
0x0000003F
PERIP1_CLK_EN[30], adc_clken
0xE070027C
0xFFFFFFC0
PERIP1_SW_RST[30], adc_swrst
0xE0080000
0x00000000
ADC_STATUS
ADC configuration procedure
To change the device state in order to force POWER DOWN state:
# MISC SETTINGS (ADC)
Set address 0xE070027C to value 0xBFFFFFC0
Set address 0xE0700274 to value 0x4000003F
# ADC STATUS SETTINGS
Set address 0xE0080000 to value 0x00000000
# MISC SETTINGS (ADC)
Set address 0xE0700274 to value 0x0000003F
94/590
Doc ID 018553 Rev 3
RM0078
6.5
Power management
Voltage regulators
SPEAr1340 has three internal voltage regulators that generate a 2V5 supply output from a
3V3 supply input:
●
MIPHY single-lane regulator: the voltage controlled by this regulator is internally
connected to the MIPHY supply, but it is also externally visible on
MIPHY_S_0_VDD2PLL2V5. This regulator is always active; it is not possible to bypass
it.
●
VREG1 regulator: used only for USB. This regulator is always active (its power down
pin is connected to constant 0)
●
VREG2 regulator: used for all PLLs (PLL1, PLL2, PLL3, DDR PLL), ADC and OTP.
This regulator is switchable; its power down pin is controlled through a hardwired
connection to dedicated PCM output (see PCM core description in Section 6.6.2:
Functional description )
Note:
See also: Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded
MPU, Electrical characteristics chapter.
Doc ID 018553 Rev 3
95/590
Power management
6.6
RM0078
Power control module (PCM)
PCM is the core of the SPEAr1340 leakage power management system. Its role is to
properly manage the power supply shutoff of the switchable sections of the embedded MPU.
This section describes the structure and functionality of the PCM to allow the end user fully
understand the effect of the power domain-related power management procedures on the
hardware.
Note:
For the PCM feature list, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex
A9 HMI embedded MPU.
Note:
Figure 24. PCM block diagram
ack_power_state
GETH_config
USB_config
GPIO_config
RTC_config
FW_config
wakeup_en
wakeup_trig
BUSMATRIX_req
ack_power_state_o
Configuration
funnel
isolate_vector_o
shutoff_vector_o
config_vector
USBPHY_suspend
BUSMATRIX_ack
PCM Core
ack_o
bad_o
V_is_ok_vector
V_core_ok_4
V_core_ok_3
V_core_ok_2
V_core_ok_1
DDR_1V2_ok_i
reg_powerdown_o
Domain
checker
DDR1V2_OFF
DDR1V8_OFF
DDR1V8_OFF
MISC connections
PAD connections
Direct connections
Note:
96/590
For the pin description, refer to Table 32: PCM internal pins.
Doc ID 018553 Rev 3
RM0078
6.6.1
Power management
Pins
Table 32 lists PCM internal pins.
For the description of the external pins, refer to Doc ID 023063, Data sheet, SPEAr1340,
Dual-core Cortex A9 HMI embedded MPU.
Table 32.
PCM internal pins
Pin name
Direction
Description
clk_i
In
–
resetn_i
In
–
slow_clk_i
In
–
slow_resetn_i
In
–
FW_config
In
Standard power configuration driven from registers belonging
to MISC
{USB, GPIO, RTC,
GETH}_config
In
Wakeup power configuration driven from registers belonging
to MISC
wakeup_en
In
Wakeup enable signals driven from registers belonging to
MISC
wakeup_trig
In
Wakeup triggers from peripherals (USB,GPIO, RTC, GMAC).
GPIO trigger comes from IO.
USBPHY_suspend
In
Indicates the SUSPEND status of USB PHY
BUSMATRIX_ack
In
Acknowledges for request of interconnect matrix shutdown
BUSMATRIX_req
Out
V_is_ok_vector_i
In
Outputs of voltage detectors relative to the 4 supply domains.
It indicates if the power domain it represents has reached
functional voltage level (‘1’) or not (‘0’)
DDR_1V2_ok_i
In
It is the output of voltage detector relative to DDRPHY power
domain
acknowledge_o
Out
Indicates that the power configuration driven from the MISC
registers has been acknowledged. Pin configuration should
not change again until the previous one has been
acknowledged.
shutoff_vector_o
Out
Vector that directly drives the power switches. One element for
each power domain. ‘1’ means powerdown (open the switch).
isolate_vector_o
Out
Vector that drives isolation cells. One element for each power
domain. ‘1’ means isolate functionality is active.
bad_o
Out
Indicates that the last configuration requested is a bad one
(not belonging to the set of allowed configurations).
power_state_o
Out
Outputs the last power configuration that has been received
AND served correctly.
DDR_1V2_shutoff_o
Out
Controls the (optional) external (i.e. board) power switch on
the DDRPHY 1V2 supply line.
Request for interconnect matrix shutdown
Doc ID 018553 Rev 3
97/590
Power management
Table 32.
RM0078
PCM internal pins (continued)
Pin name
6.6.2
Direction
Description
DDR_1V8_shutoff_o
Out
Controls the (optional) external (i.e. board) power switch on
the DDRPHY 1V5/1V8 supply line.
reg_powerdown_o
Out
Controls the shutdown of the 2V5 voltage regulator that
supplies the system PLLs and ADC.
Functional description
The power control module coordinates the control activities related to shutoff mode
management. It is divided into three main sub-blocks:
●
the PCM core: the main state machine. This block takes as input a new power island
configuration (a bit vector containing the desired status for each of the available power
islands of SPEAr1340) and the current status of the power islands. It outputs the
control signals for the power island switches and for the isolation cells
(see Section : PCM core on page 98).
It provides control for the shutdown of the VREG2 voltage regulator, related to
VREG2_2V5_OUT. The regulator is automatically powered down when all of the
switchable power islands are switched off, and automatically powered up upon wakeup.
This control can be bypassed by programming the MISC register PCM_CFG (see
MISC registers in RM0089, Reference manual, SPEAr1340 address map and
registers).
●
a configuration funnel: a muxing block that selects the source of the current
configuration to be fed to the PCM core.
(see Section : Configuration funnel on page 99)
●
a domain checker: this block processes (synchronizes and debounces) the outputs of
voltage detectors that report the current status of each power island.
(see Section : Domain checker on page 102)
PCM core
The PCM core (Figure 25) comprises the following:
98/590
●
the main state machine: a small sequencer for controlling the external power supply
switches (for DDR physical layer power supplies), and
●
a configuration sequencer: this component can limit internally the number of transitions
between power states (where a power state is simply one of the shutoff modes
mentioned before), while still providing to the user full accessibility from any power
state to any other one.
Doc ID 018553 Rev 3
RM0078
Power management
Figure 25. PCM core block diagram
PCM core
shutoff_vector_o
resetn_i
isolate_vector_o
clk_i
config_vector_i
Configuration
sequencer
filtered_config_vector
PCM
FSM
ack_power_state_o
reg_powerdown_o
V_is_ok_vector_i
ack_o
shutoff(0)
bad_o
External
switch
control
DDR1V2_OFF
DDR1V8_OFF
Figure 26 shows the relevant interface timing; these signals comply with the following rules:
●
External power configuration command must remain stable at least until acknowledged
●
On domain shutoff, acknowledge must be asserted after V_is_OK states that power
has been effectively shut off (last internal event of shutoff procedure)
●
On domain power-up, acknowledge must be asserted after isolation line has been
deasserted (last internal event of power-up procedure)
●
On shutdown, issue the first isolate command, then issue the shutoff command.
●
When powering back up (shutoff back to 0), wait for V_is_OK before releasing isolation
logic.
Figure 26. Relevant PCM core interface timing
Configuration funnel
Figure 27 shows the configuration funnel sub-block structure.
Doc ID 018553 Rev 3
99/590
Power management
RM0078
Figure 28 describes the implemented selection function.
Figure 27. Configuration funnel block diagram
GETH_config
USB_config
GPIO_config
MUX
REG
config_vector
RTC_config
BUSMATRIX_req
FW_config
wakeup_en
SELECTOR
wakeup_trig
ack_power_state
SELECTION
FUNCTION
USBPHY_suspend
MISC connections
PCM connections
Direct connections
BUSMATRIX_ack
100/590
Doc ID 018553 Rev 3
RM0078
Power management
Figure 28. Configuration funnel selection flow graph
START
Is device in
ALWAYSON?
Is FW requiring
alwayson?
NO
NO
Is FW requiring
BUS OFF?
NO
Propagate FW
configuration.
YES
YES
If an enabled peripheral
triggers a wakeup event,
propagate its preloaded
configuration, else keep
current configuration
stable.
YES
YES
Is BUS
MATRIX idle?
NO
NO
Is USB PHY
suspended?
YES
Is USB
ENABLED to
wakeup?
Propagate FW
configuration.
Send Request to
BUSMATRIX
YES
YES
Is BUS
MATRIX idle?
NO
NO
Propagate
“0000”
configuration.
Send Request to
BUSMATRIX
YES
Is BUS
MATRIX idle?
NO
Propagate
“0000”
configuration.
Send Request to
BUSMATRIX
Doc ID 018553 Rev 3
101/590
Power management
RM0078
Domain checker
To resynchronize/debounce and, in general, process the signals that detect the current
status (ON/OFF) of the related power supply rail, the domain checker sub block provides a
simple interface to voltage detector outputs.
Figure 29. Domain checker block diagram
·
·
- Synchronizes Vok from switch
- Propagates:
‘1’ if ‘1’ stable for last N cycles
‘0’ if ‘0’ stable for last N cycleS
Otherwise, holds the previous
value.
V_core_ok_1
V_core_ok_2
Debouncer
Debouncer
domain_ok_vector_o
V_core_ok_3
V_core_ok_4
DDR_1v2_ok_i
shutoff_vector_i(0)
102/590
Debouncer
Debouncer
Debouncer
DDR2/3 OK
Counter (for 1V8)
Doc ID 018553 Rev 3
RM0078
7
BootROM
BootROM
This chapter describes the device startup sequence from power on to bootloader execution,
and provides an overview of the SoC device internal mechanism (both hardware and
software) after the device resets.
For the BootROM feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
7.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
BootROM is the booting firmware prestored in the on-chip 32 KB ROM of SPEAr1340. On
power-on, the processor comes out of reset and fetches the first instruction from the RESET
vector location. For ARM processors, the high order reset vector is at 0xFFFF0000, where
the start of the BootROM memory is mapped. In this way, the ARM processor starts fetching
and executing instructions from the BootROM.
The main tasks performed by the BootROM code are:
7.1.1
●
performing basic hardware initialization
●
selecting the booting device; BootROM selects the booting device after reset by
reading the status of the STRAP[3:0] pins
●
locating the next level bootloader on the booting device
●
validating the next level bootloader image
●
passing control to the next level bootloader (execution of the next level bootloader)
●
taking care of error cases
Useful terms and definitions
This section lists some useful terms and their definitions.
●
BootROM is the very first firmware code to be fetched and executed by ARM Cortex A9
core when it finishes a reset.
●
Bootloading is the activity of locating and loading the next level of software during the
boot activity.
●
X-Loader is a piece of code that corresponds to the first level of bootloading. It may
reside in non-volatile memory (Flash), or be loaded from an external peripheral; it is
located and authenticated by BootROM. Its primary task is to configure the DRAM
controller with the proper setting according to the external memory part timing.
●
XIP (execute in place) is a method in which firmware code can be fetched/executed by
the CPU directly from the nonvolatile memory in which it resides.
●
Code-shadowing is a software technique in which external firmware code is copied
from a memory that does not support in-place execution (XIP), like NAND Flash, to a
different memory enabling direct execution like static RAM.
Doc ID 018553 Rev 3
103/590
BootROM
7.2
RM0078
●
OTP is one-time programmable memory that can be customized for the final application
during the manufacturing process.
●
SYSROM is the 32 KB read-only memory embedded in the device at address
0xFFFF0000, where the BootROM code resides.
●
SYSRAM0 is the 32 KB static RAM embedded in the device at address 0xB3800000
that is used by the BootROM code to store its global and local variables during
execution.
●
ALWAYS-ON state is a power state condition in which the embedded MPU is almost
completely powered off. The exceptions are: a small island including wake-up logic and
SRAM memory; the DDR is powered on in self-refresh mode. The OS state resides
permanently in memory and the system can be resumed quickly.
●
UOC is the USB On-The-Go controller.
Functional description
BootROM has several features, such as:
●
Security feature: BootROM enables the security feature provided by the cryptographic
co-processor (C3) through OTP. After that, BootROM receives encrypted images with a
security key, reads the key from OTP and decrypts the images. If the image decryption
is successful, the execution proceeds further. If the image is not encrypted using the
security key, the decryption fails.
For information on how to enable the security feature, see Section 7.2.2: OTP
configuration.
●
Data CRC (DCRC): BootROM validates the next-level bootloader images through data
CRC. Data CRC is enabled through OTP. When enabled, CRC is performed on the
image data. If the calculated image CRC matches with that of the header, the execution
proceeds further.
For information on how to enable data CRC, see Section 7.2.2: OTP configuration.
104/590
●
Wake-up triggering: In order to implement power control, BootROM supports wake-up
triggering. When the SoC is powered up and BootROM execution starts, BootROM
checks if power-up is a result of wake-up triggering. If any of the PCM_CFG[9:5] bits is
set, BootROM jumps to “Always-on RAM” at address 0xE0800000.
●
Pen Holding mechanism: When Cortex A9 comes out of reset, both of its cores start
fetching instructions from the reset vector location, for instance from address
0xFFFF0000, where the 32 KB SYSROM memory is located. The pen holding
mechanism is used to stop Core 1 while continuing Core 0 booting. See also:
Section 7.4.1: BootROM on Core 1.
Doc ID 018553 Rev 3
RM0078
7.2.1
BootROM
Hardware components used
BootROM uses SYSRAM0 to copy the first level bootloader, Xloader. Depending on the
booting mode selected, the source of Xloader may be Serial NOR Flash, Parallel NOR
Flash, NAND Flash, SD/MMC card, UART0 and USB OTG.
For more information on supported booting devices, see Section 7.2.3: Boot device
selection.
Figure 30 illustrates BootROM start-up sequence.
Figure 30. BootROM start-up sequence
"OOTSTAGES
3932!-
%XTERNALCODE
DEVICE
"OOT2/-
8,OADER
!2NDLEVEL
"OOTLOADER
(IGHVECTORS
"OOT2//3
%MBEDDEDIN30%!R
%XTERNAL
Doc ID 018553 Rev 3
105/590
BootROM
7.2.2
RM0078
OTP configuration
The SPEAr1340 OTP module embeds three 255-bit banks: Bank 1, Bank 2 and Bank M.
BANK M contains one predefined bit (255) used for test purposes, and dedicated bits
controlled by the BootROM. Table 33 shows how Bank M is mapped in the OTP.
Note:
Refer to Chapter 9: One-time programmable antifuse (OTP) for information on OTP banks.
Table 33.
Note:
OTP Bank M configuration
Bits
Fields
Offset
Description
1
XXXX
255
2
S1-S0
254-253 1 bit + 1 redundancy for encryption key section enable
2
V1-V0
252-251 1 bit + 1 redundancy for vendor ID section enable
2
J1-J0
250-249 1 bit + 1 redundancy for JTAG disable
2
T1-T0
248-247 1 bit + 1 redundancy for TEST disable
2
E1-E0
246-245 Reserved
2
C1-C0
244-243 1 bit + 1 redundancy for data CRC check enable
2
U1-U0
242-241 Reserved
137
Security
72
USB/PCI ID
103-32
64 bits + 8 ECC for USB/PCI vendor IDs (see Note 2)
16
WP B2
31-16
8 bits + 8 redundancy for masking OTP Bank 2
16
WP B1
15-0
8 bits + 8 redundancy for masking OTP Bank 1
1 bit reserved for blowing at final test
240-104 128 bits + 9 ECC for encryption key (see Note 1)
1
To enable the security feature, set either of the security bits S0 or S1.
2
The OTP USB/PCI vendor IDs section can be programmed by users who wish to provide
their own customized vendor and product IDs for USB.
The following C structure highlights the fields of the 72-bit USB/PCI ID section (in little
endian):
/*
* This OTP structure keeps all the USB and PCIe information that might be
customized by the final user.
*/
typedef struct otp_basic {
u16 otp_usb_vid;
u16 otp_usb_pid;
u16 otp_pci_vid;
u16 otp_pci_pid;
u8 otp_ecc; /* a H(127, 120) is required with 7bit of ECC. */
}
See also: Section : OTP section access on page 145.
106/590
Doc ID 018553 Rev 3
RM0078
7.2.3
BootROM
Boot device selection
The device has seven external strapping pins (STRAP[6:0]) that are sampled by internal
hardware logic during the power-on reset sequence, and latched on the BOOTSTRAP_CFG
miscellaneous register (0xE0700004).
After latching, STRAP[6:0] pins are reusable for different purposes. When used as output
pins, they require no special conditions, but when used as input pins, the application must
keep them in a non-driving (tri-state) mode for at least 2 µs after MRESETn is released.
Note:
For the description of STRAP[6:0] pins, refer to Doc ID 023063, Data sheet, SPEAr1340,
Dual-core Cortex A9 HMI embedded MPU.
The pins STRAP[3:0] are used to select the internal booting method (see Table 34 below).
Reading the status of these pins, the BootROM selects the booting device between the
following ones:
●
Serial NOR Flash
●
Parallel NOR Flash
●
NAND Flash
●
SD/MMC card
●
UART0
●
OTG as USB device
Table 34 describes SPEAr1340 boot selection.
Note:
For pad configuration details, refer to RM0089, Reference manual, SPEAr1340 address
map and registers, Miscellaneous chapter, “Pad configuration options” table.
Table 34.
Hardware boot selection (STRAP[0..3])
Backup source(1)
Primary source
STRAP3
STRAP2
STRAP1
STRAP0
Bypass
na
0
0
0
0
Serial NOR Flash
USB OTG (Device)
0
0
0
1
NAND Flash
USB OTG (Device)
0
0
1
0
Parallel NOR Flash (8-bit)
USB OTG (Device)
0
0
1
1
Parallel NOR Flash (16-bit)
USB OTG (Device)
0
1
0
0
UART0
na
0
1
0
1
rfu(2)
na
0
1
1
0
rfu
na
0
1
1
1
USB OTG (Device)
na
1
0
0
0
Serial NOR Flash
UART0
1
0
0
1
NAND Flash
UART0
1
0
1
0
Parallel NOR Flash (8-bit)
UART0
1
0
1
1
Parallel NOR Flash (16-bit)
UART0
1
1
0
0
MMC/SD memory card
na
1
1
0
1
rfu
na
1
1
1
0
rfu
na
1
1
1
1
Doc ID 018553 Rev 3
107/590
BootROM
1
RM0078
The backup source will be used in case that the primary source is not available.
2Reserved
7.2.4
for future use.
Software architecture
BootROM code is the first piece of software executed after the power ON phase. It is
logically divided into three main parts:
●
System initialization: this part is common for all boot modes.
●
Boot device initialization and code-shadowing: this part is different for each boot mode.
It consists of initializing the boot device and copying the Xloader image from that device
to SRAM memory.
●
Xloader authentication and execution: this part is common for all boot modes.
SYSROM code memory map
The 32 KB SYSROM memory is located at address 0xFFFF0000, where the ARM Cortex
A9 starts fetching just after reset. The BootROM code resides here.
Figure 31 shows the memory map of the SYSROM.
Figure 31. SYSROM memory map
0xFFFF7FFF
0xFFFF7F00
0xFFFF7E00
Version Table
Security Table
Security Code
0xFFFF5000
0xFFFF4C38
unused
rw data
0xFFFF4A4C
0xFFFF4398
ro data
Text
0xFFFF0020
0xFFFF0000
108/590
Vector Table
Doc ID 018553 Rev 3
RM0078
BootROM
SYSRAM0 memory map
At start-up, the DDR memory is not initialized. Instead, SPEAr1340 has a 32 KB SYSRAM0
memory that BootROM uses for the following purposes:
●
Stack
●
BSS
●
RW data
●
Code-shadowing
●
Others: Pen holding and internal watchdog reset
Figure 32 shows the memory map of the SYSRAM0.
Figure 32. SYSRAM0 memory map
0xB3807FFF
Code Shadowing Area
0xB3801500
RAM Variables
0xB38007F4
RW data
0xB3800608
0xB3800600
Watchdog Reset
Pholding
Stack
0xB3800000
Doc ID 018553 Rev 3
109/590
BootROM
7.2.5
RM0078
System initialization
When the SoC is powered up and BootROM starts executing, it first performs basic system
initialization. This set of steps is common irrespective of the boot mode chosen.
Figure 33 shows the tasks performed during system initialization.
Figure 33. System initialization
3934%-2%3%4"OOT2/-
ENTRYPOINT
3YSTEMINITIALIZATION
$ISABLES--5
)NVALIDATES)#ACHEANDENABLESIT
9%3 *UMPTO!,7!93
/N2!-LOCATION
X%
7AKEUP
TRIGGERED
/54
./
7ATCHDOGRESET
9%3
3ETBITOF393?37?2%3
REGISTERTOTRIGGERCOMPLETE
3O#RESET
/NX"
WRITEh7$v
./
)NITIALIZE0,,
3ETUP3TACK
)NITIALIZE$ATA
SEGMENTAND"33
3ECURITYENABLED
9%3
%XECUTESECURITYFUNCTIONS
./
#HECKBOOTTYPE
9%3
3ECURITY
EXECUTION SUCCESSFUL
./
0ROCEEDTOINDIVIDUALBOOTMODES
(!.'
110/590
Doc ID 018553 Rev 3
RM0078
BootROM
Figure 33 is explained as follows:
Note:
1.
As a first step, the memory management unit (MMU) is disabled. BootROM does not
use virtual memory, so MMU is not needed. In order to improve system performance
the instruction cache (I-Cache) is initialized. To do so, BootROM invalidates the
I-Cache and then enables it.
2.
If wake-up is triggered by any of the sources, code jumps to the ALWAYS-ON RAM
memory location (0xE0800000). This sequence is important to resume from the sleep
feature. BootROM expects wake-up code to be already present in the ALWAYS-ON
RAM. This RAM is in the ALWAYS-ON domain and hence is not powered off during
sleep. The wake-up code lying in the ALWAYS-ON RAM area is put there by higher
layer software before going into sleep, and is responsible for resuming the system in
the original state.
Even if the security feature is enabled, BootROM simply passes the control to ALWAYS-ON
RAM considering that the code comes from a trusted source. No security checks are
performed.
3.
If wake-up is not triggered, it may be an internal watchdog reset. In this case, BootROM
writes the watchdog ID (WD0 or WD1) that causes the reset on the watchdog reset
location in SYSRAM0 (0xB3800604) and triggers the complete SoC reset by setting bit
0 of SYS_SW_RES miscellaneous register (0xE0700204).
4.
If none of the above is true, it means that it is a normal system start-up. BootROM
initializes the PLL, sets up the stack, initializes the BSS and data segment. Refer to
Section : PLL initialization for more details.
5.
If either of the security bits (S1 or S0) is set in OTP (see Section 7.2.2: OTP
configuration), BootROM executes the initial security functions. If security execution is
successful, BootROM passes at the following step. Otherwise, it hangs. For example:
If bit 4 of SOC_CFG miscellaneous register (0xE0700000) is set, it corresponds to an
unsupported test mode, so the BootROM will hang.
As the timer is required by almost all boot modes, BootROM initializes the timer.
6.
BootROM reads the BOOTSTRAP_CFG miscellaneous register (0xE0700004) to
determine the STRAP pins configuration. According to the boot type, BootROM jumps
to the corresponding boot mode.
PLL initialization
When the system comes out of reset, its clock source is osci1 (24 MHz). There are two
different factors that BootROM must take into consideration while configuring the clocks of
various subsystems:
●
The clock frequencies of the individual subsystems must be within their limits: for
instance, the C3 maximum frequency is 48 MHz.
●
The power consumed by the SoC must be the minimum possible.
The aim in programming PLL is to configure the minimum possible supported frequency.
PLL programming is done through M, N and P parameters which can take a defined set of
values.
Fout = Fvco / 2^P
where:
M can be from 8 to 255
N can be from 1 to 7
P can be from 0 to 6
Doc ID 018553 Rev 3
111/590
BootROM
RM0078
Fvco = Fref * 2 * M
Fref = Fin / N
Fin = 24 MHz
On SPEAr1340, PLL1_FRQ is configured as 0x11000201, where:
●
M = 0x11
●
P = 0x2
●
N = 0x1
Resulting in:
●
Fref = 24 / 1 = 24 MHz
●
Fvco = (24 * 2 * 17) = 816 MHz
●
Fpll = (816 / (2^2)) = 204 MHz
This leads to:
7.2.6
●
CPU Freq = Fpll / 2 = 102 MHz
●
AHB Freq = Fpll / 6 = 34 MHz
●
APB Freq = Fpll / 12 = 17 MHz
Boot device initialization and code-shadowing
After basic system initialization, BootROM initializes the IP which shall be used to access
the boot device. The steps involved are:
112/590
●
Pad configuration
SPEAr1340 pads are multiplexed with different IPs. By default, all multiplexed pads are
in input mode. Hence, it is necessary to configure the pads related to a particular IP as
required by the IP (basically setting the direction of the pad).
●
Controller configuration
The controller used to access the device is initialized. Additionally, if the boot device is
using any synthesizer for the device clock, BootROM configures it.
●
Device access
To access the device and get Xloader, BootROM performs the following tasks:
a)
Initializes the boot device (if required)
b)
Copies Xloader header (64 bytes) from device to stack
c)
Authenticates the header
d)
Copies the Xloader image from the boot device to SRAM
Doc ID 018553 Rev 3
RM0078
BootROM
Table 35 summarizes the major configuration differences for each boot mode.
Table 35.
IP configuration
Boot
mode
Controller
Clock source (controller)
Clock
frequency
(MHz)
Boot
Bypass
SMI
SMI clock = HCLK/SMI_PRESCALER
8.5
0xE6000000
SNOR
SMI
SMI clock = HCLK/SMI_PRESCALER
8.5
0xE6000000
PNOR
FSMC
HCLK
34
0xA0000000
NAND
FSMC
HCLK
34
0xB0800000
SD/MMC
MCIF
UART
UART0
USB
OTG
UDC
Image source address
40.78
The first primary partition must be
a FAT partition.
The image name must be
xloader.img.
Osci1
24
Sent by host using Kermit protocol
ohci_clk48_i
48
Sent by flashing utility
MCIF XY Synthesizer
The following sections provide IP configuration details for each boot mode.
Boot Bypass
Boot Bypass uses the serial NOR device. In this mode, no validation or authentication is
performed on the image data; the code directly jumps to the image load address found in
the image header. The image header is authenticated, however. When using Boot Bypass, it
is the sole responsibility of the user to ensure that the image being executed comes from a
trusted source.
Note:
In this mode, security should be disabled. If security is enabled and Boot Bypass is selected,
the SoC hangs.
Pad configuration
To access the serial NOR device, the following pads are enabled:
●
SMI_DATAIN
●
SMI_DATAOUT
●
SMI_CLK
●
SMI_CS0n (this pin should be used for booting from serial NOR)
●
SMI_CS1n
Controller configuration
The SMI controller is used to access serial NOR devices. These are the configuration
options:
●
Bank used: Bank 0
●
Controller clock: 8.5 MHz
●
Chip deselect time: 300 ns
Doc ID 018553 Rev 3
113/590
BootROM
RM0078
Device access
In Boot Bypass mode, the image header (first 64 bytes) is copied word-by-word from the
SNOR Flash to stack and authenticated. If the header authentication is successful, the code
directly jumps to the image load address (which should lie in the SNOR memory area).
Serial NOR (SNOR)
The serial NOR boot mode is used to boot from serial NOR Flash.
Pad configuration
See Section : Boot Bypass.
Controller configuration
See Section : Boot Bypass.
Device access
Once the configuration is over, the header (first 64 bytes) is copied word-by-word from the
SNOR Flash to stack and authenticated. If the header authentication is successful, load
address is extracted from it and the entire image is copied word-by word from the SNOR
Flash to the load address (in SRAM).
Parallel NOR (PNOR)
BootROM supports booting from 8-bit and 16-bit PNOR devices. The FSMC controller is
used to access the PNOR Flash.
Pad configuration
Pads enabled for 8-bit PNOR devices:
114/590
●
FSMC_AD0 - FSMC_AD25
●
FSMC_RB0
●
FSMC_ALE_AD17
●
FSMC_CE0n (this pin should be used for booting from parallel NOR)
●
FSMC_CE1n
●
FSMC_CLE_AD16
●
FSMC_REn
●
FSMC_RSTPWDWN0
●
FSMC_RSTPWDWN1
●
FSMC_RWPRT0n
●
FSMC_RWPRT1n
●
FSMC_WEn
●
FSMC_IO0 - FSMC_IO7
Doc ID 018553 Rev 3
RM0078
BootROM
Pads enabled for 16-bit PNOR devices:
●
FSMC_AD0 - FSMC_AD25
●
FSMC_RB0
●
FSMC_ALE_AD17
●
FSMC_CE0n (this pin should be used for booting from parallel NOR)
●
FSMC_CE1n
●
FSMC_CLE_AD16
●
FSMC_REn
●
FSMC_RSTPWDWN0
●
FSMC_RSTPWDWN1
●
FSMC_RWPRT0n
●
FSMC_RWPRT1n
●
FSMC_WEn
●
FSMC_IO0 - FSMC_IO15
Controller configuration
FSMC control register for Bank0 (GenMemCtrl0)
●
Wait check during the first data access is enabled
●
Reset / power-down signal is sent to the Flash memory
●
Type of memory specified as PNOR
●
Bank0 is enabled
Timing register (GenMemCtrl_tim0)
●
Duration of address state phase: 5 HCLK cycles
●
Duration of hold address phase: 5 HCLK cycles
●
Duration of Data_ST phase: 67 HCLK cycles
●
Burst turn around duration: 5 HCLK cycles
●
Burst cycle length: 5 HCLK cycles
●
Data latency used: 6 HCLK cycles
Device access
The Xloader header and image data is copied using the library copy function. However, the
copy function is different depending on the device width:
●
8-bit PNOR : byte-by-byte
●
16-bit PNOR: half-word by half-word
NAND Flash
BootROM expects the X-Loader to be present in either of the first four blocks (block 0 to
block 3) of the NAND device. It uses a pure skip-block algorithm, in which bad blocks are
skipped, starting from block0 up to block3. If BootROM does not find Xloader in any of these
four blocks, it jumps to default boot mode.
SPEAr1340 supports booting from a variety of NAND devices. It supports both NAND
devices and new Open NAND Flash (ONFI) devices (see Section 7.4.3: List of supported
devices). The FSMC controller is used to access the NAND Flash.
Doc ID 018553 Rev 3
115/590
BootROM
RM0078
Pad configuration
Pads enabled for all NAND 8-bit devices
●
FSMC_RB0
●
FSMC_ALE_AD17
●
FSMC_CE0n (this pin should be used for booting from NAND)
●
FSMC_CE1n
●
FSMC_CLE_AD16
●
FSMC_REn
●
FSMC_RSTPWDWN1
●
FSMC_RWPRT0n
●
FSMC_RWPRT1n
●
FSMC_WEn
●
FSMC_IO0 - FSMC_IO7
Pads enabled for all NAND 16-bit chips
●
FSMC_RB0
●
FSMC_ALE_AD17
●
FSMC_CE0n (this pin should be used for booting from NAND)
●
FSMC_CE1n
●
FSMC_CLE_AD16
●
FSMC_REn
●
FSMC_RSTPWDWN1
●
FSMC_RWPRT0n
●
FSMC_RWPRT1n
●
FSMC_WEn
●
FSMC_IO0 - FSMC_IO15
Controller configuration
FSMC control register for Bank0 (GenMemCtrl0):
Note:
116/590
●
Wait sensitivity is activated
●
NAND is selected as memory type
●
CLE to RE delay is set as Tclk * 3
●
ALE to RE delay is set as Tclk * 3
●
The NAND device is enabled
1: Tclk = 29.4 ns (HCLK is set to 34 MHz).
Doc ID 018553 Rev 3
RM0078
BootROM
Timing registers
To detect NAND devices without issues, the timing registers of the FSMC controller are
initialized with appropriate timing values. The two timing registers to be configured are:
●
GenMemCtrl_Comm0 (Timing register for common mode and NAND bank 0)
●
GenMemCtrl_Attrib0 (Timing for PCcard attribute mode and wait mode for NAND bank
0)
BootROM initializes the following timing settings in these registers. The values below are
hardcoded into the BootROM code and cannot be tuned. These values have been selected
relaxed in order to allow booting from any NAND device. In later stages, after BootROM,
these values can be changed according to the NAND device used.
●
THIZ = 0x01
The total time is calculated as: THIZ= 29.4 ns
●
THOLD = 0x04
The total time is calculated as: THOLD= 117.65 ns
●
TWAIT = 0x06
The total time is calculated as: TWAIT= 205.9 ns
●
TSET = 0x00
The total time is calculated as: TSET= 29.4 ns
Hence, both GenMemCtrl_Comm0 and GenMemCtrl_Attrib0 registers are initialized with
value 0x01040600.
Note:
For a detailed description of FSMC timing requirements, refer to Doc ID 023063, Data sheet,
SPEAr1340, Dual-core Cortex A9 HMI embedded MPU, “Timing characteristics” chapter.
Device access
Once the controller has been configured, the next step is to detect the Flash chip present.
The procedure followed is:
1.
The reset command (0xFF) is issued.
2.
The device ID is read by issuing command 0x90.
3.
The device ID is read again to eliminate bus hold and interface concerns.
If the two ID reads matched, it means that a NAND device is present on the board.
4.
The manufacturer ID is read as well.
5.
The device ID read is compared with the static table in BootROM memory. This table
contains the IDs of old NAND devices.
If the device ID read from the chip is found in the static table, the device parameters
such as page_size, block_size, and so on are read from the table. Otherwise, the chip
is probed for ONFI-compliant Flash.
6.
Once the chip parameters are known, the NAND chip is read page-by-page.
7.
The first 64 bytes constitute the image header. Once the header is validated, the entire
image is read page-by-page. Refer to Section 7.2.8: Image header authentication for
details on header authentication.
Doc ID 018553 Rev 3
117/590
BootROM
RM0078
SD/MMC
BootROM supports booting from SD and MMC cards.
Note:
The file system used by the card is the FAT file system.
Pad configuration
Pads enabled for SD/MMC:
●
MCIF_nCE_SD_MMC
●
MCIF_DATA_DIR
●
MCIF_SD_CMD
●
MCIF_LEDS
●
MCIF_ADDR1_CLE_CLK
●
MCIF_nCD_SD_MMC
●
MCIF_DMARQ_RnB_WP
●
MCIF_DATA0
●
MCIF_DATA[1:3]_SD
●
MCIF_DATA[4:7]
Controller configuration
Peripheral register (PERIP_CFG miscellaneous register)
●
Configure MCIF for SD/MMC card
MCIF clock synthesizer (MCIF_SD_CLK_SYNT miscellaneous register)
●
Set X parameter as 1
●
Set Y parameter as 8
●
Output clock synthesizer frequency:
Fout = Fin * X / (2 * Y) for 50% duty cycle
For instance, Fout = 12.65 MHz
Device access
The first step after configuring the controller is to determine the card type: SD or MMC. This
is done by sending a command to the card that exists in the SD specification and is not
present in MMC cards.
BootROM checks if the SD card supports version 2 of the SD specification (which is the first
to define the high capacity bit) with the command CMD8 , then tests for the bit with CMD41.
If the card does not support version 2 of the SD specification, CMD41 is still issued (without
requesting for the high capacity bit) to differentiate between a standard capacity SD card
and an MMC card.
The following flowchart shows this procedure.
118/590
Doc ID 018553 Rev 3
RM0078
BootROM
Figure 34. SD/MMC card detection sequence
6WDUW
&0'
WLPHRXW
2.
6'Y[
RU00&
RUQRFDUG
6'Y[
&0'
UHT+&
&0'
2.
%LWFOHDUHG
%LWVHW
WLPHRXW
00&RU
QRFDUG
6'VWDQGDUG
6'
KLJKFDSDFLW\
&0'
UHT+'
WLPHRXW
2.
00&Y
RUQRFDUG
00&Y
&0'
&KHFNELWV
WLPHRXW
2.
1RFDUG
1RW
00&VWDQGDUG
00&
KLJKGHQVLW\
3URFHHGZLWKFDUGFRQILJXUDWLRQ
Once the card type has been determined, image data is read from it using the standard
SD/MMC card reading procedures. The protocol used for both SD and MMC is based on
transactions where the host initiates a transfer by sending a command and the card
responds with status information, the actual data and a CRC. To read a particular FAT
sector, SD/MMC driver sets the start address to read from and the number of bytes to be
read (for a FAT sector, this is always 512 bytes).
Doc ID 018553 Rev 3
119/590
BootROM
RM0078
UART
When UART booting is selected, UART0 is configured for a baud rate of 115200 bps. In
UART boot mode, BootROM supports data transfer using only the Kermit protocol.
Pad configuration
Pads enabled for UART0:
●
UART0_RXD
●
UART0_TXD
Controller configuration
●
Select OSC1 (24 MHz) as UART0 clock source
●
Baud rate divisor values are set for 115200 bps
●
Line control registers are configured for:
–
8-bit word length
–
No parity
–
1 stop bit
–
FIFO enabled
Device access
As mentioned above, in UART boot mode the Kermit protocol is used to transfer data. Data
is sent in the form of Kermit packets. The data in each packet is enclosed between a packet
header and footer. Refer to Kermit protocol manual for more details on data transmission:
https://www-vs.informatik.uni-ulm.de/teach/ws05/rn1/Kermit%20Protocol.pdf
Each data byte transmitted by the host is received in UART0’s data register (UARTDR,
0xE00000000). Similarly to other boot modes, if the header authentication is successful,
BootROM starts copying the image data to the SRAM memory.
USB OTG
USB OTG boot mode can be configured by setting the STRAP[3] pins. USB boot mode is
quite helpful as it is used (in conjunction with Flashing utility) to burn the next-level
bootloaders and/or the operating system onto the SNOR, PNOR and NAND Flash chips.
The USB OTG supports 2 modes of operation: slave mode and DMA mode. For USB
booting, the SPEAr device is configured in slave mode to perform transfers over bulk
endpoint.
Pad configuration
There is no pad multiplexing for USB IP pins. No pad needs to be enabled for USB OTG
boot mode.
Controller configuration
The following steps are performed during USB OTG initialization.
120/590
1.
All test registers and state machines in the USB 2.0 nanoPHY are reset.
2.
UHC1 port’s transmit and receive logic are reset.
3.
Wait is done until USB 2.0 nanoPHY PLL is locked.
4.
OTG HCLK is reset and enabled.
5.
Vendor ID (VID)/Product ID (PID) are initialized: If any of OTP bits 251 or 252 is set,
BootROM reads the VID and PID from the OTP memory. Otherwise, it uses the default
Doc ID 018553 Rev 3
RM0078
BootROM
VID (0x483) and PID (0x3802). Then, BootROM prepares USB string descriptors with
the following data:
–
MANUFACTURER - ST MICROELECTRONICS
–
PRODUCT_NAME - ST SPEAr SoC Family
–
DEVICE_ID - As read from the DIE_ID_3 and DIE_ID_4
After that, BootROM initializes the USB OTG controller registers and prepares the core for
device mode. The following registers are configured for this:
●
●
●
Global AHB configuration register (GAHBCFG)
–
Periodic TxFIFO is marked completely empty
–
IN Endpoint TxFIFO is marked completely empty
All interrupts except the following are masked:
–
OUT Endpoints Interrupt
–
USB Reset
–
Enumeration Done
–
Receive FIFO Non-Empty
Soft disconnect is removed by setting bit 1 in the Device control register (DCTL)
Once this configuration is done, BootROM waits for USB RESET interrupt. The RESET
interrupt is raised when the device is connected to the host using OTG cable. Once the
RESET interrupt is received, the generic USB device enumeration procedure is performed
and the device is enumerated as HIGH SPEED DEVICE.
The following tables list the descriptors used.
Table 36.
USB device descriptors
Length
(bits)
Offset
(bits)
Hex
Value
bLength
8
0
0x12
Descriptor size is 18 bytes.
bDescriptorType
8
8
0x01
DEVICE descriptor type
bcdUSB
16
16
0x0200
bDeviceClass
8
32
0x00
Each interface specifies its own class information.
bDeviceSubClass
8
40
0x00
Each interface specifies its own subclass information.
bDeviceProtocol
8
48
0x00
The device does not use class-specific protocols on a
device basis.
bMaxPacketSize0
8
56
0x40
Maximum packet size for endpoint zero is 64.
idVendor
16
64
0x0483
Vendor ID is 1155: STMicroelectronics.
idProduct
16
80
0x3802
Product ID is 14338.
bcdDevice
16
96
0x0100
The device release number is 1.00.
iManufacturer
8
112
0x01
The manufacturer string descriptor index is 1.
iProduct
8
120
0x02
The product string descriptor index is 2.
iSerialNumber
8
128
0x03
The serial number string descriptor index is 3.
bNumConfigurations
8
136
0x01
The device has 1 possible configuration.
Field
Description
USB Specification version 2.00
Doc ID 018553 Rev 3
121/590
BootROM
Table 37.
RM0078
USB configuration descriptors
Field
Length
(bits)
Offset
(bits)
Hex
Value
Description
bLength
8
0
0x09
Descriptor size is 9 bytes.
bDescriptorType
8
8
0x02
CONFIGURATION descriptor type
wTotalLength
16
16
0x0020
The total length of data for this configuration is 32. This
includes the combined length of all the descriptors
returned.
Warning : The value of wTotalLength is not equal to real
length
bNumInterfaces
8
32
0x01
This configuration supports 1 interface.
bConfigurationValue
8
40
0x01
The value 1 should be used to select this configuration.
iConfiguration
8
48
0x00
The device does not have the string descriptor describing
this configuration.
bmAttributes
8
56
0xC0
Configuration characteristics :
Bit 7: Reserved (set to 1)
Bit 6: Self-powered (set to 1)
Bit 5: Remote Wakeup (set to 0)
Note: The rest of the bits are reserved and set to 0.
bMaxPower
8
64
0x00
The maximum power consumption of the device in this
configuration is 0 mA.
Table 38.
USB interface descriptors
Length
(bits)
Offset
(bits)
Hex
Value
bLength
8
72
0x09
Descriptor size is 9 bytes.
bDescriptorType
8
80
0x04
INTERFACE descriptor type
bInterfaceNumber
8
88
0x00
The number of this interface is 0.
bAlternateSetting
8
96
0x00
The value used to select the alternate setting for this
interface is 0.
bNumEndpoints
8
104
0x02
The number of endpoints used by this interface is 2
(excluding endpoint zero).
bInterfaceClass
8
112
0x00
Unknown class
bInterfaceSubClass
8
120
0x00
The subclass code is 0.
bInterfaceProtocol
8
128
0x02
The protocol code is 2.
iInterface
8
136
0x00
The device does not have a string descriptor describing
this interface.
Field
122/590
Description
Doc ID 018553 Rev 3
RM0078
Table 39.
BootROM
USB IN endpoint descriptors
Length
(bits)
Field
Offset
(bits)
Hex
Value
Description
bLength
8
144
0x07
Descriptor size is 7 bytes
bDescriptorType
8
152
0x05
ENDPOINT descriptor type
bEndpointAddress
8
160
0x81
This is an IN endpoint with endpoint number 1.
Types
bmAttributes
8
168
0x02
Transfer: BULK
Pkt Size Adjust: No
wMaxPacketSize
16
176
0x0200
Maximum packet size for this endpoint is 512 Bytes. If
High-Speed, 0 additional transactions per frame.
bInterval
8
192
0x00
The polling interval value is every 0 Frames. Undefined
for High-Speed.
Table 40.
USB OUT endpoint descriptors
Length
(bits)
Offset
(bits)
Hex Value
bLength
8
200
0x07
Descriptor size is 7 bytes.
bDescriptorType
8
208
0x05
ENDPOINT descriptor type
bEndpointAddress
8
216
0x02
This is an OUT endpoint with endpoint number 2.
Field
Description
Types
bmAttributes
8
224
0x02
Transfer: BULK
Pkt Size Adjust: No
wMaxPacketSize
16
232
0x0200
bInterval
8
248
0x00
Table 41.
Maximum packet size for this endpoint is 512 Bytes. If
High-Speed, 0 additional transactions per frame.
The polling interval value is every 0 Frames. If
High-Speed, 0 uFrames/NAK.
USB string descriptors
Field
Length
(bits)
Offset
(bits)
Hex Value
Description
String Descriptor 1
bLength
8
0
0x28
Unicode String Length is 40 bytes (19 chars).
bUnicodeType
8
8
0x03
Second Byte of Unicode
STRING
8
16
0x53
String: ST SPEAr SoC Family
String Descriptor 2
bLength
8
0
0x04
Descriptor size is 4 bytes.
bUnicodeType
8
8
0x03
Second Byte of this descriptor
wLANGID[0]
16
16
0x0409
Language Id: 1033
Doc ID 018553 Rev 3
123/590
BootROM
Table 41.
RM0078
USB string descriptors (continued)
Length
(bits)
Field
Offset
(bits)
Hex Value
Description
String Descriptor 3
bLength
8
0
0x24
Unicode String Length is 36 bytes (17 chars).
bUnicodeType
8
8
0x03
Second Byte of Unicode
STRING
8
16
0x30
String: SPEAr
Device access
In the USB bootmode, SPEAr acts as USB device. BootROM uses the following custom
protocol for data transmission between the host and the device:
1.
The first packet sent by the host is a a 64-byte packet. These 64 bytes constitute the
image header.
2.
The length of all other packets, but the last one, is equal to 512 bytes.
3.
The last packet length is less than or equal to 512 bytes.
Note:
BootROM validates the first packet (image header). If the validation is not successful, it
ignores all subsequent packets. Otherwise, it stores the image at the load address (present
in the image header)
7.2.7
Xloader authentication and execution
Once the image is loaded in the SRAM, it is authenticated. Depending upon the OTP
configuration, the following checks are possible (in the order written below):
1.
Data CRC verification (with Xloader devoted field) and RSA PUBLIC KEY signature
verification. If data CRC is enabled, CRC check is done against image data.
2.
Image decryption: if security is enabled, the image is decrypted using the security key
from OTP.
If both of these checks are successful, I-cache is invalidated and the code jumps to the load
address present in the image header (ih_load).
It is possible for the Xloader to return back to the BootROM code. If this happens, Xloader
returns to the BootROM with an address of the next-level bootloader (possibly u-boot). This
address lies in the memory map of the bootmode selected. For instance, if SNOR boot
mode is selected and Xloader returns back to the BootROM, it returns with an address lying
in the SNOR memory location where the next-level bootloader is present. Again, BootROM
copies the image header, validates it, copies image data, validates and executes the image.
In this way, it is possible to execute multiple images from BootROM.
Error scenarios
If data CRC check or image decryption fail:
●
Note:
Security checks are applied also in the default boot mode.
●
124/590
For memory boot modes (SNOR, PNOR, NAND) the default boot mode is executed.
Refer to Section 7.2.9: Default boot mode for details.
For peripheral boot modes (UART, SD/MMC, USB) no action is taken. The SoC should
be restarted by the user.
Doc ID 018553 Rev 3
RM0078
BootROM
BootROM flow summary
The following flowchart summarizes the complete BootROM software design.
Figure 35. BootROM flowchart
SYSTEM RESET - BootROM
entry point
System initialization:
1. Disables MMU
2. Invalidates I-Cache and enables it
Wake up
triggered?
YES Jump to ALWAYSOn-RAM location
(0xE0800000)
OUT
NO
1. Initialize PLL
2. Setup Stack
3. Initialize Data
segment and BSS
NO
Watchdog reset?
YES
Set bit 0 of SYS_SW_RST
register to trigger complete
SoC reset.
On 0xB3800604
write “WD0”
YES
Execute security functions
Security enabled?
NO
HANG
Check boot type
YES
Is security
enabled?
YES
NO
1. Initialize SMI IP and
enable pads
2. Get image header
and authenticate.
YES
YES
Security
execution
successful?
Boot Bypass?
NO
NO
HANG
1. Initialize the peripherals/controller
required by the boot mode.
2. Get X-loader header from the boot source.
3. Authenticate X-Loader header
Header
authentication
successful?
Header
authentication
successful?
NO
Go to default
boot mode
YES
Jump to image
load address
NO
Go to default
boot mode
Receive complete
X-loader image
NO
YES
Data CRC
enabled?
YES
NO
Security validation
successful?
Data CRC ok?
NO
YES
Security enabled?
NO
YES
Jump to image
load address
Doc ID 018553 Rev 3
125/590
BootROM
RM0078
Here is a brief description of the flow:
1.
BootROM does the basic system initialization.
2.
After initialization, BOOTSTRAP_CFG miscellaneous register (0xE0700004) is read to
get the boot mode selected.
3.
If it is a Boot Bypass, the SMI controller is initialized and the image header is read from
the SMI Flash (0xE6000000). Once the header authentication is successful, the code
jumps to the load address (present in the header). Refer to Section 7.2.8: Image
header authentication for details on header authentication.
4.
If it is not a Boot Bypass, the image header is read from the source. The source can be
any of the following: SNOR Flash, PNOR Flash, NAND Flash, MMC card, SD card,
sent through the Kermit protocol (UART boot) or sent as USB packets (USB boot).
5.
The image header is authenticated and the load address is extracted from it. BootROM
then copies the complete image from the source to the load address.
6.
If data CRC is enabled, then data CRC check is performed.
7.
If security is enabled, the image is decrypted using the security key and more security
checks are performed.
8.
Finally, the code jumps to the image load address.
Error scenarios
7.2.8
●
If the initial security function execution fails, the SoC hangs.
●
Boot Bypass requires security to be disabled. If security is enabled and Boot Bypass is
selected, the SoC hangs.
●
If the header authentication fails, the default boot mode is triggered.
●
If DCRC check fails, the default boot mode is triggered.
●
If the image decryption or any other security check fail, the default boot mode is
triggered.
Image header authentication
The image header is 64-byte long. It has the following structure:
typedef struct image_header
{
uint32_t ih_magic; /* Image Header Magic Number */
uint32_t ih_hcrc; /* Image Header CRC Checksum */
uint32_t ih_time; /* Image Creation Timestamp */
uint32_t ih_size; /* Image Data Size */
uint32_t ih_load; /* Data Load Address */
uint32_t ih_ep; /* Entry Point Address */
uint32_t ih_dcrc; /* Image Data CRC Checksum */
uint8_t ih_os; /* Operating System */
uint8_t ih_arch; /* CPU architecture */
uint8_t ih_type; /* Image Type */
uint8_t ih_comp; /* Compression Type */
uint8_t ih_name[IH_NMLEN]; /* Image Name */
} image_header_t;
Figure 36 describes the image header authentication logic.
126/590
Doc ID 018553 Rev 3
RM0078
BootROM
Figure 36. Header authentication flow
Entry point - Image
Header Authentication
NO
ih_magic =
IH_MAGIC?
YES
Calculate header crc
NO
Calculated
hcrc = ih_hcrc ?
Invalid header!
Return FAILURE
YES
ih_load >= 0xB2800000
and
< bootrom_mem_end ?
NO
YES
Valid Header
Return SUCCESS
The magic number is defined as:
#define IH_MAGIC 0x27051956
To be considered valid, the image header should follow the rules below:
1.
ih_magic should match the value IH_MAGIC
2.
ih_hcrc should match the calculated crc of image_header_t
3.
ih_load should not fall in between SYSRAM0 start address and bootrom_mem_end.
When data CRC is enabled, ih_dcrc is used to validate the complete X-Loader image. It
should match the calculated CRC of the image.
Note:
bootrom_mem_end is the SYSRAM0 address up to where the BSS region goes. It is less
than 0xB3801500.
Doc ID 018553 Rev 3
127/590
BootROM
7.2.9
RM0078
Default boot mode
BootROM executes the default boot mode only if the following primary boot modes fail:
●
Boot Bypass
●
SNOR boot
●
NAND boot
●
PNOR boot
The system needs to be reset in case that primary boot fails:
●
UART boot
●
SD/MMC boot
●
USB boot
Therefore, depending upon bootstrap pin configuration, USB booting or UART booting is
triggered in case any failure occurs in primary boot mode.
Figure 37 explains the default boot mode behavior.
Figure 37. Default boot mode flow
Default boot mode triggered
YES
Is the current
boot mode - UART
or USB or SDMMC?
Do nothing. Let the user
reset the SoC.
NO
Is USB
configured as
default boot
mode?
YES
NO
Go to UART boot mode
128/590
Doc ID 018553 Rev 3
Go to USB boot mode
RM0078
BootROM
7.3
Secure boot
7.3.1
Overview
The first stage boot is the ROM code inside of the SPEAr SoC. Once the SPEAr is
provisioned with security information, and the secure mode is enabled in the OTP, the device
requires cryptographically signed code for the second stage boot. It is up to the second
stage boot to provide security for the following stages. Services are provided to the second
stage that allow code to use the same cryptographic functions for the following stages.
Alternately, the second stage can provide its own security model.
7.3.2
First stage secure boot process
The boot process for secure boot requires a signed and encrypted image. Device boot will
fail if the image supplied does not have a signature, or if it has a corrupted or invalid
signature. The device will not retry the boot process if an invalid image is detected. A device
reset is required to resume the boot process.
The boot image is protected by an RSA PKCS#1 v2.1 digital signature, and the boot code is
protected from casual viewing with a key derived from OTP data and image broadcast
descriptor (a user-defined field in the image header).
The following procedure outlines the boot process. Details of the cryptographic algorithms
can be found in Section 7.3.10: Image signature cryptography.
1.
Note:
Load the boot code into the SRAM on the device, using any boot source selected by
the standard boot loader.
RAM is required for the boot process due to in-place decryption.
2.
Retrieve the digital signature from the end of the boot image.
3.
Use C3 to generate a hash of public keys and check this against OTP public key
signature. Return error code if the public key hash is incorrect
4.
Using C3, hash the boot image.
5.
Verify the signature held in Flash using the Master Public Key and PKCS#1 v2.1 PSS
ESMA. Return error code if does not match.
6.
Extract the OTP data, signature data and image broadcast descriptor into the C3 buffer.
7.
Perform key derivation of data from step 6 to generate the AES key.
8.
Decrypt the code in place (SRAM) using the KDF key.
9.
Hand off the code execution to the boot code.
Boot code has access to the cryptographic API outlined above and could repeat the
process, or implement its own solution.
Doc ID 018553 Rev 3
129/590
BootROM
RM0078
Figure 38. First stage secure boot process
Reset
security = *(sec_interface_table*)
0xFFFF7E00
security.init
(SEC_NORMAL_SECURE_MODE)
Error?
Yes
For(;;);
No
sec_state = security.get_state()
Boot bypass &&
Sec_state == None
Verify CRC
Execute from Flash
Yes
No
sec_state = security.get_state()
Boot from Peripheral/Flash
(normal boot loader process)
No
For(;;);
Yes
Load successful?
Yes
Verify header CRC
Load image
Verify image digital signature
No
Verify header CRC
Load image
Verify image CRC
Execute from eSRAM
Image verified
Yes
Decrypt in place
Execute
No
Out
For(;;);
Out
130/590
Doc ID 018553 Rev 3
RM0078
7.3.3
BootROM
Life cycle
Life cycle is part of any security application. The ROM-based services and intrinsic chip
capabilities are limited to a fixed set of functions, and are considered unchangeable. This
does not prevent implementation of life cycle modifications in subsequent boot code.
Secure ROM life cycle states
From the device point of view there are two states:
●
●
Development
–
Non-Secure ROM device shipped to customer
–
JTAG enabled
Release
–
Secure ROM device shipped to customer
–
JTAG disabled and 'Secure' indicator provided to ROM
–
Provision OTP with security credentials
Once the second link of the chain of trust is up and running the customer code can
implement any kind of life-cycle required for specific needs. It is assumed that the first level
boot does not change frequently, and implements a strong security model.
7.3.4
Services
BootROM services
typedef struct boot_rom_callbacks_{
unsigned long
table_version;
void *
(*get_soc_type)(void);
unsigned long
(*get_boot_type)(void);
void
*nand_info;
int
(*nand_read)(void *nand, unsigned int offset, unsigned int *length,
unsigned char *buffer);
unsigned char *
(*get_version)(void);
int
(*get_otpbits)(unsigned long bit_off, unsigned long bit_cnt,
unsigned long *buffer);
unsigned long
(*hamming_encode)(unsigned long parity, void *data, unsigned int d,
unsigned int nr);
void
(*hamming_fix)(void *data, unsigned int d, unsigned int nr,
unsigned int fix);
} boot_rom_callbacks_t;
Doc ID 018553 Rev 3
131/590
BootROM
RM0078
Secure ROM services
The secure ROM service table is located at 0xFFFF7E00, and can be verified by checking
the table version and revision fields. These services are used by the BootROM to validate
the second stage boot image that gets loaded into SRAM, before the BootROM transfers
control to the SRAM code.
Error codes and definitions
/**
* Security interface error code definition
* \ingroup values
*/
typedef enum sec_err_{
SEC_SUCCESS = 0,
/**< Regular success code */
SEC_UNSUPPORTED = 1,
/**< The requested feature is not supported in this
configuration */
SEC_IMAGE_VERIF_FAILED = 2, /**< Indicate digital signature verification failed*/
SEC_INVALID_PARAMS = 3,
/**< Indicate an invalid parameter has been passed */
SEC_ALREADY_PRESENT = 4,
/**< Indicate the operation has already been
performed and cannot be performed another time (life
cycle) */
SEC_SELF_TEST_FAILED = 5,
/**< Indicate self test has failed. \note The caller
shall enter infinit loop*/
SEC_OUT_OF_RESOURCES = 6,
/**< internal memory allocation failed, probably need
more memory or there is a fragmentation issue */
SEC_INVALID_BLOB = 7,
/**< The blob given as input is invalid */
SEC_INVALID_KEY = 8,
/**< The keys given as parameter are not valid i.e.
their hash didn't match */
SEC_INVALID_STATE = 9,
/**< The operation requested is not supported in the
current state of the device */
SEC_OTP_CORRUPT = 10,
/**< The operation requested reported a corruption
within the OTP memory */
}sec_err_t;
/**
* \typedef sec_init_t
* Defines the initialization mode:
* \ingroup values
*/
typedef enum sec_init_{
SEC_NORMAL_SECURE_MODE,
/**< Regular operating mode */
SEC_FAKE_SECURE_MODE,
/**< Test mode used in order to simulate secure mode
in one of the following 2 cases:
- a. ROM only device and bonding option is NOT SECURE
- b. NV RAM device and fuse in life cycle hasn't been
blown yet */
SEC_ALWAYS_LAST,
/**< Sentinel for the enum type */
}sec_init_t;
132/590
Doc ID 018553 Rev 3
RM0078
BootROM
typedef enum sec_state_{
/**< Indicates security is enabled at device level */
SEC_SECURITY_ENABLED,
SEC_SECURITY_DISABLED,
/**< Indicates security is disabled at device level
*/
}sec_state_t;
7.3.5
Security table in BootROM
The security services implemented in the BootROM are provided at a well known location
(see BootROM and RAM layout). These services allow subsequent loader code to access
the cryptographic algorithms, and C3 hardware. The services are called by C code and
conform to the standard ARM C calling conventions. The services allow the caller to assess
the state of the secure boot environment and implement the same cryptography used by the
secure ROM code in the follow-on boot stages.
Table 42.
Security table
Field
Definition
revision
Firmware revision - reflects which services are provided
table_rev
Table revision - reflects table structure
mem_size
Size of the memory required by the security module
init
Performs initialization (an array of callback function is expected)
get_state
Returns the current security state of the device
verify_image
Verifies the current image
decrypt_image
Decrypts the current image using PKCS#1 v2.1
sign_challenge
Signs the incoming challenge using PKCS#1 v2.1 (future support)
create_rng_pool
Creates random pool - creates a pool of random numbers in OTP. Random
pool (RNG_POOL) is a security parameter in OTP and is used as part of
the KDF to generate the encryption key for the firmware.
provision
Flips life cycle state to next state (future support)
seal
Seals code or data (future support)
unseal
Unseals code or data (future support)
clear_lifecycle
De-commission the device - all secrets are lost (future support)
C function call information for security services
typedef struct sec_interface_table_{
unsigned long revision;/**< Firmware revision number */
unsigned long table_rev;/**< API revision */
unsigned long mem_size;/**< Size of memory required by the security module */
sec_err_t (*sec_init_fn)(sec_init_t flag, void * mem, unsigned long mem_size,
boot_rom_callbacks_t * rom_cb);
sec_state_t (*sec_get_state_fn)(void * mem);
Doc ID 018553 Rev 3
133/590
BootROM
RM0078
sec_err_t (*sec_verify_image_fn)(void * mem, unsigned char * image,
unsigned long image_size);
sec_err_t (*sec_decrypt_image_fn)(void * mem, unsigned char * image_src,
unsigned char * image_dst, unsigned long image_size);
sec_err_t (*sec_sign_challenge_fn)(void * mem, unsigned char * challenge,
unsigned long challenge_size,
unsigned char * response,
unsigned long * response_length);
sec_err_t (*sec_create_rnd_pool_fn)(void * mem, unsigned char * pub_keys,
unsigned char * out_buf);
sec_err_t (*sec_provision_fn)(void * mem, unsigned long * cycle);
sec_err_t (*sec_seal_blob_fn)(void * mem, unsigned char * blob_in,
unsigned long sensitive_offset,
unsigned long blob_in_len,
unsigned char * blob_out);
sec_err_t (*sec_unseal_blob_fn)(void * mem, unsigned char * blob_in,
unsigned long sensitive_offset,
unsigned long blob_in_len,
unsigned char * blob_out);
sec_err_t (*sec_clear_life_cycle_fn)(void * mem );
}sec_interface_table_t;
134/590
Doc ID 018553 Rev 3
RM0078
7.3.6
BootROM
BootROM and RAM layout
Following is the final Flash layout, which includes all the BootROM sections, all the secure
ROM sections and the tables.
Figure 39. BootROM and RAM layout
ROM area
SRAM area
0xFFFF0000
BootROM
0xFFFF4FFF
Security
extensions
0xFFFF5000
Security service
table
7.3.7
OTP layout
SPEAr has three 256-bit banks (Bank 1, Bank 2 and Bank M) embedded into the OTP
module, which is an array of one-time programmable anti-fuse memory cells reserved for
system assigned purposes.
There are two types of data stored in OTP:
Modifiable data
This data can be updated from a 0 to a 1 at any time and the results effect the function of the
device over time.
Unmodifiable (fixed) data
This data may never be modified and is protected from modification attempts by a CRC and
ECC. Any modification of this data will either result in an ECC/CRC failure or ECC
correction (if the modification is within the tolerance of the ECC correction capabilities).
The purpose for this data is to remain constant for the entire life of the device, until the end
of life (when it should be destroyed by zeroizing the data (setting all bits to 1, or burned
state).
This OTP area should be write protected (if possible) by hardware mechanisms.
The security parameters (all unmodifiable) are as follows:
Table 43.
Security parameters
RSA PUBLIC KEY
2048 bits
RNG_POOL
128 bits (from bank M bit 104)
Doc ID 018553 Rev 3
135/590
BootROM
RM0078
Table 43.
Security parameters (continued)
PUBLIC_KEY_HASH
256 bits: 1 bit at Bank 1 (bit 177) and 255 bits at Bank 2 (bits 0 to 254)
PUBLIC_KEY_HASH = SHA256 (RSA_PUBLIC_KEY)
DATA[256 bits]
SHA-256 (RNG_POOL + PUBLIC_KEY_HASH)
CRC[32 bits]
DATA[bits 0-31] xor DATA[bits 32-63] xor DATA[bits 64-95] xor DATA[bits
96-127] xor DATA[bits 128-159] xor DATA[bits 160-191] xor DATA[bits
192-223] xor DATA[bits 224-255]
The CRC protects the randompool (RNG_POOL) and the public key hash
(PUBLIC_KEY_HASH) from modification. The CRC is generated by taking
the result of the SHA-256 of this information (DATA defined above) and
performing a 32 bit XOR of the resulting 8 32-bit words of DATA. The
resulting 32 bits are broken up into two locations in Bank 1 (31 bits in one
location and 1 bit (the high bit) in CRC MSB).
ECC
hamming_code(PUBLIC_KEY_HASH + RNG_POOL + CRC
BANK 1 configuration for secure boot
1 bit
31b
xxxx
Security
CRC
0xff
Note:
1
8b
VE1
8b
ST1
8b
VE0
8b
ST0
20b
1b
1b
Reserved CRC Key
(1)
MSB MSB
0xfe....0xe0 0xdf........................0xc0
10b
160b
Security
ECC
Unused
0xbf.....................................0xa0 0x9f.........................................0
These bits are not used for secure boot, they are reserved for other purposes.
Field description
Security CRC:
Low 31 bits of the CRC
VE0/VE1:
Version number: 0
ST0/ST1:
Security Lifecycle: 3 = SECURE_BOOT
CRC MSB:
Bit 31 of the CRC
Key MSB:
Bit 255 of PUBLIC_KEY_HASH
Security ECC:
ECC code
BANK 2 configuration for secure boot
1 bit
255b
xxxx
Public key hash
0xff 0xfe.................................................................................................................................................................0
Field description:
Public key hash:
136/590
Bits 0 to 254 of PUBLIC_KEY_HASH
Doc ID 018553 Rev 3
RM0078
BootROM
BANK M configuration for secure boot
1 bit
1b
1b
1b
1b
1b
1b
1b
1b
1b
1b
4b
xxxx
S1
S0
V1
V0
J1
J0
T1
T0
E1
E0 Reserved
137b
Security
72b
32b
USB/PCI IDs WP bits
0xff
Refer to the “OTP configuration” section for the field description of Bank M.
OTP life cycle identification
The life cycle of the OTP is controlled during initial provisioning. Since the ARM has
complete access to the OTP bits, changes to the lifecycle cannot be controlled by the ROM.
Security of the OTP must be enforced by the application and is out of scope for the ROM.
ST0 and ST1 should be programmed to the same value. The value is read from OTP as
(value = ST0 or ST1). Therefore, if ST0 is 3 and ST1 is 4, the resulting value (used) would
be 0x7.
STx is a walking set of bits defined as follows:
00000000 = No security
00000001 = Security option 1 (see Note: 1)
0000001x = Security option 3 (see Note: 1)
000001xx = Provisioned (security enabled)
00001xxx-1xxxxxxx = Decommissioned
Note:
1
Security options 1 and 3 are used only for debugging. These modes allow you to pre-test
code by enabling security checks and disabling CRC/ECC checks.
ST0 and ST1 are not part of the XOR calculation. Any bits that are defined twice are
duplicated and when read, are read as a logical OR (value1 OR value2). That way, if any
OTP bits did not get blown correctly, the secondary blown (1 value) bit will override the value
and be the used value of 1. Therefore for the lifecycle, a value of 1100 and a value of 0100
would be 1100. This should correct a single bit error in the lifecycle. Furthermore, the
lifecycle is defined such that once one of the higher bits is set, there is no way to go back.
Once it is provisioned, the only thing that can be done to change it is move to
decommissioned.
7.3.8
Usage examples
Examples of operating system integration
This section covers:
●
Supervisor protection
●
Hypervisor protection
Supervisor protection
Supervisor only protection is implemented by the OS entirely. The boot ROM acts only as a
root of trust and the OS must guarantee that the code in DDR cannot access any of the
secrets held in internal SRAM. In that case the remapping of the vector table cannot be
done in external DDR.
The biggest drawback of this implementation is that it requires all the supervisor code to run
out of internal memory.
Doc ID 018553 Rev 3
137/590
BootROM
RM0078
Hypervisor protection
It is similar to the supervisor protection in the sense that it requires the hypervisor code to
run entirely out of internal memory. The main advantage here is to allow the OS to run in
external DDR (as it runs in user mode).
The performance penalty is significant in that case, plus the OS needs some modification as
the hypervisor is actually implementing para-virtualization and not full virtualization.
This adds latency on every interrupt and might not be acceptable in some cases.
The hypervisor held in external flash working in conjunction with the boot ROM extends the
trust model to all accesses to any of the keys is fully protected and handled by the
hypervisor itself and enforced by the MMU.
Expected flow for the deployment of secure boot
The following flow describes operations for development and production of secure mode
boot code:
1.
Device is shipped from ST Manufacturing plant (or TSMC) in a non-secure state.
Meaning the device always boots up in a non-secure mode. During development, it is
useful to stay in this mode as long as the development is in progress.
2.
After development, test a final image. To do this:
Encrypt image using the secure ROM SDK
–
Create an RSA keys
–
Sign image
–
Generate OTP provisioning data
b)
Write OTP provisioning data to OTP
3.
Once secure-mode debug is completed, disable JTAG on the device by setting the
JTAG disable OTP bits. Development has ceased, and production OTP flashing
solution (using above OTP data) is deployed.
4.
Boot in secure mode. From this point on the device boots in secure mode.
5.
Firmware update
6.
7.3.9
a)
a)
The authentication is handled by customer code
b)
Signed image using same factory root keys (step 2a) is downloaded to the device
c)
Flash new image
Decommission by setting the OTP lifecycle bit (ST0 and ST1) to 0xf
BootROM signed image format
The bootROM requires that the second stage loader be wrapped in a U-Boot header format.
The SecureROM SDK will take an unsigned image and cryptographic keys generated by the
SDK, and sign the image. It will also provide the recommended OTP settings.
Note:
A standard U-Boot image header is used to identify the image, it is not actually U-boot, as
U-boot is too large to fit within SRAM. The entire second stage image, plus the BOOT
ROM's data area and the digital signature must all fit in SROM (32KB) for secure boot to
work properly.
The structure of the cryptographic trailer is defined as follows:
138/590
Doc ID 018553 Rev 3
RM0078
BootROM
Figure 40. Boot image format
U-Boot Header
(unencrypted)
NOTES:
1) U-Boot Header’s
payload size includes the
Digital Signature.
U-Boot Image
(encrypted)
2) The load address for the
image must be in SRAM or
DDR memory. This is
required due to the
image decryption phase.
U-Boot payload size
(pad to 16 byte boundary)
Digital signature
(unencrypted)
Since the digital signature is included in the U-Boot payload size, and the signature is
required to be in SRAM when the signature verification is performed, the overall image size
supported is reduced by the size of the digital signature size (0x70 bytes in size). The digital
signature must also be aligned on a 16-byte boundary, therefore the image must be padded
with up to 15 bytes to align it correctly.
#define KEY_LEN_BYTES (256/8)
typedef struct image_crypto_header_{
unsigned char
pub_enc[KEY_LEN_BYTES];// 0x00
unsigned char
pub_sig[KEY_LEN_BYTES];// 0x20
unsigned long
hdr_revision;// 0x40
unsigned long
broadcast_desc;
unsigned long
revision;
unsigned long
signature_properties;
unsigned char
signature[KEY_LEN_BYTES];// 0x50-0x6f
}image_crypto_header_t;
7.3.10
Image signature cryptography
The boot image is encrypted (using NIST SP800-108 KDF of OTP data), and signed with a
PKCS#1 v2.1 PSS algorithm using an RSA 2048 private key. The public key is included in
the signed blob, and is verified by comparing the thumbprint of the RSA-2048 public keys
used to sign code with OTP data. If they match, the public keys are considered valid, and
are used to verify the PKCS#1 signature.
The KDF using RL and the security value from OTP allows the user to select an encryption
key based on a device key, a version and security key. This key is used independently from
the validation, and decrypts the boot image before execution
OTP parameters:
security value, VE0, VE1, and public key hash
Cryptographic header parameters:
broadcast_desc
A || B:
the concatenation of binary strings A and B
Doc ID 018553 Rev 3
139/590
BootROM
RM0078
Parameters
B:
Blob of data to sign (boot code)
B p:
B filled to 16 byte boundary
B e:
Bp encrypted
Ke:
RSA private key exponent
Km:
RSA private key modulus
KS:
RSA public key of Ke/Km
KP:
RSA public key value (future support for key-chaining)
PKD:
SHA_256 (KP || KS)
RC :
hexadecimal string 01020304050607
RL :
(VE0 | VE1) XOR broadcast_desc
Rv :
security value (128 bits from OTP "security" field)
RK :
KDF(KI = Rv, Label = 8, Context = RC , L = RL)
Ri :
hexadecimal string a6a6a6a6a6a6a6a6a6a6a6a6a6a6a6a6
Provisioning
1.
Compute PKD and store in OTP public key hash
2.
Create 128 bits of security data from random number generator and store in OTP
security
3.
Set OTP VE0 and VE1 to the same number
4.
Set OTP ST0 and ST1 to 4 (provisioned)
5.
Compute CRC and store in OTP
6.
Compute ECC and store in OTP
Signature generation algorithm
1.
Create Bp using B (boot image including U-Boot header). Round the size up to the
nearest 16-byte boundary and fill with 0xa6.
2.
Verify that the image fits in the SRAM (SRAM size minus bootROM SRAM usage).
3.
Compute RK from OTP and cryptographic header data.
4.
Encrypt Bp with AES CBC-128 using key Rk and iv Ri. This creates Be.
5.
140/590
Create signature using RSA PKCS 2.1 EMSA-PSS with:
–
Message payload = Be
–
RSA public key Ks
6.
Store RSA public key Ks in cryptographic header pub_sig field.
7.
Store signature in cryptographic header.
8.
Store broadcast_desc in cryptographic header.
Doc ID 018553 Rev 3
RM0078
BootROM
Signature verification algorithm
Preconditions: bootloader loads Be and cryptographic header into SRAM, and OTP has
been programmed. ST0 OR ST1 = 4, 5, 6, or 7.
1.
Compute RK from OTP and cryptographic header data
2.
Compute PKD from public keys stored in cryptographic header
3.
Validate public keys by comparing PKD to OTP
If PKD from OTP (step 1 from provisioning) is the same as PKD calculated from step 2.
then RSA keys are valid
4.
Using public key Ks in the cryptographic header pub_sig field, verify the signature in the
cryptographic header against Be using PKCS#1 v2.1 PSS
5.
Decrypt Be in place (in SRAM) with RK computed from step 1
6.
Transfer control to decrypted B
Doc ID 018553 Rev 3
141/590
BootROM
RM0078
7.4
Additional information
7.4.1
BootROM on Core 1
SPEAr1340 is based on ARM Cortex A9 processor. For system startup, only one core is
necessary. The other one must be stalled and continue only when the SMP OS has been
loaded. Hence, it becomes essential for the BootROM to behave differently on the two
different cores. This can be easily done by finding the core ID on runtime using the following
instruction:
MRCp15, 0, r0, c0, c0, 5 /* Get our cpu id */
Core 1 is stalled using the Pen holding mechanism: Core 1 enters into a tight read-loop on
the fixed SYSRAM0 location (0xB3800600), waiting for an external event (holding pen
release). This event eventually happens in the SMP OS, when the OS writes a valid address
at the pen holding location.
Figure 41 shows the BootROM flow on Core 1.
Figure 41. BootROM on Core 1
SYSTEM RESET - BootROM
entry point
System initialization:
1. Disables MMU
2. Invalidates I-Cache and enables it
Wake up
triggered?
YES
Jump to ALWAYSOn-RAM location
(0xE0800000)
OUT
On 0xB3800604
write “WD1”
Set bit 0 of SYS_SW_RST
register to trigger complete
SoC reset.
NO
Watchdog reset?
YES
NO
Check address at
pen holding location
(0xB3800600)
NO
142/590
Address =
0xFFFFFF?
YES
Doc ID 018553 Rev 3
Jump to the
address
RM0078
7.4.2
BootROM
Error codes
If error reporting is enabled and BootROM encounters any error, it reports the same on the
error reporting device. The following table summarizes various error codes and their
meaning.
Table 44.
Error codes
Error code
7.4.3
Possible boot mode
Definition
101
na
This error signifies unsupported boot mode.
102
Any
Image header is corrupted.
103
Any
Image data is corrupted.
106
NAND
BootROM is unable to initialize the NAND chip.
107
NAND
BootROM is unable to read NAND chip. Possibly, the
first four blocks are bad blocks.
114
SD/MMC
BootROM is unable to find/initialize SD/MMC card.
115
SD/MMC
BootROM is unable to read data from SD/MMC card
116
USB OTG
The first USB data packet is not 64 bytes long.
117
USB OTG
Image data of size not equal to the one specified in
the image header.
List of supported devices
SNOR
All devices, whose read command is 0x03, are supported.
PNOR
TBD
NAND
All ONFI devices complying to following versions:
●
ONFI v1.0
●
ONFI v2.0
●
ONFI v2.1
●
ONFI v2.2
●
ONFI v2.3
Additionally, the following devices are supported.
Table 45.
Supported NAND devices
Device part
Vendor
Density
Bus width
Page size
NAND02GW3B2CN6
ST
2 GBit
x8
2048 bytes + 64 bytes
NAND02GW3A
Numonyx
2 GBit
x8
2048 bytes + 64 bytes
NAND08GW3B2CN6
Numonyx
8 Gbit
x8
2048 + 64 spare bytes
NAND512W3A2C2A6
ST
512 Mbit
x8
512 + 16 spare bytes
Doc ID 018553 Rev 3
143/590
BootROM
Table 45.
RM0078
Supported NAND devices (continued)
Device part
Vendor
Density
Bus width
Page size
NAND01GW4B2AN6
ST
1 GBit
x16
1024 words + 32 spare
NAND01GW3B2BN6
ST
1 GBit
x8
2048 + 64 spare bytes
NAND04GW3B2BN6
ST
4 GBit
x8
2048 + 64 spare bytes
NAND128W3A28N6
ST
128 MBit
x8
512 + 16 spare bytes
NAND256W3A2BN6
ST
256 MBit
x8
512 + 16 spare bytes
K9K8G08V0A
SAMSUNG
8 GBit
x8
512 + 16 spare bytes
K9F4G08V0A
SAMSUNG
4 Gbit
x8
512 + 16 spare bytes
K9F2G08V0A
SAMSUNG
2 GBit
x8
512 + 16 spare bytes
K9F1208V0A
SAMSUNG
64 MBit
x8
K9F8G08V0M
SAMSUNG
8 Gbit
x8
K9F1G16U0M
SAMSUNG
1 GBit
x16
1024words + 32 spare
KM29U256
SAMSUNG
256 MBit
x8
512 + 16 spare bytes
NAND01GR3B
ST
1GBit
x8
2048 + 64 spare bytes
SD/MMC
●
All SD cards complying to v2.0 and 1.0 are supported.
●
All MMC cards are supported.
USB
High/Full-speed USB v2.0 Host is supported.
7.4.4
BootROM table
BootROM defines a table of re-entrant BootROM routines that can be used by the next
bootloading levels as a library.
The table format is the following one, represented as a C structure:
#define TABLE_VERSION_2_0 2
#define TABLE_VERSION_2_1 3
const __attribute__ ((section(”.table”))) struct table_s spear_table = {
.table_version = TABLE_VERSION_2_1,
/* offset 0x00 */
.get_boot_type = getboottype,
/* offset 0x04 */
.get_soc_type = getsoctype,
/* offset 0x08 */
.nand_info = &nand_info[0],
/* offset 0x0C */
.nand_read = nand_read_skip_bad,
/* offset 0x10 */
.get_version = getversion,
/* offset 0x14 */
.get_otpbits = get_otpbits,
/* offset 0x18 */
.hamming_encode = hamming_encode
/* offset 0x1C */
.hamming_fix = hamming_fix
/* offset 0x20 */
} ;
The table fields are all pointers to functions or structure, and basically can be divided in few
groups according to their functionality.
144/590
Doc ID 018553 Rev 3
RM0078
BootROM
Generic info
The purpose of this table section is to get BootROM or SoC generic information.
/* BootROM table version */
u32
table_version;
/* This routine returns a string containing the BootROM version
*
* Format:
* BOOTROM_VERSION = * $(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
*/
u8 * (*get_version)(void);
/* This routine returns the SOC type we are running onto.
* It returns a pointer to following structure:
* struct soc_type_s {
*
u8
soc;
*
u8
revision;
* } ;
*/
struct soc_type_s * (*get_soc_type)(void);
/* This routine returns the boot type selected with the strapping
* option (four bits)
*/
u8
(*get_boot_type)(void);
NAND Flash info and access
The code to access the NAND Flash is usually not negligible in terms of size. So to
decrease the second bootloader footprint, usage of this table section is suggested.
/* To read the NAND the nand_read() routine can be used. The
* first argument of the nand_read(), the nand_info geometry, can be found
* in the table table itself.
*/
nand_info_t *nand_info;
int (*nand_read)(nand_info_t *nand, size_t offset, size_t *length, u_char
*buffer);
OTP section access
This section describes the routines exported by BootROM to access the OTP section. The
OTP bits must be accessed and any possible error (due to the nature of OTP technology)
must be corrected.
/* Following are the routines exported by BootROM to access the
* OTP section.
*/
int (*get_otpbits)(u32 bit_off, u32 bit_cnt, u32 *buffer);
u32 (*hamming_encode)(u32 parity, void *data, unsigned int d, unsigned int
nr);
void (*hamming_fix)(void *data, unsigned int d, unsigned int nr, unsigned
int fix);
Doc ID 018553 Rev 3
145/590
BootROM
7.4.5
RM0078
Terminology
Table 46.
Useful terms
Term
146/590
Description
Authenticate
Prove the integrity or identity of an operator or an object.
Authorize
Grant an authenticated entity access to a service or an object.
Authorization session
Security protocol that enables the GPE to authenticate service
requests from an authorized operator; this is a [FIPS140] requirement
enforced by a GPE.
Blob
Binary large object; opaque data which is sealed.
Client
Consumer of secure ROM services
Credential
Authentication value which provides proof of knowledge (password) or
proof of ownership (biometrics or smartcard); analogous to [TPM] Auth
and [FIPS140] authentication data.
Cryptographic boundary
from [FIPS140] - an explicitly defined continuous perimeter that
establishes the physical bounds of a cryptographic module and
contains all the hardware and software components of a cryptographic
module.
Cryptographic module
From [FIPS140] - the set of hardware and/or software that implements
NIST Approved security functions (including cryptographic algorithms
and key generation) and is contained within the cryptographic
boundary.
Cryptographic service
Fom [FIPS140] - an available GPE security command
Endpoint
An operator capable of cryptographically exchanging information with
the device; the exchange may provide authentication, confidentiality,
and integrity.
HAL
Hardware Abstraction Layer
Identity
Derived from [FIPS140] - an operator which is uniquely and
individually authenticated by the cryptographic module; an identity is
associated with a role or roles for authorization; identity-based
authentication is a [FIPS140-2] security level 3 requirement.
IV
Initialization vector, an input parameter to the AES
encryption/decryption service.
KDF
Key derivation function, a cryptographic hash function which derives
one or more secret keys from secret values and/or other known
information.
Mechanism
A set of primitives used to implement any of multiple policies; in the
context of security, often stated as 'protection mechanism'.
Operator
From [FIPS140] - a consumer of cryptographic services external to the
GPE, which may be human or automation.
Permanent state
Lifecycle state variables that are in shielded locations and survive
power cycles.
Policy
A particular organizational strategy which is implemented with
mechanisms; in the context of security, a 'security policy' protects
information resources using 'protection mechanisms'.
Doc ID 018553 Rev 3
RM0078
BootROM
Table 46.
Useful terms (continued)
Term
7.4.6
Description
Role
Derived from [FIPS140] - a class of authenticated operators whose
members are authorized to invoke specific cryptographic services and
are not authorized to invoke others.
Root keys (ROOT_AES,
ROOT_HMAC)
Private AES256 and HMAC256 keys used by the secure ROM to seal
blobs for external storage; they are unique for each device.
Seal
A secure ROM activity which allows SPEAr secrets to be stored
outside of the SPEAr; it produces a signed (SPEAR_HMAC) and
encrypted (SPEAR_AES) blob for external storage.
TPM
Trusted platform module
User
A class of authenticated operators which consume cryptographic
services; equivalent to the [FIPS140] User role.
Zeroize
Invalidate a critical security parameter in a shielded location.
References
●
[AES Key Wrap] National Institute of Standards and Technology (NIST), AES Key Wrap
Specification, 2001 November
●
[ANS X9.31 1998] American National Standard for Financial Services, Digital Signature
Using Reversible Public Key Cryptography for the Financial Services Industry (rDSA)
●
[ANS X9.62 2005] American National Standard for Financial Services, Public Key
Cryptography for the Financial Service Industry, ECDSA
●
[FIPS140] see [FIPS140-2] and [FIPS140-3]
●
[FIPS140-2] National Institute of Standards and Technology (NIST), Security
Requirements for Cryptographic Modules, FIPS Pub 140-2, 2001 May
●
[FIPS140-3] National Institute of Standards and Technology (NIST), Security
Requirements for Cryptographic Modules, FIPS Pub 140-3 Draft, 2007
●
[FIPS 186] National Institute of Standards and Technology (NIST), Digital Signature
Standard, FIPS Pub 186, 2006 March
●
[NIST SP800-56] National Institute of Standards and Technology (NIST),
Recommendation for Pair-Wise Key Establishment Scheme
●
[NIST SP800-57] National Institute of Standards and Technology (NIST),
Recommendation for Key Management - Part 1: General, NIST Special Publication
800-57, 2007 March
●
[NIST SP800-90] National Institute of Standards and Technology (NIST),
Recommendation for Random Number Generation Using Deterministic Random Bit
Generators
●
[NIST 931 RNG Ext] National Institute of Standards and Technology (NIST), Random
Number Generator Based on ANSI X9.31 Appendix A.2.4 Using TDES and AES
●
[PKCS#1 v2.1] RSA Laboratories, RSA Cryptography Standard
Doc ID 018553 Rev 3
147/590
Static RAMs (SRAM)
8
RM0078
Static RAMs (SRAM)
This chapter focuses on SRAM functionality and operation.
For the SRAM feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
8.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device integrates 2 instances of static RAM blocks, identified as SYSRAM0
(32 KB) and SYSRAM1 4 KB).
SYSRAM0 is a single port static RAM with 32 KB size. When all power islands are switched
off, SYSRAM0 loses its data contents.
SYSRAM1 is a single port static RAM with 4 KB size. When all power islands are switched
off, SYSRAM1 maintains its data contents.
A part of these memory areas is used during the bootstrap phase by BootROM firmware.
After booting, all SRAM areas are fully available for general purpose applications.
For the address space location of the two SRAMs, refer to the companion document:
RM0089, Reference manual, SPEAr1340 address map and registers.
148/590
Doc ID 018553 Rev 3
RM0078
9
One-time programmable antifuse (OTP)
One-time programmable antifuse (OTP)
This chapter focuses on OTP functionality and operation.
For the OTP feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
9.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The OTP block is an array of one-time programmable antifuse memory cells.
Because all OTP banks have an embedded charge pump that provides the high voltage
required for antifuse programming sessions, no additional high voltage pad is required at the
chip interface.
Because OTP is software programmable, no dedicated programming interface is required at
the chip level.
9.2
Pins
OTP has no external pins.
9.3
Clocks
The OTP block receives a single clock. It is PCLK, the APB clock, nominally running at
83.5 MHz. Write operations to OTP banks must be performed with system running in SLOW
mode (OSCI1 clock) so that PCLK runs at 2 MHz frequency.
9.4
Functional description
The OTP block has three main functionalities:
●
Write
●
Read
DATA values can be read back from MISC (OTP outputs are mirrored to MISC
registers) together with a valid bit. DATA is refreshed after each reset. The exact
availability time is cell-dependent, the valid bit indicates whether data is already
available.
●
Masking can inhibit write operations to some words. Use masking to prevent the
contents of specific bytes in Bank 1 (or 2) from being altered.
See also: Section 9.5: Programming.
Doc ID 018553 Rev 3
149/590
One-time programmable antifuse (OTP)
9.4.1
RM0078
OTP banks
OTP embeds three 255-bit banks, with the following features:
●
BANK 1: 255-bit data bank with write-protect mechanism
●
BANK 2: 255-bit data bank with write-protect mechanism
●
BANK M: 255-bit bank, logically partitioned as described in next section.
OTP banks bit mapping and usage
BANK 1/BANK 2
In BANK 1 and BANK2, there is one predefined bit (255) used for test purposes. The rest of
the bits are available for the user.
Table 47.
BANK 1/ 2 bit mapping
1 bit
255 bits
XXXX
Data
255
254...0
XXXX
1 bit reserved for blowing at final test
Data
255 bits available for data writing
BANK M
BANK M contains one predefined bit (255) used for test purposes, and dedicated bits
controlled by the BootROM. For a detailed description of these bits, refer to Chapter 7:
BootROM.
Table 48.
1 bit
BANK M bit mapping
4 bits
XXXX Reserved
255
254...251
1 bit
1 bit
1 bit
1 bit
1 bit
1 bit
213 bits
J1
J0
T1
T0
E1
E0
Reserved
250
249
248
247
246
245
244...32
16 bits
16 bits
WP bits B2 WP bits B1
31...16
15...0
XXXX
1 bit reserved for blowing at final test
Reserved
BootROM controlled (see Table 33: OTP Bank M configuration in Chapter 7: BootROM)
J1 | J0
1 bit + 1 redundancy for JTAG disable (both bits should be programmed at “1” in order to
permanently disable the JTAG interface)
T1 | T0
1 bit + 1 redundancy for TEST disable (both bits should be programmed at “1” in order to
permanently disable the TEST interface)
E1 | E0
Reserved
Reserved
BootROM controlled (see Table 33: OTP Bank M configuration in Chapter 7: BootROM)
WP bits B2 8 bits + 8 redundancy for masking bank 2 (each couple of bits (0-1; 2-3; … 14-15) should
be programmed at “11” in order to inhibit write operations to the corresponding 32-bit
word of BANK 2)
WP bits B1 8 bits + 8 redundancy for masking bank 1 (each couple of bits (0-1; 2-3; … 14-15) should
be programmed at “11” in order to inhibit write operations to the corresponding 32-bit
word of BANK 1)
150/590
Doc ID 018553 Rev 3
RM0078
One-time programmable antifuse (OTP)
9.5
Programming
9.5.1
Writing
Note:
1.
Check that all previous write operations have finished: read the appropriate MISC
register.
2.
Program data bits to the appropriate MISC registers.
3.
Start the OTP write: program the appropriate write bit to the MISC register.
1
The three banks must not be programmed in parallel.
2
It is strongly advised to deploy either a redundancy (OR function between 2 bits) or ECC
scheme to the data been written due to typical reliability of fuse burning process.
Changing a programmed value
Under normal conditions (after a standard write with no masking applied), after data has
been programmed (step 3, above):
●
Overwriting a 0 with a 1 is possible, as shown in the following example.
●
Overwriting a 1 with a 0 has no effect.
●
Overwriting a 1 with a 1 can damage an antifuse.
Example: changing 0001 to 0101
1.
Write 0001 (as described in Section 9.5.1: Writing above)
2.
Write 0100
Result: 0101
Note:
To avoid antifuse damage, the second write must not be 0101.
9.5.2
Masking
Writing a 1 in one of the first 32 bits of BANK M inhibits all write operations to the
corresponding byte of BANK1 (or 2).
The first couple of bits of the MASK bank inhibits writing to the first 32-bit word of Bank 1;
the second couple of bits of the MASK bank inhibits writing to the second 32-bit word of
Bank 1, and so on, then moving to Bank 2.
Each mask bit has a redundant copy (OR function between the two).
Doc ID 018553 Rev 3
151/590
General purpose timers (GPT)
10
RM0078
General purpose timers (GPT)
This chapter focuses on GPT functionality and operation.
For the GPT feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
10.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device integrates 4 instances of a general purpose timer digital block,
identified as GPT0, GPT1, GPT2, GPT3. Each instance is a dual timer, for total 8
independent timers.
Figure 42. GPT block diagram
TOGGLE_FF
int
MUTIMER
MT_INT1_CLK
mt_int_clk
MT_INT1
TIMER_CLK
P_D_OUT
CLK
RESETn
TIMER_DEBUG
PWDATA
TOGGLE_FF
MT_CAPT1
int
MT_CAPT2
MUTIMER
PADDR[82]
MT_INT2_CLK
PENABLE
PWRITE
MT_INT2
DECODER
152/590
P_D_OUT
dec_rd_reg
Doc ID 018553 Rev 3
PRDATA
MB_PD_OUT
dec_rd_reg
dec_wr_reg
PSELgpt
WRAP_APB
RM0078
10.2
General purpose timers (GPT)
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
10.3
Clocks
Refer to Chapter 5: Reset and clock generator (RCG).
10.4
Interrupts
Refer to Appendix A: Interrupts.
10.5
Functional description
General purpose timers can be used for precise timing measurements, and for the
measurement of input signal frequency. GPT are essentially counters that increment based
on the clock cycle and the timer prescaler that can be monitored by an application to
determine elapsed time. GPT can have timer and capture mode capabilities.
The timer clock is generated by a programmable 4-bit prescaler unit that performs a clock
division by 1, 2, 4, 8, 16, 32, 64, 128, and 256.
The following modes of operation are available:
●
Auto-reload mode
When the timer is enabled, the counter is cleared and starts incrementing. When it
reaches the compare register value, an interrupt source is activated, the counter is
automatically cleared and restarts incrementing. The process is repeated until the timer
is disabled.
●
Single-shot mode
When the timer is enabled, the counter is cleared and starts incrementing. When it
reaches the compare register value, an interrupt source is activated, the counter
stopped and the timer disabled.
●
Capture function
This function is provided for the measurement of input timing signals. After initialization
when a rising transition occurs at the MT_CAPTx input, the actual counter value is
stored into the rising edge capture register (TIMER_REDG_CAPTx).
In the same way, when a falling edge transition occurs at the CAPT input, the actual
counter value is stored into the falling edge capture register (TIMER_FEDG_CAPTx).
You can read the value stored in the two capture registers and compute the duration of
the rising to falling edge (or vice versa) time interval.
Doc ID 018553 Rev 3
153/590
Real-time clock (RTC)
11
RM0078
Real-time clock (RTC)
This chapter focuses on RTC functionality and operation.
For the RTC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
11.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The RTC is a block that keeps track of the real time of day. It also functions as an alarm and
a calendar. The time is displayed in 24-hour format, and time/calendar values are stored in
binary-coded decimal format.
The time of day, alarm and calendar, status and control registers can all be accessed via a
standard 32 APB bus. All read/write operations last 2 cycles.
RTC provides a self isolation mode that is activated during power down. This feature allows
RTC to continue working if power is not supplied to the rest of the circuit. This feature is
realized by supplying separate power and clock connections.
A set of 16 general purpose registers (GP-Reg) are provided which can be used to save
data during the power down state.GP-Reg-set runs on 32 K oscillator clock and powered by
RTC battery. Each register is 32-bit and addressed mapped on the 32-bit APB bus. A bit in
status register reflects the status of any pending write to GP-Reg-set. This means that write
operation to the GP-Reg-set should be sequential, so you should wait for this pending
status bit to be cleared before writing again to GP-Reg-set.
11.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
11.3
Clocks
Refer to Chapter 5: Reset and clock generator (RCG).
11.4
Interrupts
Refer to Appendix A: Interrupts.
154/590
Doc ID 018553 Rev 3
RM0078
11.5
Real-time clock (RTC)
Functional description
The RTC block is composed of two sub-blocks: the timer (RTC_32K) and the APB interface
(RTC_48M). The timer block is powered by an external and separate battery and is clocked
by a 32768 Hz clock. It provides two main functions.
●
Time and calendar update
●
Power monitoring and self-isolation.
The APB interface is powered by the main chip power supply and it is clocked by a 83 MHz
clock. It provides the following functions.
●
Synchronization between 32 KHz and 48 MHz domains
●
Timer registers read and write
●
Alarm programming
●
Interrupt generation
●
Isolation monitoring
Doc ID 018553 Rev 3
155/590
Direct memory access controllers (DMAC)
12
RM0078
Direct memory access controllers (DMAC)
This chapter focuses on DMAC functionality and operation.
For the DMAC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers related to the DMAC, refer to the
system configuration registers (MISC) in the following companion document:
●
12.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device integrates 2 instances of a DMA controller digital block, identified as
DMAC0 and DMAC1.
The DMAC is an AHB-central DMA controller core that transfers data from a source
peripheral to a destination peripheral over two AHB buses. A wrapper is designed to
instantiate 2 DMAC cores (each with 2 AHB master interfaces), 2 ICMs (which arbitrate the
same master interface of each DMAC) and a MUX (which manages multiple peripheral
handshaking interfaces).
Figure 43. DMAC block diagram
$-!#
#HANNEL N
$-!HARDWARE
REQUEST)&
!RBITER
-ASTER)&
12.2
&)&/
#HANNEL !("3LAVE)&
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
12.3
Clocks
The DMAC clock is HCLK, the AHB clock.
See also: Chapter 5: Reset and clock generator (RCG).
156/590
Doc ID 018553 Rev 3
RM0078
12.4
Direct memory access controllers (DMAC)
Interrupts
Each DMAC can generate 5 different types of interrupts to ARM (INT_FLAG):
●
Error interrupt (IntErr): generated when an ERROR response is received from an
AHB slave on the HRESP bus during a transfer
●
Destination transaction complete interrupt (IntDstTran): generated after completion
of the last AHB transfer of the requested transaction from the handshaking interface on
the destination side
●
Source transaction complete interrupt (IntSrcTran): generated after completion of
the last AHB transfer of the requested transaction from the handshaking interface on
the source side
●
Block complete interrupt (IntBlock): generated on DMA block transfer completion to
the destination peripheral
●
Transfer complete interrupt (IntTfr): generated on DMA transfer completion to the
destination peripheral
Also, the bitwise OR of all bits of the INT_FLAG bus is driven on the INT_COMBINED
output.
See also: Appendix A: Interrupts .
12.5
Functional description
12.5.1
DMAC wrapper
SPEAr1340 provides a DMAC wrapper with 56 DMA lines. These lines are connected to 32
hardware handshaking interfaces allowed by the 2 DMAC cores. Each core is configured
with 16 handshaking interfaces. The DMAC wrapper uses two interconnection modules
(ICMs) to arbitrate the same master interface of each DMAC core.
Figure 44 shows how the 2 ICMs are connected to the master interfaces of both DMACs.
Figure 44. DMAC wrapper
DMAC wrapper
DMAC0
DMAC1
51
41
52
42
ICM0
40
Doc ID 018553 Rev 3
ICM1
50
157/590
Direct memory access controllers (DMAC)
12.5.2
RM0078
DMAC multiplexing
The DMAC wrapper uses a multiplexer (MUX) to:
●
select the peripheral
●
select the DMAC core
●
manage a peripheral with a handshaking interface
The DMAC multiplexing consists of the following steps.
Step 1: selecting the peripheral
Each DMAC is configured with 16 handshaking lines. The first 12 can be selected by
configuring the miscellaneous registers. The last 4 handshaking lines for each DMAC are
always mapped in a fixed way .
Figure 45. DMAC handshaking lines allocation
HS0’
HS1’
ADC_TX
HS0
DMAC0
HS11’
I2S_RX
HS11
CAM1_EVEN
HS16
HS12
HS12’
HS13’
HS14’
HS15’
HS0”
HS1”
CAM1_ODD
HS17
DMAC1
HS11”
HS27
HS12”
HS13”
HS14”
HS15”
To select the peripheral, you must configure the miscellaneous register DMAC_HS_SEL
according to the allocated handshaking interface (HS), as shown in Table 49.
158/590
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
Table 49.
DMAC MUX - selecting the peripheral
Handshaking
interface #
DMAC_HS_SEL bit
0
1
0
0
ADC_TX
Reserved
1
1
Reserved
Reserved
2
2
SPDIF_TX
Reserved
3
3
SPDIF_RX
Reserved
4
4
SSP_TX
Reserved
5
5
SSP_RX
Reserved
6
6
UART0_TX
Reserved
7
7
UART0_TX
Reserved
8
8
I2C0_TX
Reserved
9
9
I2C0_TX
Reserved
10
10
I2S_TX
Reserved
11
11
I2S_RX
Reserved
12
UART1_TX
13
UART1_TX
14
I2C1_TX
15
I2C1_TX
16
12
CAM1_EVEN
Reserved
17
13
CAM1_ODD
Reserved
18
14
CAM2_EVEN
Reserved
19
15
CAM2_ODD
Reserved
20
16
CAM3_EVEN
Reserved
21
17
CAM3_ODD
Reserved
22
18
CAM4_EVEN
Reserved
23
19
CAM4_ODD
Reserved
24
20
Reserved
Reserved
25
21
Reserved
Reserved
26
22
Reserved
Reserved
27
23
Reserved
Reserved
28
Reserved
29
Reserved
30
Reserved
31
Reserved
Doc ID 018553 Rev 3
159/590
Direct memory access controllers (DMAC)
RM0078
Example:
CAM1_EVEN corresponds to line HS#16.
According to Table 49, HS#16 corresponds to bit 12.
Therefore, to select CAM1_EVEN, set DMAC_HS_SEL[12] to 0.
Step 2: selecting the flow controller and the data direction
The DMAC controller is compatible with the ARM DMA controller. To select the flow
controller and the data flow direction, configure the miscellaneous registers
DMAC_FLOW_SEL and DMAC_DIR_SEL as follows:
–
To select if the flow controller is DMAC or the peripheral, configure the
DMAC_FLOW_SEL[HS#] register.
–
To select the data direction (from or to the peripheral), configure the
DMAC_DIR_SEL[HS#] register.
Table 51.
DMAC MUX - selecting the flow controller and data direction
DMAC_FLOW_SEL[i]
DMAC_DIR_SEL[i]
Flow controller
Data direction
0
0
DMAC
From the peripheral
0
1
DMAC
To the peripheral
1
x
Peripheral
Not needed
Step 3: selecting the DMAC core
To select which of the two DMAC cores the peripheral requests must be sent to, you must
configure the miscellaneous register DMAC_SEL as shown in Table 52 below.
Table 52.
DMAC MUX - selecting the DMAC core
DMAC_SEL[i]
DMAC core involved
0
DMAC core 0
1
DMAC core 1
Step 4: Assigning a handshaking interface on a DMAC channel
To route a handshaking interface on a DMAC channel, you must configure the
corresponding CFGx register (where x is the channel).
–
To assign a HS interface as the source of a channel, set SRC_PER bits of the
corresponding CFGx register.
–
To define a HS interface as the destination of a channel, set DST_PER bits of the
corresponding CFGx register.
Since both SRC_PER and DST_PER fields are on 4 bits ([42:39] and [46:43] respectively),
you can write any value from 0 to 15.
160/590
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
DMAC configuration example
Here is an example of how to route a peripheral to DMAC, for instance the camera interface
on DMAC0 core.
1.
2.
According to Table 49 CAM1_EVEN scorresponds to line HS#16. To select
CAM1_EVEN, set DMAC_HS_SEL[16] to 0.
According to DMAC_SEL[0] register description:
hs0_16_map:
0 : hs0 on DMA0 , hs16 on DMA1
1 : hs0 on DMA1 , hs16 on DMA0
Therefore, CAM1_EVEN can be routed on the HS#0 of both DMACs:
3.
–
If DMAC_SEL[0] = 1, CAM1_EVEN is assigned to DMAC0 and ADC_TX to
DMAC1.
–
If DMAC_SEL[1] = 0, CAM1_EVEN is assigned to DMAC1 and ADC_TX to
DMAC0 .
Select if the peripheral is source or destination:
–
If the CAM is source, set SRC_PER = CFGx[42:39] = 0x0.
–
If the peripheral is destination, set DEST_PER = CFGx[46:43] = 0x0.
Tha same procedure should be followed for the other peripherals.
Summarizing:
12.5.3
CAM1_EVEN
0x0
CAM1_ODD
0x1
CAM2_EVEN
0x2
CAM2_ODD
0x3
CAM3_EVEN
0x4
CAM3_ODD
0x5
CAM4_EVEN
0x6
CAM4_ODD
0x7
DMAC transfers
This section discusses how a single block transfer, made up of transactions, is performed.
The device that controls the length of a block is known as the flow controller. The DMAC, the
source peripheral or the destination peripheral must be assigned as the flow controller.
●
If the block size is known prior to when the channel is enabled, then the DMAC should
be programmed as the flow controller.
●
If the block size is unknown when the DMAC channel is enabled, either the source or
destination peripheral must be the flow controller.
Table 53 lists valid transfer types and flow controller combinations.
See also: Section 12.6.1: DMAC transfer types on page 166 for programming information.
Doc ID 018553 Rev 3
161/590
Direct memory access controllers (DMAC)
Table 53.
RM0078
Transfer types and flow controller combinations
Transfer type
Flow controller
Memory to memory
DMAC
Memory to peripheral
DMAC
Memory to peripheral
Peripheral
Peripheral to memory
DMAC
Peripheral to memory
Peripheral
Peripheral to peripheral
DMAC
Peripheral to peripheral
Source peripheral
Peripheral to peripheral
Destination peripheral
Handshaking interfaces are used at the transaction level to control the flow of single or burst
transactions. The operation of the handshaking interface is different and depends on
whether the peripheral or the DMAC is the flow controller.
The peripheral uses the handshaking interface to indicate to the DMAC that it is ready to
transfer or accept data over the AHB bus. A non-memory peripheral can request a DMA
transfer through the DMAC using one of two types of handshaking interfaces:
●
hardware handshaking: it is accomplished using a dedicated handshaking interface
●
software handshaking: it is accomplished through memory-mapped registers
Software selects between the hardware or software handshaking interface on a per-channel
basis. The type of handshaking interface depends on whether the peripheral is a flow
controller or not. For a memory peripheral there is no handshaking interface with the DMAC,
and therefore the memory peripheral can never be a flow controller. Once the channel is
enabled, the transfer proceeds immediately without waiting for a transaction request.
Software handshaking
When the slave peripheral requires the DMAC to perform a DMA transaction, it
communicates this request by sending an interrupt to the CPU or interrupt controller. The
interrupt service routine then uses the software handshake registers to initiate and control a
DMA transaction. This group of software registers is used to implement the software
handshaking interface.
Handshaking interface – DMAC flow controller
When the peripheral is not the flow controller, the DMAC tries to efficiently transfer the data
using as little of bus bandwidth as possible. Generally, the DMAC tries to transfer the data
using burst transactions and, where possible, fill or empty the channel FIFO in single bursts
– provided that the software has not limited the burst length. The DMAC can also lock the
arbitration for the master bus interface so that a channel is permanently granted the master
bus interface. Additionally, the DMAC can assert the AMBA HLOCK signal to lock the
system arbiter.
162/590
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
Single transaction region
There are cases where a DMA block transfer cannot be completed using only burst
transactions. Typically this occurs when the block size is not a multiple of the burst
transaction length. In these cases, the block transfer uses burst transactions up to the point
where the amount of data left to complete the block is less than the amount of data in a burst
transaction. At this point, the DMAC samples the “single” status flag and completes the
block transfer using single transactions. The peripheral asserts a single status flag to
indicate to the DMAC that there is enough data or space to complete a single transaction
from or to the source/destination peripheral. The single transaction region is the time
interval where the DMAC uses single transactions to complete the block transfer; burst
transactions are exclusively used outside this region.
Early-terminated burst transaction
When a source or destination peripheral is in the single transaction region, a burst
transaction can still be requested. However, src_burst_size_bytes or dst_burst_size_bytes is
greater than the number of bytes left to complete in the source/destination block transfer at
the time that the burst transaction is triggered. In this case, the burst transaction is started
and “early-terminated” at block completion without transferring the programmed amount of
data – that is, src_burst_size_bytes or dst_burst_size_bytes – but only the amount required
completing the block transfer. An early-terminated burst transaction occurs between the
DMAC and the peripheral only when the peripheral is not the flow controller.
Handshaking interface – Peripheral flow controller
When the peripheral is the flow controller, it controls the length of the block and must
communicate to the DMAC when the block transfer is completed. The peripheral does this
by telling the DMAC that the current transaction – burst or single – is the last transaction in
the block. When the peripheral is the flow controller and the block size is not a multiple of the
source/destination burst transaction length, then the peripheral must use single transactions
to complete a block transfer.
When the peripheral is the flow controller, it indicates directly to DMAC which type of
transaction – single or burst – to perform. Where possible, the DMAC uses the maximum
possible burst length. It can also lock the arbitration for the master bus so that a channel is
permanently granted the master bus interface. The DMAC can also assert the HLOCK
signal to lock the system arbiter.
Setting up transfers
Transfers are set up by programming fields of the CTLx and CFGx registers for that channel.
A single block is made up of numerous transactions – single and burst – which are in turn
composed of AHB transfers. A peripheral requests a transaction through the handshaking
interface to the DMAC. The operation of the handshaking interface is different and depends
on what is acting as the flow controller.
Doc ID 018553 Rev 3
163/590
Direct memory access controllers (DMAC)
12.5.4
RM0078
Generating requests for the AHB master bus interface
Each channel has a source state machine and destination state machine running in parallel.
These state machines generate the request inputs to the arbiter, which arbitrates for the
master bus interface (one arbiter per master bus interface).
When the source/destination state machine is granted control of the master bus interface,
and when the master bus interface is granted control of the external AHB bus, then AHB
transfers between the peripheral and the DMAC (on behalf of the granted state machine)
can take place. AHB transfers from the source peripheral or to the destination peripheral
cannot proceed until the channel FIFO is ready. For burst transaction requests and for
transfers involving memory peripherals, the criterion for “FIFO readiness” is controlled by
the FIFO_MODE field of the CFGx register.
The definition of FIFO readiness is the same for:
●
Single transactions
●
Burst transactions, where CFGx.FIFO_MODE = 0
●
Transfers involving memory peripherals, where CFGx.FIFO_MODE = 0
The channel FIFO is deemed ready when the space/data available is sufficient to complete
a single AHB transfer of the specified transfer width. FIFO readiness for source transfers
occurs when the channel FIFO contains enough room to accept at least a single transfer of
CTLx.SRC_TR_WIDTH width. FIFO readiness for destination transfers occurs when the
channel FIFO contains data to form at least a single transfer of CTLx.DST_TR_WIDTH
width.
When CFGx.FIFO_MODE = 1, then the criteria for FIFO readiness for burst transaction
requests and transfers involving memory peripherals is as follows:
●
A FIFO is ready for a source burst transfer when the FIFO is less than half empty;
●
A FIFO is ready for a destination burst transfer when the FIFO is greater than or equal
to half full.
When the source/destination peripheral is not memory, the source/destination state
machine waits for a single/burst transaction request. Upon receipt of a transaction request
and only if the channel FIFO is “ready” for source/destination AHB transfers, a request for
the master bus interface is made by the source/destination state machine.
When the source/destination peripheral is memory, the source/destination state machine
must wait until the channel FIFO is “ready”. A request is then made for the master bus
interface. There is no handshaking mechanism employed between a memory peripheral
and the DMAC.
12.5.5
AHB master interface arbitration
Each DMAC channel has two request lines that request ownership of a particular master
bus interface: channel source and channel destination request lines.
Source and destination arbitrate separately for the bus. Once a source/destination state
machine gains ownership of the master bus interface and the master bus interface has
ownership of the AHB bus, then AHB transfers can proceed between the peripheral and
DMAC.
An arbitration scheme decides which of the request lines is granted the particular master
bus interface. Each channel has a programmable priority. A request for the master bus
interface can be made at any time, but is granted only after the current AHB transfer (burst
164/590
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
or single) has completed. Therefore, if the master interface is transferring data for a lower
priority channel and a higher priority channel requests service, then the master interface will
complete the current burst for the lower priority channel before switching to transfer data for
the higher priority channel.
12.5.6
Scatter/Gather
Scatter is relevant to a destination transfer. The destination address is incremented or
decremented by a programmed amount – the destination scatter interval (DSI) field of the
DSRx register – multiplied by the number of bytes in a single AHB transfer to the destination
when a scatter boundary is reached. The number of destination transfers between
successive scatter boundaries is programmed into the destination scatter count (DSC) field
of the DSRx register.
Scatter is enabled by writing a 1 to the CTLx.DST_SCATTER_EN field. The CTLx.DINC
field determines if the address is incremented, decremented, or remains fixed when a
scatter boundary is reached. If the CTLx.DINC field indicates a fixed-address control
throughout a DMA transfer, then the CTLx.DST_SCATTER_EN field is ignored, and the
scatter feature is automatically disabled.
Gather is relevant to a source transfer. The source address is incremented or decremented
by a programmed amount – the source gather interval (SGI) field of the SGRx register –
multiplied by the number of bytes in a single AHB transfer from the source when a gather
boundary is reached. The number of source transfers between successive gather
boundaries is programmed into the source gather count (SGC) field of the SGRx register.
Gather is enabled by writing a 1 to the CTLx.SRC_GATHER_EN field. The CTLx.SINC field
determines if the address is incremented, decremented, or remains fixed when a gather
boundary is reached. If the CTLx.SINC field indicates a fixed-address control throughout a
DMA transfer, then the CTLx.SRC_GATHER_EN field is ignored, and the scatter feature is
automatically disabled.
12.5.7
Endianness
The endianness of the AHB slave interface is statically configured to little-endian for both the
DMACs.
Endianness of each AHB master interface for both DMACs can be dynamically configured
by programming a miscellaneous register.
●
Because two DMACs are instantiated in the wrapper and each DMAC has two master
interfaces, four pins are connected to MISC:
●
DMA0_BIG_END_M1;
●
DMA0_BIG_END_M2;
●
DMA1_BIG_END_M1;
●
DMA1_BIG_END_M2.
0 = Little-endian
1 = Big-endian
Default value: 0.
Doc ID 018553 Rev 3
165/590
Direct memory access controllers (DMAC)
12.6
Programming
12.6.1
DMAC transfer types
RM0078
A DMA transfer may consist of single or multi-block transfers. On successive blocks of a
multi-block transfer, the SARx/DARx register in the DMAC is reprogrammed using either of
the following methods:
●
Block chaining using linked lists
●
Auto-reloading
●
Contiguous address between blocks
On successive blocks of a multi-block transfer, the CTLx register in the DMAC is
reprogrammed using either of the following methods:
●
Block chaining using linked lists
●
Auto-reloading
When block chaining, using Linked Lists is the multi-block method of choice. On successive
blocks, the LLPx register in the DMAC is reprogrammed using block chaining with linked
lists.
A block descriptor consists of six registers: SARx, DARx, LLPx, CTLx, SSTATx, and
DSTATx. The first four registers, along with the CFGx register, are used by the DMAC to set
up and describe the block transfer.
Note:
The term Link List Item (LLI) and block descriptor are synonymous.
Multi-block transfers
Multi-block transfers are enabled by setting the DMAH_CHX_MULTI_BLK_EN configuration
parameter to True.
Note:
Multi-block transfers—in which the source and destination are swapped during the
transfer—are not supported. In a multi-block transfer, the direction must not change for the
duration of the transfer.
Block chaining using linked lists
To enable multi-block transfers using block chaining, you must set the configuration
parameter DMAH_CHx_MULTI_BLK_EN to True and the DMAH_CHx_HC_LLP parameter
to False.
In this case, the DMAC reprograms the channel registers prior to the start of each block by
fetching the block descriptor for that block from system memory. This is known as an LLI
update.
DMAC block chaining uses a Linked List Pointer register (LLPx) that stores the address in
memory of the next linked list item. Each LLI contains the corresponding block descriptors:
166/590
1.
SARx
2.
DARx
3.
LLPx
4.
CTLx
5.
SSTATx
6.
DSTATx
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
To set up block chaining, you program a sequence of Linked Lists in memory.
LLI accesses are always 32-bit accesses (Hsize = 2) aligned to 32-bit boundaries and
cannot be changed or programmed to anything other than 32-bit, even if the AHB master
interface of the LLI supports more than a 32-bit data width.
The SARx, DARx, LLPx, and CTLx registers are fetched from system memory on an LLI
update. If configuration parameter DMAH_CHx_CTL_WB_EN = True, then the updated
contents of the CTLx, SSTATx, and DSTATx registers are written back to memory on block
completion. Figure 46 and Figure 47 show how you use chained linked lists in memory to
define multi-block transfers using block chaining.
Figure 46. Multiblock transfer using linked lists when DMAH_CHx_STAT_SRC set to
true
LLI(0)
LLI(1)
Write-back for DSTATx
Write-back for DSTATx
Write-back for SSTATx
Write-back for SSTATx
CTLx[63:32]
CTLx[63:32]
CTLx[31:0]
CTLx[31:0]
LLPx(1)
LLPx(2)
DARx
DARx
SARx
System
memory
SARx
LLPx(0)
LLPx(1)
LLPx(2)
It is assumed that no allocation is made in system memory for the source status when the
configuration parameter DMAH_CHx_STAT_SRC is set to False. If this parameter is False,
then the order of a Linked List item is as follows:
1.
SARx
2.
DARx
3.
LLPx
4.
CTLx
5.
DSTATx
Figure 47. Multiblock transfer using linked lists when DMAH_CHx_STAT_SRC set to
false
LLI(1)
LLI(0)
Write-back for DSTATx
Write-back for DSTATx
CTLx[63:32]
CTLx[63:32]
CTLx[31:0]
CTLx[31:0]
LLPx(1)
LLPx(2)
DARx
DARx
SARx
LLPx(0)
System
memory
SARx
LLPx(1)
Doc ID 018553 Rev 3
LLPx(2)
167/590
Direct memory access controllers (DMAC)
Note:
RM0078
In order to not confuse the SARx, DARx, LLPx, CTLx, STATx, and DSTATx register locations
of the LLI with the corresponding DMAC memory mapped register locations, the LLI register
locations are prefixed with LLI; that is, LLI.SARx, LLI.DARx, LLI.LLPx, LLI.CTLx,
LLI.SSTATx, and LLI.DSTATx.
Figure 48 and Figure 49 show the mapping of a Linked List Item stored in memory to the
channel registers block descriptor.
Rows 6 through 10 of Table 54 show the required values of LLPx, CTLx, and CFGx for multiblock DMA transfers using block chaining.
For rows 6 through 10 of Table 54, the LLI.CTLx, LLI.LLPx, LLI.SARx, and LLI.DARx
register locations of the LLI are always affected at the start of every block transfer. The
LLI.LLPx and LLI.CTLx locations are always used to reprogram the DMAC LLPx and CTLx
registers. However, depending on the Table 54 row number, the LLI.SARx/LLI.DARx
address may or may not be used to reprogram the DMAC SARx/DARx registers.
Table 54.
Transfer
type
1. Singleblock or last
transfer of
multi-block
2. Autoreload multiblock
transfer with
contiguous
SAR
3. Autoreload multiblock
transfer with
contiguous
DAR
Programming of transfer types and channel register update method
LLP_
RELOAD LLP_DS RELOAD
LLP.
SRC_EN
_SRC
T_EN
_DST
LOC = 0
(CTLx)
(CFGx)
(CTLx)
(CFGx)
Yes
Yes
Yes
0
0
0
0
0
1
0
0
0
CTLx,
LLPx
Update
Method
0
None, user
reprograms
SARx
Update
Method
None
(single)
No
1
CTLx, LLPx
are
Conreloaded
tiguous
from initial
values
Autoreload
No
0
CTLx, LLPx
are
Autoreloaded
reload
from initial
values
Contiguous
No
Autoreload
No
None
(single)
Yes
Yes
0
1
0
1
CTLx, LLPx
are
Autoreloaded
reload
from initial
values
5. Singleblock or last
transfer of
multi-block
No
0
0
0
0
None, user
reprograms
0
CTLx, LLPx
loaded from Connext Linked tiguous
List item
168/590
No
0
0
1
Write
back(1)
None
(single)
4. Autoreload multiblock
transfer
6. Linked list
multi-block
transfer with
contiguous
SAR
DARx
Update
Method
Doc ID 018553 Rev 3
None
(single)
Linked list Yes
RM0078
Table 54.
Transfer
type
7. Linked list
multi-block
transfer with
auto-reload
SAR
8. Linked list
multi-block
transfer with
contiguous
DAR
9. Linked list
multi-block
transfer with
auto-reload
DAR
10. Linked
list multiblock
transfer
Direct memory access controllers (DMAC)
Programming of transfer types and channel register update method (continued)
LLP_
RELOAD LLP_DS RELOAD
LLP.
SRC_EN
_SRC
T_EN
_DST
LOC = 0
(CTLx)
(CFGx)
(CTLx)
(CFGx)
No
No
No
No
0
1
1
1
1
1
0
0
0
0
0
1
CTLx,
LLPx
Update
Method
SARx
Update
Method
DARx
Update
Method
Write
back(1)
0
CTLx, LLPx
loaded from Autonext Linked reload
List item
0
CTLx, LLPx
loaded from
ConLinked list
tiguous
next Linked
List item
Yes
1
CTLx, LLPx
loaded from
AutoLinked list
reload
next Linked
List item
Yes
0
CTLx, LLPx
loaded from
Linked list Linked list Yes
next Linked
List item
Linked list Yes
1. This column assumes that the configuration parameter DMAH_CHx_CTL_WB_EN = True. If DMAH_CHx_CTL_WB_EN =
False, then there is never writeback of the control and status registers regardless of transfer type, and all rows of this
column are “No”.
Figure 48. Mapping of block descriptor (LLI) in memory to channel registers when
DMAH_CHx_STAT_SRC set to True
hsize = 32
LLI.DSTATx
LLI.SSTATx
LLI.CTLx[63:32]
LLI.CTLx[31:0]
LLI.LLPx
LLI.DARx
LLI.SARx
{LLPx[31:2], 2‘b00} + 0x18
{LLPx[31:2], 2‘b00} + 0x14
{LLPx[31:2], 2‘b00} + 0x10
Fixed Offsets
{LLPx[31:2], 2‘b00} + 0xc
{LLPx[31:2], 2‘b00} + 0x8
{LLPx[31:2], 2‘b00} + 0x4
{LLPx[31:2], 2‘b00}
32
Doc ID 018553 Rev 3
base address of LLI
(LLPx.LOC)
169/590
Direct memory access controllers (DMAC)
RM0078
Figure 49. Mapping of block descriptor (LLI) in memory to channel registers when
DMAH_CHx_STAT_SRC set to False
hsize = 32
LLI.DSTATx
{LLPx[31:2], 2‘b00} + 0x14
LLI.CTLx[63:32]
LLI.CTLx[31:0]
LLI.LLPx
LLI.DARx
LLI.SARx
{LLPx[31:2], 2‘b00} + 0x10
Fixed Offsets
{LLPx[31:2], 2‘b00} + 0xc
{LLPx[31:2], 2‘b00} + 0x8
{LLPx[31:2], 2‘b00} + 0x4
{LLPx[31:2], 2‘b00}
32
Note:
base address of LLI
(LLPx.LOC)
Throughout this chapter, there are descriptions about fetching the LLI.CTLx register from
the location pointed to by the LLPx register. This exact location is the LLI base address
(stored in LLPx register) plus the fixed offset. For example, in Figure 48, the location of the
LLI.CTLx register is LLPx.LOC + 0xc.
Referring to Table 54, if the Write Back column entry is “Yes” and the configuration
parameter DMAH_CHx_CTL_WB_EN = True, then the CTLx[63:32] register is always
written to system memory (to LLI.CTLx[63:32]) at the end of every block transfer.
The source status is fetched and written to system memory at the end of every block
transfer if the Write Back column entry is “Yes,” DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_SRC = True, and CFGx.SS_UPD_EN is enabled.
The destination status is fetched and written to system memory at the end of every block
transfer if the Write Back column entry is “Yes,” DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_DST = True, and CFGx.DS_UPD_EN is enabled.
Auto-reloading of channel registers
During auto-reloading, the channel registers are reloaded with their initial values at the
completion of each block and the new values used for the new block. Depending on the row
number in Table 54, some or all of the SARx, DARx, and CTLx channel registers are
reloaded from their initial value at the start of a block transfer.
Contiguous address between blocks
In this case, the address between successive blocks is selected as a continuation from the
end of the previous block.
Enabling the source or destination address to be contiguous between blocks is a function of
the CTLx.LLP_SRC_EN, CFGx.RELOAD_SRC, CTLx.LLP_DST_EN, and
CTLx.RELOAD_DST registers (see Table 54).
Note:
170/590
You cannot select both SARx and DARx updates to be contiguous. If you want this
functionality, you should increase the size of the Block Transfer (CTLx.BLOCK_TS), or if this
is at the maximum value, use Row 10 of Table 54 and set up the LLI.SARx address of the
block descriptor to be equal to the end SARx address of the previous block. Similarly, set up
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
the LLI.DARx address of the block descriptor to be equal to the end DARx address of the
previous block. For more information, refer to Section : Multi-block transfer with linked list for
source and linked list for destination (Row 10).
Suspension of transfers between blocks
At the end of every block transfer, an end-of-block interrupt is asserted if:
Note:
1.
Interrupts are enabled, CTLx.INT_EN = 1, and
2.
The channel block interrupt is unmasked, MaskBlock[n] = 1, where n is the channel
number.
The block-complete interrupt is generated at the completion of the block transfer to the
destination.
For rows 6, 8, and 10 of Table 54, the DMA transfer does not stall between block transfers.
For example, at the end-of-block N, the DMAC automatically proceeds to block N + 1.
For rows 2, 3, 4, 7, and 9 of Table 54 (SARx and/or DARx auto-reloaded between block
transfers), the DMA transfer automatically stalls after the end-of-block interrupt is asserted,
if the end-of-block interrupt is enabled and unmasked.
The DMAC does not proceed to the next block transfer until a write to the ClearBlock[n]
block interrupt clear register, done by software to clear the channel block-complete interrupt,
is detected by hardware.
For rows 2, 3, 4, 7, and 9 of Table 54 (SARx and/or DARx auto-reloaded between block
transfers), the DMA transfer does not stall if either:
●
Interrupts are disabled, CTLx.INT_EN = 0, or
●
The channel block interrupt is masked, MaskBlock[n] = 0, where n is the channel
number.
Channel suspension between blocks is used to ensure that the end-of-block ISR (interrupt
service routine) of the next-to-last block is serviced before the start of the final block
commences. This ensures that the ISR has cleared the CFGx.RELOAD_SRC and/or
CFGx.RELOAD_DST bits before completion of the final block. The reload bits
CFGx.RELOAD_SRC and/or CFGx.RELOAD_DST should be cleared in the end-of-block
ISR for the next-to-last block transfer.
Ending multi-block transfers
All multi-block transfers must end as shown in either Row 1 or Row 5 of Table 54. At the end
of every block transfer, the DMAC samples the row number, and if the DMAC is in the Row 1
or Row 5 state, then the previous block transferred was the last block and the DMA transfer
is terminated.
Row 1 and Row 5 are used for single-block transfers or terminating multi-block transfers.
Transfers initiated in rows 2, 3 or 4 can only end in row 1; similarly, transfers initiated in rows
6 through 10 can only end in row 5. Ending in the Row 5 state enables status fetch and
write-back for the last block. Ending in the Row 1 state disables status fetch and write-back
for the last block.
For rows 2, 3, and 4 of Table 54, (LLPx.LOC = 0 and CFGx.RELOAD_SRC and/or
CFGx.RELOAD_DST is set), multi-block DMA transfers continue until both the
CFGx.RELOAD_SRC and CFGx.RELOAD_DST registers are cleared by software. They
should be programmed to 0 in the end-of-block interrupt service routine that services the
next-to-last block transfer; this puts the DMAC into the Row 1 state.
Doc ID 018553 Rev 3
171/590
Direct memory access controllers (DMAC)
RM0078
For rows 6, 8, and 10 of Table 54 (both CFGx.RELOAD_SRC and CFGx.RELOAD_DST
cleared), the user must set up the last block descriptor in memory so that both
LLI.CTLx.LLP_SRC_EN and LLI.CTLx.LLP_DST_EN are 0.
The sampling of the LLPx.LOC bit takes place exclusively at the beginning of the transfer
when the channel is enabled. This determines whether writeback is enabled throughout the
complete transfer, and changing the value of this bit in subsequent blocks on the same
transfer does not have any effect.
Note:
172/590
The only allowed transitions between the rows of Table 54 are from any row into Row 1 or
Row 5. As already stated, a transition into row 1 or row 5 is used to terminate the DMA
transfer; all other transitions between rows are not allowed. Software must ensure that
illegal transitions between rows do not occur between blocks of a multi-block transfer. For
example, if block N is in row 10, then the only allowed rows for block N +1 are rows are rows
10 or 5.
Doc ID 018553 Rev 3
RM0078
12.6.2
Direct memory access controllers (DMAC)
Programming example
The following flow diagram shows an overview of programming the DMA described in
Section : Programming example for linked list multi-block transfer.
Figure 50. Flowchart for DMA programming example
Y
Idle
Read
ChEnReg
Channel
busv
N
Program
CTLx
Register
Clear pending
interrupts
Write to
DONE bit
Write to BLOCK_TS to
set block transfer size
Write to LLP_SRC_EN,
LLP_DST_EN to set
block chaining for
source/destination
Write to TT_FC to
set transfer type and
flow control
Write to
SRC_TR_WIDTH to
set source transfer
width
Write to
DST_TR_WIDTH to set
destination transfer
width
Write to SMS/DMS to
identify AHB layer for
source/destination
Write to SINC/DINC for
incrementing address
for source/destination
Write to SRC_MSIZE,
DEST_MSIZE to set
source/destination burst
transaction length
Write SRC_GATHER_EN
DST_SCATTER_EN to set
source/destination gather
enable bit
Write to INT_EN to set
Interrupt Enable bit
Program
CTLx
Register
LOCK_B
Bus Lock bit
set
Set Bus Lock
Level duration
LOCK_B_L
N
Write to HS_SEL_SRC,
HS_SEL_DST to set
source/destination
handshaking interface
Hardware
handshaking
enabled
Y
LOCK_CH
Channel Lock
bit set
Y
Write to SRC_PER,
DEST_PER to assign
hardware handshaking
interface
N
Write to SS_UPD_EN,
DS_UPD_EN to set
source/destination Status
Update Enable
Y
Set Channel
Lock Level
duration
LOCK_CH_L
N
Write to FIFO_EMPTY bit,
CH_SUSP Channel
Suspend bit and
CH_PRIOR Channel
Priority bit
Set LLPx register
locations of all
LLI entries
Write to Protection
Control bit PROTCTL
Write to FIFO_MODE
select bit and Flow Control
Mode bit FCMODE
Set SARx/DARx
register locations
of all LLI entries
Scatter
enabled
Write to RELOAD_SRC,
RELOAD_DST to set
automatid source/
destination Reload
Write to SRC_HS_POL,
DST_HS_POL to set
source/destination
Handshaking Interface
Polarity
Doc ID 018553 Rev 3
Program
SGRx
register
Y
Program
DSRx
register
N
Gather
enabled
Write to MAX_ABRST to
set Maximum AMBA
Burst Length
Y
N
Clear pending
interrupts
Write to ChEnReg to
enable DMAC
173/590
Direct memory access controllers (DMAC)
RM0078
Programming example for linked list multi-block transfer
This section explains the step-by-step programming of the DMAC. The example
demonstrates row 10 of Table 54 for multi-block transfer with linked list for source and linked
list for destination. This example uses the DMAC to move four blocks of contiguous data
from source to destination memory using the linked list feature.
1.
Set up the chain of linked list items – otherwise known as block descriptors – in
memory. Write the control information in the LLI.CTLx register location of the block
descriptor for each LLI in memory for Channel 1. In the LLI.CTLx register, the following
is programmed:
a)
Set up the transfer type for a memory-to-memory transfer:
ctlx[22:20] = 3'b000;
b)
Set up the transfer characteristics:
- Transfer width for the source in the SRC_TR_WIDTH field
ctlx[6:4] = 3'b001;
- Transfer width for the destination in the DST_TR_WIDTH field
ctlx[3:1] = 3'b001;
- Source master layer in the SMS field where the source resides
ctlx[26:25] = 2'b00;
- Destination master layer in the DMS field where the destination resides
ctlx[24:23] = 2'b00;
- Incrementing address for the source in the SINC field
ctlx[10:9] = 2'b00;
- Incrementing address for the destination in the DINC field
ctlx[8:7] = 2'b00;
2.
Write the channel configuration information into the CFGx register for Channel 1:
a)
HS_SEL_SRC/HS_SEL_DST bits select which of the handshaking
interfaces.hardware or software.is active for source requests on this channel.
cfgx[11] = 1'b0;
cfgx[10] = 1'b0;
These settings are ignored because both the source and destination are memory
types.
b)
If the hardware handshaking interface is activated for the source or destination
peripheral, assign the handshaking interface to the source and destination
peripheral by programming the SRC_PER and DEST_PER bits:
cfgx[46:43] = 1'b0;
cfgx[42:39] = 1'b0;
These settings are ignored because both the source and destination are memory
types.
3.
The following For loop, shown as a programming example, sets the following:
–
LLI.LLPx register locations of all LLI entries in memory (except the last) to nonzero and point to the base address of the next Linked List Item.
–
LLI.SARx/LLI.DARx register locations of all LLI entries in memory point to the start
source/destination block address preceding that LLI fetch.
The For statement below configures the LLPx entries:
174/590
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
for(i=0 ; i < 4 ; i=i+1) begin
if (i == 3)
llpx = 0; // end of LLI
else
llpx = llp_addr + 20; // start of next LLI
//-: Program SAR
`AHB_MASTER.write(0, llp_addr, sarx, AhbWord32Attrb, handle[0]);
//-: Program DAR
`AHB_MASTER.write(0, (llp_addr + 4), darx, AhbWord32Attrb, handle[0]);
//-: Program LLP
`AHB_MASTER.write(0, (llp_addr + 8), llpx, AhbWord32Attrb, handle[0]);
//-: Program CTL
`AHB_MASTER.write(0, (llp_addr + 12), ctlx[31:0], AhbWord32Attrb,
handle[0]);
`AHB_MASTER.write(0, (llp_addr + 16), ctlx[63:32], AhbWord32Attrb,
handle[0]);
// update pointers
llp_addr = llp_addr + 20; // start of next LLI
// 4
// (
sarx
darx
16-bit words each with scatter/gather interval in each block
will work only with scatter_gather count of 2)
= sarx + 24;
= darx + 24;
end
4.
If Gather is enabled—DMAH_CHx_SRC_GAT_EN = True and
CTLx.SRC_GATHER_EN is enabled— program the SGRx register for Channel 1.
5.
If Scatter is enabled—DMAH_CHx_DST_SCA_EN = True and
CTLx.DST_SCATTER_EN is enabled—program the DSRx register for Channel 1.
6.
Clear any pending interrupts on the channel from the previous DMA transfer by writing
to the Interrupt Clear registers.
7.
Finally, enable the channel by writing a 1 to the ChEnReg.CH_EN bit; the transfer is
performed.
Doc ID 018553 Rev 3
175/590
Direct memory access controllers (DMAC)
12.6.3
RM0078
Programming a channel
Three registers – LLPx, CTLx, and CFGx – need to be programmed to determine whether
single- or multi-block transfers occur, and which type of multi-block transfer is used. The
different transfer types are shown in Table 54.
The DMAC can be programmed to fetch the status from the source or destination peripheral;
this status is stored in the SSTATx and DSTATx registers. When the DMAC is programmed
to fetch the status from the source or destination peripheral, it writes this status and the
contents of the CTLx register back to memory at the end of a block transfer. The Write Back
column of Table 54 shows when this occurs.
The “Update Method” columns indicate where the values of SARx, DARx, CTLx, and LLPx
are obtained for the next block transfer when multi-block DMAC transfers are enabled.
Note:
In Table 54, all other combinations of LLPx.LOC = 0, CTLx.LLP_SRC_EN,
CFGx.RELOAD_SRC, CTLx.LLP_DST_EN, and CFGx.RELOAD_DST are illegal, and will
cause indeterminate or erroneous behavior.
Programming examples
Section : Single-block transfer (Row 1) on page 176
Section : Multi-block transfer with linked list for source and linked list for destination (Row
10) on page 178
Section : Multi-block transfer with source address auto-reloaded and destination address
auto-reloaded (Row 4) on page 182
Section : Multi-block transfer with source address auto-reloaded and linked list destination
address (Row 7) on page 186
Section : Multi-block transfer with source address auto-reloaded and contiguous destination
address (Row 3) on page 191
Section : Multi-block DMA transfer with linked list for source and contiguous destination
address (Row 8) on page 195
Single-block transfer (Row 1)
This section describes a single-block transfer, Row 1 in Table 54.
Note:
176/590
Row 5 in Table 54 is also a single-block transfer with write-back of control and status
information enabled at the end of the single-block transfer.
1.
Read the Channel Enable register to choose a free (disabled) channel; refer to
“ChEnReg” register.
2.
Clear any pending interrupts on the channel from the previous DMA transfer by writing
to the Interrupt Clear registers: ClearTfr, ClearBlock, ClearSrcTran, ClearDstTran, and
ClearErr. Reading the Interrupt Raw Status and Interrupt Status registers confirms that
all interrupts have been cleared.
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
3.
Program the following channel registers:
a)
Write the starting source address in the SARx register for channel x.
b)
Write the starting destination address in the DARx register for channel x.
c)
Program CTLx and CFGx according to Row 1, as shown in Table 54. Program the
LLPx register with 0.
d)
Write the control information for the DMA transfer in the CTLx register for channel
x. For example, in the register, you can program the following:
- Set up the transfer type (memory or non-memory peripheral for source and
destination) and flow control device by programming the TT_FC of the CTLx
register.
- Set up the transfer characteristics, such as:
• Transfer width for the source in the SRC_TR_WIDTH field.
• Transfer width for the destination in the DST_TR_WIDTH field.
• Source master layer in the SMS field where the source resides.
• Destination master layer in the DMS field where the destination resides.
• Incrementing/decrementing or fixed address for the source in the SINC field.
• Incrementing/decrementing or fixed address for the destination in the DINC
field.
e)
Write the channel configuration information into the CFGx register for channel x.
- Designate the handshaking interface type (hardware or software) for the source
and destination peripherals; this is not required for memory.
This step requires programming the HS_SEL_SRC/HS_SEL_DST bits,
respectively. Writing a 0 activates the hardware handshaking interface to handle
source/destination requests. Writing a 1 activates the software handshaking
interface to handle source and destination requests.
- If the hardware handshaking interface is activated for the source or destination
peripheral, assign a handshaking interface to the source and destination
peripheral; this requires programming the SRC_PER and DEST_PER bits,
respectively.
f)
If gather is enabled (parameter DMAH_CHx_SRC_GAT_EN = True and
CTLx.SRC_GATHER_EN is enabled), program the SGRx register for channel x.
g)
If scatter is enabled (parameter DMAH_CHx_DST_SCA_EN = True and
CTLx.DST_SCATTER_EN), program the DSRx register for channel x.
4.
After the DMAC-selected channel has been programmed, enable the channel by
writing a 1 to the ChEnReg.CH_EN bit. Ensure that bit 0 of the DmaCfgReg register is
enabled.
5.
Source and destination request single and burst DMA transactions in order to transfer
the block of data (assuming non-memory peripherals). The DMAC acknowledges at the
completion of every transaction (burst and single) in the block and carries out the block
transfer.
6.
Once the transfer completes, hardware sets the interrupts and disables the channel. At
this time, you can respond to either the Block Complete or Transfer Complete
interrupts, or poll for the transfer complete raw interrupt status register (RawTfr[n], n =
channel number) until it is set by hardware, in order to detect when the transfer is
complete. Note that if this polling is used, the software must ensure that the transfer
complete interrupt is cleared by writing to the Interrupt Clear register, ClearTfr[n],
before the channel is enabled.
Doc ID 018553 Rev 3
177/590
Direct memory access controllers (DMAC)
RM0078
Multi-block transfer with linked list for source and linked list for destination
(Row 10)
Note:
This type of multi-block transfer can only be enabled when either of the following parameters
is set:
●
DMAH_CHx_MULTI_BLK_TYPE = NO_HARDCODE,
or
●
DMAH_CHx_MULTI_BLK_TYPE = LLP_LLP
1.
Read the Channel Enable register (ChEnReg) to choose a free (disabled) channel.
2.
Set up the chain of Linked List Items (otherwise known as block descriptors) in
memory. Write the control information in the LLI.CTLx register location of the block
descriptor for each LLI in memory (see Figure 46) for channel x. For example, in the
register, you can program the following:
a)
Set up the transfer type (memory or non-memory peripheral for source and
destination) and flow control device by programming the TT_FC of the CTLx
register.
b)
Set up the transfer characteristics, such as:
- Transfer width for the source in the SRC_TR_WIDTH field.
- Transfer width for the destination in the DST_TR_WIDTH field.
- Source master layer in the SMS field where the source resides.
- Destination master layer in the DMS field where the destination resides.
- Incrementing/decrementing or fixed address for the source in the SINC field.
- Incrementing/decrementing or fixed address for the destination in the DINC field.
3.
178/590
Write the channel configuration information into the CFGx register for channel x.
a)
Designate the handshaking interface type (hardware or software) for the source
and destination peripherals; this is not required for memory.
This step requires programming the HS_SEL_SRC/HS_SEL_DST bits,
respectively. Writing a 0 activates the hardware handshaking interface to handle
source/destination requests for the specific channel. Writing a 1 activates the
software handshaking interface to handle source/destination requests.
b)
If the hardware handshaking interface is activated for the source or destination
peripheral, assign the handshaking interface to the source and destination
peripheral. This requires programming the SRC_PER and DEST_PER bits,
respectively.
4.
Make sure that the LLI.CTLx register locations of all LLI entries in memory (except the
last) are set as shown in Row 10 of Table 54. The LLI.CTLx register of the last Linked
List Item must be set as described in Row 1 or Row 5 of Table 54. Figure 46 shows a
Linked List example with two list items.
5.
Make sure that the LLI.LLPx register locations of all LLI entries in memory (except the
last) are non-zero and point to the base address of the next Linked List Item.
6.
Make sure that the LLI.SARx/LLI.DARx register locations of all LLI entries in memory
point to the start source/destination block address preceding that LLI fetch.
7.
If parameter DMAH_CHx_CTL_WB_EN = True, ensure that the LLI.CTLx.DONE field
of the LLI.CTLx register locations of all LLI entries in memory is cleared.
8.
If source status fetching is enabled (DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_SRC = True, and CFGx.SS_UPD_EN is enabled), program the
SSTATARx register so that the source status information can be fetched from the
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
location pointed to by the SSTATARx. For conditions under which the source status
information is fetched from system memory, refer to the Write Back column of Table 54.
9.
If destination status fetching is enabled (DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_DST = True, and CFGx.DS_UPD_EN is enabled), program the
DSTATARx register so that the destination status information can be fetched from the
location pointed to by the DSTATARx register. For conditions under which the
destination status information is fetched from system memory, refer to the Write Back
column of Table 54.
10. If gather is enabled (DMAH_CHx_SRC_GAT_EN = True and CTLx.SRC_GATHER_EN
is enabled), program the SGRx register for channel x.
11. If scatter is enabled (DMAH_CHx_DST_SCA_EN = True and
CTLx.DST_SCATTER_EN is enabled) program the DSRx register for channel x.
12. Clear any pending interrupts on the channel from the previous DMA transfer by writing
to the Interrupt Clear registers: ClearTfr, ClearBlock, ClearSrcTran, ClearDstTran, and
ClearErr. Reading the Interrupt Raw Status and Interrupt Status registers confirms that
all interrupts have been cleared.
13. Program the CTLx and CFGx registers according to Row 10, as shown in Table 54
14. Program the LLPx register with LLP(0), the pointer to the first linked list item.
15. Finally, enable the channel by writing a 1 to the ChEnReg.CH_EN bit; the transfer is
performed.
16. The DMAC fetches the first LLI from the location pointed to by LLPx(0).
Note:
The LLI.SARx, LLI.DARx, LLI.LLPx, and LLI.CTLx registers are fetched. The DMAC
automatically reprograms the SARx, DARx, LLPx, and CTLx channel registers from the
LLPx(0).
17. Source and destination request single and burst DMA transactions to transfer the block
of data (assuming non-memory peripheral). The DMAC acknowledges at the
completion of every transaction (burst and single) in the block and carries out the block
transfer.
18. Once the block of data is transferred, the source status information is fetched from the
location pointed to by the SSTATARx register and stored in the SSTATx register if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_SRC = True, and
CFGx.SS_UPD_EN is enabled. For conditions under which the source status
information is fetched from system memory, refer to the Write Back column of Table 54.
The destination status information is fetched from the location pointed to by the
DSTATARx register and stored in the DSTATx register if DMAH_CHx_CTL_WB_EN =
True, DMAH_CHx_STAT_DST = True, and CFGx.DS_UPD_EN is enabled. For
conditions under which the destination status information is fetched from system
memory, refer to the Write Back column of Table 54.
19. If DMAH_CHx_CTL_WB_EN = True, then the CTLx[63:32] register is written out to
system memory. For conditions under which the CTLx[63:32] register is written out to
system memory, refer to the Write Back column of Table 54.
The CTLx[63:32] register is written out to the same location on the same layer
(LLPx.LMS) where it was originally fetched; that is, the location of the CTLx register of
the linked list item fetched prior to the start of the block transfer. Only the second word
of the CTLx register is written out – CTLx[63:32] – because only the CTLx.BLOCK_TS
and CTLx.DONE fields have been updated by the DMAC hardware. Additionally, the
CTLx.DONE bit is asserted to indicate block completion. Therefore, software can poll
the LLI.CTLx.DONE bit of the CTLx register in the LLI to ascertain when a block
transfer has completed.
Doc ID 018553 Rev 3
179/590
Direct memory access controllers (DMAC)
Note:
RM0078
Do not poll the CTLx.DONE bit in the DMAC memory map; instead, poll the LLI.CTLx.DONE
bit in the LLI for that block. If the polled LLI.CTLx.DONE bit is asserted, then this block
transfer has completed. This LLI.CTLx.DONE bit was cleared at the start of the transfer
(Step 7).
20. The SSTATx register is now written out to system memory if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_SRC = True, and
CFGx.SS_UPD_EN is enabled. It is written to the SSTATx register location of the LLI
pointed to by the previously saved LLPx.LOC register.
The DSTATx register is now written out to system memory if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_DST = True, and
CFGx.DS_UPD_EN is enabled. It is written to the DSTATx register location of the LLI
pointed to by the previously saved LLPx.LOC register.
The end-of-block interrupt, int_block, is generated after the write-back of the control
and status registers has completed.
21. The write-back location for the control and status registers is the LLI pointed to by the
previous value of the LLPx.LOC register, not the LLI pointed to by the current value of
the LLPx.LOC register. next LLI from the memory location pointed to by the current
LLPx register and automatically reprograms the SARx, DARx, LLPx, and CTLx channel
registers. The DMA transfer continues until the DMAC determines that the CTLx and
LLPx registers at the end of a block transfer match the ones described in Row 1 or Row
5 of Table 54 (as discussed earlier). The DMAC then knows that the previously
transferred block was the last block in the DMA transfer.
The DMA transfer might look like that shown in Figure 51.
Figure 51. Multi-block with linked address for source and destination
Address of
Source Layer
Address of
Destination Layer
Block
2
SAR(2)
Block
2
DAR(2)
Block
1
SAR(1)
Block
1
DAR(1)
Block
0
SAR(0)
Block
0
DAR(0)
Source Blocks
180/590
Doc ID 018553 Rev 3
Destination Blocks
RM0078
Direct memory access controllers (DMAC)
If the user needs to execute a DMA transfer where the source and destination address are
contiguous, but where the amount of data to be transferred is greater than the maximum
block size CTLx.BLOCK_TS, then this can be achieved using the type of multi-block transfer
shown in Figure 52.
Figure 52. Multi-block with linked address for source and destination where SARx
and DARx between successive blocks are contiguous
Address of
Source Layer
Address of
Destination Layer
Block3
DAR(3)
Block3
Block2
DAR(2)
SAR(3)
Block2
Block1
DAR(1)
SAR(2)
Block1
SAR(1)
Block0
DAR(0)
Block0
SAR(0)
Source Blocks
Doc ID 018553 Rev 3
Destination Blocks
181/590
Direct memory access controllers (DMAC)
RM0078
The DMA transfer flow is shown in Figure 53.
Figure 53. DMA transfer flow for source and destination linked list address
Channel enabled by
software
LLI fetch
Hardware reprograms
SARx, DARx, CTLx, and LLPx
DMAC block transfer
Source/destination
status fetch
Write-back of control and
source/destination status to LLI
Block-complete interrupt
generated here
Is DMAC in
Row1 or Row5 of
the “Programming of transfer
types and channel register
update method” table?
DMAC transfer complete
interrupt generated here
no
yes
Channel disabled by
hardware
Multi-block transfer with source address auto-reloaded and destination
address auto-reloaded (Row 4)
Note:
This type of multi-block transfer can only be enabled when either of the following parameters
is set:
●
DMAH_CHx_MULTI_BLK_TYPE = NO_HARDCODE
or
182/590
●
DMAH_CHx_MULTI_BLK_TYPE = RELOAD_RELOAD
1.
Read the Channel Enable register (ChEnReg) to choose an available (disabled)
channel.
2.
Clear any pending interrupts on the channel from the previous DMA transfer by writing
to the Interrupt Clear registers: ClearTfr, ClearBlock, ClearSrcTran, ClearDstTran, and
ClearErr. Reading the Interrupt Raw Status and Interrupt Status registers confirms that
all interrupts have been cleared.
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
3.
Program the following channel registers:
a)
Write the starting source address in the SARx register for channel x.
b)
Write the starting destination address in the DARx register for channel x.
c)
Program CTLx and CFGx according to Row 4, as shown in Table 54. Program the
LLPx register with 0.
d)
Write the control information for the DMA transfer in the CTLx register for channel
x. For example, in the register, you can program the following:
- Set up the transfer type (memory or non-memory peripheral for source and
destination) and flow control device by programming the TT_FC of the CTLx
register.
- Set up the transfer characteristics, such as:
• Transfer width for the source in the SRC_TR_WIDTH field.
• Transfer width for the destination in the DST_TR_WIDTH field.
• Source master layer in the SMS field where the source resides.
• Destination master layer in the DMS field where the destination resides.
• Incrementing/decrementing or fixed address for the source in the SINC field.
• Incrementing/decrementing or fixed address for the destination in the DINC
field.
e)
If gather is enabled (DMAH_CHx_SRC_GAT_EN = True and
CTLx.SRC_GATHER_EN is enabled), program the SGRx register for channel x.
f)
If scatter is enabled (DMAH_CHx_DST_SCA_EN = True and
CTLx.DST_SCATTER_EN), program the DSRx register for channel x.
g)
Write the channel configuration information into the CFGx register for channel x.
Ensure that the reload bits, CFGx. RELOAD_SRC and CFGx.RELOAD_DST, are
enabled.
- Designate the handshaking interface type (hardware or software) for the source
and destination peripherals; this is not required for memory.
This step requires programming the HS_SEL_SRC/HS_SEL_DST bits,
respectively. Writing a 0 activates the hardware handshaking interface to handle
source/destination requests for the specific channel. Writing a 1 activates the
software handshaking interface to handle source/destination requests.
- If the hardware handshaking interface is activated for the source or destination
peripheral, assign the handshaking interface to the source and destination
peripheral. This requires programming the SRC_PER and DEST_PER bits,
respectively.
4.
After the DMAC selected channel has been programmed, enable the channel by writing
a 1 to the ChEnReg.CH_EN bit. Ensure that bit 0 of the DmaCfgReg register is
enabled.
5.
Source and destination request single and burst DMAC transactions to transfer the
block of data (assuming non-memory peripherals). The DMAC acknowledges on
completion of each burst/single transaction and carries out the block transfer.
6.
When the block transfer has completed, the DMAC reloads the SARx, DARx, and CTLx
registers. Hardware sets the block-complete interrupt. The DMAC then samples the
row number, as shown in Table 54. If the DMAC is in Row 1, then the DMA transfer
either respond to the Block Complete or Transfer Complete interrupts, or poll for the
transfer complete raw interrupt status register (RawTfr[n], where n is the channel
number) until it is set by hardware, in order to detect when the transfer is complete.
Doc ID 018553 Rev 3
183/590
Direct memory access controllers (DMAC)
RM0078
Note that if this polling is used, software must ensure that the transfer complete
interrupt is cleared by writing to the Interrupt Clear register, ClearTfr[n], before the
channel is enabled. If the DMAC is not in Row 1, the next step is performed.
7.
The DMA transfer proceeds as follows:
a)
If interrupts are enabled (CTLxx.INT_EN = 1) and the block-complete interrupt is
unmasked (MaskBlock[x] = 1’b1, where x is the channel number), hardware sets
the block-complete interrupt when the block transfer has completed. It then stalls
until the block-complete interrupt is cleared by software. If the next block is to be
the last block in the DMA transfer, then the block-complete ISR (interrupt service
routine) should clear the reload bits in the CFGx.RELOAD_SRC and
CFGx.RELOAD_DST registers. This puts the DMAC into Row 1, as shown in
Table 54. If the next block is not the last block in the DMA transfer, then the reload
bits should remain enabled to keep the DMAC in Row 4.
b)
If interrupts are disabled (CTLx.INT_EN = 0) or the block-complete interrupt is
masked (MaskBlock[x] = 1’b0, where x is the channel number), then hardware
does not stall until it detects a write to the block-complete interrupt clear register;
instead, it immediately starts the next block transfer. In this case, software must
clear the reload bits in the CFGx.RELOAD_SRC and CFGx.RELOAD_DST
registers to put the DMAC into Row 1 of Table 54 before the last block of the DMA
transfer has completed.
The transfer is similar to that shown in Figure 54.
Figure 54. Multi-block dma transfer with source and destination address autoreloaded
Address of
Source Layer
Address of
Destination Layer
Block0
Block1
Block2
SAR
DAR
BlockN
Source Blocks
184/590
Doc ID 018553 Rev 3
Destination Blocks
RM0078
Direct memory access controllers (DMAC)
The DMA transfer flow is shown in Figure 55.
Figure 55. DMA transfer flow for source and destination address auto-reloaded
Channel enabled by
software
Block transfer
Reload SARx, DARx, and CTLx
Is DMAC in Row1
of the “Programming of transfer
types and channel register
update method” table?
yes
DMAC transfer
complete interrupt
generated here
no
Channel disabled by
hardware
CTLx.INT_EN = 1
&
MASKBLOCK[x]=1?
Block-complete interrupt
generated here
no
yes
Stall until block-complete
interrupt cleared by software
Doc ID 018553 Rev 3
185/590
Direct memory access controllers (DMAC)
RM0078
Multi-block transfer with source address auto-reloaded and linked list
destination address (Row 7)
Note:
This type of multi-block transfer can only be enabled when either of the following parameters
is set:
●
DMAH_CHx_MULTI_BLK_TYPE = 0
or
●
DMAH_CHx_MULTI_BLK_TYPE = RELOAD_LLP
1.
Read the Channel Enable register (ChEnReg) in order to choose a free (disabled)
channel.
2.
Set up the chain of linked list items (otherwise known as block descriptors) in memory.
Write the control information in the LLI.CTLx register location of the block descriptor for
each LLI in memory (see Figure 46) for channel x. For example, in the register you can
program the following:
a)
Set up the transfer type (memory or non-memory peripheral for source and
destination) and flow control peripheral by programming the TT_FC of the CTLx
register.
b)
Set up the transfer characteristics, such as:
- Transfer width for the source in the SRC_TR_WIDTH field. .
- Transfer width for the destination in the DST_TR_WIDTH field.
- Source master layer in the SMS field where the source resides.
- Destination master layer in the DMS field where the destination resides.
- Incrementing/decrementing or fixed address for the source in the SINC field.
- Incrementing/decrementing or fixed address for the destination in the DINC field.
3.
Note:
Write the starting source address in the SARx register for channel x.
The values in the LLI.SARx register locations of each of the Linked List Items (LLIs) set up
in memory, although fetched during an LLI fetch, are not used.
4.
Write the channel configuration information into the CFGx register for channel x.
- Designate the handshaking interface type (hardware or software) for the source
and destination peripherals; this is not required for memory.
This step requires programming the HS_SEL_SRC/HS_SEL_DST bits. Writing a 0
activates the hardware handshaking interface to handle source/destination
requests for the specific channel. Writing a 1 activates the software handshaking
interface source/destination requests.
- If the hardware handshaking interface is activated for the source or destination
peripheral, assign the handshaking interface to the source and destination
peripheral; this requires programming the SRC_PER and DEST_PER bits,
respectively.
5.
186/590
Make sure that the LLI.CTLx register locations of all LLIs in memory (except the last)
are set as shown in Row 7 of Table 54, while the LLI.CTLx register of the last Linked
List item must be set as described in Row 1 or Row 5 of Table 54. Figure 46 shows a
Linked List example with two list items.
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
6.
Ensure that the LLI.LLPx register locations of all LLIs in memory (except the last) are
non-zero and point to the next Linked List Item.
7.
Ensure that the LLI.DARx register location of all LLIs in memory point to the start
destination block address preceding that LLI fetch.
8.
If DMAH_CHx_CTL_WB_EN = True, ensure that the LLI.CTLx.DONE fields of the
LLI.CTLx register locations of all LLIs in memory are cleared.
9.
If source status fetching is enabled (DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_SRC = True, and CFGx.SS_UPD_EN is enabled), program the
SSTATARx register so that the source status information can be fetched from the
location pointed to by the SSTATARx. For conditions under which the source status
information is fetched from system memory, refer to the Write Back column of Table 54.
10. If destination status fetching is enabled (DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_DST = True, and CFGx.DS_UPD_EN is enabled), program the
DSTATARx register so that the destination status information can be fetched from the
location pointed to by the DSTATARx register. For conditions under which the
destination status information is fetched from system memory, refer to the Write Back
column of Table 54.
11. If gather is enabled (DMAH_CHx_SRC_GAT_EN = True and CTLx.SRC_GATHER_EN
is enabled), program the SGRx register for channel x.
12. If scatter is enabled (DMAH_CHx_DST_SCA_EN = True and
CTLx.DST_SCATTER_EN, program the DSRx register for channel x.
13. Clear any pending interrupts on the channel from the previous DMA transfer by writing
to the Interrupt Clear registers: ClearTfr, ClearBlock, ClearSrcTran, ClearDstTran, and
ClearErr. Reading the Interrupt Raw Status and Interrupt Status registers confirms that
all interrupts have been cleared.
14. Program the CTLx and CFGx registers according to Row 7, as shown in Table 54.
15. Program the LLPx register with LLPx(0), the pointer to the first Linked List item.
16. Finally, enable the channel by writing a 1 to the ChEnReg.CH_EN bit; the transfer is
performed. Ensure that bit 0 of the DmaCfgReg register is enabled.
17. The DMAC fetches the first LLI from the location pointed to by LLPx(0).
Note:
The LLI.SARx, LLI.DARx, LLI.LLPx, and LLI.CTLx registers are fetched. The LLI.SARx
register – although fetched – is not used.
18. Source and destination request single and burst DMAC transactions in order to transfer
the block of data (assuming non-memory peripherals). The DMAC acknowledges at the
completion of every transaction (burst and single) in the block and carries out the block
transfer.
19. Once the block of data is transferred, the source status information is fetched from the
location pointed to by the SSTATARx register and stored in the SSTATx register if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_SRC = True, and
CFGx.SS_UPD_EN is enabled. For conditions under which the source status
information is fetched from system memory, refer to the Write Back column of Table 54.
The destination status information is fetched from the location pointed to by the
DSTATARx register and stored in the DSTATx register if DMAH_CHx_CTL_WB_EN =
True, DMAH_CHx_STAT_DST = True, and CFGx.DS_UPD_EN is enabled. For
conditions under which the destination status information is fetched from system
memory, refer to the Write Back column of Table 54.
20. If DMAH_CHx_CTL_WB_EN = True, then the CTLx[63:32] register is written out to
system memory. For conditions under which the CTLx[63:32] register is written out to
Doc ID 018553 Rev 3
187/590
Direct memory access controllers (DMAC)
RM0078
system memory, refer to the Write Back column of Table 54.
The CTLx[63:32] register is written out to the same location on the same layer
(LLPx.LMS) where it was originally fetched; that is, the location of the CTLx register of
the linked list item fetched prior to the start of the block transfer. Only the second word
of the CTLx register is written out – CTLx[63:32] – because only the CTLx.BLOCK_TS
and CTLx.DONE fields have been updated by hardware within the DMAC. The
LLI.CTLx.DONE bit is asserted to indicate block completion. Therefore, software can
poll the LLI.CTLx.DONE bit field of the CTLx register in the LLI to ascertain when a
block transfer has completed.
Note:
Do not poll the CTLx.DONE bit in the DMAC memory map. Instead, poll the LLI.CTLx.DONE
bit in the LLI for that block. If the polled LLI.CTLx.DONE bit is asserted, then this block
transfer has completed. This LLI.CTLx.DONE bit was cleared at the start of the transfer
(Step 8).
21. The SSTATx register is now written out to system memory if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_SRC = True, and
CFGx.SS_UPD_EN is enabled. It is written to the SSTATx register location of the LLI
pointed to by the previously saved LLPx.LOC register.
The DSTATx register is now written out to system memory if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_DST = True, and
CFGx.DS_UPD_EN is enabled. It is written to the DSTATx register location of the LLI
pointed to by the previously saved LLPx.LOC register.
The end-of-block interrupt, int_block, is generated after the write-back of the control
and status registers has completed.
Note:
The write-back location for the control and status registers is the LLI pointed to by the
previous value of the LLPx.LOC register, not the LLI pointed to by the current value of the
LLPx.LOC register.
22. The DMAC reloads the SARx register from the initial value. Hardware sets the blockcomplete interrupt. The DMAC samples the row number, as shown in Table 54. If the
DMAC is in Row 1 or Row 5, then the DMA transfer has completed. Hardware sets the
transfer complete interrupt and disables the channel. You can either respond to the
Block Complete or Transfer Complete interrupts, or poll for the transfer complete raw
interrupt status register (RawTfr[n], n = channel number) until it is set by hardware, in
order to detect when the transfer is complete. Note that if this polling is used, software
must ensure that the transfer complete interrupt is cleared by writing to the Interrupt
Clear register, ClearTfr[n], before the channel is enabled. If the DMAC is not in Row 1
or Row 5 as shown in Table 54, the following steps are performed.
23. The DMA transfer proceeds as follows: 23. The DMA transfer proceeds as follows:
188/590
a)
If interrupts are enabled (CTLx.INT_EN = 1) and the block-complete interrupt is
unmasked (MaskBlock[x] = 1’b1, where x is the channel number), hardware sets
the block-complete interrupt when the block transfer has completed. It then stalls
until the block-complete interrupt is cleared by software. If the next block is to be
the last block in the DMA transfer, then the block-complete ISR (interrupt service
routine) should clear the CFGx.RELOAD_SRC source reload bit. This puts the
DMAC into Row 1, as shown in Table 54. If the next block is not the last block in
the DMA transfer, then the source reload bit should remain enabled to keep the
DMAC in Row 7, as shown in Table 54.
b)
If interrupts are disabled (CTLx.INT_EN = 0) or the block-complete interrupt is
masked (MaskBlock[x] = 1’b0, where x is the channel number), then hardware
does not stall until it detects a write to the block-complete interrupt clear register;
instead, it immediately starts the next block transfer. In this case, software must
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
clear the source reload bit, CFGx.RELOAD_SRC in order to put the device into
Row 1 of Table 54 before the last block of the DMA transfer has completed.
24. The DMAC fetches the next LLI from memory location pointed to by the current LLPx
register and automatically reprograms the DARx, CTLx, and LLPx channel registers.
Note that the next block is the last block of the DMA transfer, then the CTLx and LLPx
registers just fetched from the LLI should match Row 1 or Row 5 of Table 54.
The DMA transfer might look like that shown in Figure 56.
Figure 56. Multi-block DMA transfer with source address auto-reloaded and linked
list destination address
Address of
Source Layer
Address of
Destination Layer
Block0
SAR
DAR(0)
Block1
DAR(1)
Block2
DAR(2)
BlockN
DAR(N)
Source Blocks
Doc ID 018553 Rev 3
Destination Blocks
189/590
Direct memory access controllers (DMAC)
RM0078
The DMA transfer flow is shown in Figure 57.
Figure 57. DMA transfer flow for source address auto-reloaded and linked list
destination address
Channel enabled by
software
LLI fetch
Hardware reprograms
DARx, CTLx, and LLPx
DMAC block transfer
Source/destination status fetch
Write-back of control and
source/destination status to LLI
Reload SARx
yes
DMAC transfer
complete interrupt
generated here
Is DMAC in Row1
of the “Programming of transfer
types and channel register
update method” table?
no
Channel disabled by
hardware
CTLx.INT_EN = 1
&
MASKBLOCK[x]=1?
Block-complete interrupt
generated here
yes
Stall until block interrupt
cleared by hardware
190/590
Doc ID 018553 Rev 3
no
RM0078
Direct memory access controllers (DMAC)
Multi-block transfer with source address auto-reloaded and contiguous
destination address (Row 3)
Note:
This type of multi-block transfer can only be enabled when either of the following parameters
is set:
●
DMAH_CHx_MULTI_BLK_TYPE = 0
or
●
DMAH_CHx_MULTI_BLK_TYPE = RELOAD_CONT
1.
Read the Channel Enable register (ChEnReg) to choose a free (disabled) channel.
2.
Clear any pending interrupts on the channel from the previous DMA transfer by writing
to the Interrupt Clear registers: ClearTfr, ClearBlock, ClearSrcTran, ClearDstTran, and
ClearErr. Reading the Interrupt Raw Status and Interrupt Status registers confirms that
all interrupts have been cleared.
3.
Program the following channel registers:
a)
Write the starting source address in the SARx register for channel x.
b)
Write the starting destination address in the DARx register for channel x .
c)
Program CTLx and CFGx according to Row 3, shown in Table 54. Program the
LLPx register with 0.
d)
Write the control information for the DMA transfer in the CTLx register for channel
x. For example, in the register, you can program the following:
- Set up the transfer type (memory or non-memory peripheral for source and
destination) and flow control device by programming the TT_FC of the CTLx
register.
- Set up the transfer characteristics, such as:
• Transfer width for the source in the SRC_TR_WIDTH field.
• Transfer width for the destination in the DST_TR_WIDTH field.
• Source master layer in the SMS field where the source resides.
• Destination master layer in the DMS field where the destination resides.
• Incrementing/decrementing or fixed address for the source in the SINC field.
• Incrementing/decrementing or fixed address for the destination in the DINC
field.
e)
If gather is enabled (DMAH_CHx_SRC_GAT_EN = True and
CTLx.SRC_GATHER_EN is enabled), program the SGRx register for channel x.
f)
If scatter is enabled (DMAH_CHx_DST_SCA_EN = True and
CTLx.DST_SCATTER_EN is enabled), program the DSRx register for channel x.
g)
Write the channel configuration information into the CFGx register for channel x.
- Designate the handshaking interface type (hardware or software) for the source
and destination peripherals; this is not required for memory.
This step requires programming the HS_SEL_SRC/HS_SEL_DST bits,
respectively. Writing a 0 activates the hardware handshaking interface to handle
source/destination requests for the specific channel. Writing a 1 activates the
software handshaking interface to handle source/destination requests.
- If the hardware handshaking interface is activated for the source or destination
peripheral, assign the handshaking interface to the source and destination
Doc ID 018553 Rev 3
191/590
Direct memory access controllers (DMAC)
RM0078
peripheral. This requires programming the SRC_PER and DEST_PER bits,
respectively.
192/590
4.
After the DMAC channel has been programmed, enable the channel by writing a 1 to
the ChEnReg.CH_EN bit. Ensure that bit 0 of the DmaCfgReg register is enabled.
5.
Source and destination request single and burst DMAC transactions to transfer the
block of data (assuming non-memory peripherals). The DMAC acknowledges at the
completion of every transaction (burst and single) in the block and carries out the block
transfer.
6.
When the block transfer has completed, the DMAC reloads the SARx register; the
DARx register remains unchanged. Hardware sets the block-complete interrupt. The
DMAC then samples the row number, as shown in Table 54. If the DMAC is in Row 1,
then the DMA transfer has completed. Hardware sets the transfer-complete interrupt
and disables the channel. You can either respond to the Block Complete or Transfer
Complete interrupts, or poll for the transfer complete raw interrupt status register
(RawTfr[n], n = channel number) until it is set by hardware, in order to detect when the
transfer is complete. Note that if this polling is used, software must ensure that the
transfer complete interrupt is cleared by writing to the Interrupt Clear register,
ClearTfr[n], before the channel is enabled. If the DMAC is not in Row 1, the next step is
performed.
7.
The DMA transfer proceeds as follows:
a)
If interrupts are enabled (CTLx.INT_EN = 1) and the block-complete interrupt is
unmasked (MaskBlock[x] = 1’b1, where x is the channel number), hardware sets
the block-complete interrupt when the block transfer has completed. It then stalls
until the block-complete interrupt is cleared by software. If the next block is to be
the last block in the DMA transfer, then the block-complete ISR (interrupt service
routine) should clear the source reload bit, CFGx.RELOAD_SRC. This puts the
DMAC into Row 1, as shown in Table 54. If the next block is not the last block in
the DMA transfer, then the source reload bit should remain enabled to keep the
DMAC in Row 3, as shown in Table 54.
b)
If interrupts are disabled (CTLx.INT_EN = 0) or the block-complete interrupt is
masked (MaskBlock[x] = 1’b0, where x is the channel number), then hardware
does not stall until it detects a write to the block-complete interrupt clear register;
instead, it starts the next block transfer immediately. In this case, software must
clear the source reload bit, CFGx.RELOAD_SRC, to put the device into Row 1 of
Table 54 before the last block of the DMA transfer has completed.
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
The transfer is similar to that shown in Figure 58.
Figure 58. Multi-block DMA transfer with source address auto-reloaded and
contiguous destination address
Address of
Source Layer
Address of
Destination Layer
Block2
Block1
DAR(2)
Block0
SAR
DAR(1)
DAR(0)
Source Blocks
Doc ID 018553 Rev 3
Destination Blocks
193/590
Direct memory access controllers (DMAC)
RM0078
The DMA transfer flow is shown in Figure 59.
Figure 59. DMA transfer flow for source address auto-reloaded and contiguous
destination address
Channel enabled by
software
Block transfer
Reload SARx and CTLx
yes
DMAC transfer
complete interrupt
generated here
Is DMAC in Row1
of the “Programming of transfer
types and channel register
update method” table?
no
Channel disabled by
hardware
CTLx.INT_EN = 1
&
MASKBLOCK[x]=1?
Block-complete interrupt
generated here
yes
Stall until block interrupt
cleared by software
194/590
Doc ID 018553 Rev 3
no
RM0078
Direct memory access controllers (DMAC)
Multi-block DMA transfer with linked list for source and contiguous
destination address (Row 8)
Note:
This type of multi-block transfer can only be enabled when either of the following parameters
is set:
●
DMAH_CHx_MULTI_BLK_TYPE = 0
or
●
DMAH_CHx_MULTI_BLK_TYPE = LLP_CONT
1.
Read the Channel Enable register (ChEnReg) to choose a free (disabled) channel.
2.
Set up the linked list in memory. Write the control information in the LLI.CTLx register
location of the block descriptor for each LLI in memory (see Figure 46) for channel x.
For example, in the register, you can program the following:
a)
Set up the transfer type (memory or non-memory peripheral for source and
destination) and flow control device by programming the TT_FC of the CTLx
register.
b)
Set up the transfer characteristics, such as:
- Transfer width for the source in the SRC_TR_WIDTH field.
- Transfer width for the destination in the DST_TR_WIDTH field.
- Source master layer in the SMS field where the source resides.
- Destination master layer in the DMS field where the destination resides.
- Incrementing/decrementing or fixed address for the source in the SINC field.
- Incrementing/decrementing or fixed address for the destination in the DINC field.
3.
Note:
Write the starting destination address in the DARx register for channel x.
The values in the LLI.DARx register location of each Linked List Item (LLI) in memory,
although fetched during an LLI fetch, are not used.
4.
5.
Write the channel configuration information into the CFGx register for channel x.
a)
Designate the handshaking interface type (hardware or software) for the source
and destination peripherals; this is not required for memory.
This step requires programming the HS_SEL_SRC/HS_SEL_DST bits. Writing a 0
activates the hardware handshaking interface to handle source/destination
requests for the specific channel. Writing a 1 activates the software handshaking
interface to handle source/destination requests.
b)
If the hardware handshaking interface is activated for the source or destination
peripheral, assign the handshaking interface to the source and destination
peripherals. This requires programming the SRC_PER and DEST_PER bits,
respectively.
Ensure that all LLI.CTLx register locations of the LLI (except the last) are set as shown
in Row 8 of Table 54, while the LLI.CTLx register of the last Linked List item must be set
as described in Row 1 or Row 5 of Table 54. Figure 46 shows a Linked List example
with two list items.
Doc ID 018553 Rev 3
195/590
Direct memory access controllers (DMAC)
RM0078
6.
Ensure that the LLI.LLPx register locations of all LLIs in memory (except the last) are
non-zero and point to the next Linked List Item.
7.
Ensure that the LLI.SARx register location of all LLIs in memory point to the start
source block address preceding that LLI fetch.
8.
If DMAH_CHx_CTL_WB_EN = True, ensure that the LLI.CTLx.DONE fields of the
LLI.CTLx register locations of all LLIs in memory are cleared.
9.
If source status fetching is enabled (DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_SRC = True, and CFGx.SS_UPD_EN is enabled), program the
SSTATARx register so that the source status information can be fetched from the
location pointed to by SSTATARx. For conditions under which the source status
information is fetched from system memory, refer to the Write Back column of Table 54.
10. If destination status fetching is enabled (DMAH_CHx_CTL_WB_EN = True,
DMAH_CHx_STAT_DST = True, and CFGx.DS_UPD_EN is enabled), program the
DSTATARx register so that the destination status information can be fetched from the
location pointed to by the DSTATARx register. For conditions under which the
destination status information is fetched from system memory, refer to the Write Back
column of Table 54.
11. If gather is enabled (DMAH_CHx_SRC_GAT_EN = True and CTLx.SRC_GATHER_EN
is enabled), program the SGRx register for channel x.
12. If scatter is enabled (DMAH_CHx_DST_SCA_EN = True and
CTLx.DST_SCATTER_EN) program the DSRx register for channel x.
13. Clear any pending interrupts on the channel from the previous DMA transfer by writing
to the Interrupt Clear registers: ClearTfr, ClearBlock, ClearSrcTran, ClearDstTran, and
ClearErr. Reading the Interrupt Raw Status and Interrupt Status registers confirms that
all interrupts have been cleared.
14. Program the CTLx and CFGx registers according to Row 8, as shown in Table 54.
15. Program the LLPx register with LLPx(0), the pointer to the first Linked List item.
16. Finally, enable the channel by writing a 1 to the ChEnReg.CH_EN bit; the transfer is
performed. Ensure that bit 0 of the DmaCfgReg register is enabled.
17. The DMAC fetches the first LLI from the location pointed to by LLPx(0).
Note:
The LLI.SARx, LLI.DARx, LLI.LLPx, and LLI.CTLx registers are fetched. The LLI.DARx
register location of the LLI – although fetched – is not used. The DARx register in the DMAC
remains unchanged.
18. Source and destination request single and burst DMAC transactions to transfer the
block of data (assuming non-memory peripherals). The DMAC acknowledges at the
completion of every transaction (burst and single) in the block and carries out the block
transfer.
19. Once the block of data is transferred, the source status information is fetched from the
location pointed to by the SSTATARx register and stored in the SSTATx register if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_SRC = True, and
CFGx.SS_UPD_EN is enabled. For conditions under which the source status
information is fetched from system memory, refer to the Write Back column of Table 54.
The destination status information is fetched from the location pointed to by the
DSTATARx register and stored in the DSTATx register if DMAH_CHx_CTL_WB_EN =
True, DMAH_CHx_STAT_DST = True, and CFGx.DS_UPD_EN is enabled. For
196/590
Doc ID 018553 Rev 3
RM0078
Direct memory access controllers (DMAC)
conditions under which the destination status information is fetched from system
memory, refer to the Write Back column of Table 54.
20. If DMAH_CHx_CTL_WB_EN = True, then the CTLx[63:32] register is written out to
system memory. For conditions under which the CTLx[63:32] register is written out to
system memory, refer to the Write Back column of Table 54.
The CTLx[63:32] register is written out to the same location on the same layer
(LLPx.LMS) where it was originally fetched; that is, the location of the CTLx register of
the linked list item fetched prior to the start of the block transfer. Only the second word
of the CTLx register is written out, CTLx[63:32], because only the CTLx.BLOCK_TS
and CTLx.DONE fields have been updated by hardware within the DMAC. Additionally,
the CTLx.DONE bit is asserted to indicate block completion. Therefore, software can
poll the LLI.CTLx.DONE bit field of the CTLx register in the LLI to ascertain when a
block transfer has completed.
Note:
Do not poll the CTLx.DONE bit in the DMAC memory map. Instead, poll the LLI.CTLx.DONE
bit in the LLI for that block. If the polled LLI.CTLx.DONE bit is asserted, then this block
transfer has completed. This LLI.CTLx.DONE bit was cleared at the start of the transfer
(Step 8).
21. The SSTATx register is now written out to system memory if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_SRC = True, and
CFGx.SS_UPD_EN is enabled. It is written to the SSTATx register location of the LLI
pointed to by the previously saved LLPx.LOC register.
The DSTATx register is now written out to system memory if
DMAH_CHx_CTL_WB_EN = True, DMAH_CHx_STAT_DST = True, and
CFGx.DS_UPD_EN is enabled. It is written to the DSTATx register location of the LLI
pointed to by the previously saved LLPx.LOC register.
The end-of-block interrupt, int_block, is generated after the write-back of the control
and status registers has completed.
Note:
The write-back location for the control and status registers is the LLI pointed to by the
previous value of the LLPx.LOC register, not the LLI pointed to by the current value of the
LLPx.LOC register.
22. The DMAC does not wait for the block interrupt to be cleared, but continues and fetches
the next LLI from the memory location pointed to by the current LLPx register and
automatically reprograms the SARx, CTLx, and LLPx channel registers. The DARx
register is left unchanged. The DMA transfer continues until the DMAC samples that
the CTLx and LLPx registers at the end of a block transfer match those described in
Row 1 or Row 5 of the “CTLx.SRC_MSIZE and DST_MSIZE decoding” table(1). The
DMAC then knows that the previously transferred block was the last block in the DMA
transfer.
The DMAC transfer might look like that shown in Figure 60. Note that the destination
address is decrementing.
1. For this table, refer to DMAC chapter in RM0089, Reference manual, SPEAr1340 address map and registers.
Doc ID 018553 Rev 3
197/590
Direct memory access controllers (DMAC)
RM0078
Figure 60. Multi-block DMA transfer with linked list source address and contiguous
destination address
Address of
Source Layer
Address of
Destination Layer
Block2
SAR(2)
Block1
Block2
DAR(2)
Block1
DAR(1)
SAR(1)
Block0
Block0
DAR(0)
SAR(0)
Source Blocks
Destination Blocks
The DMA transfer flow is shown in Figure 61.
Figure 61. DMA transfer for linked list source address and contiguous destination
address
Channel enabled by
software
LLI fetch
Hardware reprograms
SARx, CTLx, and LLPx
DMAC block transfer
Source/destination status fetch
Write-back of control and
source/destination status to LLI
Block-complete interrupt
generated here
Is DMAC in Row1
of the “Programming of transfer no
types and channel register
update method” table?
DMAC transfer
complete interrupt
generated here
yes
Channel disabled by
hardware
198/590
Doc ID 018553 Rev 3
RM0078
12.6.4
Direct memory access controllers (DMAC)
Disabling a channel prior to transfer completion
Under normal operation, software enables a channel by writing a 1 to the channel enable
register, ChEnReg.CH_EN, and hardware disables a channel on transfer completion by
clearing the ChEnReg.CH_EN register bit.
The recommended way for software to disable a channel without losing data is to use the
CH_SUSP bit in conjunction with the FIFO_EMPTY bit in the Channel Configuration
Register (CFGx).
1.
If software wishes to disable a channel prior to the DMA transfer completion, then it can
set the CFGx.CH_SUSP bit to tell the DMAC to halt all transfers from the source
peripheral. Therefore, the channel FIFO receives no new data.
2.
Software can now poll the CFGx.FIFO_EMPTY bit until it indicates that the channel
FIFO is empty.
3.
The ChEnReg.CH_EN bit can then be cleared by software once the channel FIFO is
empty.
When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not permit a
single word of CTLx.DST_TR_WIDTH to be formed. However, there may still be data in the
channel FIFO, but not enough to form a single transfer of CTLx.DST_TR_WIDTH. In this
scenario, once the channel is disabled, the remaining data in the channel FIFO is not
transferred to the destination peripheral.
It is permissible to remove the channel from the suspension state by writing a 0 to the
CFGx.CH_SUSP register. The DMA transfer completes in the normal manner.
Note:
If a channel is disabled by software, an active single or burst transaction is not guaranteed to
receive an acknowledgement. If the DMAC is configured to use defined length bursts
(DMAH_INCR_BURSTS = 0), disabling the channel via software prior to completing a
transfer is not supported.
Abnormal transfer termination
A DMAC DMA transfer may be terminated abruptly by software by clearing the channel
enable bit, ChEnReg.CH_EN. You must not assume that the channel is disabled
immediately after the ChEnReg. The CH_EN bit is cleared over the AHB slave interface.
Consider this as a request to disable the channel. You must poll ChEnReg.CH_EN and
confirm that the channel is disabled by reading back 0. A case where the channel is not
disabled after a channel disable request is where either the source or destination has
received a split or retry response. The DMAC must keep re-attempting the transfer to the
system HADDR that originally received the split or retry response until an OKAY response is
returned; to do otherwise is an AMBA protocol violation.
Software may terminate all channels abruptly by clearing the global enable bit in the DMAC
Configuration Register (DmaCfgReg[0]). Again, you must not assume that all channels are
disabled immediately after the DmaCfgReg[0] is cleared over the AHB slave interface.
Consider this as a request to disable all channels. You must poll ChEnReg and confirm that
all channels are disabled by reading back 0.
Note:
If the channel enable bit is cleared while there is data in the channel FIFO, this data is not
sent to the destination peripheral and is not present when the channel is re-enabled. For
read-sensitive source peripherals, such as a source FIFO, this data is therefore lost. When
the source is not a read-sensitive device (such as memory), disabling a channel without
waiting for the channel FIFO to empty may be acceptable, since the data is available from
Doc ID 018553 Rev 3
199/590
Direct memory access controllers (DMAC)
RM0078
the source peripheral upon request and is not lost.
If a channel is disabled by software, an active single or burst transaction is not guaranteed to
receive an acknowledgement.
If the DMAC is configured to use defined length bursts (DMAH_INCR_BURSTS = 0),
disabling the channel via software prior to completing a transfer is not supported.
12.6.5
Defined-length burst support on DMAC
By default, the DMAC support incremental (INCR) bursts only. To achieve better
performance, defined length bursts, such as INCR4, INCR8 and INCR16 are required. The
DMAC can be configured to use defined-length bursts by setting the configuration
parameter DMAH_INCR_BURSTS to 0. In this mode, the DMAC will select the largest valid
defined-length burst to complete the transfer.
200/590
Doc ID 018553 Rev 3
RM0078
13
Cryptographic co-processor (C3)
Cryptographic co-processor (C3)
This chapter focuses on C3 functionality and operation.
For the C3 feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
Overview
C3 is a set of macro-functions (channels) controlled by two instruction dispatchers.
Instruction flows are created and stored in memory by the host processor: they are then
read from memory and dispatched to the appropriate channels.
Figure 62 shows C3 block diagram.
Figure 62. C3 block diagram
IRQ
Instruction dispatcher (ID)
DES/3DES channel
M
Initiator bus
AES channel
UHH channel
UHH2 channel
PKA channel
Coupling and chaining module (CCM)
Move channel
AHB Master interface (HIF)
13.1
RM0089, Reference manual, SPEAr1340 address map and registers
RNG channel
RAM Buffer
(MEMORY)
S
Target bus
AHB Slave
interface (SIF)
Doc ID 018553 Rev 3
System registers
(SYS)
201/590
Cryptographic co-processor (C3)
13.2
RM0078
Functional description
The main blocks of C3 are described in the following sections.
13.2.1
AHB Master Interface (HIF)
The Master interface (HIF) interfaces channels and instruction dispatchers (ID) to the
initiator bus (AMBA AHB) and to the internal RAM Buffer (MEMORY). The purpose of the
HIF is to allow read and write accesses generated by channels and instruction dispatchers
to be transferred to the initiator bus or to the internal memory. An arbiter in the HIF prevents
data access collisions from occurring. ID0 has the highest priority to perform accesses on
this block followed in order by ID1 and Channels #0 to #7 (lowest priority). Read transfers
have higher priority then write transfers.
Every module attached to the HIF receives its own bus error signal. This signal is set by the
HIF if a bus error condition is detected for a bus transaction initiated by the corresponding
module.
The HIF is able to route requests to the internal memory instead of the bus. The HIF is also
able to route write requests to a byte bucket (data written there is thrown away).
Transactions can simultaneously occur on the bus, the internal memory and the byte bucket.
Before using the internal memory and the byte bucket, a base address for transactions that
must target the internal memory or the byte bucket instead of the bus must be programmed
in the HIF.
To program the memory and byte bucket base addresses, you must configure the related C3
HIF registers (see RM0089, Reference manual, SPEAr1340 address map and registers).
Write transaction requests coming from IDs or channels that are within an address window
of 64 KB starting from the programmed byte bucket base address will be routed to the byte
bucket. This means that everything written to this address window is thrown away. Read
transactions from this address window are not affected by the byte bucket: they are normally
routed either to the internal memory or to the bus.
Transaction requests coming from IDs or channels that are within an address window of 64
KB starting from the programmed memory base address will be routed to the internal
memory. Higher addresses of the internal memory window are aliased if the internal
memory is smaller than 64 KB.
A burst transaction always completes on the initial target even if addresses span two
different targets.
The Move Channel can be used to transfer data to/from the internal memory from/to the bus
and vice versa. The content of the internal memory is undefined at startup or after an
asynchronous master reset.
The byte bucket has priority if both the byte bucket base address and the memory base
address are programmed with the same addresses.
The internal memory content can also be accessed from the AHB Slave interface (SIF). The
internal memory can be accessed by an ID or channel and simultaneously from the AHB
slave interface (SIF).
202/590
Doc ID 018553 Rev 3
RM0078
13.2.2
Cryptographic co-processor (C3)
C3 RAM Buffer (MEMORY)
The AHB Master interface is able to route requests to an internal Memory instead of the
Bus. The size of the internal RAM is 16 KB, it is composed by 4096 words of 32 bits each.
13.2.3
Instruction dispatching subsystem (IDS)
The IDS is a structural block that instances up to 4 instruction dispatchers (ID) and an
instruction dispatcher multiplexer (IDM). The OR logic port is drawn in the IDS hierarchical
level for simplicity, although this logic is really located in IDM. The IDs are connected in
daisy chain to propagate information about channel and lane signals allocation. IDs can be
replaced with zero logic blocks if all instruction dispatchers are not needed. Each ID
interfaces to the HIF to fetch instruction, to the SIF to allow access to its registers, to the
CCM to send coupling/chaining commands, to the SYS to communicate interrupt states and
indirectly to channels (via IDM) to forward (dispatch) instructions.
SPEAr1340 implements 2 instruction dispatchers: ID0 and ID1. ID2 and ID3 are not
available.
Instruction dispatcher (ID)
An ID requests instructions from the HIF to fill an instruction queue. It knows which channels
instructions must be dispatched to by decoding the higher bits of the first word of every
instruction. If the target of an instruction is channel 0 and the instruction decodes to flow
type instructions (NOP, NEXT, STOP, COUPLE, UNCOUPLE or WAIT) the instruction is not
dispatched: it is executed by the ID.
See Section 13.3: Operation for details about the encodings.
Channel selection
An ID must allocate a channel to take its ownership. An ID is not allowed to dispatch
instructions to a channel without having allocated it. This way it is guaranteed that only one
ID at a time will have the control of a single channel. When a channel is allocated by an ID it
receives a select signal.
An ID goes in error state if it tries to allocate a Channel that is already allocated or if the
Channel is not in idle state.
Two IDs could simultaneously allocate a channel. This situation will remain unnoticed to
them until they start dispatching instructions.
Instruction dispatching
Once channels are allocated by using the above described signals the dispatching of
instructions can begin. Typically, the first word of an instruction is dispatched simultaneously
to the channel selection. Each instruction is composed by up to four 32-bit words. Words of
multi-word instructions are dispatched sequentially to a channel using lanes.
The instruction dispatcher multiplexer (IDM) multiplexes lanes coming from IDs to drive the
final four lanes that are shared by all channels.
When a channel is allocated by an ID, the ID monitors continuously its state (CSTAT). If the
channel should report an error, the ID goes also in error state aborting current dispatching
and stopping program execution.
Doc ID 018553 Rev 3
203/590
Cryptographic co-processor (C3)
13.2.4
RM0078
Couple and chaining module (CCM)
This is a switch matrix that can be used to interconnect two channels in a master/slave
mode. This block is controlled by the IDs when they execute COUPLE and UNCOUPLE
instructions. The maximum number of channel pairs that can be simultaneously
interconnected corresponds to the number of CCM data paths: one CCM data path is used
for each master/slave interconnection. Channels can be cascaded (a channel can
simultaneously be a master and a slave).
13.2.5
AHB Slave interface (SIF)
Most C3 blocks have configuration and/or status registers that can be accessed using the
AHB slave interface. The SIF bridges AHB requests to a simpler set of signals for the
different C3 blocks that need to map registers in the AHB address space. The SIF takes also
care to decode AHB addresses in order to select the correct C3 internal block. The SIF
interfaces to modules using a single clock data transfer protocol to keep latencies on the
AHB bus at a minimum. There is, however, a feature available to modules to permit them to
introduce wait states in read cycles.
13.2.6
System registers (SYS)
The SYS block implements the system registers as described in RM0089, Reference
manual, SPEAr1340 address map and registers. It collects status information about
channels and instruction dispatchers. This information can be read through the slave
interface. The SYS block is able to acknowledge ID interrupts and it is able to issue an
asynchronous reset command to the MRGEN block.
13.2.7
Reset logic (MRGEN)
This module drives the asynchronous reset buffer tree of the C3. It receives its input from
two sources: asynchronously from the system (top-level HRESET_n port) and
synchronously from the SYS module. The MRGEN module has two purposes: to
synchronize the release of the external reset and permit self-reset (software reset) to the
C3.
13.2.8
Channels
SPEAr1340 implements the following channels:
●
Channel 0: MOVE channel
●
Channel 1: DES/3DES channel
●
Channel 2: AES channel
●
Channel 3: UHH channel
●
Channel 4: UHH2 channel
●
Channel 5: PKA channel
●
Channel 6: RNG channel
●
Channel 7: empty
The features of each channel are described in Doc ID 023063, Data sheet, SPEAr1340,
Dual-core Cortex A9 HMI embedded MPU.
The instruction set for each channel is described in Section 13.3: Operation.
204/590
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
Each channel has four components. Two of them are common to all channels (CB_IF and
CPIF) and the other two (CU and CB) are specific to the channel. A channel is connected to
the Instruction dispatching subsystem (IDS), to the HIF, to the CCM and to the SIF.
Figure 63 shows a typical C3 channel architecture. The main components are described
below.
Figure 63. C3 channel architecture
)NSTRUCTION
$ISPATCHER
)$
#OUPLING#HAINING
-ODULE
##-
#HANNEL
&)&/
()&
"LOCK
CONTROL
UNIT#5
&)&/
#OREBLOCK
#"
#OREBLOCK
INTERFACE
#"?)&
4ARGET
BUS
The components of a channel are:
●
Control unit (CU): This block decodes instructions coming from the IDS and, in
accordance with them, configures the core block (CB) and, if necessary, activates it so
that it begins data processing. The CU interfaces the FIFOs in the CB_IF block to the
HIF and it must keep a source, destination and count registers. The CU drives also a
set of multiplexers inside the CB_IF for coupling and chaining operations driven by the
CCM. The CU is tailored to the needs of each channel but most of its logic can be
implemented by copying it from an existing channel. The MOVE channel may be used
as a template to build new channels.
●
Control block inteface (CB_IF): This block contains the FIFOs of the read and write
path of the channel. It contains also most of the coupling and chaining logic. The CB_IF
creates an interface from the rich and complex signal set of the CU to the much simpler
CB interface. The CB_IF is the same for every channel, only the size of its FIFOs can
be changed by modifying an RTL parameter.
●
Core Block (CB): It performs the main task of the channel, that is functional data
processing.
●
CPIF: Channel interface to the SIF for AHB slave register access: both the CU and the
CB have an interface for register access. The CPIF does the address decoding and
output signal multiplexing and interfaces to the main AHB slave interface, the SIF. The
CPIF stores also the Channel ID value (see registers description), this is configured
using and RTL parameter. The CPIF is the same for every channel.
Doc ID 018553 Rev 3
205/590
Cryptographic co-processor (C3)
13.3
RM0078
Operation
This section describes the instructions encoding for the generic flow type and for each
channel.
In the following subsections:
13.3.1
1.
[x] denotes the value of the Additional Instruction Words field for this instruction.
2.
(x-y) denote acceptable values for this field
3.
Unused fields must be zero
4.
The opcodes for module 0 are shared between the Instruction Dispatchers and the
channel 0 module. Two opcodes are available for channel 0 operations (bits 25-23 = 4
or 5). For example the move channel can be allocated to channel 0 and the two move
operations can be encoded in the operation field.
5.
xxxx stands for “don’t care”.
Generic flow type instructions
This section specifies the flow type instructions encoding.
Flow type instructions are decoded by C3 dispatchers when the module number (bits 31-28)
is 0. Different values of the module number leads to the dispatching of the instruction and its
arguments to the channel associated to that number.
Bits 31-28
Module Number (0)
Bits 27-26
Additional Instruction Words (0-3)
Bits 25-23
Operation (0-7)
0 = STOP
[0]
If and only if Bit 27-26 = 1 (one additional word) then the status register content is
written in memory (at the address pointed to by this additional word) when the stop
execution is executed.
Bits 22-0 -> unused
1 = WAIT
[0]
Bits 22-16 -> unused
Bits 15-0 -> Number of clock cycles to wait (0-65535)
2 = NEXT_Inst_List (*list_start)
[1]
Bits 22-0 -> unused
*list_start-> 32-bit pointer to start of next instruction list
3 = NOP
Bits 22-0 -> unused
4 = CHANNEL 0 SPECIFIC OPCODE
(see Section 13.3.2: Move channel instruction set)
5 = CHANNEL 0 SPECIFIC OPCODE
(see Section 13.3.2: Move channel instruction set)
6 = COUPLE
206/590
[0]
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
Bits 22-19 -> Master device (0-15)
Bit 18 -> Coupling/Chaining selection
0 = couple master device inputs (coupling)
1 = couple master device outputs (chaining)
Bit 17 -14 -> Slave device (0-15)
(This value should correspond to the Module Number)
Bits 13-11 -> Coupling/Chaining Path Number
Bits 10-0 -> unused
Bits 13-11 -> Coupling/Chaining Path Number
Bits 10-0 -> unused
7 = UNCOUPLE
[0]
Bits 22-14 -> unused
Bits 13-11 -> Coupling/Chaining Path Number
Bits 10-0 -> unused
13.3.2
Move channel instruction set
The Move channel executes MOVE_INIT and MOVE_DATA instructions as specified below.
Bits 31-28
Module Number (0)
(Assumes that Move Channel is in Channel 0 instruction space)
Bits 27-26
Additional Instruction Words (0-3)
Bits 25-23
Operation (0-7)
0-3 = Not Used
[0]
4 = MOVE_Init (data)
[1]
Bits 22-0 -> unused
data -> 32-bit mask for logical operations
5 = MOVE_Data (len, *src, *dest)
[2]
Bits 22-21 -> Logical operation
0 = no operation
1 = logical AND
2 = logical OR
3 = logical XOR
Bits 20-16 -> unused
Bits 15-0 -> Length of block to move(0-65535)
*src -> 32-bit pointer to start of source data
*dest -> 32-bit pointer to destination address
6-7 = Not Used
The move channel supports a slight variation of the MOVE_INIT instruction that also
accepts the function parameter used to set the operator (see bits nn below). Instructions
that do not conform to the following bit encodings or the ones mentioned above are
unknown to the Move channel and they will cause an error state.
Doc ID 018553 Rev 3
207/590
Cryptographic co-processor (C3)
RM0078
MOVE_INIT instruction
The MOVE_INIT instruction is 2 words long. This instruction is used to set the Function and
Operand of the Move channel. The function is encoded in the first instruction word whereas
the operand is represented by the second instruction word.
Table 55.
MOVE_INIT bit encoding
W#
Bit encoding
1
0000 0110 0nn0 0000 xxxx xxxx xxxx xxxx
2
(32 bit operand)
Bit nn in Table 55 are used to set the function of the Move channel:
Table 56.
MOVE_INIT bits nn definition
Bit 17, 16
nn
Function
00
null
01
AND
10
OR
11
XOR
MOVE_DATA instruction
The MOVE_DATA instruction is 3 words long. This instruction is used to set the Source
Register, the Destination Register and the Count Register of the Move Channel (values of
MOVE_SRCR, MOVE_DSTR and MOVE_CNTR registers) and to eventually start a copy
operation. The Function and the Count are encoded in the first instruction word, the second
word represents the Source Address and the Destination Address is represented by the
third instruction word.
Table 57.
MOVE_DATA bit encoding
W#
Bit encoding
1
0000 1010 1nn0 0000 cccc cccc cccc cccc
2
(32 bit Source Address)
3
(32 bit Destination Address)
Bit nn in the above table are used to set the Function of the Move Channel and have the
same encoding as in the MOVE_INIT instruction.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of data to be copied. If the Count is different from zero the Move Channel begins to copy
data. Count must be a multiple of 4 Bytes, Source and Destination Addresses must be 32-bit
aligned, otherwise the Move Channel will go in error state.
208/590
Doc ID 018553 Rev 3
RM0078
13.3.3
Cryptographic co-processor (C3)
DES/3DES channel instruction set
This channel can compute DES and 3DES encryption and decryption in ECB and CBC
mode by executing DES START and APPEND instructions. Instructions that do not conform
to the following bit encodings or to the generic flow type instructions are unknown to the
DES/3DES channel that will go in error state.
There are 2 different DES instructions:
●
DES START: used for setting the operation parameters, such as the key and the
initialization vector.
●
DES APPEND: used for passing the data to encrypt or decrypt.
DES START instruction
The DES START instruction can be applied with 2 different modes of operation:
●
ECB
●
CBC
ECB
The DES START ECB instruction is 2 words long. This instruction is used to set the key for
the following operations. The length of the key is encoded in the first instruction word, while
the second word represents the Source Address for the key.
Table 58.
DES START ECB bit encoding
W#
Bit encoding
1
0001 01ab 000x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the key)
Bit a in Table 58 is used to set the algorithm to use:
Table 59.
Bit a definition
Bit 25
a
Operation
0
DES
1
3DES
Bit b in Table 58 is used to set the operation to perform:
Table 60.
Bit b definition
Bit 24
b
Operation
0
Encryption
1
Decryption
Bits 15 to 0 in the first instruction word (cccc in Table 58) represent the length in Bytes of the
key.
Doc ID 018553 Rev 3
209/590
Cryptographic co-processor (C3)
RM0078
CBC
The DES START CBC instruction is 3 words long. This instruction is used to set the key and
the initialization vector for the following operations. The length of the key is encoded in the
first instruction word, the second word represents the Source Address for the key and the
third word represents the Source Address for the Initialization Vector (IV).
Table 61.
DES START CBC bit encoding
W#
Bit encoding
1
0001 10ab 001x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the key)
3
(32 bit Source Address for the IV)
Bits a and b in the above table are used to set the algorithm and the operation to perform
and have the same encoding as in the ECB instruction. Bits 15 to 0 in the first instruction
word (cccc in the above table) represent the length in Bytes of the key.
DES APPEND instruction
The DES APPEND instruction can be applied with 3 different modes of operation:
●
ECB
●
CBC
ECB
The DES APPEND ECB instruction is 3 words long. This instruction is used for passing the
data to process (encrypt or decrypt). The length of the data to process is encoded in the first
instruction word, the second word represents the Source Address and the third word
represents the Destination Address.
Table 62.
W#
DES APPEND ECB bit encoding
Bit encoding
1
0001 10ab 100x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the data)
3
(32 bit Destination Address for the data)
Bit a in the above table is used to set the algorithm to use, while bit b is used to set the
operation to perform (see Table 59 and Table 60). Bits 15 to 0 in the first instruction word
(cccc in the above table) represent the length in Bytes of the data to process.
CBC
The DES APPEND CBC instruction is 3 words long. This instruction is used for passing the
data to process (encrypt or decrypt). The length of the data to process is encoded in the first
instruction word, the second word represents the Source Address and the third word
represents the Destination Address.
210/590
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
Table 63.
W#
DES APPEND CBC bit encoding
Bit encoding
1
0001 10ab 101x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the data)
3
(32 bit Destination Address for the data)
Bits a and b in the above table are used to set the algorithm and the operation to perform
and have the same encoding as in the ECB instruction. Bits 15 to 0 in the first instruction
word (cccc in the above table) represent the length in Bytes of the key.
Doc ID 018553 Rev 3
211/590
AES (MPCM) channel instruction set
The following figure lists all possible instruction encodings that the MPCM Channel understands.
Figure 64. AES (MPCM) channel instruction set
-0#-#HANNELINSTRUCTIONSET
7/2$
MNEMO
#(.
7.
B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B
Doc ID 018553 Rev 3
3%450?0
%8%#?0
3%450?$
%8%#?$
3%450?3
%8%#?3
RESERVED
RESERVED
3%450?0$
%8%#?0$
3%450?03
%8%#?03
3%450?3$
%8%#?3$
RESERVED
RESERVED
3%450?03$
%8%#?03$
./0
37)4#(64
RESERVED
RESERVED
RESERVED
RESERVED
$/7.,/!$
RESERVED
RESERVED
RESERVED
P
P
P
P
P P P P N
P P P P N
N
N
P
P
P
P
P
P
P
P
P
P
P
P
P P P P N
P P P P N
P P P P N
P P P P N
P P P P N
P P P P N
N
N
N
N
N
N
P
P
P
P
P
P
P
P
P
P
P
P
P P P P N
P P P P N
P P P P N
P P P P N
P P P P N
P P P P N
P
P
P
P
P P P P N
P P P P N
T
T
T
T
T
P
P
P
P
P
P
N
T
7/2$
BB
BB
BB
N
N
N N N N N N N
N N N N N N N
N
N
N
N
N
N
N N
N N
N
N
N
N
N
N
N
N
N N N N N N N
N N N N N N N
N N N N N N N
N N N N N N N
N N N N N N N
N N N N N N N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N N
N N
N N
N N
N N
N N
N
N
N
N
N
N
N
N
N
N
N
N
N N N N N N N
N N N N N N N
N N N N N N N
N N N N N N N
N N N N N N N
N N N N N N N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N N
N N
N N
N N
N N
N N
N
N
N
N
N
N
PARAM
PARAM
PARAM
PARAM
SRCPTR
SRCPTR
DSTPTR
DSTPTR
SRCPTR
SRCPTR
DSTPTR
DSTPTR
N
N
N N N N N N N
N N N N N N N
N
N
N
N
N
N
N N
N N
N
N
PARAM
PARAM
SRCPTR
SRCPTR
N
N
N
N
N
N
N
N
N
N
ADDRESS
SRCPTR
N
N
N
N
N
N
N
N
7/2$
N
N
N
N
N
PARAM
PARAM
DSTPTR
DSTPTR
SRCPTR
SRCPTR
DSTPTR
DSTPTR
RM0078
3%450
%8%#
RESERVED
RESERVED
7/2$
Cryptographic co-processor (C3)
212/590
13.3.4
RM0078
Cryptographic co-processor (C3)
Bits 27-26 of Instruction Word #0 indicates the number of additional words this instructions
has.
Bits 21-16 of Instruction Word #0 represents an MPCM micro-sequence when marked p in
the table or an MPCM vector table when marked t in the table.
Bits 15-0 of Instruction Word #0 represents the number of bytes to be handled by the MPCM
Channel when marked n in the table.
pppppp: MPCM micro-sequence number.
tttttt: MPCM vector table number.
nn..nnnn: Byte Count.
NOP Instruction
No operation. Do nothing. The MPCM is not started.
DOWNLOAD Instruction
The DOWNLOAD instruction is used to program the MPCM RAM memory with
microsequences. It has two additional instruction dwords:
●
Word1 indicates at which MPCM RAM address the micro-sequence must be placed
●
Word2 indicates the CU where to load the data from.
Bits 21-16 (pppppp) of the DOWNLOAD instruction word0 indicates which micro-sequence
number is being downloaded, bits 15-0 (nn..nnnn) indicates the length in Bytes of the
micro-sequence.
The C prototype of this instruction is:
mpcm_download(int prgno, int addr, const char *srcpt, int n);
prgno: bits 21-16 of word0
addr: word1
srcpt: word2
n: bits 15-0 of word0
Each micro-sequence instruction is 8 Bytes wide, so n must be a multiple of 8 Bytes. If n is
not an 8 Bytes multiple the CU reports an AERR error.
Each address location of the MPCM RAM contains a complete 8 Bytes micro-sequence
instruction.
Example: you want to download an 80 Bytes long micro-sequence as program #3 in the
MPCM RAM address h71. The micro-sequence must be loaded by the C3 from address
location h1000. The C function would be:
mpcm_download(3, 0x71, 0x1000, 80);
which must be encoded in the MPCM instruction:
29030050
00000071
00001000
After execution of this DOWNLOAD instruction, the MPCM channel will be ready to execute
micro-sequence #3. The micro-sequence is placed in the MPCM RAM locations h71-h7A.
The MPCM Core Block RAM is implicitly split into two sections. The initial address region
contains Vector Tables for micro-sequence addresses whereas higher addresses contain
the downloaded micro-sequences.
Doc ID 018553 Rev 3
213/590
Cryptographic co-processor (C3)
RM0078
Figure 65. MPCM Core block RAM diagram
In Figure 65 you can see that the micro-sequence downloaded in the previous example has
been placed at address h071 and that address location 3 is the vector to this
microsequence.
When you request the MPCM channel to execute micro-sequence #3, it will execute it
staring from address location h071.
There can be up to 64 entries in the Vector Table (the pppppp field in MPCM channel
instructions is 6-bit wide).
The user is free to choose how to organize the memory. Note that it is not obligatory to
allocate space for a full Vector Table. If you download only 4 micro-sequences, the Vector
Table can have only 4 entries (h000-h003) and the first micro-sequence can already be
placed at address h004.
SWITCHVT Instruction
The MPCM offers a mechanism to have multiple vector tables in case you need to download
more than 64 micro-sequences. In fact there can be up to 64 different vector tables in the
MPCM Core Block RAM, leading to a theoretical maximum of 64x64 = 4096 downloadable
micro-sequences.
The SWITCHVT is a single dword instruction. Bits 21-16 (tttttt) of the SWITCHVT
instruction word0 indicates which vector table to use for all following DOWNLOAD, SETUP
and EXECUTE instructions.
The C prototype of this instruction is:
mpcm_switchvt(int vtno);
vtno: bits 21-16 of word0
Example: you want to switch to Vector Table #1. The C function would be”
mpcm_switchvt(1);
214/590
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
which must be encoded in the MPCM instruction:
21010000
After execution of this SWITCHVT instruction the MPCM will use Vector Table #1 for
DOWNLOAD, SETUP and EXECUTE instructions.
When the MPCM starts-up it will per default use Vector Table #0. If you do not need to
download more than 64 micro-sequences, you should never use the SWITCHVT instruction.
Vector tables are contiguous in the initial address region of the MPCM Core Block RAM.
Figure 66. MPCM vector tables
EXECUTE Instructions
EXECUTE instructions are used to set the Source Address, Destination Address and Count
Register of the MPCM Channel’s CU and to run downloaded micro-sequences. Since it is
not always necessary to set all these registers a variety of different EXECUTE instructions is
offered (for instance, there is no need to set the Source Address Register for a microsequence that does not need input data).
One set of EXECUTE instructions have also the possibility to forward a parameter to the
MPCM Core Block before the micro-sequence is launched. This can become handy for
more complex modes.
Previous values of the Source Address and Destination Address registers are used if you do
not set them in an EXECUTE instructions.
Doc ID 018553 Rev 3
215/590
Cryptographic co-processor (C3)
RM0078
The C prototype of these instructions are:
mpcm_execute(int prgno, int n);
mpcm_execute_p(int prgno, unsigned int param, int n);
mpcm_execute_d(int prgno, char *dstpt, int n);
mpcm_execute_s(int prgno, const char *srcpt, int n);
mpcm_execute_pd(int prgno, unsigned int param, char *dstpt, int
n);
mpcm_execute_ps(int prgno, unsigned int param, const char
*stcpt, int n);
mpcm_execute_sd(int prgno, const char *srcpt, char *dstpt, int
n);
mpcm_execute_psd(int prgno, unsigned int param, const char
*srcpt, char *dstpt, int n);
See the previous paragraphs for encoding of these instructions.
Each of these instructions sets up the MPCM Core Block to execute the micro-sequence
prgno. If the EXECUTE instruction specifies param, this will be forwarded to the MPCM
Core Block before it is launched.
The Byte Count n specified in these EXECUTE instructions is also forwarded to the MPCM
Core Block before it is launched. In some more complex modes the micro-sequence could
need this information.
For each of these instructions the CU of the MPCM Channel will then load n Bytes from
*srcpt and forward them to the MPCM Core Block for processing. If n is zero no data will be
loaded. If *srcpt is not set by this EXECUTE instruction the previous value of *srcpt is used.
For each of these instructions the CU of the MPCM Channel will also store any data
generated by the MPCM Core Block to *dstpt. If *dstpt is not set by this EXECUTE
instruction the previous value will be used.
The CU of the MPCM Channel can handle any value of Byte Count (n). The CU loads
always the minimum dwords to satisfy the request (i.e. if you request the processing of 2
Bytes the CU will load 1 dword). The minimum value of n in zero, the maximum value of n is
(64kB – 1B = 65’535B).
SETUP Instructions
SETUP instructions are similar to EXECUTE instructions with the difference that the MPCM
Core Block is not launched. These are primarily used to setup the MPCM Channel when it is
going to be a Slave in Couple/Chaining operations: the MPCM Core Block will later be
launched by a Master Channel.
SETUP instructions can be used to set the Source Address and the Destination Address of
the MPCM Channel’s CU (note that the Count Register is not affected). The MPCM Core
Block is set-up to execute downloaded micro-sequences.
As with the EXECUTE instructions, one set of SETUP instructions have also the possibility
to forward a parameter to the MPCM Core Block.
Previous values of the Source Address and Destination Address registers are retained if you
do not set them in a SETUP instruction.
216/590
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
The C prototype of these instructions are:
mpcm_setup(int prgno, int n);
mpcm_setup_p(int prgno, unsigned int param, int n);
mpcm_setup_d(int prgno, char *dstpt, int n);
mpcm_setup_s(int prgno, const char *srcpt, int n);
mpcm_setup_pd(int prgno, unsigned int param, char *dstpt, int
n);
mpcm_setup_ps(int prgno, unsigned int param,
const char *stcpt, int n);
mpcm_setup_sd(int prgno, const char *srcpt, char *dstpt, int n);
mpcm_setup_psd(int prgno, unsigned int param, const char *srcpt,
char *dstpt, int n);
Each of these instructions sets-up the MPCM Core Block to execute the micro-sequence
prgno. If the SETUP instruction specifies param, this will be forwarded to the MPCM Core
Block. Note that the micro-sequence is not started. The micro-sequence can be later
launched by a Master Channel, ie.
The Byte Count n specified in these SETUP instructions is forwarded to the MPCM Core
Block. In some more complex modes the micro-sequence could need this information. The
Byte Count does not affect the Count Register.
After SETUP instructions the MPCM Channel will be ready to accept data from a Master
Channel in coupling/chaining operations.
13.3.5
Unified hash with HMAC (UHH) channel instruction set
The UHH Channel executes HASH [MD5/SHA1/SHA2/CONTEXT] and HMAC
[MD5/SHA1/SHA2/CONTEXT] instructions. Instructions that do not conform to the following
bit encodings or to to the generic flow type instructions are unknown to the UHH Channel
that will go in error state.
HASH instruction
There are 4 different HASH instructions:
●
HASH MD5
●
HASH SHA1
●
HASH SHA2
●
HASH CONTEXT
The first 3 instructions are used for computing the digest of a message and work in the
same way. The last one is used for saving and restoring the context.
HASH [MD5/SHA1/SHA2] instructions
Each HASH [MD5/SHA1/SHA2] instruction is composed by 3 subinstructions:
●
INIT
●
APPEND
●
END
Doc ID 018553 Rev 3
217/590
Cryptographic co-processor (C3)
RM0078
INIT
The HASH [MD5/SHA1/SHA2] INIT instruction is 1 word long. This instruction is used to set
the function.
Table 64.
HASH [MD5/SHA1/SHA2] INIT bit encoding
W#
1
Bit encoding
0011 000a a00x xxxx xxxx xxxx xxxx xxxx
Bits aa in the above table are used to set the algorithm to use:
Table 65.
HASH [MD5/SHA1/SHA2] INIT bits aa definition
Bit 24,23
aa
Algorithm
00
MD5
01
SHA-1
10
SHA-256
11
CONTEXT
(see HASH CONTEXT instruction)
APPEND
The HASH [MD5/SHA1/SHA2] APPEND instruction is 2 words long. The length of the
message is encoded in the first instruction word, while the second word represents the
Source Address for the message.
Table 66.
W#
HASH [MD5/SHA1/SHA2] APPEND instruction
Bit encoding
1
0011 010a a01x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the message)
Bits aa in Table 65 are used to set the algorithm to use and have the same encoding as in
the INIT instruction. Bits 15 to 0 in the first instruction word (cccc in the above table)
represent the Count in Bytes of the input message.
END
The HASH [MD5/SHA1/SHA2] END instruction is 2 words long. The second word
represents the Destination Address for the digest.
Table 67.
W#
218/590
HASH [MD5/SHA1/SHA2] END bit encoding
Bit encoding
1
0011 010a a10t xxxx xxxx xxxx xxxx xxxx
2
(32 bit Destination Address for the message)
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
Bits aa in Table 67 are used to set the algorithm to use and have the same encoding as in
the INIT instruction.
Bit t in the above table is used to truncate the result to 96-bits:
Table 68.
HASH [MD5/SHA1/SHA2] END bit t definition
Bit 20
t
Trunc
0
Full digest
1
Truncated 96-bit digest
HASH CONTEXT instruction
The HASH CONTEXT instruction is composed by 2 subinstructions:
●
SAVE
●
RESTORE
SAVE
The HASH CONTEXT SAVE instruction is 2 words long. The second word represents the
Destination Address for the context.
Table 69.
HASH CONTEXT SAVE bit encoding
W#
Bit encoding
1
0011 0101 10xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Destination Address for the context)
RESTORE
The HASH CONTEXT RESTORE instruction is 2 words long. The second word represents
the Source Address for the context.
Table 70.
HASH CONTEXT RESTORE bit encoding
W#
Bit encoding
1
0011 0101 11xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the context)
HMAC instruction
There are four different HMAC instructions:
●
HMAC MD5
●
HMAC SHA1
●
HMAC SHA2
●
HMAC CONTEXT
The first 3 instructions are used for computing the HMAC of a message and work in the
same way. The last one is used for saving and restoring the context.
Doc ID 018553 Rev 3
219/590
Cryptographic co-processor (C3)
RM0078
HMAC [MD5/SHA1/SHA2] instructions
Each HMAC [MD5/SHA1/SHA2] instruction is composed by 3 subinstructions:
●
INIT
●
APPEND
●
END
INIT
The HMAC [MD5/SHA1/SHA2] INIT instruction is 2 words long.
Table 71.
W#
HMAC [MD5/SHA1/SHA2] INIT bit encoding
Bit encoding
1
0011 011a a00x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the key)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the HASH INIT instruction.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the length in
Bytes of the key.
APPEND
The HMAC [MD5/SHA1/SHA2] APPEND instruction is 2 words long. This instruction is used
to set the Source Address Register for the message and to start the computation of the
HMAC.
The length of the message is encoded in the first instruction word, while the second word
represents the Source Address for the message.
Table 72.
W#
HMAC [MD5/SHA1/SHA2] APPEND bit encoding
Bit encoding
1
0011 011a a01x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the message)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the INIT instruction.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of the input message.
END
The HMAC [MD5/SHA1/SHA2] END instruction is 3 words long. The second word
represents the Source Address for the key, while the third word represents the Destination
Address for the HMAC.
220/590
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
Table 73.
HMAC [MD5/SHA1/SHA2] END bit encoding
W#
Bit encoding
1
0011 101a a10t xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the key)
3
(32 bit Destination Address for the message)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the INIT instruction.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the length in
Bytes of the key.
Bit t in the above table is used to truncate the result to 96-bits:
Table 74.
HMAC [MD5/SHA1/SHA2] END bit t definition
Bit 20
t
Trunc
0
Full HMAC
1
Truncated 96-bits HMAC
HMAC CONTEXT instruction
The HMAC CONTEXT instruction is composed by 2 subinstructions:
●
SAVE
●
RESTORE
SAVE
The HMAC CONTEXT SAVE instruction is 2 words long. The second word represents the
Destination Address for the context.
Table 75.
HMAC CONTEXT SAVE bit encoding
W#
Bit encoding
1
0011 0111 10xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the context)
RESTORE
The HMAC CONTEXT RESTORE instruction is 2 words long. The second word represents
the Source Address for the context.
Table 76.
HMAC CONTEXT RESTORE bit encoding
W#
Bit encoding
1
0011 0111 11xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the context)
Doc ID 018553 Rev 3
221/590
Cryptographic co-processor (C3)
RM0078
13.3.6
Unified hash with HMAC 2 (UHH2) channel instruction set
Note:
The channel described in this document (that supports SHA384 and SHA512) is called
UHH2, to distinguish from the UHH channel that can support MD5, SHA1 and SHA256. A
new channel has been developed for SHA384 and SHA512 because these algorithms are
oriented on 64 bits words (instead of 32 bits as for SHA1 and SHA256). The use of the
UHH2 channel is almost the same as for the UHH channel. There are 3 main differences:
–
SHA384 replaces MD5 and SHA512 replaces SHA1 in the instruction encoding
–
the digest size is 384 bits for SHA384 and 512 bits for SHA512
–
the size of the context for saving/restoring is increased (see details in CONTEXT
sections)
The UHH2 channel executes HASH [SHA384/SHA512/CONTEXT] and HMAC
[SHA384/SHA512/CONTEXT] instructions. Instructions that do not conform to the following
bit encodings or to to the generic flow type instructions are unknown to the UHH2 channel
that will go in error state.
There are 3 different HASH instructions:
●
HASH SHA384
●
HASH SHA512
●
HASH CONTEXT
The first 2 instructions are used for computing the digest of a message and work in the
same way. The last one is used for saving and restoring the context.
HASH [SHA384/SHA512] instructions
Each HASH [SHA384/SHA512] instruction is composed by 3 subinstructions:
●
INIT
●
APPEND
●
END
INIT
The HASH [SHA384/SHA512] INIT instruction is 1 word long. This instruction is used to set
the function.
Table 77.
HASH [SHA384/SHA512] INIT bit encoding
W#
1
Bit encoding
0100 000a a00x xxxx xxxx xxxx xxxx xxxx
Bits aa in the above table are used to set the algorithm to use:
Table 78.
222/590
HASH [SHA384/SHA512] INIT bits aa definition
Bit 24,23
aa
Algorithm
00
SHA384
01
SHA512
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
Table 78.
HASH [SHA384/SHA512] INIT bits aa definition
Bit 24,23
aa
Algorithm
10
not used
11
CONTEXT
(see HASH CONTEXT instruction)
APPEND
The HASH [SHA384/SHA512] APPEND instruction is 2 words long. The length of the
message is encoded in the first instruction word, while the second word represents the
Source Address for the message.
Table 79.
HASH [SHA384/SHA512] APPEND bit encoding
W#
Bit encoding
1
0100 010a a01x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the message)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the INIT instruction. Bits 15 to 0 in the first instruction word (cccc in the above table)
represent the count in bytes of the input message.
END
The HASH [SHA384/SHA512] END instruction is 2 words long. The second word represents
the Destination Address for the digest.
Table 80.
HASH [SHA384/SHA512] END bit encoding
W#
Bit encoding
1
0100 010a a100 xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the message)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the INIT instruction.
HASH CONTEXT instruction
The HASH CONTEXT instruction is composed by 2 subinstructions:
●
SAVE
●
RESTORE
SAVE
The HASH CONTEXT SAVE instruction is 2 words long. The second word represents the
Destination Address for the context.
Doc ID 018553 Rev 3
223/590
Cryptographic co-processor (C3)
Table 81.
RM0078
HASH CONTEXT SAVE bit encoding
W#
Bit encoding
1
0100 0101 10xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the message)
RESTORE
The HASH CONTEXT RESTORE instruction is 2 words long. The second word represents
the Source Address for the context.
Table 82.
HASH CONTEXT RESTORE bit encoding
W#
Bit encoding
1
0100 0101 11xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the message)
HMAC instruction
There are 3 different HMAC instructions:
●
HMAC SHA384
●
HMAC SHA512
●
HMAC CONTEXT
The first 2 instructions are used for computing the HMAC of a message and work in the
same way. The last one is used for saving and restoring the context.
HMAC [SHA384/SHA512] instructions
Each HMAC [SHA384/SHA512] instruction is composed by 3 subinstructions:
●
INIT
●
APPEND
●
END
INIT
The HMAC [SHA384/SHA512] INIT instruction is 2 words long.
Table 83.
HMAC [SHA384/SHA512] INIT bit encoding
W#
Bit encoding
1
0100 011a a00x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the message)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the HASH INIT instruction.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the length in
Bytes of the key.
224/590
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
APPEND
The HMAC [SHA384/SHA512] APPEND instruction is 2 words long. The length of the
message is encoded in the first instruction word, while the second word represents the
Source Address for the message.
Table 84.
HMAC [SHA384/SHA512] APPEND bit encoding
W#
Bit encoding
1
0100 011a a01x xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the message)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the INIT instruction.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of the input message.
END
The HMAC [SHA384/SHA512] END instruction is 3 words long. The second word
represents the Source Address for the key, while the third word represents the Destination
Address for the HMAC.
Table 85.
HMAC [SHA384/SHA512] END bit encoding
W#
Bit encoding
1
0100 101a a100 xxxx cccc cccc cccc cccc
2
(32 bit Source Address for the message)
Bits aa in the above table are used to set the algorithm to use and have the same encoding
as in the INIT instruction.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the length in
Bytes of the key.
HMAC CONTEXT instruction
The HMAC CONTEXT instruction is composed by 2 subinstructions:
●
SAVE
●
RESTORE
SAVE
The HMAC CONTEXT SAVE instruction is 2 words long. The second word represents the
Destination Address for the context.
Table 86.
HMAC CONTEXT SAVE bit encoding
W#
Bit encoding
1
0100 0111 10xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the message)
Doc ID 018553 Rev 3
225/590
Cryptographic co-processor (C3)
RM0078
RESTORE
The HMAC CONTEXT RESTORE instruction is 2 words long. The second word represents
the Source Address for the context.
Table 87.
HMAC CONTEXT RESTORE bit encoding
W#
13.3.7
Bit encoding
1
0100 0111 11xx xxxx xxxx xxxx xxxx xxxx
2
(32 bit Source Address for the message)
Public key (PKA) channel instruction set
The PKA Channel executes MONTY_PAR, MOD_EXP, MONTY_EXP, ECC_MUL and
ECC_MONTY_MUL instructions. Instructions that do not conform to the following bit
encodings or to the generic flow type instructions are unknown to the PKA Channel that will
go in error state.
Data structures and endianness
Each instruction for the PKA channel requires pointers for the input and the output data
structures to manage. The input data structures are variable, in composition (depending on
the operation to execute) and in size (depending on the used size for the underlying finite
field). All these structures must follow the indications about size and order for the involved
operands as described for each instruction.
The size of each field of the data structures is provided in W or E (number of 32-bit words),
where:
W =
E =
( op_len ) ⁄ 32
( exp _len ) ⁄ 32 for RSA/DH and E =
( k_len ) ⁄ 32
for ECC
W depends only on the underlying finite field size (op_len), while E depends on the
exponent to be used.
Note:
●
If the maximum allowed length for RSA and DH is 2048 bits, then W and E will be less
or equal to 64 words (corresponding to 256 bytes).
●
If the maximum allowed length for ECC is 384 bits, then W and E will be less or equal to
12 words (corresponding to 48 bytes).
The input data structures must follow the specified order and size for all the parameters.
Both input and output data structures are Big-endian. This means that the first word
represents the most significant word of the first operand. The first byte of the word is also
the most significant one.
For instance, in case of MONTY_EXP instruction, the input data structure is composed by
the 4 operands in the following order, as described by the instruction specification:
1.
op_len (one single 32-bit word)
2.
mod (W 32-bit words, depending on the op_len value)
3.
exp_len (one single 32-bit word)
4.
exp (E 32-bit words, depending on the exp_len value)
The representation in memory of this data structure is:
226/590
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
Table 88.
MONTY_EXP instruction data structure
MSB op_len
...
...
LSB op_len
MSB mod
...
...
-
-
...
...
-
-
...
...
LSB mod
MSB exp_len
...
...
LSB exp_len
MSB exp
...
...
-
-
-
-
-
-
-
-
LSB exp
MONTY_PAR instruction
The MONTY_PAR instruction is 3 words long. This instruction is used to set the Source
Address Register, the Destination Address Register and the Count Register of the PKA
Channel (values of PKA_SRCR, PKA_DSTR and PKA_CNTR registers) and to start the
computation of the Montgomery’s parameter. The value depends only on the underlying
finite field and then its computation is the same for both RSA/DH and ECC.
The Function and the Count are encoded in the first instruction word, the second word
represents the Source Address and the Destination Address is represented by the third
instruction word.
Table 89.
MONTY_PAR bit encoding
W#
Bit encoding
1
0101 1000 1xxx xxxx cccc cccc cccc cccc
2
(32 bit Source Address)
3
(32 bit Destination Address)
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of the input data structure. Count must be a multiple of 4 Bytes, Source and Destination
Addresses must be 32 bit aligned, otherwise the PKA Channel will go in error state.
The input data structure to pass is:
Table 90.
Input data structure
Name
Size in words
Description
op_len
1
Length of the operands in bits
mod
W
Modulus
The resulting data structure is:
Table 91.
Resulting data structure
Name
Size in words
R2 (mod n)
W
Description
Montgomery’s parameter
Doc ID 018553 Rev 3
227/590
Cryptographic co-processor (C3)
RM0078
MOD_EXP instruction
The MOD_EXP instruction is 4 words long. This instruction is used to set the Source
Address Register for secret data, the Source Address Register for public data, the
Destination Address Register and the Count Register of the PKA Channel (values of
PKA_SRCR, PKA_PSRCR, PKA_DSTR and PKA_CNTR registers) and to start the
computation of the modular exponentiation. In this case the input data structure has to
include the Montgomery’s parameter to use for the computation.
The Function and the Count are encoded in the first instruction word, the second word
represents the Source Address for the secret data, the third word represents the Source
Address for the public data and the Destination Address is represented by the fourth
instruction word.
MOD_EXP bit encoding
Table 92.
W#
Bit encoding
1
0101 1101 0xxx xxxx cccc cccc cccc cccc
2
(32 bit Source Address for secret data)
3
(32 bit Source Address for public data)
4
(32 bit Destination Address)
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of the input data structures (both secret and public). Count must be a multiple of 4 Bytes,
Source and Destination Addresses must be 32 bit aligned, otherwise the PKA Channel will
go in error state.
The input data structure with the secret data to pass is:
Table 93.
Input data structure
Name
Size in words
Description
op_len
1
mod
W
Modulus
exp_len
1
Length of the exponent in bits
exp
E
Exponent
R2(mod n)
W
Montgomery’s parameter
base
W
Base
Length of the operands in bits
The resulting data structure is:
Table 94.
Resulting data structure
Name
Exp
Base
228/590
(mod Mod)
Size in words
W
Description
Result of the modular exponentiation
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
MONTY_EXP instruction
The MONTY_EXP instruction is 4 words long. This instruction is used to set the Source
Address Register for secret data, the Source Address Register for public data, the
Destination Address Register and the Count Register of the PKA Channel (values of
PKA_SRCR, PKA_PSRCR, PKA_DSTR and PKA_CNTR registers) and to start the
computation of the modular exponentiation. In this case the Montgomery’s parameter to use
for the operation is computed by the channel.
The Function and the Count are encoded in the first instruction word, the second word
represents the Source Address for the secret data, the third word represents the Source
Address for the public data and the Destination Address is represented by the fourth
instruction word.
MONTY_EXP bit encoding
Table 95.
W#
Bit encoding
1
0101 1101 1xxx xxxx cccc cccc cccc cccc
2
(32 bit Source Address for secret data)
3
(32 bit Source Address for public data)
4
(32 bit Destination Address)
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of the input data structures (both secret and public). Count must be a multiple of 4 Bytes,
Source and Destination Addresses must be 32 bit aligned, otherwise the PKA Channel will
go in error state.
The input data structure with the secret data to pass is:
Table 96.
Input data structure
Name
Size in words
Description
op_len
1
Length of the operands in bits
mod
W
Modulus
exp_len
1
Length of the exponent in bits
exp
E
Exponent
base
W
Base
The resulting data structure is:
\
Table 97.
Resulting data structure
Name
Size in words
BaseExp(mod Mod)
W
Description
Result of the modular exponentiation
Doc ID 018553 Rev 3
229/590
Cryptographic co-processor (C3)
RM0078
ECC_MUL instruction
The ECC_MUL instruction is 3 words long. This instruction is used to set the Source
Address Register, the Destination Address Register and the Count Register of the PKA
Channel (values of PKA_SRCR, PKA_DSTR and PKA_CNTR registers) and to start the
computation of the scalar multiplication of an EC point. In this case the input data structure
has to include the Montgomery’s parameter to use for the computation.
The Function and the Count are encoded in the first instruction word, the second word
represents the Source Address and the Destination Address is represented by the third
instruction word.
Table 98.
ECC_MUL bit encoding
W#
Bit encoding
1
0101 1011 0xxx xxxx cccc cccc cccc cccc
2
(32 bit Source Address)
3
(32 bit Destination Address)
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of the input data structure. Count must be a multiple of 4 Bytes, Source and Destination
Addresses must be 32 bit aligned, otherwise the PKA Channel will go in error state.
The input data structure with the secret data to pass is:
Table 99.
Input data structure
Name
Size in words
Description
op_len
1
Length of the operands in bits
mod
W
Modulus of the finite field
a_sign
1
Sign of the a parameter
a
1
Parameter of the elliptic curve
Px
W
x-coordinate of the base point
Py
W
y-coordinate of the base point
k_len
1
Length of the scalar k
k
E
Scalar k
R2(mod n)
W
Montgomery’s parameter
The resulting data structure is:
Table 100. Resulting data structure
230/590
Name
Size in words
Description
kPx
W
Coordinate of the result of the scalar multiplication
kPy
W
y-coordinate of the result of the scalar multiplication
Doc ID 018553 Rev 3
RM0078
Cryptographic co-processor (C3)
ECC_MONTY_MUL instruction
The ECC_MONTY_MUL instruction is 3 words long. This instruction is used to set the
Source Address Register, the Destination Address Register and the Count Register of the
PKA Channel (values of PKA_SRCR, PKA_DSTR and PKA_CNTR registers) and to start
the computation of the scalar multiplication of an EC point. In this case the Montgomery’s
parameter to use for the operation is computed by the channel.
The Function and the Count are encoded in the first instruction word, the second word
represents the Source Address and the Destination Address is represented by the third
instruction word.
Table 101. ECC_MONTY_MUL bit encoding
W#
Bit encoding
1
0101 1011 1xxx xxxx cccc cccc cccc cccc
2
(32 bit Source Address)
3
(32 bit Destination Address)
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the Count in Bytes
of the input data structure. Count must be a multiple of 4 Bytes, Source and Destination
Addresses must be 32 bit aligned, otherwise the PKA Channel will go in error state.
The input data structure with the secret data to pass is:
Table 102. Input data structure
Name
Size in words
Description
op_len
1
Length of the operands in bits
mod
W
Modulus of the finite field
a_sign
1
Sign of the a parameter
a
1
Parameter of the elliptic curve
Px
W
x-coordinate of the base point
Py
W
y-coordinate of the base point
k_len
1
Length of the scalar k
k
E
Scalar k
The resulting data structure is:
Table 103. Resulting data structure
Name
Size in words
Description
kPx
W
Coordinate of the result of the scalar multiplication
kPy
W
y-coordinate of the result of the scalar multiplication
Doc ID 018553 Rev 3
231/590
Cryptographic co-processor (C3)
13.3.8
RM0078
RNG channel instruction set
The RNG Channel is composed by the GET_VAL instruction. Instructions that do not
conform to the following bit encoding or to the generic flow type instructions are unknown to
the RNG Channel that will go in error state.
GET_VAL instruction
The RNG channel can fill a memory area with generated random values, starting from a
passed destination pointer. The size of the values to generate is passed in the instruction.
The GET_VAL instruction is 2 words long. This instruction is used to set the Destination
Address Register of the RNG Channel (value of RNG_DSTR register).
The size of the values to generate is encoded in the first instruction word, while the second
word represents the Destination Address, where the generated values have to be stored.
Table 104. GET_VAL instruction
W#
Bit encoding
1
0110 010x xxxx xxxx cccc cccc cccc cccc
2
(32 bit Destination Address)
Bits 27 to 26 represent the number of additional words to pass with the instruction. Since
there is only one additional word for the destination pointer, they must be ”01” for obtaining
valid random numbers.
Bit 25 represents the GET_VAL instruction and has to be 0.
Bits 24 to 16 are unused and should be zero.
Bits 15 to 0 in the first instruction word (cccc in the above table) represent the size in bytes
of the output generated values. The size must be a multiple of 4 bytes and the destination
Address must be 32 bit aligned, otherwise the RNG Channel will go in error state. The RNG
channel can fill up to 64 Kbytes with a single instruction.
232/590
Doc ID 018553 Rev 3
RM0078
14
Temperature sensor (THSENS)
Temperature sensor (THSENS)
This chapter focuses on THSENS functionality and operation.
For the THSENS feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
14.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The THSENS block is an embedded sensor for junction temperature monitoring.
Figure 67. THSENS block interface
RSTN
DATA
CLK
DATAREADY
DCORRECT
PDN
THSENS
wrapper
OVERFLOW
pc lk_ i
pre set n_i
int_h i_thresh_o
hi_th resh_i
int_lo_thresh_o
lo_thresh_i
14.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
14.3
Clocks
The block receives the APB clock (wrapper logic) and a slower clock for the embedded
THSENS_065LP library cell.
See also: Chapter 5: Reset and clock generator (RCG).
Doc ID 018553 Rev 3
233/590
Temperature sensor (THSENS)
14.4
RM0078
Interrupts
The THSENS block generates the following interrupts:
●
The “hi” interrupt line sets to ‘1’ when the measured temperature is higher than the “hi”
threshold value.
●
The “lo” interrupt line sets to ‘1’ when the measured temperature is lower than the “lo”
threshold value.
See also: Appendix A: Interrupts.
14.5
Functional description
The block provides means to access the THSENS_065LP library cell for temperature
measurement inside SPEAr1340 embedded MPU. Output temperature value is provided on
DATA output, which is registered on PCLK and latched on a resynchronized (to PCLK)
version of DATAREADY, because THSENS_065LP cell runs on a slow clock.
Access to block inputs and outputs is possible through a dedicated MISC register
THSENS_CFG. (see MISC chapter in RM0089, Reference manual, SPEAr1340 address
map and registers).
Temperature measurement range is from 20 degrees to 125 degrees Celsius. The typical
value to be set on DCORRECT input is 10.
As additional features with respect to simple temperature measurement, the wrapper allows
the generation of two interrupts that depend on the comparison of the value of temperature
read by the sensor with two threshold values fed as input to the wrapper. See Section 14.4:
Interrupts for the description of these interrupts.
14.5.1
Low power modes
For lower power consumption, use the PDN input pin to put the THSENS block in powerdown mode.
234/590
Doc ID 018553 Rev 3
RM0078
15
Multiport DDR2/3 controller (MPMC)
Multiport DDR2/3 controller (MPMC)
This chapter focuses on MPMC functionality and operation.
For the MPMC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
15.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
MPMC is a high performance multichannel memory controller able to support DDR2 and
DDR3 double data rate memory devices. The multiport architecture ensures that memory is
shared efficiently among different high-bandwidth client modules.
15.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
15.3
Clocks
See also: Chapter 5: Reset and clock generator (RCG).
The memory controller uses two different clock sources from two different PLLs present in
the system. The first clock source used for the six AXI data ports (axiY_ACLK) and AHB
register port (regHCLK) is the same used in the system for the AMBA interconnect. Its
frequency is fixed at 200 MHz. You can gate this clock through the mpmc_amba_clken
parameter on the bit zero of the PERIP2_CLK_ENB register.
The second clock source is used by the memory controller, the physical (PHY) interface, the
MIM structure and the memory. The relationship between the frequency value of the
memory controller and the PHY interface is fixed to 1:2. The translation between the
interconnect frequency and the controller frequency is accomplished by the FIFOs present
for each data AXI port. The maximum frequency for the PHY interface and memory interface
is 533 MHz. The memory interface has a dedicated clean clock source(I_DDRPHY_clk_ref).
The value can be programmed by the miscellaneous circuit through registers PLL4_FREQ
and PLL4_CTR. The MIM is used to translate the controller frequency into the PHY
frequency.
Figure 68 shows the relationship between the memory controller clocks.
Doc ID 018553 Rev 3
235/590
Multiport DDR2/3 controller (MPMC)
RM0078
Figure 68. MPMC clocks scheme
axiY_ACLK/regHCLK= 166 MHz
Interconnect
PLL1
Controller core
axiY_ACLK/i_MPMC_clk = async
i_MPMC_clk = 266 MHz
PLL4
MIM DFI to DFI bridge
i_MPMC_clk/i_DDRPHY_clk = sync
/2
i_DDRPHY_clk = 533 MHz
PHY
i_DDRPHY_clk_ref (clean) = 533 MHz
15.3.1
Changing the input clock frequency
The operating frequency of the memory controller is dependent on an ASIC-level input
clock. There are situations in which you may wish to modify the frequency of the clock
without resetting the memory controller. To change the clock frequency at which the memory
controller operates, the memory controller must stop processing requests, the clock must be
adjusted, the memory controller timing parameters must be reprogrammed and then the
memory controller can be restarted. The procedure to change the clock frequency is as
follows:
236/590
1.
Ensure that the memory controller is idle, that is when the controller_busy signal is low.
You can check the status of the controller_busy signal through the miscellaneous
register MPMC_CTR_STS.
2.
Put the memory devices into self-refresh mode by asserting the srefresh parameter to
‘b1. Do not use other means except for this parameter. To check if the devices have
been put in self-refresh mode, check the cke_status signal (register
MPMC_CTRL_REG_129).
3.
Stop the memory controller by writing a ‘b0 to the start parameter.
4.
The clock frequency may now be changed. Once the clock frequency has stabilized,
program the parameters with the updated values. Review any other parameters that
may be affected by the frequency change, such as: caslat, caslat_lin, caslat_lin_gate,
any of the timing parameters, and so forth, and modify as necessary.
5.
After updating all parameters, restart the memory controller by writing a ‘b1 to the start
parameter. This forces the DLL to lock to the new frequency.
6.
Once the DLL has locked and the PHY has initialized, the memory controller input
signal dfi_init_complete is asserted (check the bit 8 into the int_status parameter is
set). At this point, you can bring the memory devices out of self-refresh by clearing the
srefresh parameter to ‘b0. You do not need to wait to send commands to the memory
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
controller after clearing the srefresh parameter; the memory controller will adjust for
self-refresh exit time before processing memory commands.
7.
15.4
If any of the memory mode registers require updating at this point, you must set the
values in the EMRS parameters, and then write them to the memory devices by setting
the write_modereg parameter to ‘b1.
Interrupts
The memory controller has the following main interrupt line connected to the processor
general interrupt circuit (refer to Chapter 2: CPU subsystem (A9SM)):
●
The MPMC controller_int signal is connected to the interrupt line ID[92]
The controller_int signal is a level sensitive signal that is asserted when the memory
controller detects an interrupt condition.
15.5
Resets
There are two sets of reset logic within the multiport memory controller: the reset for the
core/phy and the reset for the AMBA ports.
The reset signal for the core/phy is the asynchronous active-low reset
(resetn_MPMC_ctrl/resetn_MPMC_phy) that resets all critical flip-flops in the system to
ensure that the core emerges from reset in a known state. When the core is reset, all
parameters are reset, and any commands within the core are lost. The core reset does not
reset the AMBA ports, but resetting the core without resetting the AMBA ports will generate
unknown behavior.
The AMBA port reset is the active-low asynchronous signal
(aresetn_MPMC/hresetn_MPMC). When this reset is asserted, the associated AMBA port
resets, the port FIFOs clear, and the pointers reset. To prevent corruption within the memory
controller, reset AXI ports only at initialization, or while a port is idle at the interface with no
commands within the core.
It is not required that both resets be activated simultaneously, but it is required that both
resets be asserted concurrently for at least five cycles. First remove the core reset, then
reset the port.
15.6
Functional description
The multiport memory controller was designed for high memory bandwidth utilization and
efficient arbitration for high priority requests. The architecture of the multiport system is
shown in Figure 69 and consists of the following:
●
6 AXI interfaces
●
Arbiter
●
Command queue with placement logic
●
Write data queue
●
Read data queue
●
DRAM command processing
●
Register port with an AHB interface
Doc ID 018553 Rev 3
237/590
Multiport DDR2/3 controller (MPMC)
RM0078
Figure 69. Multiport memory controller architecture
Write data
interfaces
MPMC core
Write data
AXI
Arbiter
Interfaces
Command queue
with placement
logic
DRAM
command
processing
PHY interface
AXI Bus
queue
AHB Bus
Read data
queue
Register
port
Read Data
Interfaces
Programmable register settings
The interface blocks contain FIFOs for commands, read and write data, and handle any
clock domain crossings, if required. From the port interface blocks, commands are
processed through an arbiter which feeds single commands to the command queue of the
memory controller core. Write and read data is routed directly to the write and read data
queues of the memory controller core independently of the arbiter.
Each port has a distinct write data interface to the write data queue of the memory controller
core. However, for read data, all ports share a single read data interface back to the port
interface blocks.
15.6.1
AXI interface
The MPMC has 6 AXI data ports that function as AXI slaves to external AXI masters.
Transfers are burst-based of variable byte counts. The transfer types INCR and WRAP are
fully supported. FIXED burst types are not supported.
Table 105. AXI transfer type limitations
Name
axiY_ARBURST/
axiY_AWBURST
Description
AXI port Y read command burst type.
– ‘b00 = Reserved (FIXED is not supported)
– ‘b01 = INCR
– ‘b10 = WRAP
– ‘b11 = Reserved
For the ports connected to the AXI bus directly, the thread ID signals (axiY_ARID or
axiY_AWID) identify which of the 16-thread IDs is associated with the command. This
thread ID is combined with the originating port to create a source ID which is used in the
core to maintain originator information. There are no restrictions on mapping of thread IDs
to AXI bus masters. The AXI interfaces handle all communication between the AXI bus and
the core. Each port always supports full-size transfers where the full data port width is
utilized on each beat. In addition, each port can be independently programmed to support
238/590
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
narrow transfers, where the bytes-per-beat size is less than the port data width. There are
no fixed timing requirements on a port between the traffic channels when the narrow transfer
option is disabled, and in this case, write data may arrive before, with, or after the write
command. When the narrow transfer option is enabled, a port does not accept write data
until it has received the command and is aware of the total byte count associated with that
command.
AXI transaction management
For optimization of the core, read commands from different thread IDs on a port, or read
commands from different ports, may be automatically re-arranged in the core to execute
out-of-order. When commands from different thread IDs are re-ordered, read data returned
to the AXI- port interfaces will also be out-of-order and may be interleaved. To avoid
reordering within a port, the AXI bus master should use one thread ID for all commands
from any port. Note that read commands from the same thread ID on the same port will
always execute in the same order as they were accepted into their port.
Write commands have more ordering restrictions. Write commands from different ports may
be re-ordered, but write commands from the same port, from the same or different thread
ID, must remain in order. Write data may not be interleaved. While the AXI- interface does
support multiple outstanding write instructions, the write data is expected to arrive in order.
Because the read and write channels are distinct, read and write commands from different
thread IDs on a port, from the same thread ID on a port, or from different ports, may be
re-ordered. These commands will be automatically re-arranged for optimal command
execution, as long as there are no collisions between the commands.
An incoming AXI transaction is mapped into a core-level transaction, then synchronized
from the AXI clock domain to the core clock domain and stored in the AXI port FIFOs. Each
instruction consists of an address, size, length and thread ID. Because a port may utilize
multiple thread IDs, the source ID that is used in the core is a combination of both the port
and thread information. This concatenation occurs in the arbiter and this source ID is used
in the placement logic. From the AXI FIFOs, the transaction is presented to the arbiter which
arbitrates requests from all ports and forwards a single transaction to the core.
AXI port configuration options
Each AXI port in the memory controller has been defined for the requirements of the
intended system. The configuration options are:
●
Datapath width: Each port has a data interface width of 64 bits.
●
Width of the ID: Each port is configured with a thread ID of 4 bits.
●
Priority definition: Command priority is defined based on the port and the command
type. For each port Y, there is an axiY_r_priority parameter which defines priorities for
all read commands and an axiY_w_priority parameter which defines priorities for all
write commands. Supported priority values range from 0 to 7, with 0 as the highest
priority.
●
Register port: AHB asynchronous port is used to write register.
●
Buffering: Each data port contains a command, a read and a write FIFO, and a
response storage array. In addition, each programmable port contains an
asynchronous response FIFO to synchronize the memory response to the port time
domain when operating asynchronously. The depth of each buffer in each port is listed
in Table 106.
Doc ID 018553 Rev 3
239/590
Multiport DDR2/3 controller (MPMC)
RM0078
Table 106. Configured AXI settings
Number
port
Port data
width
Command
FIFO depth
Write FIFO
depth
Read FIFO
depth
Write
response
FIFO depth
Write
response
storage
array depth
0-6
64
8
8
8
8
8
●
Exclusive access buffer depth: Exclusive access is an optional AXI feature that is
only supported by the memory controller. This type of access will only be used if
exclusive access commands are issued to the memory controller by driving the
axiY_ARLOCK signal to ‘b10 with a read command. Each port of this memory
controller contains 1 exclusive buffer and therefore each port may monitor the
exclusivity of up to 1 transaction at any time. Refer to Section 15.6.3: Initialization
protocol for more information.
●
Locked access: There may be an occasion where a particular user wishes to have
access to the memory without interruption from other ports. The AXI locked access
option allows this functionality. The process is completely controlled by the user
through the types of commands sent to the memory controller.
●
Error detection: When an illegal operational condition is detected on a new AXI
transaction entering the port, the port responds through an AXI error signal and the
controller interrupt signal, and the error signature is recorded in the register space. The
AXI error signal flagged is dependent on the type of transaction that caused the error
(read or write). The controller interrupt and the signature information is dependent on
type of error (command or data).
AXI port FIFOs
Incoming transactions from the AXI- interfaces are processed by the interface logic and
mapped into equivalent transactions on the core bus. These transactions are queued into
each port’s command FIFO.
Each programmable port contains four FIFOs: command, read data, write data and
response synchronization. The response synchronization FIFOs are only used when
operating in asynchronous mode. In addition to the FIFOs, each port contains a storage
array to hold the read and write responses.
The five channels of traffic and their relationship to the port FIFOs is shown in Figure 70.
240/590
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
Figure 70. AXI interface blocks
AXI interface blocks
Core
When
Asynchronous
Write
Resp.
Port
0
from
Memory
Write Resp.
Synch FIFO
Write Resp.
Array
Write Data
Write FIFO
Write data
queue
(N-channels)
Write Cmd
Port 0
Arbitration
Read Cmd
Read Data
Command
FIFO
Read FIFO
When
Asynchronous
Write
Resp.
Port
1
Write Resp.
Synch FIFO
Write Resp.
Array
Arbiter
Command
queue with
placement
logic
Write Data
Write FIFO
Write Cmd
Port 0
Arbitration
Read Cmd
Read Data
Command
FIFO
Read FIFO
Read data
queue
(1 channel
for all ports)
. . .Replicated for Other Ports . .
When
Asynchronous
Write
Resp.
Port
N
Write Resp.
Synch FIFO
Write Resp.
Array
Write Data
Write FIFO
Write Cmd
Read Cmd
Read Data
Port 0
Arbitration
Command
FIFO
Read FIFO
Doc ID 018553 Rev 3
241/590
Multiport DDR2/3 controller (MPMC)
RM0078
●
Command FIFO: The command FIFO accepts a single command from the in-port
arbitration logic and holds the following command information: address, command type,
encoded number of beats, encoded bytes-per-beat, bufferable/cacheable flag, coherent
bufferable flag, thread ID, exclusive access / locked access status.
●
Read FIFO: The read FIFO holds the data signals sent back from the memory
controller, thread ID, last data byte and read data response. There is only one
streaming read data interface out from the core for all AXI ports, regardless of the
number of ports or the number of thread IDs for any port. The memory controller maps
this data stream back to the proper port. With this singular data interface, the AXI port
must be ready to accept the read data as soon as it is available on the internal core bus
to avoid stalling the memory controller. The read FIFO is also responsible for
synchronizing the data to the AXI time domain.
●
Write FIFO: The write FIFO holds the data to send to memory, thread ID and data
mask. The purpose of the write FIFO is to allow the AXI bus to offload its write data
completely before the data is transferred to the core buffers. Each port has a distinct
write channel into the core write data queue. If there are multiple thread IDs for a port,
they will all share the channel for that port.
●
Write response interface: When a write request is accepted into the AXI interface, an
entry will be created in the response storage array for that command. When a write
response is ready, the array will verify that this is the oldest command for any thread ID
on that port and if so, the response will be sent out. The timing of this response is
dependent on the type of response requested (buffered or cached) and the contents of
the command queue. The AXI interface may not issue write responses until responses
for all the older write commands have been issued; for cacheable responses, the entire
system may be held off waiting for that response.
This indication will be returned to the AXI master through the signal axiY_BRESP and
its associate valid indicator axiY_BVALID signal. Different masters may require this
response at different stages of the write command. A master that needs to quickly
release the bus would optimally receive the completion response as soon as the port
has accepted the write command and all of the corresponding data. Another master
may wish to wait until the data has been accepted into the memory controller core, or
successfully written to memory.
Each data port is configured with two signals that work together to determine when an
instruction is considered complete and the write completion response (axiY_BRESP)
will be returned to the master. These signals are axiY_AWCACHE [0] and
axiY_AWCOBUF (axiY_AWCOBUF with Y=5…0 is controllable by means of the
miscellaneous register bits MPMC_CTR_STS[20:15]).
Table 107 details the relationship between axiY_AWCACHE and axiY_AWCOBUF.
Table 107. Write response signals
axiY_AWCACHE[3:0]
‘b0000
‘b0001
242/590
Response information(1)
axiY_AWCOBUF
Irrelevant
Non-bufferable write command. Response will be
ready when the write data has been committed to
memory.
0
Standard bufferable write command.
Response will be ready when the command and
all associated data have been received by the AXI
data port.
There is no guarantee of data coherency across
all AXI ports.
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
Table 107. Write response signals (continued)
axiY_AWCACHE[3:0]
Response information(1)
axiY_AWCOBUF
‘b0001
1
Coherent bufferable write command. Response
will be ready when the command has been
accepted by the command queue in the memory
controller core. This guarantees data coherency
across all ports, but reduces the overall write
response latency relative to the non-bufferable
option.
‘bxxx-
–
All other settings are reserved.
1. The response will only be sent if all of the older write responses have been issued for any thread ID on that
port.
Treat the axiY_AWCOBUF signals as any other write command control signal. If the system
cannot generate these signals on a per-command basis, it is recommended that these
signals be tied high or low.
15.6.2
AHB interface
The register interface is an independent AHB port to the memory controller. This port
converts the AHB register addresses to core register addresses. This port operates
asynchronously and contains a 4-deep asynchronous FIFO. The register port only supports
the AHB SINGLE burst type. The register port only supports transfer types of NONSEQ or
IDLE. This port will support accesses with a byte-per-beat equal to or less than the width of
the AHB register bus. There is no support for INCR or WRAP burst types. There is no
support for SEQ or BUSY transfer types.
Table 108. AHB transfer type limitations
Name
Description
regHBURST/
AHB register burst size.
– ’b000 = Single beat (SINGLE)
– All other settings are reserved
regHTRANS
AHB register transaction type indicator.
– ’b00 = Idle
– ’b01 = Reserved (busy is not supported)
– ’b10 = Non-sequential
– ’b11 = Reserved (sequential is not supported)
regHRESP
AHB register transfer response. Only “Okay” and “Error” are supported
for AHB.
– ’b00 = OKAY • ’b01 = ERROR\
– ’b10 = Reserved (RETRY is not supported)
– ’b11 = Reserved (SPLIT is not supported)
All parameters related to the AHB port operation are located in the core register map. These
parameters are programmed during the initialization sequence along with all of the other
device parameters.
Doc ID 018553 Rev 3
243/590
Multiport DDR2/3 controller (MPMC)
RM0078
A typical boot-up sequence includes a reset of the AXI ports as well as the core, followed by
programming of the core through the AHB register port.
15.6.3
Initialization protocol
For correct operation, the memory controller requires a specific sequence after all power to
the system and to the memory devices is stable. The memory controller does not include
circuitry to control the activation of power and ground to the system. Once the power to the
memory devices and the system is stable, the memory controller must be initialized, and it
will then automatically initialize the memory devices. Use the following procedure to initialize
the memory controller:
15.6.4
1.
Clear the resetn_MPMC_ctrl signal by driving it to ’b0. All programmable registers are
cleared.
2.
Set the resetn_MPMC_ctrl signal synchronously with the memory controller clock by
driving the signal to ’b1.
3.
Issue write register commands to configure the DRAM protocols. Keep the start
parameter de-asserted during this initialization step.
4.
Assert the start parameter. This triggers the memory controller to execute the
initialization sequence using the parameters written into the registers. The memory
controller waits for the PHY to assert the dfi_init_complete signal (bit 8 of int_status
parameter), which indicates that the PHY and the memory devices are ready to accept
commands.
Exclusive access
The exclusive access feature allows a master to monitor if a memory area has been altered
since its last read. Exclusive access does not imply that the memory area is locked; other
thread IDs of that port, or other ports, may access the area for reads or writes even though
an exclusive access exists. If any writes occur to a memory area with a valid exclusive
access request, the master will lose exclusivity and be informed of this status when it
attempts to write to the area again. A loss of exclusivity does not trigger an interrupt or any
error conditions; however, the AXI protocol requires that the write data is not written to
memory if an exclusive write fails its exclusivity check. The master that has lost exclusivity
must determine whether to restart the sequence by requesting another exclusive read or to
write the data to the memory regardless via a non-exclusive write.
15.6.5
244/590
Error responses
●
AXI error response: When an illegal operational condition is detected on a new AXI
transaction entering the port, the port responds with an error condition. Instructions that
generate AXI errors result in unpredictable behavior, and may cause memory
corruption and/or hang conditions. When the programmable ports are programmed to
asynchronous mode, the error signature is serialized and sent to the memory controller
as a single data stream. This eliminates the need for an asynchronous FIFO to capture
error information, but adds a delay to interrupt generation that does not impact the
timing of the responses.
●
Write errors: Error responses on a write operation are sent on the write response
channel through the axiY_BRESP bus. A single response is sent for each write
command. Write error responses are generated if the decoded bytes-per-beat is less
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
than the port data width when the narrow transfer option is not selected. This is an error
for write commands only when the axiY_AWLEN is greater than 0 (Command Error).
15.6.6
●
Read errors: For read commands, the error is sent on the read data channel through
the axiY_RRESP bus. The response is sent along with each data word. Read error
responses are generated if a double-bit ECC error is detected on a read, and reporting
is enabled in the ctrl_raw parameter. In addition to the controller interrupt and
controller-level status signals, double-bit ECC errors on read commands also trigger an
AXI response. For default transfers, the error is sent with the beats that caused the
error. If the error was associated with a narrow transfer, the error is sent with each beat
of the erroneous data word.
●
AXI error reporting: If an AXI command error occurs, a bit will be set in the int_status
parameter and the address and source ID of the command are saved in the
port_cmd_error_addr and port_cmd_error_id parameters, respectively. In addition, the
access type or types that relate to the error are stored in the port_cmd_error_type
parameter. Similarly, when a data error occurs, the source ID of the command is saved
in the port_data_error_id parameter. The access type or types that relate to the error
are stored in the port_data_error_type parameter. The bits in the error type parameters
are not exclusive. Multiple bits may be set to indicate the type of errors that occurred.
Reading the int_ack parameter allows future errors to be captured in these error
parameters. Read these parameters if the axiY_BRESP or axiY_RRESP signals are
set. If multiple errors occur prior to an acknowledgment of the first error, the parameters
still represent the first error attributes. Other error signatures are lost. If multiple errors
occur simultaneously on different ports, the error information represents the lowest
numbered erring port. Single-bit and double-bit ECC errors are also reported in the
int_status parameter and the error signature parameters as detailed in Section 15.6.5:
Error responses.
Multiport arbiter
The arbiter is responsible for arbitrating requests from the ports and sending requests to the
memory controller core. This memory controller supports the weighted round-robin
arbitration scheme which is based on three-step arbitration system. All commands are
routed into priority groups based on the priority of the requests. Then, within each priority
group, requests are serviced according to the “weight” (relative priority) of each port. Finally,
each priority group presents a single command to the priority select module, which passes
the highest priority command on to the memory controller core. This arbitration scheme
also supports two additional features. For situations where the priority and the relative
priority for multiple commands are identical, a port ordering system in included whereby the
user may adjust the order in which the ports are considered. Secondly, for situations where
two ports may be related, a mechanism is included which allows a pair of ports to share
arbitration bandwidth for bandwidth efficiency.
Round-robin operation
Round-robin operation is the simplest form of arbitration and is ideal for systems that do not
require requests to be treated preferentially to maintain bandwidth or minimize latency. This
scheme uses a counter that rotates through the port numbers, incrementing every time a
port request is granted. If the port that the counter is referencing has an active request, and
the memory controller core command queue is not full, then this request will be sent to the
memory controller core. If there is not an active request for that port, then the port will be
skipped and the next port will be checked. The counter will increment by one whenever any
request has been processed, regardless of which port’s request was arbitrated. Round-robin
Doc ID 018553 Rev 3
245/590
Multiport DDR2/3 controller (MPMC)
RM0078
arbitration ensures that each port’s requests can be successfully arbitrated into the memory
controller core every N cycles, where N is the number of ports in the memory. No port will
ever be locked out, and any port can have its requests serviced on every cycle as long as all
other ports are quiet and the command queue is not full.
Port priority
For AXI ports, the priority is associated with a port and each port has separate priority
parameter for reads and writes. These values are stored into the programmable parameters
axiY_r_priority and axiY_w_priority (where Y represents the port number) at controller
initialization. Internally, the ports are organized into priority groups based on their priority
setting. The priority value is also used by the placement logic inside the memory controller
core when filling the command queue. A priority value of 0 is highest priority, and a priority
value of (decimal) 7 is the lowest priority in the memory controller.
Note:
You can program at priority level 0, but it is better to reserve this priority value so that the
placement queue can elevate to this level through aging.
Relative priority
Inside each priority group, the relative priority is used to determine arbitration. The memory
controller contains 8 identical priority groups with logic that selects between the requests
from all commands at that priority level. The relative priority parameters
axiY_priorityZ_relative_priority (where Y is the port number and Z is the priority group)
“weight” the ports for each level and determine how the priority group will be arbitrated.
Figure 71 shows this type of arbitration system. By using the relative priority concept, the
arbitration is skewed in favor of certain ports based on user programming.
Note:
246/590
The relative priority parameters have a minimum acceptable value of 1 to prevent port
lockout. A 0 value will cause an error condition.
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
Figure 71. Weighted round-robin priority group structure
Priority group 1
axi0_priority1_relative_priority
...
axi5_priority1_relative_priority
Priority 2
commands
Priority group 2
axi0_priority2_relative_priority
...
axi5_priority2_relative_priority
... Priority groups 3-6 ...
...
Priority 7
commands
Command queue of the MC core
Priority 1
commands
Priority select module
Ports
0-5
Priority sorting
Priority 0
commands
Priority group 0
axi0_priority0_relative_priority
axi1_priority0_relative_priority
...
axi5_priority0_relative_priority
Priority group 7
axi0_priority7_relative_priority
...
axi5_priority7_relative_priority
Programmable register settings
If the relative priorities are all programmed to the same value within any priority group, then
the arbitration will mimic a version of simple round-robin scheme within that priority group.
Instead of incrementing whenever any request is processed, the simple round-robin counter
will only increment to the next port after the value in the axiY_priorityZ_relative_priority
parameter number of requests are processed.
Each port X for priority level Y will be allocated the ratio of that port’s relative priority
parameter (axiY_priorityZ_relative_priority) to the sum of all requesting port’s relative
priority values. If a particular port is not requesting, then it is not included in the sum
calculation, which means that the arbitration will be split with relative proportions among the
requesting ports.
As an example, consider a system with 4 ports where all requests are at priority 0. This
system is described in Table 109.
Table 109. Relative priority example
Parameter
System A
axi0_priority0_relative_priority
1
axi1_priority0_relative_priority
2
axi2_priority0_relative_priority
3
axi3_priority0_relative_priority
4
Doc ID 018553 Rev 3
247/590
Multiport DDR2/3 controller (MPMC)
RM0078
For this system, port 0 will be serviced 1/(1+2+3+4) = 1/10 of the time and Port 3 will be
serviced 4/ (1+2+3+4) = 4/10 of the time. However, if port 2 is not actively requesting, then
port 0 will be serviced 1/(1+2+4) = 1/7 of the time and port 3 will be serviced 4/(1+2+4) = 4/7
of the time.
To ensure that relative priorities are maintained, there is a weight counter for each port
within each priority group. These counters track the number of transactions accepted for that
port in that priority group. When any counter value reaches the programmed relative port
priority, the scan order for that priority group will be internally modified. The port that has
met its relative priority will be dynamically positioned to the bottom of the scan order (and its
counter will be reset), allowing other ports a preferential position.
Note:
For ports that are not expected to issue requests at a certain priority level, program the
associated relative priority parameter to 0x1. This allows for minimum allocation without the
risk of lock out in case a command appears.
Port ordering
With simple round-robin arbitration, the ports are scanned based on their port number in
incrementing order in the system. Assuming that the command queue is not full, the port
referenced by the counter is examined for valid incoming transactions. If there is an active
request, it will be accepted. Otherwise, the next port in the scan order will be checked, and
its request accepted.
For the memory controller with weighted round-robin arbitration, the user has the option of
adjusting the order that the ports are scanned. This is useful if requests from certain ports
are more critical, or if a specific order may reduce contention between ports.
The three-bit axiY_port_ordering parameters are used to set this new scan order. A value of
’b000 gives the highest listing in the scan order, and a value of ’b111 is the lowest listing in
the scan order.
If the 6 axiY_port_ordering parameters are programmed with unique values, then the scan
order will be modified to proceed sequentially in this new order. If any of the port ordering
parameters has the same value, then those ports will still be equal in the arbitration test. In
this case, the port number will select between these ports, with the lower-numbered port
automatically being selected first.
Weighted round-robin arbitration summary
The memory controller weighted round-robin arbitration system combines the concepts of
round-robin operation, priority, relative priority and port ordering. The incoming commands
are separated into priority groups based on the priority of the associated port for that type of
command. Within each priority group, the relative priority values are examined to determine
the arbitration winner. If the relative priority values are identical and no individual command
can be selected, then the scan order is used to select between the requests. In the end, the
highest priority command, from the highest relative priority port, with the highest location in
the scan order will be selected and sent to the memory controller core.
As an example, consider the system described in Table 110. The counters refer to the
counters that exist for each port within each priority group to ensure that relative priorities
are maintained. For simplification, the command queue is considered to never be full and
commands are only received at priority level 0. The behavior is shown in Table 111. The
highest port in the scan order that is requesting always wins arbitration, and the scan order
is dynamically modified when any port counter reaches its allocated relative priority value.
Note that if the command queue was considered, then cycles where the command queue
248/590
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
was full would not have any arbitration winner and therefore, the counter values and scan
order would not change on that cycle.
Table 110. System D specifications
Parameter
Port 0
Port 1
Port 2
Port 3
axiY_priority0_relative_priority
4
3
2
1
axiY_port_ordering
0
1
2
3
Table 111. System D operation
Ports requesting
Cycle
P0
P1
P2
P3
Arbitration
winner
Next counter
P0
P1
P2
P3
Next scan
order
P0-P1-P2-P3
0
Y
Y
P0
1
0
0
0
P0-P1-P2-P3
1
Y
Y
Y
P0
2
0
0
0
P0-P1-P2-P3
2
Y
Y
Y
Y
P0
3
0
0
0
P0-P1-P2-P3
3
Y
Y
Y
Y
P0
4
0
0
0
P1-P2-P3-P0
4
Y
Y
Y
Y
P1
0
1
0
0
P1-P2-P3-P0
5
Y
Y
Y
Y
P1
0
2
0
0
P1-P2-P3-P0
6
Y
Y
Y
Y
P1
0
3
0
0
P2-P3-P0-P1
7
Y
Y
Y
P2
0
0
1
0
P2-P3-P0-P1
8
Y
Y
Y
P2
0
0
2
0
P3-P0-P1-P2
9
Y
Y
P3
0
0
0
1
P0-P1-P2-P3
10
Y
Y
Y
P0
1
0
0
0
P0-P1-P2-P3
11
Y
Y
P2
1
0
1
0
P0-P1-P2-P3
12
Y
Y
P2
1
0
2
0
P0-P1-P3-P2
Doc ID 018553 Rev 3
249/590
Multiport DDR2/3 controller (MPMC)
RM0078
Priority relaxing
A lower priority level will not win arbitration in weighted round-robin arbitration unless there
are no higher priority requests. This could mean that, in a situation where high priority
requests are being received continuously, lower priority requests could be locked out
indefinitely. To avoid this scenario and control the arbitration latency for lower-priority
commands, it is possible to disable priority groups temporarily. This is known as priority
relaxing, and it is a time-controlled function. Each higher priority group will be temporarily
disabled when the pre-set counter value for the lower priority group has been reached and a
request is waiting. The axiY_priority_relax parameters set the counter value for port X at
which the priority relax condition will be triggered. The timing counters inside each port are
controlled by the weighted_round_robin_latency_control parameter. When the latency
control bit is set to ’b1, the timing counters are free-running. Any timing counter may hit its
axiY_priority_relax value at any point. When this occurs, higher-priority groups are disabled
to allow a waiting request for this port to be processed. This results in a random latency for
each port, but the maximum latency is fixed at the axiY_priority_relax value. If the current
port does not have any commands waiting when the timing counter hits the relax value, then
the counter will be reset and the arbiter will function normally. When the
weighted_round_robin_latency_control parameter is cleared to ’b0, the timing counters only
count while that port has a waiting request that is not being processed. In this case, when
the port’s axiY_priority_relax parameter value is reached, all priority groups at priority levels
higher than the waiting request are disabled. This port’s command is granted arbitration and
is moved through to the memory controller core. Because the priority relax parameters and
counters are associated with individual ports, it is possible that multiple priority relax
counters could reach their specified value simultaneously. In this case, the lower priority
command will be arbitrated first and then the higher priority command. This situation could
alter the arbitration latency slightly, causing it to be longer than the expected value in the
priority relax parameter.
Port pairing
The memory controller arbiter incorporates a feature which allows adjacent ports to be
grouped together and considered jointly for arbitration. The
weighted_round_robin_weight_sharing parameter controls this function, with one bit per
pair of ports in the memory controller. Bit 0 controls ports 0 and 1, Bit 1 controls ports 2 and
3, etc. Because the ports are grouped together, their relative priorities are not considered
separately. Referring to Section : Relative priority, the general formula for port priority
allocation is the ratio of that port’s relative priority parameter
(axiY_priorityZ_relative_priority) to the sum of all requesting port’s relative priority values. In
this case, the relative priority value of only one of the paired ports is used for the sum
calculation. This means that the bandwidth will be divided differently among the ports.
Note:
250/590
For port weight sharing to be used, the relative priority parameters for the port pair must be
programmed to the same value, and the port order of the paired ports must be sequential. If
either condition is not followed, an error bit is set to ’b1.
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
Error conditions
With the programming complexities of the weighted round-robin arbitration scheme, an error
reporting mechanism is included to notify users of illegal programming scenarios. These
error conditions generate a memory controller core interrupt and set a bit in the
wrr_param_value_err parameter to ’b1 (see bits 16-19 of the Controller configuration
register 47).
The potential error conditions are:
●
Bit 16 = The 6 axiY_port_ordering parameters do not all contain unique values.
●
Bit 17 = Any of the axiY_priorityZ_relative_priority parameters have been programmed
with a zero value. A 0 value leads to unknown behavior. The minimum allowable value
is 1.
●
Bit 18 = Any ports, whose related bit of the weighted_round_robin_weight_sharing
parameter is set to ’b1, do not have the same values in their
axiY_priorityZ_relative_priority parameters.
●
Bit 19 = For ports whose related bit of the weighted_round_robin_weight_sharing
parameter is set to ’b1, the values of the axiY_port_ordering parameters are not
sequential.
If bits 16, 18 or 19 are set to ’b1 in the wrr_param_value_err parameter, and any of the ports
are paired in the weighted_round_robin_weight_sharing parameter, then all weight sharing
data will be ignored during memory controller initialization and the ports will be prioritized by
port number. If port pairing is not being used, but the bit 16 error condition is set to ’b1, then
ports with a non-unique port ordering are prioritized by port number.
15.6.7
Command queue with placement logic
The memory controller core contains a command queue that accepts commands from the
arbiter. This command queue uses a placement algorithm to determine the order that
commands execute in the memory controller core. The placement logic follows many rules
to determine where new commands should be inserted into the queue, relative to the
contents of the command queue at the time. Placement is determined by considering
address collisions, source collisions, data collisions, command types and priorities. In
addition, the placement logic attempts to maximize efficiency of the memory controller core
through command grouping and bank splitting. Once placed into the command queue, the
relative order of commands is constant.
Many of the rules used in placement may be individually enabled/disabled. In addition, the
queue may be disabled by clearing the placement_en parameter, resulting in an in-line
queue that services requests in the order they are received. If the placement_en parameter
is cleared to ’b0, the placement algorithm will be ignored.
The rules of the placement algorithm are the following:
●
Address collision/Data coherency violation: To avoid address collisions, reads or
writes that access the same chip select, bank and row as a command already in the
command queue will be inserted into the command queue after the original command,
even if the new command is of a higher priority. This factor may be enabled/disabled
through the addr_cmp_en parameter and should be disabled only if the system can
guarantee coherency of reads and writes.
●
Source ID collision: Each port is assigned a specific source ID that is a combination of
the port and thread ID information, and identifies the source uniquely. This allows the
memory controller to map data from/ to the correct source/destination. In general, read
Doc ID 018553 Rev 3
251/590
Multiport DDR2/3 controller (MPMC)
RM0078
commands from the same source ID will be placed in the command queue in order.
Therefore, a read command with the same source ID as a read command already in
the command queue will be processed after the original read command. All write
commands from a port, even with different source IDs, will be executed in order. If there
are no address conflicts, a read command could be executed ahead of a write
command with the same source ID, and likewise a write command could be executed
ahead of a read command with the same source ID. This feature will always be
enabled.
●
Write buffer collision: Incoming write requests in the command queue are allocated
to one of the 4 write buffers of the memory controller core automatically based on
availability. New write commands will be designated to any available buffer. However,
back-to-back write requests from a particular source ID will be allocated to the same
write buffer as the previous command. Because the memory controller core must pull
data out of the buffers in the order it was stored, if a write command is linked to a buffer
that is associated with another command in the queue, then the new command will be
placed in the command queue after that command, regardless of priority. This feature
will always be enabled.
●
Priority: The placement algorithm will attempt to place higher priority commands
ahead of lower priority commands, as long as they have no source ID, write buffer or
address collisions. Higher priority commands will be placed lower in the command
queue if they access the same address, are from the same requestor or use the same
buffer as lower priority commands already in the command queue. This feature is
enabled through the priority_en parameter.
●
Bank splitting: Before accesses can be made to two different rows within the same
bank, the first active row must be closed (pre-charged) and the new row must be
opened (activated). Both activities require some timing overhead; therefore, for
optimization, the placement queue will attempt to insert the new command into the
command queue such that commands to other banks may execute during this timing
overhead. The placement of the new commands will still follow priority, source ID, write
buffer and address collision rules. The placement logic will also attempt to optimize the
memory controller core by inserting a command to the same bank as an existing
command in the command queue immediately after the original command. This
reduces the overall timing overhead by potentially eliminating one pre-charging/
activating cycle. This placement will only be possible if there are no priority, source ID,
write buffer or address collisions or conflicts with other commands in the command
queue. All bank splitting features are enabled through the bank_split_en parameter.
●
Read/Write grouping: The memory suffers a small timing overhead when switching
from read to write mode. For efficiency, the placement queue will attempt to place a
new read command sequentially with other read commands in the command queue, or
a new write command sequentially with other write commands in the command queue.
Grouping will only be possible if no priority, source ID, write buffer or address collision
rules are violated. This feature is enabled through the rw_same_en parameter.
Once a command has been placed in the command queue, its order relative to the other
commands in the queue at that time is fixed. While this provides simplicity in the algorithm,
there are drawbacks. For this reason, the memory controller offers two options that affect
commands once they have been placed in the command queue:
●
252/590
Command aging: Because commands can be inserted ahead of existing commands
in the command queue, the situation could occur where a low priority command
remains at the bottom of the queue indefinitely. To avoid such a lockout condition, aging
counters have been included in the placement logic that measure the number of cycles
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
that each command has been waiting. If command aging is enabled through the
active_aging parameter, then if an aging counter hits its maximum, the priority of the
associated command will be decremented by one (lower priority commands are
executed first). This increases the likelihood that this command will move to the top of
the command queue and be executed. Note that this command does not move relative
positions in the command queue when it ages; the new priority will be considered when
placing new commands into the command queue. Aging is controlled through a master
aging counter and command aging counters associated with each command in the
command queue. The age_count and command_age_count parameters hold the initial
values for each of these counters, respectively. When the master counter counts down
the age_count value, a signal is sent to the command aging counters to decrement.
When the command aging counters have completely decremented, then the priority of
the associated command is decremented by one number and the counter is reset.
Therefore, a command does not age by a priority level until the total elapsed cycles has
reached the product of the age_count and command_age_count values. The maximum
number of cycles that any command can wait in the command queue until reaching the
top priority level is the product of the age_count value, the command_age_count value,
and the number of priority levels in the system.
●
15.6.8
High-priority command swapping: Commands are assigned priority values to
ensure that critical commands are executed more quickly in the memory controller than
less important commands. Therefore, it is desirable that high-priority commands pass
into the memory controller core as soon as possible. The placement algorithm takes
priority into account when determining the order of commands, but still allows a
scenario in which a high-priority command sits waiting at the top of the command
queue while another command, perhaps of a lower priority, is in process. The
high-priority command swapping feature allows this new high-priority command to be
executed more quickly. If the user has enabled the swapping function through the
swap_en parameter, then the entry at the top of the command queue will be compared
with the current command in progress. If the command queue’s top entry is of a higher
priority (not the same priority), and it does not have an address, source ID or write
buffer conflict with the current command being executed, then the original command
will be interrupted. For this memory controller, an additional check is performed before
a read command is interrupted. If the read command in progress and the read
command at the top of the command queue are from the same port, then the executing
command will only be interrupted if the swap_port_rw_same_en parameter is set to
’b1. If this parameter is cleared to ’b0, a read command from the same port as a read
command in progress, even with a higher priority and without any conflicts, would
remain at the top of the command queue while the current command completes.
Other memory controller features
Out-of-range address checking
Because the master may attempt to write to an invalid address, all incoming addresses are
always checked against the addressable physical memory space. If a transaction is
addressed to an out-of-range memory location, bit 0 of the int_status parameter is set to 1b1
to alert the user of this condition. The memory controller records the address, source ID,
and the length and type of transaction that caused the out-of-range interrupt in the
out_of_range_addr, out_of_range_source_id, out_of_range_length and out_of_range_type
parameters.
Reading the out-of-range parameters initiates the memory controller to empty these
parameters and allow them to store out-of-range access information for future errors. The
Doc ID 018553 Rev 3
253/590
Multiport DDR2/3 controller (MPMC)
RM0078
interrupt are acknowledged by setting bit 0 of the int_ack parameter to 1b1, which in turn
causes bit 0 of the int_status parameter to clear to 1b0.
If a second out-of-range access occurs before the first out-of-range interrupt is
acknowledged, bit 1 of the int_status parameter is set to 1b1 to indicate that multiple
out-of-range accesses occurred. If the out-of-range parameters have been read when the
second out-of-range error occurs, the details of this transactionare stored in the out-of-range
parameters. If they have not been read, the details of the second error are lost.
Even though the address has been identified as erroneous, the memory controller still
processes the read or write transaction. A read transaction returns random data that the
user must receive to avoid stalling the memory controller. A standard, non-exclusive write
transaction will write the associated data to an unknown location in the memory array,
potentially over-writing other stored data. A command can not be aborted once accepted
into the memory controller.
Table 112. Out of range access parameters
Parameter name
Description
out_of_range_addr [34:0]
Transaction Address
out_of_range_source_id [6:0]
Bits [6:4] = Port ID
Bits [3:0] = AXI Thread ID
out_of_range_length [6:0]
Total byte count of the transaction.
For write commands: (axiY_AWLEN + 1) x 2 axiY_AWSIZE.
For read commands: (axiY_ARLEN + 1) x 2 axiY_ARSIZE.
out_of_range_type [5:0]
6b000000 = Non-exclusive write
6b000001 = Non-exclusive read
6b000010 = Non-exclusive masked write
6b000100 = Wrapped write
6b000101 = Wrapped read
6b000110 = Wrapped masked write
6b001000 = Exclusive write
6b001001 = Exclusive read
6b001010 = Exclusive masked write
6b010000 = Flushed write
All other settings Reserved
Self-refresh handshaking protocol
You may manually trigger the memory devices to enter self-refresh mode by setting the
srefresh parameter to 1b1 or by driving the user-interface signal srefresh_enter high. Either
of these methods will cause the memory controller to complete the active processes inside
the memory controller and then put the memory devices into self-refresh. The CKE input will
be deasserted in this mode.
In some circumstances, you may require confirmation that the memory devices have
entered self-refresh mode. For this, the srefresh_ack acknowledge signal has been
implemented. This acknowledge is only available when self-refresh mode is triggered by
driving the srefresh_enter pin, an only if the pin is held high until the acknowledge is
received. Pulsing the srefresh_enter signal is enough to trigger entry to the self-refresh
mode, but the acknowledge requires the signal to be held.
254/590
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
Once asserted, the srefresh_ack signal will be de-asserted when the srefresh_enter pin is
de-asserted.
Data byte disable
In addition to the DFI signals, the memory controller provides a sideband signal,
data_byte_disable, to the PHY indicating its data bus status. This signal’s width is a sum of
the width of the memory data bus and the ECC data bus. For each bit of the bus that the
memory controller is not using, the data_byte_disable signal will be driven to ‘b1. This signal
is a concatenation of ECC functionality and data path reduction. The PHY should use this
information to disable bits on the PHY/memory interface.
Half datapath option
This memory controller includes the option to reduce the usable size of the bus between the
memory controller and the memory devices. This feature is useful when a different memory
part, with a smaller data width, is utilized. To use a memory device with a smaller datapath,
the half datapath option must be enabled by setting the programmable reduc parameter to
1b1.
When the reduc parameter is set to 1b1, only the lower half of the DFI data bus is used. In
this setting, the upper half the signal data_byte_disable will be driven high.
If the reduc parameter is cleared to 1b0, the memory controller will ignore the half datapath
option and function normally. In this case, the entire DFI data interface will be used.
Idle drive enable
For minimal power usage, the memory controller provides an option to disable the data and
strobe buses when the memory controller is idle. If the user sets the drive_dq_dqs
parameter to 1b1 and the memory controller is idle, the idle_drive_enable signal will be
driven high. The memory controller is considered idle when there are no transactions
currently in progress and the memory controller core command queue is empty.
15.6.9
Address mapping
The memory controller automatically maps user addresses to the DRAM memory in a
contiguous block. Addressing starts at user address 0 and ends at the highest available
address according to the size and number of DRAM devices present. This mapping is
dependent on how the memory controller was configured and how the parameters in the
internal MC registers are programmed. The exact number and values of these parameters
depends on the configuration and the type of memory for which the memory controller was
designed.
The mapping of the address space to the internal data storage structure of the DRAM
devices is based on the actual size of the DRAM devices available. The size is stored in
user-programmable parameters that must be initialized at power up. Certain DRAM devices
allow for different mapping options to be chosen, while other DRAM devices depend on the
burst length chosen.
DDR SDRAM address mapping options
The address structure of DDR SDRAM devices contains five fields. Each of these fields can
be individually addressed when accessing the DRAM. The address map for this memory
controller is ordered as follows:
Doc ID 018553 Rev 3
255/590
Multiport DDR2/3 controller (MPMC)
RM0078
Chip Select -- Row -- Bank -- Column -- Datapath
The maximum widths of the fields are based on the configuration settings. The actual widths
of the fields may be smaller if the device address width parameters (addr_pins,
eight_bank_mode and column_size) are programmed differently.
Maximum address space
The maximum user address range is determined by the width of the memory datapath, the
number of chip select pins, and the address space of the DRAM device. The maximum
amount of memory can be calculated by the following formula:
MaxMemBytes = ChipSelects X 2Address X NumBanks X DPWidthBytes
For this memory controller, the maximum values for these fields are as follows:
●
Chip selects = 2
●
Device address = 15 + 14 (Row + Column)
●
Number of banks per chip select = 8
●
Memory datapath width in bytes = 4 bytes
As a result, the maximum accessible memory area is 1 GB or 2 GB depending on
configuration.
Memory mapping to address space
The maximum allowable address space and mapping into the DRAM devices for the
memory controller is shown in Figure 72. This map corresponds to a memory device with 15
row bits and 14 column bits.
Figure 72. Memory controller memory map: maximum
The addr_pins and column_size parameters can each range from the maximum configured
for the memory controller to seven bits smaller than the maximum configured. This allows
the memory controller to function with a wide variety of memory device sizes. The settings
for the addr_pins and column_size parameters control how the address map is used to
decode the user address to the DRAM chip selects and row and column addresses. The
eight_bank_mode parameter controls the address when eight bank mode is supported. It is
assumed that the values in these parameters never exceed the maximum values
configured. Using the example shown in Figure 72, if the memory controller is wired to
devices with 12 row pins and 12 column bits, the maximum accessible memory space would
be reduced. The accessible memory space for this configuration is 1024 MB. The address
map for this configuration is shown in Figure 73. Note that address bits 30 through 34 are
listed as ‘don’t care’ bits. These bits are ignored when the memory controller generates the
address to the DRAM devices, but they are used to verify that the address lies within the
usable address range of the memory controller. Therefore, the user should drive these bits
to ‘b0 to avoid the memory controller interpreting the command as being out-of-range and
setting one or both of the out-of-range interrupt bits.
256/590
Doc ID 018553 Rev 3
RM0078
Multiport DDR2/3 controller (MPMC)
Figure 73. Alternate memory map
Note:
1
The Chip Select, Row, Bank, and Column fields are used to address an entire memory
word, and the memory controller bits are used to address individual bytes within that user
word. For example, for a read starting at byte address 0x2, the memory controller bits must
be defined as 3b010 in order to address this byte directly. Reads and writes are memory
word-aligned if all the memory controller bits are 0.
2
The maximum accessible memory area is 1 GB or 2 GB, depending on configuration. When
the 2 GB address space is enabled, the ACP function is not available.
Doc ID 018553 Rev 3
257/590
Static memory controller (FSMC)
16
RM0078
Static memory controller (FSMC)
This chapter focuses on FSMC functionality and operation.
For the FSMC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
16.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The flexible static memory controller (FSMC) is an AHB peripheral that interfaces AHB
masters to a wide variety of memories. A wrapper is designed to contain this IP and a
multiplexer (MUX) which selects the appropriate signals to connect to the pads depending
on the type of memory.
Figure 74. FSMC and embedded MPU boundary
In chip I/Os
Out chip I/Os
SRAM
Pad direction
FSMC
Interrupts
& wait
I pads
Interrupts
O pads
Clocks & synch
Address
command
clock
On chip
logic
Configuration
registers
SoC boundary
258/590
Doc ID 018553 Rev 3
On board bus
Out data bus
AHB
I/O pads
In data bus
NOR Flash
NAND Flash
RM0078
16.2
Static memory controller (FSMC)
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
16.3
Clocks
The FSMC receives HCLK, the AHB clock, running at 166 MHz.
See also: Chapter 5: Reset and clock generator (RCG).
16.4
Interrupts
Each NAND Flash connected to SPEAr1340 can generate an interrupt to notify the end of
the BUSY state. When it happens, FSMC can provide them to ARM via pc*_int, sampled
according to the configuration register GenMemCtrl_int.
See also: Appendix A: Interrupts.
16.5
Functional description
The flexible static memory controller is used to interface an AHB bus to external memories.
The main purposes of FSMC are:
16.5.1
●
Translate the AHB protocol into the appropriate external storage device protocol
●
Meet the timing of the external devices, slowing down and counting an appropriate
number of HCLK (AHB clock) cycles to complete the transaction to the external device.
NAND Flash controller
The following accesses are supported for NAND Flash:
●
Common memory space access: this is the normal way of accessing the NAND
Flash. The data size is specified in DeviceWidth field of GenMemCtrl_PCx registers
and corresponding timings must be specified in GenMemCtrl_Commx registers.
●
Attribute memory space access: this is the same as the common memory access
mode, except that timings are specified in the GenMemCtrl_Attrib register.
FSMC can support up to 2 memory banks for NAND Flash. The following table lists the
criteria used to select each bank.
Table 113. NAND bank selection
Address (HEX)
Region name
Bank selected
Chip select
0xB0800000
Common memory space
Bank 0
FSMC_CE0n
0xB0880000
Not used
Bank 0
FSMC_CE0n
0xB0900000
Attribute memory space
Bank 0
FSMC_CE0n
0xB0980000
Not used
Bank 0
FSMC_CE0n
0xB0A00000
Common memory space
Bank 1
FSMC_CE1n
0xB0A80000
Not used
Bank 1
FSMC_CE1n
Doc ID 018553 Rev 3
259/590
Static memory controller (FSMC)
RM0078
Table 113. NAND bank selection (continued)
Address (HEX)
16.5.2
Region name
Bank selected
Chip select
0xB0B00000
Attribute memory space
Bank 1
FSMC_CE1n
0xB0B80000
Not used
Bank 1
FSMC_CE1n
NOR Flash / SRAM controller
FSMC can support up to 2 memory banks for NOR Flash and SRAM. The following table
shows the criteria used to select each bank.
Table 114. NOR/SRAM bank selection
Address (HEX)
Bank selected
Chip select
0xA0000000
Bank 0
FSMC_CE0n
0xA4000000
Bank 1
FSMC_CE1n
The lower bits of HADDR are issued to the external memory taking into account that
HADDR is expressed in bytes while the memory is addressed in memory words.
The following table is used and does not depend on the actual bus data transfer size HSIZE.
Table 115. External memory address
Memory word size
HADDR bits issued to memory
1 byte
HADDR[25:0]
2 bytes
HADDR[25:1]
When the bus data size (HSIZE) is smaller that the actual memory size, if it is a SRAM then
the controller uses the byte lanes (BLN outputs). For instance, reading or writing a byte
(HSIZE=00, generated by the ARM assembly instructions LDRB or STRB) to/from a SRAM
16-bit data wide, is managed automatically by the controller with BLN outputs.
If it is a Flash, the controller reads the whole memory word and uses only the information
needed. There is no hardware mechanism to avoid writing to a Flash memory less than one
full memory word.
260/590
Doc ID 018553 Rev 3
RM0078
16.5.3
Static memory controller (FSMC)
Asynchronous operating modes
The interface signals are synchronized by the internal clock HCLK. This clock is not output
to the memory, however it is shown in the following graphics as a reference.
When the extended mode is enabled (ExtendMode bit set in the GenMemCtrlx register),
there are four extended modes available (A, B, C and D) and it is possible to mix these
modes in read and write access. For example, read in mode A and write in mode B.
When the extended mode is disabled, the FSMC operates in Mode1 or Mode2 as follows:
●
Mode 1 is the default mode when SRAM memory type is selected (Bits 3:2
MemoryType = 0x0 in the GenMemCtrlx register).
●
Mode 2 is the default mode when NOR memory type is selected (Bits 3:2
MemoryType = 0x2 in the GenMemCtrlx register).
When the extended mode is disabled, it is not possible to mix modes in read and write
access.
Table 116. FSMC asynchronous operating modes
Memory type
Asynchronous mode
SRAM
NOR
Extended mode disabled
Mode 1
Mode 2
Extended mode enabled
Mode A
Mode B, C, D
To select between the four asynchronous access modes, you must configure the
AccessMode bits in the GenMemCtrl_timx register as follows:
–
00: Access mode A
–
01: Access mode B
–
10: Access mode C
–
11: Access mode D
The following sections describe the asynchronous access modes in more detail.
Note:
For a detailed description of FSMC timing requirements, please refer to Doc ID 023063,
Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU.
Doc ID 018553 Rev 3
261/590
Static memory controller (FSMC)
RM0078
Mode 1 - SRAM asynchronous access
Figure 75 and Figure 76 show the timings for a typical SRAM access.
Figure 75. SRAM asynchronous read access
t1= 4 cycles
Addr_ST= 3
t2= 5 cycles
Data_ST= 4
HCLK
FSMC_CExn
Address valid
FSMC_ADx
FSMC_BLxn
Data read
FSMC_IO
FSMC_REn
FSMC_WEn
Data strobe
Figure 76. SRAM asynchronous write access
t2= 5 cycles
Data_ST= 4
t1= 4 cycles
Addr_ST= 3
HCLK
FSMC_CExn
FSMC_ADx
Address valid
FSMC_BLxn
Data write
FSMC_IO
FSMC_REn
FSMC_WEn
1 HCLK cycle
262/590
Doc ID 018553 Rev 3
RM0078
Static memory controller (FSMC)
Mode A - SRAM asynchronous access with FSMC_REn toggling
Figure 77 and Figure 78 show the timings for a typical SRAM access with FSMC_REn
toggling.
Similar to “Mode 1” with the difference:
●
FSMC_REn toggling
●
Independent read and write timings
Figure 77. SRAM asynchronous read access with FSMC_REn toggling
t1= 4 cycles
Addr_ST= 3
t2= 5 cycles
Data_ST= 4
HCLK
FSMC_CExn
Address valid
FSMC_ADx
FSMC_BLxn
Data read
FSMC_IO
FSMC_REn
FSMC_WEn
Data strobe
Figure 78. SRAM asynchronous write access with FSMC_REn toggling
t2= 5 cycles
Data_ST= 4
t1= 4 cycles
Addr_ST= 3
HCLK
FSMC_CExn
FSMC_ADx
Address valid
FSMC_BLxn
Data write
FSMC_IO
FSMC_REn
FSMC_WEn
1 HCLK cycle
Doc ID 018553 Rev 3
263/590
Static memory controller (FSMC)
RM0078
Mode 2/Mode B - NOR Flash asynchronous access
The only difference between Mode 2 and Mode B is that read and write timings are the
same when the extended mode is disabled (Mode 2), or can be different when the extended
mode is enabled (Mode B).
Similar to “Mode 1” with the difference:
●
FSMC_REn toggling
●
Independent read and write timings when extended Mode is set (Mode B).
Figure 79 and Figure 80 show the timings for a typical NOR Flash access.
Figure 79. NOR Flash asynchronous read access
t2= 5 cycles
Data_ST= 4
t1= 4 cycles
Addr_ST= 3
HCLK
FSMC_CExn
Address valid
FSMC_ADx
FSMC_AV
Data read
FSMC_IO
FSMC_REn
FSMC_WEn
Data strobe
Figure 80. NOR Flash asynchronous write access
t2= 5 cycles
Data_ST= 4
t1= 4 cycles
Addr_ST= 3
HCLK
FSMC_CExn
FSMC_ADx
Address valid
FSMC_AV
FSMC_IO
Data write
FSMC_REn
FSMC_WEn
1 HCLK cycle
264/590
Doc ID 018553 Rev 3
RM0078
Static memory controller (FSMC)
Mode C - NOR Flash asynchronous access with FSMC_REn toggling
Figure 81 and Figure 82 show the timings for a typical NOR Flash access with FSMC_REn
toggling.
Similar to “Mode 1” with the difference:
●
FSMC_AV toggling
●
FSMC_REn toggling
Figure 81. NOR Flash asynchronous read access with FSMC_REn toggling
t2= 5 cycles
Data_ST= 4
t1= 4 cycles
Addr_ST= 3
HCLK
FSMC_CExn
FSMC_ADx
Address valid
FSMC_AV
FSMC_IO
Data read
FSMC_REn
FSMC_WEn
Data strobe
Figure 82. NOR Flash asynchronous write access with FSMC_REn toggling
t1= 4 cycles
Addr_ST= 3
t2= 5 cycles
Data_ST= 4
HCLK
FSMC_CExn
FSMC_ADx
Address valid
FSMC_AV
FSMC_IO
Data write
FSMC_REn
FSMC_WEn
1 HCLK cycle
Doc ID 018553 Rev 3
265/590
Static memory controller (FSMC)
RM0078
Mode D - Asynchronous access with extended address
Figure 83 and Figure 84 show the timings for an asynchronous access with extended
address.
Similar to “Mode 1” with the difference:
●
FSMC_AV toggling
●
FSMC_REn toggling extended beyond FSMC_AV change
Figure 83. Asynchronous read access with extended address
t1= 3 cycles
Addr_ST= 2
th= 3 cycles
Hold_addr= 2
t2= 3 cycles
Data_ST= 2
HCLK
FSMC_CExn
Address valid
FSMC_ADx
FSMC_AV
FSMC_IO
Data read
FSMC_REn
OEN_delay
FSMC_WEn
Data strobe
Figure 84. Asynchronous write access with extended address
t1= 3 cycles
Addr_ST= 2
th= 3 cycles
Hold_addr= 2
t2= 3 cycles
Data_ST= 2
HCLK
FSMC_CExn
FSMC_ADx
Address valid
FSMC_AV
FSMC_IO
Data write
FSMC_REn
FSMC_WEn
1 HCLK cycle
266/590
Doc ID 018553 Rev 3
RM0078
16.5.4
Static memory controller (FSMC)
ECC calculation
FSMC has 2 hardware ECC calculator blocks, based on BCH coding. This solution corrects
up to 8 errors in a 512-byte large data block; the data block size is not programmable. Each
one refers to a single NAND chip select; ECC hardware blocks are not shared among NAND
memories.
After having written the 512-byte data, BCH encoder takes about 29 AHB clock cycles to
calculate the ECC code. The bit 15 in GenMemCtrl_Status register flags when ECC
calculation is completed.
The ECC code is 104 bits and stored in the following registers:
●
GenMemCtrl_ECCrx, where x is the Bank
●
GenMemCtrl_ECC2rx, where x is the Bank
●
GenMemCtrl_ECC3rx, where x is the Bank
●
GenMemCtrl_Status[24:16]
All registers are read-only; attempts to write them are ignored.
When reading back, the 512 bytes data must be stored temporarily in a RAM. The 13 bytes
ECC previously written must also be read, however there is no need to store them in RAM,
the BCH decoder in FSMC automatically captures them to use.
After about 301 AHB clock cycles and having read all data from NAND, the BCH decoder
provides the information in the registers mentioned above.
The final correction must be done in software inverting the appropriate bit in the buffer RAM.
16.5.5
Bus turn around
External memories share the same address and data busses. During a read access, the
data bus is driven by the selected memory. In case the next access is made to a multiplexed
I/Os memory, a data bus turn around condition occurs, because the controller needs to drive
addresses on the data bus, which might lead to a bus contention in case the “previous”
memory has not released the bus fast enough.
To prevent this, each time the FSMC performs a read access (random read, single access,
or the last of a burst) to any kind of memory, a bus turnaround delay is introduced between
the completion of the current read transaction (FSMC_REn and chip select disabled by the
controller) and the next transaction. This delay lasts (BusTurn+1) AHB clock (HCLK) cycles,
where BusTurn is the value programmed in the timing register GenMemCtrl_tim of the
selected memory.
If the memory system includes only non-muxed memories, BusTurn can be set to the
minimum, otherwise set it to fulfill the worst (slowest) memory turnaround time.
Doc ID 018553 Rev 3
267/590
Serial NOR Flash controller (SMI)
17
RM0078
Serial NOR Flash controller (SMI)
This chapter focuses on SMI functionality and operation.
For the SMI feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
17.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The serial memory interface integrated in SPEAr1340 acts as an AHB slave interface (32-,
16- or 8-bit) to SPI-compatible off-chip memories. SMI allows the CPU to use these serial
memories either as data storage or for code execution.
Figure 85. SMI block diagram
SMI clock
prescaler
(1 to 127)
AMBA AHB Bus
SMI data processing
and control
Data,
command
Bank select
SPIcompatible
memories
Transmit register
AHB slave
interface
Control and
status register
Receive register/
Status register
17.2
Clock
Data,
Status
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
268/590
Doc ID 018553 Rev 3
RM0078
17.3
Serial NOR Flash controller (SMI)
Clocks
The memory clock (smi_clk_o) is generated by SMI through its programmable prescaler
unit.
The incoming AHB bus frequency fAHB (HCLK signal) is divided by the value stored in the
PRESC field of CR1 register, resulting in the SMI clock frequency fSMICLK:
fSMICLK = fAHB / (PRESC value)
that is:
tSMICLK = tAHB • (PRESC value)
where tSMICLK and tAHB are the clock periods of the SMI clock and the AHB bus,
respectively.
fSMICLK can be up to 20 MHz in normal mode and up to 50 MHz in fast read mode.
Note:
If PRESC is an even value, high time and low time of SMI clock are both equal to half a
tSMICLK. In contrast, in case PRESC is an odd value:
tSMICLK, high = tSMILCK • [(PRESC - 1) / 2] / PRESC
tSMICLK, low = tSMICLK • [(PRESC + 1) / 2] / PRESC
17.3.1
Latency
Assuming that SMI is not busy, the nominal latency for a 32-bit single read to a nonincrementing serial Flash address is:
●
73 tAHB maximum, if PRESC = 1 (that is, tAHB = tSMICLK);
●
(68 tSMICLK + 5 tAHB) maximum, if PRESC > 1 (that is, tAHB ≠ tSMICLK, and specifically
tSMICLK > tAHB), taking into account up to 9 clock periods in addition to 64 clock periods
required to both send command to serial Flash memory (1-byte opcode + 3-bytes
address) and receive back 32 bits.
Under the same assumption, the nominal latency for a 32-bit single write to a nonincrementing serial Flash address is:
●
5 tAHB maximum, if PRESC = 1 (that is, tAHB = tSMICLK);
●
(2 tSMICLK + 3 tAHB) maximum, if PRESC > 1 (that is, tAHB ≠ tSMICLK, and specifically
tSMICLK > tAHB).
For AHB read burst transfers, the maximum latency for all transfers after the first is the same
as the data size, that is (32 tSMICLK) for a word transfer, (16 tSMILCLK) for a half-word, and
(8 tSMICLK) for a byte, because there are no mandatory extra commands (instruction opcode
and address).
For AHB Write Burst transfers, the maximum latency for the 2nd transfer is: (data size +
opcode + address bytes), and it is the same as data size for the transfers after that.
Nominal latency can be increased by:
●
On-going SMI transfer (read, write, read status register command or write enable
command)
●
Deselect time programming (field TCS in CR1 register), which adds
(TCS + 1) • smi_clk_o periods
●
Busy / idle transfer on AHB bus
Doc ID 018553 Rev 3
269/590
Serial NOR Flash controller (SMI)
RM0078
●
Fast read which adds 1 dummy byte
●
Hold programming (field HOLD in CR1 register)
●
Boot delay time (see Section 17.4.5: Booting from external memory)
●
Frequency change
●
On-going programming
17.4
Functional description
17.4.1
AHB interface
The following rules apply to the access from the AHB to the SMI:
17.4.2
●
Endianness is fixed to little-endian
●
SPLIT/RETRY responses are not supported
●
Bursts must not cross bank boundaries
●
Size of data transfers for memories can be byte/half-word/word, otherwise ERROR
response on HRESP
●
Size of data transfers for registers must be 32-bit wide, otherwise ERROR response on
HRESP
●
Read requests: all types of BURST are supported. Wrapping bursts take more time
than incrementing bursts, as there is a break in the address increment
●
Write requests: wrapping bursts are not supported, and provoke an ERROR response
on HRESP
●
BUSY transfer: the SMI transfer is held until busy is inactive.
Memory device compatibility
The communication protocol used is SPI in CPOL = 1 and CPHA = 1 mode. The instructions
supported are listed in Table 117.
Table 117. Supported instruction set
Opcode
17.4.3
Description
0x03
Read data bytes
0x0B
Read data high speed
0x05
Read status register
0x06
Write enable
0x02
Page program
0xAB
Release from deep power-down
Hardware mode
At reset, the SMI operates in hardware mode. In this mode, the TR transmit register and RR
receive register must not be accessed. They are managed by the SMI hardware and used to
communicate with the external memory devices whenever an AHB master reads or writes to
an address in external memory.
270/590
Doc ID 018553 Rev 3
RM0078
17.4.4
Serial NOR Flash controller (SMI)
Software mode
In software mode, TR transmit register and RR receive register are accessible. Direct AHB
transfers to/from external memories are not allowed. You can enable software mode by
setting the SM bit in the CR1 register.
Software mode is used to transfer any data or commands from the TR transmit register to
external memory and to read data directly in the RR receive register. The transfer is started
using the send bit in the CR2 register.
For example, software mode is used to erase Flash memory before writing. Erase cannot be
managed in hardware mode due to incompatibilities that exist between Flash devices from
different vendors.
In software mode, application code being executed by the core cannot be fetched from
external memory. It must either reside in internal memory, or be previously loaded from
external memory while the SMI is in hardware mode.
17.4.5
Booting from external memory
SPEAr1340 allows an external boot from a serial Flash only located at Bank0 (which is
enabled after power-on reset). During the boot phase, the following instructions sequence is
automatically sent to Bank0:
Note:
17.4.6
1.
Release from deep power-down (opcode 0xAB), in order to be able to boot on this bank
even if it was in deep power-down mode
2.
29 µs delay to ensure Bank0 is successfully released
3.
Read status register (opcode 0x05), in order to check that Bank0 is neither in write nor
in erase cycle
4.
Read data bytes (opcode 0x03) at memory start location (that is, 0xE6000000) with a
19 MHz clock frequency.
1
All memory banks other than Bank0 are disabled at reset and they must be enabled by
setting dedicated BE bits in CR1 register before they can be accessed.
2
If an AHB request occurs while either the WEN bit or the RSR bit (both in CR2 register) is
set, the on-going command is first finished before the request from AHB is sent to the
memory.
External memory read request
A read request to external memory is served only if the SMI is in hardware mode (CR1
register, SW = 0), and write burst mode is not selected (CR1 register, WBM = 0), otherwise
the ERF1 flag in the SR register is set and an ERROR response is sent to AHB.
When a read request occurs in normal mode (CR1 register, FAST=0), the following
sequence is sent to the selected bank:
1.
Read data bytes opcode (0x03)
2.
3 or 2-byte address from the most to the least significant bit (depending on the
ADDR_LENGTH bit in the CR1 register)
3.
The clock is sent until the end of burst request from master.
When a read request occurs in high speed mode (CR1 register, FAST = 1), the following
sequence is sent to the selected bank:
Doc ID 018553 Rev 3
271/590
Serial NOR Flash controller (SMI)
RM0078
1.
Read data bytes at high speed opcode (0x0B)
2.
3 or 2-byte address from the most to the least significant bit (depending on the
ADDR_LENGTH bit in the CR1 register)
3.
1 dummy byte (0x00)
4.
The clock is sent until the end of burst request from master.
The external memory bank remains selected as long as there is no external memory
address jump, and as long as no new commands are sent to the SMI (such as WEN, RSR,
SW mode or WBM mode, write request, bank disable, prescaler configuration change or
memory access error).
It also remains selected when the address rolls over from 0xFFFFFF to 0x000000 in same
bank.
17.4.7
External memory write request
A write request from AHB is served only if the SMI is in hardware mode (CR1 register, SW =
0), otherwise the ERF1 flag in the SR register is set and an ERROR response is sent to
AHB. Wrapping bursts are not allowed as serial memories do not support them. They
generate an ERROR response to AHB.
When a write request occurs, it is sent to external memory if the following conditions are
met:
●
Bank in write mode: When a bank is in write mode, the corresponding WM flag is set
in the SR register. If this condition is not met when a write request occurs, the ERF2
flag in the SR is set and an ERROR response is sent to AHB. To enable write mode,
select the bank using the BS bits in the CR2 register and then set the WEN bit in the
CR1 register.
●
No write in progress: The WIP bit in the SR register must be cleared. If this condition
is not met, AHB is stalled until WIP = 0.
When these two conditions are met, the following sequence is sent to the selected bank:
1.
Page program opcode (0x02)
2.
3- or 2-byte address from the most to the least significant bit (depending on the
ADDR_LENGTH bit in the CR1 register)
3.
Transfer all the data bytes from bit 7 to bit 0, starting with address given previously and
incrementing it to the last depending on the size of the write request.
Write capability must be used only if write in progress/busy bit of the external memory status
register is located in bit 0. Otherwise the system will become locked. After a write request is
sent to external memory, write mode bit is reset and the read status register instruction is
automatically sent to this bank until WIP = 0. Bits 7:0 of the SR register are refreshed every
8 smi_clk_o periods with the contents of the status register read from the selected external
memory.
When memory programming is finished, the WCF in the SR is set and an interrupt is
generated if the WCIE bit in the CR1 register is set.
In order to send a write request to another bank than the one under programming, the
software must wait for WIP = 1, otherwise the error ERF2 would be generated due to non
incrementing address. The bank under programming phase must not be disabled in order to
write to another one.
272/590
Doc ID 018553 Rev 3
RM0078
17.4.8
Serial NOR Flash controller (SMI)
Write burst mode
Write burst mode is used to keep the external memory selected after the AHB write request
(CR1 register, WBM = 1). In that case, the next AHB to external memory write request must
be sent to the next incremented address, and it must be of the same size. Otherwise the
ERF2 flag in the SR register is set and an ERROR response is sent to AHB. The external
memory selection is released by resetting WBM or disabling the bank, and then the external
memory page program cycle starts. If Bank is enabled, the read status register instruction is
automatically sent to this bank until WIP = 0. A memory access error (ERF1 or ERF2)
generates the nCSx release and the start of the external memory page program.
If write burst mode is not selected, the next incrementing AHB write request will be sent to
external memory if it occurs before the end of the previous serial transfer. Otherwise the
ERF2 flag in the SR register is set and an ERROR response is sent to AHB. Consequently,
it is mandatory to set WBM bit in order to perform several write requests which are not sent
in the same AHB incrementing burst. If WBM = 0 and no other write request occurs, the
external memory selection is released after sending the data, and the external memory
page program cycle starts.
Read requests to external memory are forbidden when WBM = 1, otherwise the ERF1 flag
in the SR register is set and an ERROR response is sent to AHB.
17.4.9
Read while write
If a read to the same bank which is in programming phase occurs, the AHB is stalled until
WIP = 0.
If a read to another bank occurs, the read status register sequence is stopped, the read
request is served and then the read status register sequence is re-sent to the memory being
programmed. So during a read while write, the external memory select is released after the
read command, in order to send the read status register sequence.
17.4.10
Erasing and write status register
In case of serial Flash, an erase may be necessary before writing. Due to incompatibility
between different serial Flash vendors, erase and write status register can be done only in
software mode.
It is mandatory to send previously the write enable instruction through software mode only,
in order not to corrupt the WM bit in the SR register as the end of internal Flash erase or
write status register cannot be checked by hardware (and consequently write complete
interrupt is not generated). WIP bit can be checked by sending the RSR command
continuously.
Doc ID 018553 Rev 3
273/590
Memory card interface (MCIF)
18
RM0078
Memory card interface (MCIF)
This chapter focuses on MCIF functionality and operation.
For the MCIF feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
18.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
MCIF is a hardware IP that interfaces with the most common memory cards on the market:
●
SD/SDIO 2.0
●
CF/CF+ Rev 4.1
●
SDHC
●
MMC 4.2/4.3
●
xD
The device interface multiplexes different memory cards on the same IOs; only one memory
card is accessible at a given time. At the board level, discrete elements are required to
handle host-swap management.
Figure 86. SD/SDIO/MMC Host controller block diagram
Bus monitor
AHB BUS
Synchronizer
Power management
AHB
interface
SD
registers
SD protocol unit
Command
control unit
SDIO2.0/
SD2.0 Mem /
MMC 3.31
4.2/4.3 Device
Data
control unit
Data FIFO
2 * 4K
Clock control
18.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
274/590
Doc ID 018553 Rev 3
RM0078
18.3
Memory card interface (MCIF)
Clocks
See Chapter 5: Reset and clock generator (RCG).
18.4
Functional description
18.4.1
SD2.0/SDIO2.0/MMC4.3 AHB Host controller
The SD2.0/SDIO2.0/MMC4.3 Host controller:
●
has an ARM processor interface that conforms to SD Host controller standard
specification version 2.0.
●
handles SDIO/SD protocol at the transmission level, packing data, adding cyclic
redundancy check (CRC) and start/end bit, and checking for transaction format
correctness.
●
provides a programmed IO method and a DMA data transfer method.
In the programmed IO method, the ARM processor transfers data using the buffer data
port register. Host controller support for DMA can be determined by checking the DMA
support in the capabilities register. DMA enables a peripheral to read or write to
memory without intervention from the CPU. The system address register points to the
first data address, and data is then accessed sequentially from that address.
The SD2.0/SDIO2.0/MMC4.3 Host controller comprises:
●
The Host_AHB interface, which acts as the bridge between AHB and Host controller.
●
Host controller registers: the SD/SDIO controller registers are programmed by the ARM
Processor through AHB target interface. Interrupts to the ARM Processor are
generated based on the values set in the Interrupt status register and Interrupt enable
registers.
●
Bus monitor, which checks for any violations occurring in the SD bus and time-out
conditions.
●
Clk_gen: the clock generation block generates the SD clock depending on the value
programmed by the ARM Processor in the clock control register.
●
CRC generator and checker (CRC7 and CRC16): the CRC7 and CRC16 generators
calculate the CRC for command and Data respectively to send the CRC to the
SD/SDIO card. The CRC7 and CRC16 checker checks for any CRC error in the
Response and Data sent by the SD/SDIO card. In order to detect data defects on the
cards the host may include error correction codes in the payload data. An ECC code is
used to store data on the card. This ECC code is used by the host or application to
decode the user data.
AHB interface
The SD2.0/SDIO2.0/MMC4.3 Host controller provides a programmed IO method in which
the ARM Host driver transfers data using the buffer data port register. The AHB target is the
Host control registers, and these registers are programmed by the ARM processor through
the AHB target interface.
In the programmed IO data transfer method, the data transaction is performed through the
AHB target interface. If the data transaction is done using the DMA data transfer method,
the AHB Interface initiates a read or write transaction with memory.
Doc ID 018553 Rev 3
275/590
Memory card interface (MCIF)
RM0078
Interrupt controller
If any of the interrupt bits are set in the interrupt status register, the SD2.0/SDIO2.0/MMC4.3
Host controller generates an interrupt to the ARM processor.
Data FIFO
The SD/SDIO Host controller uses two 4K dual port FIFOs to perform both read and write
transactions.
For maximum throughput during a write transaction (a data transfer from the ARM
Processor to the SD2.0/SDIO2.0/MMC4.3 card), the two FIFOs are used alternately to store
data. As data from the first FIFO transfers to the SD2.0/SDIO2.0/MMC4.3 card, the second
FIFO fills, and as data from the second FIFO transfers, the first FIFO fills.
Similarly, for maximum throughput during a read transaction (a data transfer from the
SD2.0/SDIO2.0/MMC4.3 card to the ARM Processor), data from the
SD2.0/SDIO2.0/MMC4.3 card is alternately written to each FIFO. As data from one FIFO
transfers to the ARM Processor, the other FIFO fills.
If the Host controller cannot accept any data from the SD2.0/SDIO2.0/MMC4.3 card, it
either issues a read wait (if card supports read wait mechanism) to stop the data coming
from the card, or it stops the clock.
Note:
FIFO depth is 4K. Two 4K FIFOs are used to support a ping pong mechanism (to increase
the throughput).
DAT[0-7] control logic
The DAT[0-7] control logic block transmits data in the data lines during a write transaction
and receives data in the data lines during a read transaction.
Command control logic
The Command control logic block sends the command on the cmd line, and receives the
response coming from the SD2.0/SDIO2.0/MMC4.3 card.
Power control
The SD2.0/SDIO2.0/MMC4.3 Host controller supplies SD bus power depending on the
value programmed in the power control register by the ARM Processor. The ARM processor
supplies SD bus voltage according to card OCR and the supply voltage capabilities of the
Host controller. If the SD bus power is set to 1 in the power control register, the Host
controller supplies voltage to the card. If the Host driver selects an unsupported voltage in
the SD bus voltage select field, the Host controller may ignore a write to SD bus power and
retain a value of zero.
Stream write and read for both DMA and NON DMA transaction
WRITE_DAT_UNTIL_STOP(CMD20) writes a data stream from the host, beginning at the
given address and ending at a STOP_TRANSMISSION.
READ_DAT_UNTIL_STOP(CMD11) reads a data stream from the card, beginning at the
given address and ending at a STOP_TRANSMISSION.
The Host controller switches to the second FIFO after writing/reading a block of data to the
first FIFO, but in stream transaction blk size is not programmed by the driver. Because of
this, for both stream write and stream read transactions the host driver should write the
276/590
Doc ID 018553 Rev 3
RM0078
Memory card interface (MCIF)
maximum FIFO size value to the blk size register. For example, if FIFO size is 4 K bytes, the
host driver should write 4 K bytes to the blk size register. This ensures that FIFO switching
occurs after writing/reading the 4 K bytes of data (= FIFO size).
Host enumeration
The SD2.0/SDIO2.0/MMC4.3 host is enumerated by an external ARM processor. The
processor is informed of card insertion or removal from the slot by means of interrupts. The
cards in the slot are enumerated by the SD host controller as instructed by the processor, by
means of target register sets.
Data transfer protocol
SD transfers are classified according to how the number of blocks is specified:
●
Single block transfer
The number of blocks is specified to the host controller before the transfer.
The number of blocks specified is always one.
●
Multiple block transfer
The number of blocks is specified to the host controller before the transfer.
The number of blocks specified is one or more.
●
Infinite block transfer
The number of blocks is not specified to the host controller before the transfer.
This transfer continues until an Abort transaction is executed.
For an SD memory card, the abort transaction is performed by CMD12.
For an SDIO card, the abort transaction is performed by CMD52.
Doc ID 018553 Rev 3
277/590
Memory card interface (MCIF)
18.4.2
RM0078
Not using DMA
Figure 87 provides a flowchart of the data transfer procedure without using DMA.
Figure 87. Data transfer using DAT line sequence (not using DMA)
Start
(5)
(1)
Set Command Reg
Set Block Size Reg
(6)
(2)
Command Complete Int Occur
Wait for
Command Complete Int
Set Block Count Reg
(3)
(7)
Set Argument Reg
Clr Command Complete status
(4)
(8)
Set Transfer Mode Reg
Get Response Data
(9)
Write
Read
Write or read?
(10-R)
(10-W)
Wait for
Buffer Read Ready Int
Wait for
Buffer Write Ready Int
Buffer Write
Ready
Int
occur
Clr Buffer Write Ready status
Buffer Read
Ready Int Occur
(11-R)
(11-W)
Clr Buffer Read Ready status
(12-R)
(12-W)
Get Block Data
Set Block Data
(13-W)
(13-R)
Yes
Yes
More Blocks?
More Blocks?
No
No
Single or Multi
Block Transfer
(14)
Single / Multi / Infinite
Block Transfer?
(17)
(15)
Wait for
Transfer Complete Int
(16)
Abort Transaction
Transfer Complete Int occur
Clr Transfer Complete status
End
278/590
Infinite Block Transfer
Doc ID 018553 Rev 3
RM0078
Note:
Memory card interface (MCIF)
1.
In the Block Size register, set the value of the executed data byte length of one block.
2.
In the Block Count register, set the value of the executed data block count.
3.
In the Argument register, set the value of the issued command.
4.
Set the appropriate value to Multi / Single Block Select and Block Count Enable.
Set the value appropriate for the issued command to Data Transfer Direction, Auto
CMD12 Enable, and DMA Enable.
5.
In the Command register, set the value appropriate for the issued command.
When writing the upper byte of the Command register, an SD command is issued.
6.
Wait for a command complete interrupt.
7.
Clear this bit: In the Normal Interrupt Status register, write 1 to Command Complete.
8.
Read the Response register, and get the necessary information for the issued
command.
9.
If writing to a card, go to step 10-W.
If reading from a card, go to step 10-R.
10-W. Wait for a Buffer Write Ready Interrupt.
Non DMA write transfer: On receipt of a Buffer Write Ready interrupt, the ARM
processor acts as a master and begins to transfer data via the Buffer data port register
(fifo_1). The transmitter begins sending data on the SD bus when a block of data is
ready in fifo_1. While transmitting the data on the SD bus, the buffer write ready
interrupt is sent to the ARM Processor for the second block of data. The ARM
processor acts as a master and begins sending the second block of data via the Buffer
data port register to fifo_2. A buffer write ready interrupt is asserted only when a FIFO
is empty to receive a block of data.
11-W. Write 1 to Buffer Write Ready in the Normal Interrupt Status register for clearing this
bit.
12-W. Write block data (according to the number of bytes specified at the step 1) to the
Buffer Data Port register.
13-W. Repeat until all blocks are sent and then go to step (14).
Non DMA read transfer: A Buffer Read Ready interrupt is asserted whenever a block of
data is ready in one of the FIFO’s. On receipt of a Buffer Read Ready interrupt, the
ARM processor acts as a master and begins reading the data via the Buffer data port
register (fifo_1). The receiver begins reading data from the SD bus only when a FIFO is
empty to receive a block of data. When both the FIFO’s are full, the host controller
stops the data flow from the card either by using a read wait mechanism (if the card
supports read wait) or by stopping the clock.
10-R.Wait for a Buffer Read Ready interrupt
11-R. Clear this bit: In the Normal Interrupt Status register, write 1 to Buffer Read Ready.
12-R. Read block data (according to the number of bytes specified in step 1) from the Buffer
Data Port register.
13-R. Repeat the previous step until all blocks are received, and then go to step 14.
14. For a single or multiple block transfer, go to step 15.
For an infinite block transfer, go to step 17.
15. Wait for a Transfer Complete interrupt.
16. Clear this bit: In the Normal Interrupt Status register, write 1 to Transfer Complete.
17. Perform the Abort transaction sequence.
Doc ID 018553 Rev 3
279/590
Memory card interface (MCIF)
RM0078
Note:
Steps 1 and 2 can be executed simultaneously; steps 4 and 5 can be executed
simultaneously.
18.4.3
Using DMA
Burst types such as 8-beat incrementing burst, 4-beat incrementing burst, or single transfer
are used to transfer or receive the data from the system memory primarily to avoid the
longer hold time of the AHB bus by the master.
Figure 88 provides a flowchart of the data transfer procedure using DMA.
Figure 88. Data transfer using DAT line sequence (using DMA)
Start
(1)
Set System Address Reg
(2)
(10)
Set Block Size Reg
Wait for
Transfer Complete Int
and DMA Int
(3)
Set Block Count Reg
(4)
(11)
Set Argument Reg
Check
Interrupt Status
(5)
Transfer Complete
Int. occur
Set Transfer Mode Reg
(12)
(6)
Clr DMA Interrupt status
Set Command Reg
(7)
(13)
Set System Address Reg
Wait for
Command Complete Int
(8)
DMA Int. occur
Command Complete
Int occur
Clr Command Complete status
(14)
Clr Transfer Complete status
Clr DMA Interrupt status
(9)
Get Response Data
280/590
End
1.
In the System Address register, set the system address for DMA.
2.
In the Block Size register, set the value of the executed data byte length of one block.
3.
In the Block Count register, set the value of the executed data block count.
4.
In the Argument register, set the value of the issued command.
5.
Set the value to Multi / Single Block Select and Block Count Enable. Set the value
corresponding to the issued command to Data Transfer Direction, Auto CMD12 Enable
and DMA Enable.
6.
In Command register, set the value of the issued command.
Doc ID 018553 Rev 3
RM0078
Note:
Memory card interface (MCIF)
When writing the upper byte of Command register, an SD command is issued.
7.
Wait for a Command Complete interrupt
8.
Clear this bit: In the Normal Interrupt Status register, write 1 to Command Complete.
9.
Read the Response register and get the necessary information in accordance with the
issued command.
DMA read transfer: On receipt of the response end bit from the card for the write
command (data flowing from Host to Card), the SD Host controller act as the master
and requests the AHB bus. After receiving the grant, the host controller begins reading
a block of data from the system memory and fills the first FIFO. Whenever a block of
data is ready, the transmitter begins sending the data on the SD bus. While transmitting
the data on the SD bus, the host controller requests the bus to fill the second block in
the second FIFO. Ping Pong FIFOs are used to increase the throughput.
Similarly, the host controller reads a block of data from the system memory whenever a
FIFO is empty.
This continues until all blocks are read from the System memory. a transfer complete
interrupt is set only after transferring all the blocks of data to the card.
DMA write transfer: The block of data received from the card (data flowing from card to
host) is stored in first half of the FIFO. Whenever a block of data is ready, the SD Host
controller acts as the master and request the AHB bus. After receiving the grant, the
host controller begins writing a block of data into the system memory from the first
FIFO. While transmitting the data into the system memory, the host controller receives
the second block of data and store it in second FIFO.
Similarly the host controller writes a block of data into the system memory whenever
data is ready.
This continues until all blocks are transferred to system memory. The transfer complete
interrupt is set only after transferring all blocks of data to the system memory.
Note:
The host controller receives a block of data from the card only when it has room to store a
block of data in FIFO. When both FIFOs are full, the host controller stop the data flow from
the card either by using a read wait mechanism (if the card supports read wait) or by
stopping the clock.
10. Wait for the Transfer Complete interrupt and DMA interrupt.
11. If Transfer Complete = 1, go to step 4,
If DMA Interrupt = 1, go to step 12.
Transfer Complete has higher priority than DMA Interrupt.
12. Clear this bit: In the Normal Interrupt Status register, write 1 to DMA Interrupt.
13. In the System Address register, set the next system address of the next data position
and go to step 10.
14. Clear this bit: In the Normal Interrupt Status register, write 1 to the Transfer Complete
and DMA Interrupt.
Note:
Steps 2 and 3 can be executed simultaneously; steps 5 and 6 can be executed
simultaneously.
Example: the host wishes to transfer 4 KB of data to the card. Assuming that the maximum
block size is 256 bytes, the host driver programs the block size register as 256, and the
block count register with the value 16. The AHB Master and Transmitter inside the
SD2.0/SDIO2.0/MMC4.3 Host controller get the information (how much data to transfer)
from these registers. Using this information, the AHB master acts as a master and initiates a
Doc ID 018553 Rev 3
281/590
Memory card interface (MCIF)
RM0078
data read transaction (to read a block of data - 256 bytes from the system memory). The
following types of burst are used primarily to avoid a longer AHB bus hold by the master.
●
Single transfer
●
4-beat incrementing burst
●
8-beat incrementing burst
The first block is received in the first FIFO and the second block in the second FIFO.
Similarly, the remaining blocks are received in alternate FIFOs. Whenever a block of data is
ready in FIFO, the transmitter starts transmitting the block of data (256) on the SD bus. After
transmitting the entire block of data to the card, the transmitter waits for a status response
from the card. The transmitter sends the next block of data only when it receives a good
status response from the card for the previous block of data, otherwise the transaction is
aborted and the host starts a fresh transaction.
18.4.4
Using ADMA
Figure 89 provides a flowchart of the data transfer procedure using ADMA.
Figure 89. Data transfer using DAT line sequence (using ADMA)
Start
(1)
Create Descriptor table
(2)
Set ADMA System Address Reg
(3)
Set Block Size Reg
(4)
Set Block Count Reg
(11)
Wait for
Transfer Complete Int
and ADMA Error Int
(5)
Set Argument Reg
(12)
Check
Interrupt Status
(6)
ADMA Error Int. occurs
Set Transfer Mode Reg
Transfer Complete
Int. occurs
(7)
Set Command Reg
(13)
(14)
Clr Transfer Complete
Interrupt status
(8)
Wait for
Command Complete Int
(9)
(15)
Abort ADMA
Operation
Command Complete
Int occurs
Clr Command Complete status
(10)
Get Response Data
282/590
Doc ID 018553 Rev 3
Clr ADMA Error
Interrupt status
End
RM0078
Memory card interface (MCIF)
1.
In the system memory, create a Descriptor table for ADMA.
2.
In the ADMA System Address register, set the Descriptor address for ADMA.
3.
In the Block Size register, set the value of the executed data byte length of one block.
In the Block Count register, set the value of the executed data block count as explained
in RM0089, Reference manual, SPEAr1340 address map and registers, MCIF chapter.
If the Block Count Enable in the Transfer Mode register is set to 1, total data length can
be designated by the Block Count register and the Descriptor Table. These two
parameters indicate the same data length, but transfer length is limited by the 16-bit
Block Count register. If the Block Count Enable in the Transfer Mode register is set to 0,
total data length is designated not by the Block Count register, but by the Descriptor
Table. In this case, ADMA reads more data than length programmed in the descriptor
from the SD card. A too large read operation is aborted asynchronously, and extra read
data is discarded when the ADMA completes.
Note:
4.
In the Argument register, set the argument value.
5.
In the Transfer Mode register, set the appropriate value. The host driver determines
Multi / Single Block Select, Block Count Enable, Data Transfer Direction, Auto CMD12
Enable and DMA Enable. Multi / Single Block Select and Block Count Enable are
determined as explained in RM0089, Reference manual, SPEAr1340 address map and
registers, MCIF chapter.
6.
In the Command register, set the appropriate value.
When writing to the upper byte [3] of the Command register, an SD command is issued and
DMA is started.
7.
Wait for a Command Complete interrupt.
8.
Clear this bit: In the Normal Interrupt Status register, write 1 to Command Complete.
9.
Read the Response register and get the necessary information for the issued
command.
10. Wait for a Transfer Complete interrupt and an ADMA Error interrupt.
11. If Transfer Complete = 1, go to step 12.
If ADMA Error Interrupt = 1, go to step 13.
12. Clear this bit: In the Normal Interrupt Status register, write 1 to Transfer Complete
Status.
13. Clear this bit: In the Error Interrupt Status register, write 1 to ADMA Error Interrupt
Status.
14. Abort the ADMA operation. To stop SD card operation, issue an abort command. If
necessary, the host driver checks the ADMA Error Status register to detect why an
ADMA error was generated.
Note:
Steps 3 and can be executed simultaneously; steps 5 and 6 can be executed
simultaneously.
Doc ID 018553 Rev 3
283/590
Memory card interface (MCIF)
18.4.5
RM0078
Abort transaction
An abort transaction is performed using CMD12 for an SD memory card and CMD52 for an
SDIO card. The two cases when the HD must do an abort transaction are: When the HD
stops infinite block transfers and when HD stops transfers while a multiple block transfer is
exicuting.
The two types of abort command are: asynchronous abort, where the HD can issue an
abort command at anytime unless command inhibit (CMD) = 1 in the current state register;
and synchronous abort, where the HD uses a Stop At Block Gap request in the block gap
control register to issue an abort command after the data transfer stops.
Synchronous abort
Figure 90 provides a flowchart of this procedure.
Figure 90. Synchronous abort sequence
284/590
1.
Stop SD transactions: In the Block Gap Control register, set the Stop At Block Gap
Request to 1.
2.
Wait for a Transfer Complete interrupt.
3.
Clear this bit: In the Normal Interrupt Status register, set Transfer Complete to 1.
4.
Issue an Abort Command
5.
Do a software reset: In the Software Reset register, set both Software Reset for DAT
Line and Software Reset for CMD Line to 1.
6.
In the Software Reset register, check both the Software Reset for DAT Line and the
Software Reset for CMD Line. If both are 0, the abort is complete. If either is 1, repeat
this step.
Doc ID 018553 Rev 3
RM0078
18.4.6
Memory card interface (MCIF)
Synchronization
Data path synchronization
For both read and write transaction, dual port RAM is used to store data using one clock
domain and to retrieve data using another clock domain.
Signal flow from clock domain A to clock domain B
In clock domain A, the input pulse (in_pulse) is latched at clock A, and the latched signal
(in_pulse_lat) is inverted whenever an input pulse (in_pulse) is detected.
In clock domain B, the latched signal is triple-flopped. The output pulse of clock domain B is
generated by XORing the output of the second and third stage synchronizers (flip flops).
Figure 91. Data path synchronization
Doc ID 018553 Rev 3
285/590
Memory card interface (MCIF)
18.4.7
RM0078
CF4.1/xD1.3 AHB Host controller
The CFHOST controller provides a control interface to connect a CompactFlash Storage or
CF+ Card to the AMBA AHB slave interface.
It has the following features:
●
True IDE operating mode only.
For TrueIDE mode support, the CFHOST controller provides direct access to the ATA
Command/Control register set in the CF/CF+ Device. For data transfers in TrueIDE
mode (PIO), the CPU can directly read/ write the ATA DataPort register transparently
for transferring data or let the CFHOST controller perform PIO transfer protocol by
performing the INTRQ monitoring from the CompactFlash device for every block of
transfer, with DRQ size set to 512 (default), 1024, 2048 or 4096 Bytes.
●
Ultra DMA transfer protocol, to transfer data between the host controller and the
CF/CF+ device, for increased data transfer rates.
●
Advanced timing modes when generating transfers in True IDE mode and ultra DMA
mode.
●
1 byte transfer sizes and up to 256 blocks (where a block size is 512 bytes) between
the AHB Bus and the CF/CF+ device.
The transfer size is always 16-bit wide. The block data transfers increases the
performance of the CPU by off loading the complexity of performing individual
transactions on the CF/CF+ Interface to the CFHOST controller with the CPU
performing data read/write in Burst mode on the AHB Bus.
Dual Internal Data FIFOs are used in ping-pong fashion during the Block transfer, and
the AHB Interface operates in PIO mode for efficient movement of data between the
Host memory and to/from the controller’s FIFOs.
●
Complete PIO Transfer protocol by monitoring the INTRQ Signal and BSY/DRQ for
every Block Size (which is default of 512 bytes). For write transfers, the CFHOST
controller monitors the INTRQ Signal and DRQ status for every sector (512 bytes
default) that is to be transferred to the CF/CF+ Device. For read transfers, the CFHOST
controller monitors the INTRQ Signal and DRQ status for every Sector (512 bytes
default) that is to be read from the CF/CF+ device. In this mode, the performance is
increased dramatically as the CPU is only transferring data to/from Data Port FIFO’s in
the CFHOST controller in burst mode, while the CFHOST controller is performing the
actual PIO transfer on the CF/CF+ Interface.
●
A dual-clock based architecture (one clock for the CF Interface and one clock for the
AHB Interface).
The dual clock architecture provides flexibility in running the CompactFlash (CF+)
Interface at the higher speeds that are part of the advanced timing modes.
286/590
Doc ID 018553 Rev 3
RM0078
Memory card interface (MCIF)
Block diagram
Figure 92. CF/xD Host controller block diagram
Timing
Control
CF/+
Interface
Controller
AHB
AHB
Processor
slave
Bus
card
RAM
256 x 32
interface
AHB Bus
CF CF/+
xD
Operations
Registers
Interface
xD
xD
Bus Card
ECC
CF host control and status registers set block
This block:
●
Contains a set of registers used to operate the CFHOST controller. These registers
include the configuration and status registers, interrupt control register, transfer control
registers, and the data port registers (read/write data port register).
●
Provides access to the ATA Register set in the CF/CF+ device.
●
Generates interrupt to the CPU by monitoring various events.
What follows is a brief description of the registers and functions in this block. For a complete
description refer to the host controller registers in the RM0089, Reference
manual, SPEAr1340 address map and registers.
Configuration registers are used to configure the various modes in the CFHOST controller
that control the behavior of the Interface/Card. The main modes the CFHOST controller can
operate are: memory mode, I/O mode, true IDE mode and ultra DMA mode. The Timing
mode is programmed in the timing mode register. The frequency of the cficlk is programmed
in CFI clock configuration register. The CFI status register contains the current CF/CF+ card
interface status.
Interrupt control registers contain the IRQ Register to report various events, and the
Interrupt Enable register to control the generation of the ahbtarget_interrupt signal based on
various Events; it monitors the CF/CF+ interface signals and generates an interrupt when an
event occurs. The event Status is returned through the interrupt register. The CPU can
enable/disable the Interrupt generation for each individual event separately. The Interrupt
Block also generates an interrupt when the currently invoked transfer on the C/CF+ Interface
completes, or, during the transfer, a when the Buffer is available to transfer the next Block of
data.
Transfer control registers are a set of registers used to generate Transactions on the
CF/CF+ Interface. Based on the values programmed and current mode of the CFHOST
controller, transfers are generated on the CF/CF+ Interface in addition to the Ultra DMA
Doc ID 018553 Rev 3
287/590
Memory card interface (MCIF)
RM0078
Mode and the True IDE Modes. The data is transferred through the Read/Write Data Port
Registers.
Data port registers are front end registers of the Read FIFO or Write FIFO. The write data
port register writes data into the Write FIFO to transfer data to the CF/CF+ Card. The read
data port register reads data from the Read FIFO when transferring data from the CF/CF+
Card to the CPU.
Extended data port registers cover a range of addresses (512 Byte for each write and
read) that acts as a front end to the Write FIFO or Read FIFO. The extended data port can
be used when the CPU wishes to initiate a Burst Transfer with Incrementing Addresses. The
Extended Write Data Port Register space (offset from 0x0200 to 0x03FC) is used to write
data into the Write FIFO, similarly to the write data port register at offset 0x0024. The
extended read data port register space (offset from 0x0400 to 0x05FC) is used to read data
from the Read FIFO, similarly to the read data port register at offset 0x0028.
TrueIDE registers provide a window with which the CPU can directly access the ATA
Registers in the CF/CF+ Device. All TrueIDE registers are 8-bit registers except the
DataPort register, which is a 16-bit register. Accesses to these registers are treated as
Non-Posted for read/write operations, and the transfers on the AHB interface are extended
until the transfer is completed on the CF/CF+ Interface. Accesses to these registers are
completed with Error response when the Interface is not operating in TrueIDE mode. The
CPU can use the ATA DataPort if it wishes to directly transfer the data between the CPU and
the CF/CF+ Device using TrueIDE PIO Mode transfer Protocol. The direct access of the ATA
DataPort is treated as transparent mode with the CFHOST controller providing only bus
protocol translation between the AHB bus and the CF/CF+ Interface.
CF host timing control block
The timing control block generates the timing information to the CF/CF+ Interface controller
block while the Interface block is generating transfers on the CF/CF+ Interface. The timing
information is based on the cficlk frequency, and on the current CF/CF+ Card timing mode.
The timing information is critical for the correct operation of the CF HOST controller to
prevent violating the timing protocol when generating transfers on the CF/CF+ Interface, and
for the correct operation of the CF/CF+ Card. The timing information is hard coded in this
block based on the timing values called for in the Specification. The timing information is
divided based on the Host controller’s operating mode and transfer mode.
CF Host transaction controller block
The transaction controller is the main control for the CFHOST controller that manages
transaction sequence generation on the CF/CF+ Interface. Based on the current transfer
mode of the CFHOST controller, the controller generates the transfer sequence to the
CF/CF+ Interface controller when the Transfer Control Register is programmed to initiate
transfer.
In TrueIDE PIO mode, the controller can perform PIO Block transfers to the CF/CF+. The
controller completely handles the PIO Transfer protocol by monitoring INTRQ signal and
BSY/DRQ bit for every Block Transfer until the complete data is transferred. The INTRQ
signal is blocked from the CPU until the PIO Transfer completes, at which point the INTRQ is
passed through to the Host CPU. The address is always fixed to 0b000 to address the Data
Port Register.
The Transfer Count must be a multiple of the DRQ Block size that is programmed in the
CFHOST controller and the CF/CF+ Device. Supported DRQ Block Sizes are 512
Bytes(Default), 1024 Bytes, 2048 Bytes and 4096 Bytes. The DRQ Block Size is also used
288/590
Doc ID 018553 Rev 3
RM0078
Memory card interface (MCIF)
to monitor the INTRQ signal for every new Block. Once the Transfer completes, the INTRQ
from the CF/CF+ Device is forwarded to the AHB Bus for the CPU to process the interrupts.
In DMA mode, the controller can perform either the TrueIDE MultiWord DMA Transfers or
UltraDMA transfers to the CF/CF+ device. The Transfer size can be from 2 to 65536 Bytes.
The DMA protocol on the CF/CF+ Interface is completely handled by the CFHOST controller
by shielding the Host CPU from the complexities of the protocol.
CF host data FIFO block
The data FIFO block contains a FIFO that is used for read/write transfers from/to the
CF/CF+ Card. This is an asynchronous FIFO, 32-bits wide and 256-bits deep, with one side
operating on ahbclk while the other side operates on clk_xin. For Write transfers to the
CF/CF+ Card, this FIFO is called the Write FIFO and for read transfers from CF/CF+ Card,
this FIFO is called the Read FIFO.
The Write FIFO is used when transferring data from the CPU to the CF/CF+ Card (Write
Transfers). Once the transfer control registers are programmed, the CPU writes the data to
be transferred into the Write FIFO by means of the write data port register. The CFHOST
controller translates the 32-bit data into 16-bit wide data. After each block of data is
transferred into the Write FIFO, the CPU waits for a Buffer Available Interrupt to transfer the
next Block of data (up to value programmed in the Transfer Block Count Register) into the
Write FIFO.
The Read FIFO is used when transferring data from the CF/CF+ Card to the CPU (Read
Transfers). Once the Transfer Control Registers are programmed, the CPU reads the data
from the Read FIFO by means of the Read Data Port Register. The CFHOST controller
assembles the 32-bit data from multiple 16-bit transfers on the CF Interface. For each block
of data read from the CF/CF+ Card, the controller asserts a Buffer Available Interrupt to
indicate that there is data available in the FIFO.
CompactFlash/CF+ interface block
The CF/CF+ Interface block interfaces with the CF/CF+ Interface signals and performs the
PIO/DMA Transfers based on the Transaction controller instructions. The Interface block
deals with one Memory, IO, PIO, or DMA Burst transfer at a time, while the Transaction
controller maintains the overall Transfer Counts. The Interface controller gets the Timing
Information from the Timing controller Block and uses this timing information when
performing PIO or DMA transfers.
xD Interface Block
AHB interface. The AHB slave block houses the Operational registers and handles the
reading and writing of these registers by the Arm Processor.
Synchronization module (SYNC) has handshake logic to communicate with the AHB
Interface, and on the other side communicates with the xD Card Interface.
ECC detection & correction module includes the ECC detection and correction modules,
and calculates the ECC code. The calculated ECC code is stored in the xD Card ECC area
during the xD Card write command. The calculated ECC code is compared with the
received ECC from the xD Card. The error correction module corrects the 1 bit error in the
byte.
Note:
3 bytes of ECC is calculated for every 256 bytes in Main Area. ECC is not calculated for the
Redundant Area.
Doc ID 018553 Rev 3
289/590
Memory card interface (MCIF)
RM0078
xD controller handles all command, address, and data sequences, manages all hardware
protocols, and enables users to access xD Memory by reading or writing into the AHB slave
operational registers.
290/590
Doc ID 018553 Rev 3
RM0078
19
Giga/Fast Ethernet controller (GMAC)
Giga/Fast Ethernet controller (GMAC)
This chapter focuses on GMAC functionality and operation.
For the GMAC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
19.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The GMAC IP provides the capability to transmit and receive data over Ethernet.
19.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
19.3
Clocks
See Chapter 5: Reset and clock generator (RCG).
19.4
Functional description
The Giga/Fast Ethernet controller supports the following interfaces:
●
MII: Media independent interface
●
GMII: Gigabit media independent interface
●
RMII: Reduced MII
These interfaces are multiplexed on the device pads. To select between the available
interfaces, you must configure the miscellaneous register GMAC_CLK_CFG[5:3],
macphy_sel.
This section describes both normal and alternate/enhanced descriptor formats.
Note:
In SPEAr1340, only the alternate/enhanced format is supported by hardware.
Doc ID 018553 Rev 3
291/590
Giga/Fast Ethernet controller (GMAC)
RM0078
19.4.1
Descriptors
Note:
In SPEAr1340, only the alternate/enhanced format is supported by hardware.
Alternate or enhanced descriptors
The alternate (or enhanced) descriptor structure has 8 DWORDS (32 bytes). The features of
the alternate descriptor structure are:
●
The alternative descriptor structure has been implemented to support buffers of up to
8 KB (useful for Jumbo frames).
●
There is a reassignment of control and status bits in TDES0, TDES1, RDES0
(Advanced timestamp or IPC full offload configuration), and RDES1.
●
The transmit descriptor stores the timestamp in TDES6 and TDES7 for the Advanced
Timestamp.
●
This receive descriptor structure is also used for storing the extended status (RDES4)
and timestamp (RDES6 and RDES7) for advanced timestamp feature or IPC full offload
feature.
●
For the Timestamp feature, the software needs to allocate 32 bytes (8 DWORDS) of
memory for every descriptor. When Timestamping or Receive IPC FullOffload engine
are not enabled, the extended descriptors are not required and the software can use
alternate descriptors with the default size of 16 bytes. The core also needs to be
configured for this change using bit 7 (ATDS: alternate descriptor size) of the Bus Mode
register.
●
When alternate descriptor is chosen without Timestamp or Full IPC Offload feature, the
descriptor size is always 4 DWORDs (DES0-DES3).
The description or bit-mapping alternate descriptor structure (in little-endian mode) is given
below.
Note:
The effect of big-endian mode (byte-swap) apply to this descriptor structure as well.
When alternate descriptor with only Full IPC Checksum Offload (Type 2) is selected, it is not
backward compatible with respect to status bits[7,5,0] in RDES0. In this mode, you should
enable the extended descriptor mode (8 DWORDS) to get the IPC checksum engine status
in RDES4.
Transmit descriptors
Figure 93 shows the transmit descriptor structure. The application software must program
the control bits TDES0[31:20] during descriptor initialization. When the DMA updates the
descriptor, it writes back all the control bits except the OWN bit (which it clears) and updates
the status bits[19:0]. Table 118 describes transmitter descriptor word 0 (TDES0) through
word 3 (TDES3).
With the advance timestamp support, the snapshot of the timestamp to be taken can be
enabled for a given frame by setting the TTSE: Transmit Timestamp Enable (TDES0 bit-25).
When the descriptor is closed (when the OWN bit is cleared), the time-stamp is written into
TDES6 and TDES7. This is indicated by the status bit TTSS: Transmit Timestamp Status
(TDES0 bit-17). The contents of TDES6 and TDES7 are listed in Table 119.
Note:
292/590
When either the advanced timestamp or IPC offload (Type 2) feature is enabled, the
software should set the DMA Bus Mode register[7], so that the DMA operates with extended
descriptor size. When this control bit is reset, the TDES4-TDES7 descriptor spaces are not
valid.
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
Figure 93. Transmitter descriptor fields - alternate (enhanced) format
31
TDES0
0
O
W
N
TDES1
Ctrl
[30:26]
RES
T
T
S
E
RE
S
Ctrl
[23:20]
T
T
T
S
R
E
S
Status [16:0]
Buffer 2 Byte Count [28:16]
RES
Buffer 1 Byte Count [12:0]
TDES2
Buffer 1 Address [31:0]
TDES3
Buffer 2 Address [31:0] or Next Descriptor Address [31:0]
TDES4
Reserved
TDES5
Reserved
TDES6
Transmit Time Stamp Low [31:0]
TDES7
Transmit Time Stamp High [31:0]
The DMA always reads or fetches four DWORDS of the descriptor from system memory to
obtain the buffer and control information as shown in Figure 94. For AV feature support,
TDES0 has additional control bits[6:3] for channel 1. For channel 0, the bits 6:3 are ignored.
Table 118 describes bits 6:3.
Figure 94. Transmit descriptor fetch (read) for alternate (enhanced) format
31
TDES0
TDES1
0
O
W
N
Ctrl
[30:26]
R
E
S
T
T
S
E
R
E
S
Ctrl
[23:20]
R
E
S
Reserved for
Status [17:7]
Buffer 2 Byte Count [28:16]
R
E
S
SLOT
Number [6:3]
Reserved for
Status [3:0]
Buffer 1 Byte Count [12:0]
TDES2
Buffer 1 Address [31:0]
TDES3
Buffer 2 Address [31:0] or Next Descriptor Address [31:0]
Table 118. Transmit descriptor words 0 through 3 (TDES0 — TDES3)
Bitq
Description
TDES0
31
OWN: Own Bit
When set, this bit indicates that the descriptor is owned by the DMA. When this bit is reset, it indicates that
the descriptor is owned by the Host. The DMA clears this bit either when it completes the frame
transmission or when the buffers allocated in the descriptor are read completely.
To avoid a possible race condition between fetching a descriptor and the driver setting an ownership bit, set
the ownership bit of the frame’s first descriptor after all subsequent descriptors belonging to the same frame
are set.
30
IC: Interrupt on Completion
When set, this bit sets the Transmit Interrupt (Register 5[0]) after the present frame has been transmitted.
Doc ID 018553 Rev 3
293/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Table 118. Transmit descriptor words 0 through 3 (TDES0 — TDES3) (continued)
Bitq
Description
29
LS: Last Segment
When set, this bit indicates that the buffer contains the last segment of the frame. When this bit is set, the
TBS1: Transmit Buffer 1 Size or TBS2: Transmit Buffer 2 Size field in TDES1 should have a non-zero value.
28
FS: First Segment
When set, this bit indicates that the buffer contains the first segment of a frame.
27
DC: Disable CRC
When this bit is set, the GMAC does not append a cyclic redundancy check (CRC) to the end of the
transmitted frame. This is valid only when the first segment (TDES0[28]) is set.
26
DP: Disable Pad
When set, the GMAC does not automatically add padding to a frame shorter than 64 bytes. When this bit is
reset, the DMA automatically adds padding and CRC to a frame shorter than 64 bytes, and the CRC field is
added despite the state of the DC (TDES0[27]) bit. This is valid only when the first segment (TDES0[28]) is
set.
25
TTSE: Transmit Timestamp Enable
When set, this bit enables IEEE1588 hardware time stamping for the transmit frame referenced by the
descriptor. This field is valid only when the First Segment control bit (TDES0[28]) is set.
24
Reserved
CIC: Checksum Insertion Control
These bits control the checksum calculation and insertion. Bit encodings are as shown below.
2’b00: Checksum Insertion Disabled.
2’b01: Only IP header checksum calculation and insertion are enabled.
23:22 2’b10: IP header checksum and payload checksum calculation and insertion are enabled, but pseudoheader checksum is not calculated in hardware.
2’b11: IP Header checksum and payload checksum calculation and insertion are enabled, and pseudoheader checksum is calculated in hardware.
When the configuration parameter IPC_FULL_OFFLOAD is not selected, this field is reserved.
21
TER: Transmit End of Ring
When set, this bit indicates that the descriptor list reached its final descriptor. The DMA returns to the base
address of the list, creating a descriptor ring.
20
TCH: Second Address Chained
When set, this bit indicates that the second address in the descriptor is the Next Descriptor address rather
than the second buffer address. When TDES0[20] is set, TBS2 (TDES1[28:16]) is a “don’t care” value.
TDES0[21] takes precedence over TDES0[20].
19:18 Reserved
17
TTSS: Transmit Timestamp Status
This field is used as a status bit to indicate that a timestamp was captured for the described transmit frame.
When this bit is set, TDES2 and TDES3 have a timestamp value captured for the transmit frame. This field
is only valid when the descriptor’s Last Segment control bit (TDES0[29]) is set.
16
IHE: IP Header Error
When set, this bit indicates that the GMAC transmitter detected an error in the IP datagram header. The
transmitter checks the header length in the IPv4 packet against the number of header bytes received from
the application and indicates an error status if there is a mismatch. For IPv6 frames, a header error is
reported if the main header length is not 40 bytes. Furthermore, the Ethernet Length/Type field value for an
IPv4 or IPv6 frame must match the IP header version received with the packet. For IPv4 frames, an error
status is also indicated if the Header Length field has a value less than 0x5.
294/590
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
Table 118. Transmit descriptor words 0 through 3 (TDES0 — TDES3) (continued)
Bitq
Description
15
ES: Error Summary
Indicates the logical OR of the following bits:
– TDES0[14]: Jabber Timeout
– TDES0[13]: Frame Flush
– TDES0[11]: Loss of Carrier
– TDES0[10]: No Carrier
– TDES0[9]: Late Collision
– TDES0[8]: Excessive Collision
– TDES0[2]: Excessive Deferral
– TDES0[1]: Underflow Error
– TDES0[16]: IP Header Error
TDES0[12]: IP Payload Error
14
JT: Jabber Timeout
When set, this bit indicates the GMAC transmitter has experienced a jabber time-out. This bit is only set
when the GMAC configuration register’s JD bit is not set.
13
FF: Frame Flushed
When set, this bit indicates that the DMA/MTL flushed the frame due to a software Flush command given by
the CPU.
12
IPE: IP Payload Error
When set, this bit indicates that GMAC transmitter detected an error in the TCP, UDP, or ICMP IP datagram
payload.
The transmitter checks the payload length received in the IPv4 or IPv6 header against the actual number of
TCP, UDP, or ICMP packet bytes received from the application and issues an error status in case of a
mismatch.
11
LC: Loss of Carrier
When set, this bit indicates that a loss of carrier occurred during frame transmission (that is, the gmii_crs_i
signal was inactive for one or more transmit clock periods during frame transmission). This is valid only for
the frames transmitted without collision when the GMAC operates in Half-Duplex mode.
10
NC: No Carrier
When set, this bit indicates that the Carrier Sense signal form the PHY was not asserted during
transmission.
9
LC: Late Collision
When set, this bit indicates that frame transmission was aborted due to a collision occurring after the
collision window (64 byte-times, including preamble, in MII mode and 512 byte-times, including preamble
and carrier extension, in GMII mode). This bit is not valid if the Underflow Error bit is set.
8
EC: Excessive Collision
When set, this bit indicates that the transmission was aborted after 16 successive collisions while
attempting to transmit the current frame. If the DR (Disable Retry) bit in the GMAC Configuration register is
set, this bit is set after the first collision, and the transmission of the frame is aborted.
7
VF: VLAN Frame
When set, this bit indicates that the transmitted frame was a VLAN-type frame.
Doc ID 018553 Rev 3
295/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Table 118. Transmit descriptor words 0 through 3 (TDES0 — TDES3) (continued)
Bitq
Description
6:3
CC: Collision Count (Status field)
These status bits indicate the number of collisions that occurred before the frame was transmitted. This
count is not valid when the Excessive Collisions bit (TDES0[8]) is set. The core updates this status field only
in the half-duplex mode.
-orSLOTNUM: Slot Number Control Bits in AV Mode
These bits indicate the slot interval in which the data should be fetched from the corresponding buffers
addressed by TDES2 or TDES3.
When the transmit descriptor is fetched, the DMA compares the slot number value in this field with the slot
interval maintained in the core (Register 11xx). It fetches the data from the buffers only if there is a match in
values. These bits are valid only for the AV channel 1 (not channel 0).
2
ED: Excessive Deferral
When set, this bit indicates that the transmission has ended because of excessive deferral of over 24,288 bit
times (155,680 bits times in 1,000-Mbps mode or if Jumbo Frame is enabled) if the Deferral Check (DC) bit
in the GMAC Control register is set high.
1
UF: Underflow Error
When set, this bit indicates that the GMAC aborted the frame because data arrived late from the Host
memory. Underflow Error indicates that the DMA encountered an empty transmit buffer while transmitting
the frame. The transmission process enters the Suspended state and sets both Transmit Underflow
(Register 5[5]) and Transmit Interrupt (Register 5[0]).
0
DB: Deferred Bit
When set, this bit indicates that the GMAC defers before transmission because of the presence of carrier.
This bit is valid only in Half-Duplex mode.
TDES1
31:29 Reserved
28:16
TBS2: Transmit Buffer 2 Size
These bits indicate the second data buffer size in bytes. This field is not valid if TDES0[20] is set.
15:13 Reserved
12:0
TBS1: Transmit Buffer 1 Size
These bits indicate the first data buffer byte size, in bytes. If this field is 0, the DMA ignores this buffer and
uses Buffer 2 or the next descriptor, depending on the value of TCH (TDES0[20]).
TDES2
31:0
Buffer 1 Address Pointer
These bits indicate the physical address of Buffer 1. There is no limitation on the buffer address alignment.
TDES3
31:0
296/590
Buffer 2 Address Pointer (Next Descriptor Address)
Indicates the physical address of Buffer 2 when a descriptor ring structure is used. If the Second Address
Chained (TDES1[24]) bit is set, this address contains the pointer to the physical memory where the Next
Descriptor is present. The buffer address pointer must be aligned to the bus width only when TDES1[24] is
set. (LSBs are ignored internally.)
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
Table 119. Transmit descriptor words 6 and 7 (TDES6 and TDES7)
Bit
Description
TDES6
TTSL: Transmit Frame Timestamp Low
31:0 This field is updated by DMA with the least significant 32 bits of the timestamp captured for the corresponding
transmit frame. This field has the timestamp only if the Last Segment bit (LS) in the descriptor is set and
Timestamp status (TTSS) bit is set.
TDES7
TTSH: Transmit Frame Timestamp High
31:0 This field is updated by DMA with the most significant 32 bits of the timestamp captured for the
corresponding receive frame. This field has the timestamp only if the Last Segment bit (LS) in the descriptor
is set and Timestamp status (TTSS) bit is set.
Receive descriptors
Figure 95 shows the structure of the received descriptor. This has 32 bytes of descriptor
data (8 DWORDs) for Advanced Timestamp or IPC Full Offload feature.
Note:
For each of these features, the software should set the DMA Bus Mode register[7] so that
the DMA operates with extended descriptor size. When this control bit is reset, RDES0[7]
and RDES0[0] is always cleared and the RDES4-RDES7 descriptor space are not valid.
Figure 95. Receive descriptor fields - alternate (enhanced) format
31
RDES0
0
O
W
N
RDES1 CTRL
Status [30:0]
RES
[30:29]
Buffer 2 Byte Count
[28:16]
CTRL
[15:14]
R
E
S
Buffer 1 Byte Count
[12:0]
RDES2
Buffer 1 Address [31:0]
RDES3
Buffer 2 Address [31:0] or Next Descriptor Address [31:0]
RDES4
Extended Status [31:0]
RDES5
Reserved
RDES6
Receive Time Stamp Low [31:0]
RDES7
Receive Time Stamp High [31:0]
●
Table 120 describes RDES0 through RDES3.
●
The extended status is written as shown in Table 121. The extended status is written
only when there is status related to IPC or timestamp available. The availability of
extended status is indicated by bit-0 of RDES0. This status is available for Advance
Timestamp or IPC Full Offload features.
●
RDES6 and RDES7 contain a snapshot of the time-stamp. The availability of that
snapshot is indicated by bit-7 in the RDES0 descriptor. Table 122 lists contents of
RDES6 and RDES7.
Doc ID 018553 Rev 3
297/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Table 120. Receive descriptor fields (RDES0 through RDES3)
Bit
Description
RDES0
31
OWN: Own Bit
When set, this bit indicates that the descriptor is owned by the DMA of the GMAC Subsystem. When this bit
is reset, this bit indicates that the descriptor is owned by the Host. The DMA clears this bit either when it
completes the frame reception or when the buffers that are associated with this descriptor are full.
30
AFM: Destination Address Filter Fail
When set, this bit indicates a frame that failed in the DA Filter in the GMAC Core.
FL: Frame Length
These bits indicate the byte length of the received frame that was transferred to host memory (including
CRC). This field is valid when Last Descriptor (RDES0[8]) is set and either the Descriptor Error
(RDES0[14]) or Overflow Error bits are reset. The frame length also includes the two bytes appended to the
29:16 Ethernet frame when IP checksum calculation (Type 1) is enabled and the received frame is not a MAC
control frame.
This field is valid when Last Descriptor (RDES0[8]) is set. When the Last Descriptor and Error Summary
bits are not set, this field indicates the accumulated number of bytes that have been transferred for the
current frame.
15
ES: Error Summary
Indicates the logical OR of the following bits:
– RDES0[1]: CRC Error
– RDES0[3]: Receive Error
– RDES0[4]: Watchdog Timeout
– RDES0[6]: Late Collision
– RDES0[7]: Giant Frame
– RDES4[4:3]: IP Header/Payload Error
– RDES0[11]: Overflow Error
RDES0[14]: Descriptor Error
This field is valid only when the Last Descriptor (RDES0[8]) is set.
14
DE: Descriptor Error
When set, this bit indicates a frame truncation caused by a frame that does not fit within the current
descriptor buffers, and that the DMA does not own the Next Descriptor. The frame is truncated. This field is
valid only when the Last Descriptor (RDES0[8]) is set.
13
SAF: Source Address Filter Fail
When set, this bit indicates that the SA field of frame failed the SA Filter in the GMAC Core.
12
LE: Length Error
When set, this bit indicates that the actual length of the frame received and that the Length/ Type field does
not match. This bit is valid only when the Frame Type (RDES0[5]) bit is reset.
11
OE: Overflow Error
When set, this bit indicates that the received frame was damaged due to buffer overflow in MTL.
10
298/590
VLAN: VLAN Tag
When set, this bit indicates that the frame pointed to by this descriptor is a VLAN frame tagged by the
GMAC Core.
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
Table 120. Receive descriptor fields (RDES0 through RDES3) (continued)
Bit
Description
RDES0 (cont’d)
9
FS: First Descriptor
When set, this bit indicates that this descriptor contains the first buffer of the frame. If the size of the first
buffer is 0, the second buffer contains the beginning of the frame. If the size of the second buffer is also 0,
the next Descriptor contains the beginning of the frame.
8
LS: Last Descriptor
When set, this bit indicates that the buffers pointed to by this descriptor are the last buffers of the frame
7
Timestamp Available/IP Checksum Error (Type1) / Giant Frame
When Advanced Timestamp feature is present:
When set, this bit indicates that a snapshot of the Timestamp is written in descriptor words 6 (RDES6) and
7 (RDES7). This is valid only when the Last Descriptor bit (RDES0[8]) is set.
When IP Checksum Engine (Type 1) is selected:
When set, this bit indicates that the 16-bit IPv4 Header checksum calculated by the core did not match the
received checksum bytes.
Otherwise:
When set, this bit indicates the Giant Frame Status. Giant frames are larger-than-1,518-byte (or 1,522-byte
for VLAN) normal frames, and larger-than-9,018-byte (9,022-byte for VLAN) jumbo frames (when Jumbo
Frame processing is enabled).
6
LC: Late Collision
When set, this bit indicates that a late collision has occurred while receiving the frame in Half-Duplex mode.
5
FT: Frame Type
When set, this bit indicates that the Receive Frame is an Ethernet-type frame (the LT field is greater than or
equal to 16’h0600). When this bit is reset, it indicates that the received frame is an IEEE802.3 frame. This
bit is not valid for Runt frames less than 14 bytes.
4
RWT: Receive Watchdog Timeout
When set, this bit indicates that the Receive Watchdog Timer has expired while receiving the current frame
and the current frame is truncated after the Watchdog Timeout.
3
RE: Receive Error
When set, this bit indicates that the gmii_rxer_i signal is asserted while gmii_rxdv_i is asserted during
frame reception. This error also includes carrier extension error in GMII and Half-duplex mode. Error can
be of less/no extension, or error (rxd ≠ 0f) during extension.
2
DE: Dribble Bit Error
When set, this bit indicates that the received frame has a non-integer multiple of bytes (odd nibbles). This
bit is valid only in MII Mode.
1
CE: CRC Error
When set, this bit indicates that a Cyclic Redundancy Check (CRC) Error occurred on the received frame.
This field is valid only when the Last Descriptor (RDES0[8]) is set.
0
Extended Status Available/Rx MAC Address
When either Advanced Timestamp or IP Checksum Offload (Type 2) is present, this bit, when set, indicates
that the extended status is available in descriptor word 4 (RDES4). This is valid only when the Last
Descriptor bit (RDES0[8]) is set.
When Advance Timestamp Feature or IPC Full Offload is not selected, this bit indicates Rx MAC Address
status. When set, this bit indicates that the Rx MAC Address registers value (1 to 31) matched the frame’s
DA field. When reset, this bit indicates that the Rx MAC Address Register 0 value matched the DA field.
Doc ID 018553 Rev 3
299/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Table 120. Receive descriptor fields (RDES0 through RDES3) (continued)
Bit
Description
RDES1
31
DIC: Disable Interrupt on Completion
When set, this bit prevents setting the Status Register’s RI bit (CSR5[6]) for the received frame ending in
the buffer indicated by this descriptor. This, in turn, disables the assertion of the interrupt to Host due to RI
for that frame.
30:29 Reserved
RBS2: Receive Buffer 2 Size
These bits indicate the second data buffer size, in bytes. The buffer size must be a multiple of 4, 8, or 16,
28:16 depending on the bus widths (32, 64, or 128, respectively), even if the value of RDES3 (buffer2 address
pointer) is not aligned to bus width. If the buffer size is not an appropriate multiple of 4, 8, or 16, the
resulting behavior is undefined. This field is not valid if RDES1[14] is set.
15
RER: Receive End of Ring
When set, this bit indicates that the descriptor list reached its final descriptor. The DMA returns to the base
address of the list, creating a descriptor ring.
14
RCH: Second Address Chained
When set, this bit indicates that the second address in the descriptor is the Next Descriptor address rather
than the second buffer address. When this bit is set, RBS2 (RDES1[28:16]) is a “don’t care” value.
RDES1[15] takes precedence over RDES1[14].
13
Reserved
12:0
RBS1: Receive Buffer 1 Size
Indicates the first data buffer size in bytes. The buffer size must be a multiple of 4, 8, or 16, depending upon
the bus widths (32, 64, or 128), even if the value of RDES2 (buffer1 address pointer) is not aligned. When
the buffer size is not a multiple of 4, 8, or 16, the resulting behavior is undefined. If this field is 0, the DMA
ignores this buffer and uses Buffer 2 or next descriptor depending on the value of RCH (Bit 14).
RDES2
31:0
Buffer 1 Address Pointer
These bits indicate the physical address of Buffer 1. There are no limitations on the buffer address
alignment except for the following condition: The DMA uses the configured value for its address generation
when the RDES2 value is used to store the start of frame. Note that the DMA performs a write operation
with the RDES2[3/2/1:0] bits as 0 during the transfer of the start of frame but the frame data is shifted as per
the actual Buffer address pointer. The DMA ignores RDES2[3/2/1:0] (corresponding to bus width of
128/64/32) if the address pointer is to a buffer where the middle or last part of the frame is stored.
RDES3
31:0
300/590
Buffer 2 Address Pointer (Next Descriptor Address)
These bits indicate the physical address of Buffer 2 when a descriptor ring structure is used. If the Second
Address Chained (RDES1[24]) bit is set, this address contains the pointer to the physical memory where
the Next Descriptor is present.
If RDES1[24] is set, the buffer (Next Descriptor) address pointer must be bus width-aligned (RDES3[3, 2, or
1:0] = 0, corresponding to a bus width of 128, 64, or 32. LSBs are ignored internally.) However, when
RDES1[24] is reset, there are no limitations on the RDES3 value, except for the following condition: The
DMA uses the configured value for its buffer address generation when the RDES3 value is used to store
the start of frame. The DMA ignores RDES3 [3, 2, or 1:0] (corresponding to a bus width of 128, 64, or 32) if
the address pointer is to a buffer where the middle or last part of the frame is stored.
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
Table 121. Extended status — receive descriptor fields 4 (RDES4)
Bit
Description
31:21
Reserved
20:18
VLAN Tag Priority Value
These bits give the VLAN tag’s user value in the received packet. These bits are valid only when the
RDES4 bits 16 and 17 are set.
17
AV Tagged Packet Received
When set, this bit indicates that an AV tagged packet is received. Otherwise, this bit indicates that an
untagged AV packet is received. This bit is valid when bit 16 (AV Packet Received) is set.
16
AV Packet Received
When set, this bit indicates that an AV packet is received.
15
Reserved
14
Timestamp Dropped
When set, this bit indicates that the timestamp was captured for this frame but got dropped in the MTL
RxFIFO because of overflow. This bit is available only when you select the Advanced Timestamp feature.
Otherwise, this bit is reserved.
13
PTP Version
When set, this bit indicates that the received PTP message is having the IEEE 1588 version 2 format.
When reset, it has the version 1 format. This is valid only if the message type is non-zero. This bit is
available only if Advance Timestamp feature is selected else it is reserved.
12
PTP Frame Type
When set, this bit indicates that the PTP message is sent directly over Ethernet. When this bit is not set
and the message type is non-zero, it indicates that the PTP message is sent over UDP-IPv4 or UDP-IPv6.
The information on IPv4 or IPv6 can be obtained from bits 6 and 7. This bit is available only if Advanced
Timestamp feature is selected.
11:8
Message Type These bits are encoded to give the type of the message received.
0000: No PTP message received
0001: SYNC (all clock types)
0010: Follow_Up (all clock types)
0011: Delay_Req (all clock types)
0100: Delay_Resp (all clock types)
0101: Pdelay_Req (in peer-to-peer transparent clock)
0110: Pdelay_Resp (in peer-to-peer transparent clock)
0111: Pdelay_Resp_Follow_Up (in peer-to-peer transparent clock)
1000: Announce
1001: Management
1010: Signaling
1011-1110: Reserved
7
IPv6 Packet Received When set, this bit indicates that the received packet is an IPv6 packet.
6
IPv4 Packet Received
When set, this bit indicates that the received packet is an IPv4 packet.
5
IP Checksum Bypassed
When set, this bit indicates that the checksum offload engine is bypassed.
Doc ID 018553 Rev 3
301/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Table 121. Extended status — receive descriptor fields 4 (RDES4) (continued)
Bit
Description
4
IP Payload Error
When set, this bit indicates that the 16-bit IP payload checksum (that is, the TCP, UDP, or ICMP checksum)
that the core calculated does not match the corresponding checksum field in the received segment. It is
also set when the TCP, UDP, or ICMP segment length does not match the payload length value in the IP
Header field.
3
IP Header Error
When set, this bit indicates either that the 16-bit IPv4 header checksum calculated by the core does not
match the received checksum bytes, or that the IP datagram version is not consistent with the Ethernet
Type value.
2:0
IP Payload Type
These bits indicate the type of payload encapsulated in the IP datagram processed by the Receive
Checksum Offload Engine (COE). The COE also sets these bits to 2'b00 if it does not process the IP
datagram’s payload due to an IP header error or fragmented IP.
3'b000: Unknown or did not process IP payload
3'b001: UDP
3'b010: TCP
3'b011: ICMP
3’b1xx: Reserved
S
Table 122. Time-stamp snapshot — receive descriptor fields 6 and 7 (RDES6 & RDES7)
Bit
Description
RDES6
RTSL: Receive Frame Timestamp Low
31:0 This field is updated by DMA with the least significant 32 bits of the timestamp captured for the corresponding
receive frame. This field is updated by DMA only for the last descriptor of the receive frame which is indicated
by Last Descriptor status bit (RDES0[8]).
RDES7
RTSH: Receive Frame Timestamp High
31:0 This field is updated by DMA with the most significant 32 bits of the timestamp captured for the corresponding
receive frame. This field is updated by DMA only for the last descriptor of the receive frame which is indicated
by Last Descriptor status bit (RDES0[8]).
302/590
Doc ID 018553 Rev 3
RM0078
19.4.2
Giga/Fast Ethernet controller (GMAC)
Precision Time Protocol (PTP)
The IEEE 1588-2002 standard defines a protocol, Precision Time Protocol (PTP), which
enables precise synchronization of clocks in measurement and control systems
implemented with technologies such as network communication, local computing, and
distributed objects. The PTP applies to systems communicating by local area networks
supporting multicast messaging, including Ethernet. This protocol enables heterogeneous
systems and supports system-wide synchronization accuracy in the sub-microsecond range
with minimal network and local clock computing resources. The PTP is transported over
UDP/IP. The system or network is classified into Master and Slave nodes for distributing the
timing and clock information. Figure below shows the process that PTP uses for
synchronizing a slave node to a master node by exchanging PTP messages.
Figure 96. Networked time synchronization
Master Clock Time
t1
Slave Clock Time
Sync message
Data at Slave
Clock
t2
t2m
t2
Follow_Up message
containing value of t4
t1, t2
t3m
Delay_Resp message
t3
t1, t2, t3
t4
Delay_Resp message
containing value of t4
time
Doc ID 018553 Rev 3
t1, t2, t3, t4
303/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Figure 96 shows the PTP process:
1.
The master broadcasts the PTP Sync messages to all its nodes. The Sync message
contains the master's reference time information. The time at which this message
leaves the master's system is t1. This time must be captured, for Ethernet ports, at
GMII or MII.
2.
The slave receives the Sync message and also captures the exact time, t2, using its
timing reference.
3.
The master sends a Follow_up message to the slave, which contains t1 information for
later use.
4.
The slave sends a Delay_Req message to the master, noting the exact time, t3, at
which this frame leaves the GMII/MII.
5.
The master receives the message, capturing the exact time, t4, at which it enters its
system.
6.
The master sends the t4 information to the slave in the Delay_Resp message.
7.
The slave uses the four values of t1, t2, t3, and t4 to synchronize its local timing
reference to the master's timing reference. Most of the PTP implementation is done in
the software above the UDP layer. However, the hardware support is required to
capture the exact time when specific PTP packets enter or leave the Ethernet port at
the GMII/MII.
To get a snapshot of the time, the MAC requires a reference time in 64-bit format. The
GMAC provides the following two options for using the reference timing source in a node:
●
External Timestamp Input Option that takes an external 64-bit timing reference and its
clock as input used for synchronize the timing reference to the MAC clock domain. The
64-bit timing reference is split in two 32-bit signals: Upper 32-bits (providing the time in
seconds) and Lower 32-bits (providing the time in nanoseconds)
●
Internal Reference Time Option that takes only the reference clock input and uses it to
generate the Reference time (also called the System Time) internally and capture
timestamps. The generation, update, and modification of the System Time are
described in the next paragraph.
System time register module
The System Time Generator module is optional and is not available if external time updating
is enabled. The 64-bit time is maintained updated using the input reference clock
(clk_ptp_ref_i). This time is the source for taking snapshots (timestamps) of Ethernet frames
being transmitted or received at the GMII. The System Time counter can be initialized or
corrected using the coarse correction method. In this method, the initial value or the offset
value is written to the Timestamp Update register (See RM0089, Reference
manual, SPEAr1340 address map and registers). For initialization, the System Time counter
is written with the value in theTimestamp Update registers, while for system time correction,
the offset value is added to or subtracted from the system time. In the fine correction
method, a slave clock's (clk_ptp_ref_i) frequency drift with respect to the master clock is
corrected over a period of time instead of in one clock, coarse correction. In this method, an
accumulator sums up the contents of the Addend register, as shown in Figure 97. The
arithmetic carry that the accumulator generates is used as a pulse to increment the system
time counter (both the accumulator and the addend are 32-bit registers).
304/590
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
Figure 97. System time update using fine method
addend_val[31:0)
addend_updt
Addend register
+
Accumulator register
Constant value
incr_sub_sec_reg
+
Sub-second register
incr_sec_reg
Second register
Doc ID 018553 Rev 3
305/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Transmit path functions
The MAC captures a timestamp when the Start Frame Delimiter (SFD) of a frame is sent on
GMII/MII, each transmit frame can be marked to indicate whether a timestamp should be
captured for that frame. It must be specified from the user the frame for which the timestamp
will be captured because the MAC does not process the transmitted frames to identify the
PTP frames.
The MAC returns the timestamp to the software inside the corresponding transmit descriptor
in the TDES2 and TDES3 fields. The TDES2 field holds the 32 least significant bits of the
timestamp.
In case of alternate (enhanced) descriptor, the MAC writes the 64-bit timestamp in TDES6
and TDES7, respectively.
Receive path functions
The MAC captures the timestamp of all frames received on the GMII or MII interface and
does not process the received frames to identify the PTP frames in the default mode, that is,
when the Advanced Timestamp feature is not selected. The DMA returns the timestamp to
the software in the corresponding receive descriptor, using the RDES2 and RDES3 fields.
The RDES2 holds the 32 least significant bits of the timestamp, except as mentioned in
"Receive Timestamp" on page 500. The timestamp is written only to that receive descriptor
for which the Last Descriptor status field has been set to 1 (the EOF marker). When the
timestamp is not available an all-ones pattern is written to the descriptors (RDES2 and
RDES3), indicating that timestamp is not correct. If the software uses a control register bit to
disable timestamping, the DMA does not alter RDES2 or RDES3.
In case of alternate (enhanced) descriptor, the MAC writes the 64-bit timestamp in RDES6
and RDES7, respectively.
The RDES0[7] field indicates whether the timestamp is updated in RDES6 and RDES77 or
not.
Timestamp error margin
As mentioned in the previous paragraph the timestamp must be captured at the SFD of the
transmitted and received frames at the GMII or MII interface, because the reference timing
source (the PTP clock, clk_ptp_ref_i) is taken as different from the GMII or MII clocks, a
small error margin is introduced, because of the transfer of information across
asynchronous clock domains. In the transmit path, the captured and reported timestamp
has a maximum error margin of 2 PTP clocks. This means that the captured timestamp has
the reference timing source value that is given within 2 clocks after the SFD has been
transmitted on the GMII. Similarly, in the receive path, the error margin is 3 GMII or MII
clocks, plus up to 2 PTP clocks. It is possible to ignore the error margin because of the three
GMII or MII clocks by assuming that this constant delay is present in the system before the
SFD data reaches the GMII or MII interface of MAC.
Frequency range of reference timing clock
The timestamp information is transferred across asynchronous clock domains,from MAC
clock domain to application clock domain. Therefore, a minimum delay is required between
two consecutive timestamp captures, this delay is 4 clock cycles of GMII or MII and 3 clock
cycles of PTP clocks. If the delay between two timestamp captures is less than this delay,
the MAC does not take a timestamp snapshot for the second frame. The maximum PTP
clock frequency is limited by the maximum resolution of the reference time.
306/590
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
The minimum PTP clock frequency depends on the time required between two consecutive
SFD bytes.
19.4.3
Advanced Timestamps
In addition to the basic timestamp features , the GMAC supports the following advanced
timestamp features:
●
Supports the IEEE 1588-2008 (version 2) timestamp format.
●
Provides an option to take snapshot of all frames or only PTP type frames and event
messages
●
Provides an option to take the snapshot based on the clock type: ordinary, boundary,
end-to-end, and peer-to-peer.
●
Provides an option to select the node to be a Master or Slave for ordinary and
boundary clock.
●
Identifies the PTP message type, version, and PTP payload in frames sent directly over
Ethernet and sends the status.
●
Provides an option to measure sub-second time in digital or binary format.
Clock types
The GMAC supports the following clock types defined in the IEEE 1588-2008 standard:
●
Ordinary Clock
●
Boundary Clock
●
End-to-End Transparent Clock
●
Peer-to-Peer Transparent Clock
Ordinary clock
The ordinary clock in a domain supports a single copy of the protocol and has a single PTP
state and a single physical port. It can be a grandmaster or a slave clock and supports the
following features:
●
Sends and receives PTP messages.
●
Maintains the data sets such as timestamp values.
Boundary clock
The boundary clock is similar to the ordinary except for the following features:
●
The clock data sets are common to all ports of the boundary clock
●
The local clock is common to all ports of the boundary clock.
End-to-end transparent clock
The end-to-end transparent clock supports the end-to-end delay measurement mechanism
between slave clocks and the master clock. The end-to-end transparent clock forwards all
messages like normal bridge, router, or repeater. The residence time of a PTP packet is the
time taken by the PTP packet from the Ingress port to the Egress port. The residence time of
a SYNC packet inside the end-to-end transparent clock is updated in the correction field of
the associated Follow_Up PTP packet before it is transmitted. Similarly, the residence time
of a Delay_Req packet inside the end-to-end transparent clock is updated in the correction
field of the associated Delay_Resp PTP packet before it is transmitted.
Doc ID 018553 Rev 3
307/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Peer-to-peer transparent clock
The peer-to-peer transparent clock differs from the end-to-end transparent clock in the way it
corrects and handles the PTP timing messages. In all other aspects, it is identical to the
end-to-end transparent clock.
Reference timing source
The MAC supports the following reference timing source features
●
48 bit seconds field
●
Fixed Pulse-Per-Second Output
48 bit seconds field
The MAC supports 80-bit timestamp with the following fields:
●
UInteger48 secondsField
The seconds field is the integer portion of the timestamp in units of seconds and is 48bits wide.
●
UInteger32 nanosecondsField
The nanoseconds field is the fractional portion of the timestamp in units of
nanoseconds. The nanoseconds field supports the following two modes:
–
Digital rollover mode in which the maximum value in the nanoseconds field is
0x3B9A_C9FF, that is, (10e9-1) nanoseconds.
–
Binary rollover mode: In binary rollover mode, the nanoseconds field rolls over and
increments the seconds field after value 0x7FFF_FFFF. You can set these modes
by using Bit 9 (TSCTRLSSR) Timestamp Control Register. When the advanced
timestamp feature is selected, the timestamp maintained in the MAC is still 64-bit
wide.
Fixed pulse-per-second output
The GMAC supports the pulse-per-second (PPS) output that is given to indicate 1 second
interval (default). The frequency of the PPS output can be changed by setting Bits[3:0],
PPSCTRL in PPS Control Register.
PPS start or stop time
The start time can initially programmed in the Target Time registers.
The start or stop time should be programmed with advanced system time to ensure proper
PPS signal output. If the application programs a start or stop time that has already elapsed,
then the MAC sets an error status bit indicating the programming error. If enabled, the MAC
also sets the Target Time Reached interrupt event. The application can cancel the start or
stop request only if the corresponding start or stop time has not elapsed. If the time has
elapsed, the cancel command has no effect.
PPS width and interval
The PPS width and interval are programmed in terms of number of the units of sub-second
increment value.
Transmit path functions
The structure of the descriptor changes when you enable the advanced timestamp feature.
The advanced timestamp feature is supported only through Alternate (Enhanced)
308/590
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
descriptors format. The descriptor is 32-bytes long (8 DWORDS) and the snapshot of the
timestamp is written in descriptor TDES6 and TDES7.
Receive path functions
When the advanced timestamp feature is selected, the MAC processes the received frames
to identify valid PTP frames. The DMA returns the timestamp to the software inside the
corresponding Transmit and Receive Descriptor. The advanced timestamp feature is
supported only with the 32-bytes long Alternate (Enhanced) descriptor. The extended
status, containing the timestamp message status and the IPC status, is written in descriptor
RDES4 and the snapshot of the timestamp is written in descriptors RDES6 and RDES7.
19.4.4
AV feature
The Audio Video (AV) feature enables transmission of time-sensitive traffic over bridged
local area networks (LANs). The GMAC supports the AV data transfer in 100 Mbps and
1000 Mbps modes only in full duplex mode. A single master interface is connected to two
DMA channels (channel 0, channel 1). A DMA arbiter helps in arbitration of all the paths
(transmit and receive) in channel 0 and channel 1. Each channel has a separate Control and
Status register (CSR) for managing the transmit and receive functions, descriptor handling,
and interrupt handling.
Transmit path functions
The transmit path of channel 0 supports strict priority algorithm and is used for best-effort
traffic. For a channel, the strict priority algorithm determines that a frame is available for
transmission if the channel contains one or more frames. When the threshold mode for MTL
Tx FIFO is enabled, the strict priority algorithm determines that a frame is available for
transmission if the channel contains a partial frame of size equal to the programmed
threshold limit.
The transmit paths of channel 1 support traffic management by using the credit-based
shaper algorithm. For a channel, the credit-based shaper algorithm determines that a frame
is available for transmission if the following conditions are true:
●
The channel contains one or more frames.
●
The credit for the channel is positive as per the algorithm.
You can disable the credit-based shaper algorithm for channel 1. When disabling the creditbased shaper algorithm for a channel, the channel uses the default strict priority algorithm.
Each transmit DMA has a separate descriptor chain for fetching the transmit data. The
transmit channel that gets the access to the system bus depends on the DMA arbiter. The
transmit path has separate FIFOs (MTL layer) for each channel. The data fetched by the
DMA is put in the respective FIFO. The traffic management and scheduler unit (TMS)
controls which FIFO data is transmitted by the MAC.
If the credit-based shaper algorithm is enabled for channel 1, then the corresponding
channel is selected for transmission if the following conditions are true:
●
If the frame is available in the channel and has a positive or zero credit.
●
If the higher priority channel has no frame waiting in the FIFO. If the credit-based
shaper algorithm is disabled for channel 1, then the frame to be transmitted from a
channel is selected based on the following priority scheme: channel 0 at priority 0 (low)
and channel 1 at priority 1 (high).
Doc ID 018553 Rev 3
309/590
Giga/Fast Ethernet controller (GMAC)
RM0078
Receive path functions
The receive path of channel 0 and 1 are enabled by default. The AV packets can be of the
following two types:
19.4.5
●
AV data packets: The AV data packets are always tagged. The tagged AV control
packets are received based on the programmed priority value. You can program bits
[18:16], AVP, in register 462 (AV MAC Control Register) to specify the channel to which
an AV packet with a given priority must be sent.
●
AV control packets: The AV control packets can be either tagged or untagged. The
untagged AV control packets are received on Channel 0 by default. To receive these
packets on Channel 1, You can program bits [25:24], AVCH, of register 462 (AV MAC
Control Register) (offset 0x0738). Similar to the AV data packets, the tagged AV control
packets are received based on the programmed priority value.
Energy efficient ethernet
Energy Efficient Ethernet (EEE) is an optional operational mode that enables the IEEE
802.3 Media Access Control (MAC) sublayer along with a family of physical layers to operate
in the Low-Power Idle (LPI) mode. The EEE operational mode supports the IEEE 802.3
MAC operation at 100 Mbps, 1000 Mbps, and 10 Gbps.
The LPI mode allows power saving by switching off parts of the communication device
functionality when there is no data to be transmitted and received. The systems on both
sides of the link can disable some functionalities and save power during the periods of lowlink utilization. The MAC controls whether the system should enter or exit the LPI mode and
communicates this to the PHY.
The EEE specifies the capabilities negotiation methods that the link partners can use to
determine whether EEE is supported and then select the set of parameters that common to
both devices.
Note:
19.5
1
Even if the MAC supports multiple PHY interfaces, you should activate the EEE mode only
when the MAC is operating with GMII and MII interface.
2
According to the Energy Efficient Ethernet standard (802.3az), the LPI mode is supported
only in the full-duplex mode. Therefore, you should not enable the LPI mode when the MAC
Transmitter is configured for the half-duplex mode.
Programming
This section describes how to initialize the DMA/GMAC registers in the proper sequence.
310/590
●
Initializing DMA on page 311
●
Initializing GMAC on page 312
●
Performing normal receive and transmit operation on page 313
●
Stopping and starting transmission on page 313
●
GMII link transitions on page 313
●
IEEE 1588 time stamping on page 314
●
AV feature initialization steps on page 315
●
Energy efficient ethernet initialization steps on page 316
Doc ID 018553 Rev 3
RM0078
19.5.1
Giga/Fast Ethernet controller (GMAC)
Initializing DMA
Perform the following steps to initialize the DMA.
1.
2.
3.
4.
5.
6.
7.
Provide a software reset to reset all GMAC internal registers and logic. (Bus Mode
Register – bit 0).
Wait for the completion of the reset process. Poll bit 0 of the Bus Mode Register, which
is only cleared after the reset operation is completed.
Program the following fields to initialize the Bus Mode Register by setting the values in
Bus Mode Register:
a) Mixed Burst and AAL
b) Fixed burst or undefined burst
c) Burst length values and burst mode values.
d) Descriptor Length (only valid if Ring Mode is used)
e) Tx and Rx DMA Arbitration scheme and two-level priority weight for the channel
Create a proper descriptor chain for transmit and receive. In addition, ensure that the
DMA owns the receive descriptors by setting the bit 31 of the descriptor. When OSF
mode is used, at least two descriptors are required.
Make sure that your software creates three or more different transmit or receive
descriptors in the chain before reusing any of the descriptors.
Initialize receive and transmit descriptor list address with the base address of the
transmit and receive descriptor (Receive Descriptor List Address Register and Transmit
Descriptor List Address Register respectively).
Program the following fields to initialize the mode of operation by setting the values in
DMA Operation Mode Register:
a) Receive and Transmit Store And Forward
b) Receive and Transmit Threshold Control (RTC and TTC)
c) Error Frame and undersized good frame forwarding enable
d) OSF Mode
8.
Clear the interrupt requests, by writing to those bits of the status register (interrupt bits
only) that are set. For example, writing 1 into bit 16, the normal interrupt summary,
clears this bit (Status Register).
9.
Enable the interrupts by programming the Interrupt Enable Register.
10. Repeat steps 3 through 9 for channel 1 dedicated to AV feature.
11. Program the CBS control register, idleSlope, sendSlope, hiCredit, and loCredit
registers of channel 1.
12. Start the Receive and Transmit DMA by setting SR (bit 1) and ST (bit 13) of the control
registers for all channels.
Doc ID 018553 Rev 3
311/590
Giga/Fast Ethernet controller (GMAC)
19.5.2
RM0078
Initializing GMAC
The following GMAC Initialization operations can be performed after DMA initialization. If the
MAC initialization is done before the DMA is set-up, enable the MAC receiver (last step
below) only after the DMA is active. Otherwise, received frames fills the RxFIFO and
overflow. Note that step 1 is different depending on whether the RTBI PHY interface is or is
not enabled.
1.
If the RTBI PHY interface is enabled:
a)
Program the GMAC AN Control Register to enable Auto-negotiation ANE (bit-12).
Setting ELE (bit-14) of this register enables the PHY to loop back the transmit data
and RAN (bit-9) can be set to restart Auto negotiation.
a)
Check the GMAC AN Status Register for completion of the Auto-negotiation
process. ANC (bit-5) should be set. The link status (bit-2), when set, indicates that
the link is up.
If the RTBI PHY interface is not enabled:
a)
Program the GMAC (GMII Address Register for controlling the management
cycles for external PHY. For example, Physical Layer Address PA (bits 15-11). In
addition, set bit 0 (GMII Busy) for writing into PHY and reading from PHY.
b)
Read the 16-bit data of GMII Data Register from the PHY for link up, speed of
operation, and mode of operation, by specifying the appropriate address value in
bits 15-11 of GMII Address Register.
2.
Provide the MAC address registers (MAC Address0 High Register and MAC Address0
Low Register). Additional MAC addresses must be programmed appropriately.
3.
Program the Hash Table High and Hash Table Low Registers.
4.
Program the following fields to set the appropriate filters for the incoming frames in
MAC Frame Filter:
5.
a)
Receive All
b)
Promiscuous mode
c)
Hash or Perfect Filter
d)
Unicast, multicast, broadcast, and control frames filter settings
Program the following fields for proper flow control in Flow Control Register:
a)
312/590
Pause time and other pause frame control bits
b)
Receive and Transmit Flow control bits
c)
Flow Control Busy/Backpressure Activate
6.
Program the Interrupt Mask register bits, as required, and if applicable, for your
configuration.
7.
Program the appropriate fields in MAC Configuration Register. For example, Interframe gap while transmission and jabber disable. Based on the Auto-negotiation you
can set the Duplex mode (bit 11) or port select (bit 15).
8.
Set the bits Transmit enable (TE bit-3) and Receive Enable (RE bit-2) in MAC
Configuration Register.
Doc ID 018553 Rev 3
RM0078
19.5.3
Giga/Fast Ethernet controller (GMAC)
Performing normal receive and transmit operation
For normal operation, perform the following steps:
19.5.4
1.
For normal transmit and receive interrupts, read the interrupt status. Then, poll the
descriptors, reading the status of the descriptor owned by the Host (either transmit or
receive).
2.
Set appropriate values for the descriptors, ensuring that transmit and receive
descriptors are owned by the DMA to resume the transmission and reception of data.
3.
If the descriptors are not owned by the DMA (or no descriptor is available), the DMA
goes into SUSPEND state. The transmission or reception can be resumed by freeing
the descriptors and issuing a poll demand by writing 0 into the Tx/Rx poll demand
register (Transmit Poll Demand Register and Receive Poll Demand Register).
4.
The values of the current host transmitter or receiver descriptor address pointer can be
read for the debug process (Current Host Transmit Descriptor Register and Current
Host Receive Descriptor Register).
5.
The values of the current host transmit buffer address pointer and receive buffer
address pointer can be read for the debug process (Current Host Transmit Buffer
Address Register and Current Host Receive Buffer Address Register).
Stopping and starting transmission
Perform the following steps to pause the transmission for some time:
19.5.5
1.
Disable the Transmit DMA (if applicable), by clearing bit 13 (ST: Start/Stop
Transmission Command) of Operation Mode Register.
2.
Wait for any previous frame transmissions to complete. You can check this by reading
the appropriate bits of Debug Register.
3.
Disable the MAC transmitter and MAC receiver by clearing the bit 3 (TE: Transmitter
Enable) and bit 2 (RE: Receiver Enable) in MAC Configuration Register.
4.
Disable the Receive DMA (if applicable), after making sure that the data in the Rx FIFO
is transferred to the system memory (by reading Debug Register).
5.
Make sure that both Tx FIFO and Rx FIFO are empty.
6.
To restart the operation, first start the DMAs, and then enable the MAC Transmitter and
Receiver.
GMII link transitions
Transmit and receive clocks are running when the link is down
Perform the following steps when the link is down but the Transmit and Receive clocks are
running:
1.
Disable the Transmit DMA (if applicable), by clearing bit 13 (ST) of Operation Mode
Register.
2.
Disable the MAC receiver by clearing the bit 2 (RE) of MAC Configuration Register.
3.
Wait for any previous frame transmissions to complete from the Tx FIFO. You can do
this by reading the appropriate bits of Debug Register.
-orFlush the Tx FIFO for faster empty operation.
Doc ID 018553 Rev 3
313/590
Giga/Fast Ethernet controller (GMAC)
RM0078
4.
Disable the MAC transmitter by clearing bit 3 (TE) in MAC Configuration Register.
5.
After the link is up, read the PHY registers to know the latest configuration and
accordingly program the MAC registers.
6.
Restart the operation by starting the Tx DMA, and then enabling the MAC Transmitter
and Receiver.
You do not need to disable the Rx DMA. As the Receiver is disabled, the FIFO does not
get any data in the Rx FIFO.
Transmit and receive clocks are stopped when the link is down
1.
Wait till the link is up and the Transmit and Receive clocks are active.
When the Transmit and Receive clocks are stopped, then disabling the transmit or
receive operations does not have any effect. Therefore, the software must wait till the
link is up again.
2.
Disable the Transmit DMA (if applicable), by clearing bit 13 (ST) of the Operation Mode
Register.
3.
Disable the MAC receiver by clearing the bit 2 (RE) of MAC Configuration Register.
4.
Wait for any previous frame transmissions to complete from the Tx FIFO. You can do
this by reading the appropriate bits of Debug Register.
-orFlush the Tx FIFO for faster empty operation.
19.5.6
5.
Disable the MAC transmitter by clearing bit 3 (TE) in MAC Configuration Register.
6.
After the link is up, read the PHY registers to know the latest configuration and
accordingly program the MAC registers.
7.
Restart the operation by starting the Tx DMA, and then enabling the MAC Transmitter
and Receiver.
IEEE 1588 time stamping
Initializing system time generation
You can enable the timestamp feature by setting bit 0 of the Timestamp control register.
However, it is essential that the timestamp counter should be initialized after this bit is set.
Perform the following steps during GMAC core initialization:
1.
314/590
Mask the Timestamp Trigger interrupt by setting the bit 9 of Interrupt Mask Register.
2.
Program the bit 0 in Timestamp Control Register to enable time stamping.
3.
Program the Sub-Second Increment Register based on the PTP clock frequency.
4.
If you are using the Fine Correction approach, program the Timestamp Addend
Register and set the bit 5 of Timestamp Control Register.
5.
Poll the Timestamp Control register until the bit 5 is cleared.
6.
Program the Timestamp Control register bit 1 to select the Fine Update method (if
required).
7.
Program the System Time - Seconds Update Register and System Time Nanoseconds Update Register with the appropriate time value.
Doc ID 018553 Rev 3
RM0078
Giga/Fast Ethernet controller (GMAC)
8.
Set the bit 2 in Timestamp Control Register.
The Timestamp counter starts operation as soon as it is initialized with the value written
in the Timestamp Update registers.
9.
Note:
Enable the MAC receiver and transmitter for proper time stamping.
If timestamp operation is disabled by clearing bit 0 of Timestamp Control Register, you need
to repeat all these steps to restart the timestamp operation.
System time correction
Use the following steps to synchronize or update the system time in one process (coarse
correction method):
1.
Set the offset (positive or negative) in the Timestamp Update registers .
2.
Set bit 3 (TSUPDT) of the Timestamp Control Register.
3.
The value in the Timestamp Update registers is added to or subtracted from the system
time when the TSUPDT bit is cleared.
Use the following steps to synchronize or update the system time to reduce system-time
jitter (fine correction method):
19.5.7
1.
Calculate the rate by which you want to make the system time increments slower or
faster.
2.
Update the Timestamp Addend Register with the new value and set the bit 5 of the
Timestamp Control Register.
3.
Wait for the time for which you want the new value of the Addend register to be active.
You can do this by enabling the Timestamp Trigger interrupt after the system time
reaches the target value.
4.
Program the required target time in Target Time Seconds Register and (Target Time
Nanoseconds Register.
5.
Unmask the Timestamp interrupt by clearing bit 9 of Interrupt Mask Register.
6.
Set bit 4 in Timestamp Control Register.
7.
When this trigger causes an interrupt, read the Interrupt Status Register.
8.
Reprogram the Timestamp Addend Register with the old value and set bit 5 again.
AV feature initialization steps
Enabling slot number checking
You can use the slot number check feature to specify the intervals at which the channel 1
DMA fetches the frames from the AXI system bus. This feature is useful for a uniform and
periodic transfer of the AV traffic from the host memory. The feature is available only when
you enable time stamping and program the Sub-Second Increment Register. Perform the
following steps to enable the slot number checking:
Note:
Perform these steps after Step 11 and before Step 12 of Section 19.5.1: Initializing DMA.
Doc ID 018553 Rev 3
315/590
Giga/Fast Ethernet controller (GMAC)
RM0078
1.
Enable time stamping by following the steps described in Section : Initializing system
time generation.
2.
Make sure that the SLOTNUM field (bits 6:3) of Transmit Descriptor Word 0 (TDES0)
contains a valid slot number. You can read the current reference slot number from the
Slot Function Control and Status register.
3.
Set the bit 0 (ESC: Enable Slot Comparison) of the Slot Function Control and Status
register of a channel to enable the slot number checking.
Enabling average bits per slot reporting
The CBS Status register of the additional AV channels (channel 1 and channel 2) provides
information about the average bits that are transmitted in a slot. The software can
asynchronously read this register to retrieve information about the average bits transmitted
per slot. Perform the following steps to enable average bits per slot reporting:
Note:
1.
Enable time stamping by following the steps described in Section : Initializing system
time generation.
2.
Program the bits 6:4 (SLC: Slot Count) of the CBS Control register of a channel with
number of slots over which the average transmitted bits per slot need to be computed.
3.
Enable the bit 17 (ABPSSIE: Average Bits Per Slot Interrupt Enable) of the CBS Control
register of a channel to generate the average bits per slot interrupt.
The frequency of this interrupt depends on the value programmed in Step 2. For example,
when you program value 0 in the SLC field, the interrupt is generated at every 125
microsecond. When not required, you can disable this interrupt to stop the interrupt flooding.
4.
Read the bits 16:0 (ABS: Average Bits per Slot) from the CBS Status register of a
channel on each interrupt.
Note:
The software can read the ABS bits in polling mode even if the ABPSSIE bit is not enabled.
When high, bit 17 (ABSU: ABS Updated) of the CBS Status register indicates that a new
value is updated in the ABS field.
19.5.8
Energy efficient ethernet initialization steps
You can configure the Energy Efficient Ethernet (EEE) feature in coreConsultant. Perform
the following steps during GMAC core initialization:
1.
Read the PHY register through the MDIO interface, check if the remote end has the
EEE capability, and then negotiate the timer values.
2.
Program the PHY registers through the MDIO interface (including the
RX_CLK_stoppable bit that indicates to the PHY whether to stop RX clock in LPI
mode.)
3.
Program the bits [5:16] (LIT: LPI LS TIMER) and bits [15:0] (TWT: LPI TW TIMER) in
LPI Timers Control Register.
4.
Read the link status of the PHY chip by using the MDIO interface and update the bit 17
(PLS) of Register 12 (LPI Control and Status Register) accordingly. This update should
be done whenever the link status in the PHY chip changes.
5.
Set the bit 16 (LPIEN: LPI Enable) of LPI Control and Status Register to make the MAC
enter the LPI state.
The MAC enters the LPI mode after completing the transmission in progress and sets
the bit 0 (TLPIEN: Transmit LPI Entry).
316/590
Doc ID 018553 Rev 3
RM0078
Note:
Giga/Fast Ethernet controller (GMAC)
If you want to make the MAC enter the LPI state only after it completes the transmission of
all queued frames in the TxFIFO, you should stop the DMA before setting the LPIEN bit. For
information about how to stop the DMA, see steps 1 and 2 in Section 19.5.4: Stopping and
starting transmission.
If you want to switch off the CSR clock, GMII transmit clock, or power to the rest of the
system during the LPI state, you should wait for the TLPIEN interrupt of LPI Control and
Status Register to be generated. Restore the clocks before performing the Step 6 when you
want to come out of the LPI state.
6.
Reset the bit 16 (LPIEN: LPI Enable) of LPI Control and Status Register to bring the
MAC out of the LPI state.
The MAC waits for the time programmed in the bits [15:0] (TWT: LPI TW TIMER) before
setting the TLPIEX interrupt status bit and resuming the transmission.
Doc ID 018553 Rev 3
317/590
USB 2.0 host controllers (UHC)
20
RM0078
USB 2.0 host controllers (UHC)
This chapter focuses on UHC functionality and operation.
For the UHC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
20.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device integrates 2 USB Host interfaces identified as UHC0 and UHC1.
Each interface provides a high-speed Host controller (EHCI0, EHCI1) and a full-speed/lowspeed Host controller (OHCI0, OHCI1).
Figure 98. UHC block diagram
EHCI
Operation
AHB BIU
AMBA AHB
UTMI+PHY
List
Processor
Root Hub
EHCI
Port0
Port1
To External PAD
SOF
Generator
Packet Buffer
USB2.0 EHCI Controller
OHCI
USB1.1 OHCI
Controller
OHCI
USB1.1 OHCI
Controller
UHC
20.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
318/590
Doc ID 018553 Rev 3
RM0078
20.3
USB 2.0 host controllers (UHC)
Clocks
See Chapter 5: Reset and clock generator (RCG).
20.4
Functional description
AHB bus interface unit (BIU)
USB 2.0 Host access to the AHB bus is granted by the AHB bus interface unit (BIU), which
consists of a master module and a slave module.
The AHB BIU Slave module acts a slave on the AHB and responds to all EHCI/OHCI
operational registers accesses from an AHB master. In particular, this module allows RW
access to its operational registers through the AHB bus.
Note:
There is only a single AHB slave port in AHB BIU slave module for both EHCI and OHCI
host controller registers access.
The AHB BIU master module, acting as a master on the AHB, receives requests from the list
processor block within the EHCI Host controller, and transfers data with system memory
through the AHB bus. The AHB BIU Master supports 8-, 16-, and 32-bit data transfers, and
32-bit address transfers.
Enhanced Host controller interface (EHCI)
The EHCI Host controller, compliant with the EHCI specification (version 1.0), is embedded
within the UHC to support the 480 Mbps high-speed (HS) transaction of USB 2.0.HS device.
These are EHCI main blocks:
●
List processor
The list processor is the main block of the EHCI Host controller. The list processor is
implemented with multiple state machines to perform the list service flow, which is set
up by the host controller driver (HCD) according to the priority set in the operational
registers.
In addition, the list processor consists of a controller that interfaces with all the other
EHCI Host controller blocks, such as the AHB BIU (master module), the packet buffer,
the EHCI operational registers, the SOF generators and the root hub.
●
Operational registers
This block stores the implemented EHCI capability and operational registers as defined
in the USB EHCI specification. In addition, some specific registers are also
implemented in this block, to enable the programming of registers as the packet buffer
depth, break memory transfer, frame length.
The operational registers block interfaces with the AHB BIU (slave module), the list
processor, and the root hub.
●
Start-of-frame (SOF) generator
The SOF generator block implements the counter which generates the start-of-frame
packets to supply micro-SOFs for each microframe. The SOF counter runs in the PHY
clock domain.
Microframe duration is derived from the EHCI frame length adjustment (FLADJ)
register value. This ensures that the Host microframe duration and per-port microframe
duration remain the same.
This block interfaces with the List Processor only.
Doc ID 018553 Rev 3
319/590
USB 2.0 host controllers (UHC)
●
RM0078
Packet buffer
The packet buffer (PBUF) block provides storage and control for IN/OUT data
transaction, with a configured size of 1024 bytes (256 x 32 = 1024 bytes).
According to its functionality, the PBUF block interface with both the list processor and
the root hub. specifically, during an OUT transaction, the list processor fetches data
from the system memory and writes them in the PBUF. Besides, during an IN
transaction, the data are written to PBUF by the Root Hub.
The packet buffer size depends on the system latency and bandwidth allocated to the
EHCI Host controller. For example, in case PBUF size is programmed to 64 bytes, a
1024-bytes IN transfer would get 1024/64 = 16 data transfer on the AHB bus. If the
system is not able to ensure EHCI access to AHB bus for these 16 transfers with no
breaks, then a buffer overrun occurs. In this case, to avoid buffer overrun or under-run,
PBUF size could be set to 1024 bytes.
●
Root hub
The root hub (RH) block interfaces between the list processor and the USB PHY. It
propagates reset and resume signals to downstream ports, and handles port
connections and disconnections.
The RH operates both on the local PHY clock (a free-running 30/60 MHz clock) and on
the clock source from each physical port (30 MHz with a 16 bit interface).
Open Host controller interface (OHCI)
The OHCI Host controller, compliant with the OHCI specification (version 1.0a), is integrated
in the UHC to support the 12 Mbps full-speed (FS) and the 1.5 Mbps low-speed (LS)
operation of USB 1.1. FS/LS device connected to port0 is managed by OHCI0 and port1 is
managed by OHCI1.
The USB open Host controller is designed to be independent of the bus interface unit (BIU).
The host bus is assumed to be at least 32 bits wide with adequate performance to support
the data rate of the particular implementation (100Mbit/sec or higher plus overhead for DMA
structures) as well as bounded latency so that the FIFOs can have a reasonable size. The
main blocks of the OHCI block are described below.
320/590
Doc ID 018553 Rev 3
RM0078
USB 2.0 host controllers (UHC)
Figure 99. USB open Host controller block diagram
OHCI
Regs
RCFG_RegData(32)
APB_SADR(6)
HCI
Slave
block
APB_SData(32)
HCI_Data(32)
Control
USB
2
TxDpls
Root Hub
&
Host SIE
List
Processor
Block
TxDmns
Port
S/M
Ctrl
Ctrl
ED/TD_Status(32)
HCM_ADR/
Data(32)
X
V
R
X
V
R
USB
Cntl
ED/TD_Data(32)
APP_MData(32)
Ctrl
Ctrl
OHCI
Regs
HCI Bus
Control
TxEnL
USB
State
Control
Ctrl
Port
S/M
1
ED &TD
Regs
HCI
Master
block
RH_Data(8)
64 x 8
FIFO
Ctrl
Root
Hub
Config
Block
HSIE
S/M
RcvData
Status
HC_Data(8)
DF_Data(8)
Clock
MUX
12/1_5
RcvDpls
DPLL
-
RcvDmns
DF_Data(8)
FIFO_Data (8)
Addr (6)
HCF_Data(8)
15
Ext.FIFO Status
Port
S/M
X
V
R
USB
FIFO
64 x 8
●
HCI master block
The HCI master block is the interface between the HCI master interface logic block and
the HCI bus. It converts all the cycles initiated by different blocks of the list processor
through HCI master interface logic block into HCI bus cycles according to the protocol
defined for HCI bus. In addition to that it implements a state machine to read/ write
from/to DFIFO. When it is transferring the data returned by endpoint, it reads the data
from DFIFO and merges into DWORD and then send it to the application internal FIFO.
Similarly when reading the endpoint data from the system memory, after reading every
DWORD from the application FIFO it splits the DWORD into 4 individual bytes and then
sends it to the DFIFO. It also implements byte-alignment logic, that is when a write
cycle is initiated by FML block at the odd boundary (not the DWORD boundary), it
reads only the lower 2 bit of the address (ties them to 0), so that the application always
writes at DWORD boundary, and manipulates the byte-enables accordingly.
●
HCI slave block
The HCI slave block is the slave on HCI bus. This is basically an interface between the
OHCI operational register internal to the Host Controller and the application. It updates
the registers on writes and provides the register data on reads. All the slave accesses
should be DWORD aligned. Therefore, byte enables are not used in slave accesses.
●
List processor block
The list processor block acts as a main controller of the entire controller. It has multiple
state machines to implement List Service Flow, List Priority, USB-States, ED, TD
Doc ID 018553 Rev 3
321/590
USB 2.0 host controllers (UHC)
RM0078
Service, StatusWriteBack, TD Retirement, and so on, per the OHCI specification. In
addition, this block implements a controller that interfaces with HCI_master and hsie,
helping them in the data transfer from system memory to USB, and USB to system
memory.
The following submodules are included:
●
–
USB states
–
List service flow
–
ED-TD block
–
HCI master interface logic
–
Data read write logic
RootHub and HSIE blocks
Because implementation varies, most of the functionality of the RootHub is
implemented in the port configuration block. This logic is common to any user
configuration. The logic in this block acts as a wrapper around HSIE and interface with
Host controller list processor, FIFO and OHCI registers. This block also implements the
control logic to synchronize the interface between HSIE and port S/M.
This block implements the following submodules:
–
Reset_Resume
–
DPLL
–
HSIE
Digital PLL block (DPLL)
The function of the DPLL block is to extract the clock and data information from the
USB data received from the different transceiver. The digital PLL runs on a 48 MHz
user-provided clock to extract the clock information from the USB for both full-speed
and low-speed data. The two signals D+ and D- of the USB lines are passed through a
differential receiver (external to the UHOSTC controller) and a NRZI formatted data is
obtained from the output of the differential receivers. The output of the differential
receiver is then used by the Digital PLL to extract clock information. The PLL Block also
has a SE0 Detect Logic to detect the single ended zero (SE0) in the data stream. The
circuit in this module extracts clock from either high-speed data or low-speed data
indicated by SIE_Switch HCLK input from SIETx State Machine.
HSIE functionality
The functionality of the Host serial interface engine (HSIE) is to receive and transmit
the USB data over D+ and D- lines in accordance with the USB protocol. During the
reception of USB data, the D+ and D- signals are passed through the differential
receiver (which is external to the UHOSTC controller) to get a single ended bit stream
that is passed through the PLL Block to extract the clock and data information. The
Clock and data are passed to the SIE Block to identify the Sync Pattern and for NRZINRZ conversion. This NRZ data is then passed through the Bit Stripper which strips off
the excessive zeros inserted, The data stream is initially passed through the PID
Decode and checker to identify different PIDs. Depending upon the type of PID, the
HSIE block handles the protocol accordingly.
●
RootHub port configuration
The port configuration block implements part of the RootHub logic. This block is
separated from the main RootHub block to distinguish the logic that varies with design
requirements. In short, this block implements part of the OHCI registers that are
322/590
Doc ID 018553 Rev 3
RM0078
USB 2.0 host controllers (UHC)
specific to RootHub and a state machine for every DownStreamPort to control the port
functional states.
This block has the following submodules:
–
RootHub port registers
–
Port S/M
–
Port receive
–
Port resume
–
Port MUX
Doc ID 018553 Rev 3
323/590
USB OTG controller (UOC)
21
RM0078
USB OTG controller (UOC)
This chapter focuses on UOC functionality and operation.
For the UOC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
21.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
UOC supports both device and host functions and complies fully with the On-The-Go
Supplement to the USB 2.0 Specification, Revision 1.3a and Revision 2.0.
It can be configured as a host-only or device-only controller, fully compliant with the USB 2.0
Specification. It supports high-speed (HS, 480-Mbps) transfers.
UOC connects to the industry-standard AMBA High-Performance Bus (AHB) to
communicate with the application and system memory, and is fully compliant with the AMBA
Specification, Revision 2.0.
Figure 100. UOC module in SPEAr1340
USB_UOC_DRVVBUS
PCM
MISC
Off-chip
charge
pump 5V
utmiotg_drvvbus
uoc_irq / ID[94]
otg_utmi_suspend_n
otg_utmi30_clk
USB_UOC_VBUS
UOC
USB_UOC_ID
Other UTMI+
(Parallel 16-bit IF
and UOC IF)
MCLK_XI
OSCI
AHB IF
(Master & Slave)
otg_hclk
USB_UOC_DM
otg_hreset_n
USB_UOC_DP
otg_utmi_rst_n
U
S
B
P
H
Y
usbphy_clkcore
RCG
MCLK_XO
SPEAr top level
324/590
Doc ID 018553 Rev 3
RM0078
21.2
USB OTG controller (UOC)
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
21.3
Clocks
●
utmi_clk: this is the UTMI+ clock. It is functionally used only when a UTMI PHY is
selected, but always used as the PHY domain clock during DFT Scan mode. Select
utmi_clk as a test clock even when the core is configured for a non-UTMI PHY.
●
hclk: this is the AHB clock. hclk is the scan clock for the core's AHB domain.
See also Chapter 5: Reset and clock generator (RCG).
Doc ID 018553 Rev 3
325/590
PCI express controller (PCIe)
22
RM0078
PCI express controller (PCIe)
This chapter focuses on PCIe functionality and operation.
For the PCIe feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
22.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The PCI Express (PCIe) core incorporates a dual mode (DM) core which can implement a
PCIe interface for a PCIe Root Complex (RC) or Endpoint (EP). The dual mode core can
operate in EP or RC port modes, depending on the value written in a register during PCIe
configuration. The DM core can be switched between modes at runtime by applying a
power-on reset.
PCI Express is compliant with the PCI Express Base 2.0 specification but it is also compliant
with the PCIe 1.1 specification.
The core features a proprietary user-configurable and high-performance application
interface for generating and receiving PCIe traffic. It is available with standard AMBA 3 AXI
interfaces.
The PCIe cores implement the three PCI Express protocol layers (Transaction layer, Data
Link Layer, and the MAC portion of the Physical Layer). It also implements the mode-specific
functionality of the PCI Express Transaction Layer (XADM/RADM ) for packet transmission
which sits between the application logic and the CXPL core. As shown in Figure 101, a
complete PCI Express Port solution includes the core, an analog PHY macro, and
application logic to source and sink data.
326/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Figure 101. PCIe port system block diagram
0#)EAPPLICATION
!PPLICATION
LOGIC
!PPLICATION
REGISTERS
#05OR
%%02/-
!PPLICATION
INTERFACES
#ORE
!PPLICATIONDEPENDENT
PARTOFTHE
TRANSACTIONLAYER
4RANSACTIONLAYER
$ATALINKLAYER
0HYSICALLAYER-!#
0(9INTERFACE0)0%
0)0%COMPLIANT0(9
0#)%XPRESS,INK
Figure 102. PCIe integration in SPEAr1340
SPEAr top level
pcie_p0_int
pcie_sata_p0_int
sata_p0_int
To A9SM
interrupt controller
pcie_p0_power_up_rst_n
MISC
pcie_p0_aux_clk_en
pcie_p0_device_present
pcie_p0_core_clken
pcie_miphy_p0_rst_phy_n
MIPHY_S_0_TXp
pcie_aux_clk
pcie_axi_dbi IF
pcie_miphy_p0_clk_tx
MIPHY_S_0_TXn
pcie_miphy_p0_clk_rx
pcie_miphy_p0_data_tx
pcie_miphy_p0_data_rx
Demux
pcie_sata_axi_master IF
p1_rst_phy_n
p1_clk_auxi
p1_clk_rx
p1_clk_tx
p1_data_in
p1_data_out
pcie_sata_axi_slave IF
MIPHY
single lane
RCG
PCIe0
pcie_sata_0_aclk
pcie_sata_0_aresetn
MIPHY_S_0_RXp
SATA0
MIPHY_S_0_RXn
PLL
MIPHY_S_XTAL1
MIPHY_S_XTAL2
Pcie_sata_sel[0]
MIPHY_single pll control signal
coming from MISC register
Doc ID 018553 Rev 3
327/590
PCI express controller (PCIe)
22.2
RM0078
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
22.3
Clocks
The PCIe controller operates in following clock domains:
22.4
22.5
●
Application clock (ACLK): this is the AXI bus interface unit clock. It is used for the AXI
master and the two AXI slave interfaces.
●
RX clock (p1_clk_rx): PHY receive clock; recovered RX domain clock coming from
PHY. It is at 125 MHz or 250 MHz depending on selected speed and it is asynchronous
with ACLK.
●
TX clock (p1_clk_tx): PHY transmit clock: this clock is generated by the PHY for
clocking the PCIe core transmit section. It is at 125 MHz or 250 MHz depending on
selected speed and it is asynchronous with ACLK.
Resets
●
ARESETn: AXI reset for AXI master and the two AXI slave interfaces
●
Reset_rx: PHY receive clock domain reset, asynchronous power on reset input for the
RX clock domain
●
Reset_tx: PHY transmit clock domain reset, asynchronous power on reset input for the
TX clock domain
Interrupts
There is one interrupt output from the PCIe controller; it is the logical OR of the individual
interrupt status bits (CR6_Register) in the PCIe application control registers.
This register contains different interrupt or error conditions both internal to the PCIe
controller itself and interrupt messages coming from PCIe link.
See also Appendix A: Interrupts.
The application logic in a PCI express endpoint may use one of three methods to signal an
interrupt across the link:
●
PCI legacy interrupt
PCI includes up to four virtual interrupt wires, referred to as INTA, INTB, INTC, and
INTD. These wires are shared by all the PCI devices in the system. PCI Express
standard emulates this capability by providing Assert_INTx and Deassert_INTx
Message packets sent through the PCI Express serial Link.
●
MSI
A PCI express endpoint may signal an MSI by sending a standard PCI Express Posted
Write packet towards the Root Port. The packet must contain a specific address and
328/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
one of up to 32 data values. The varying data values, and the address value provide
more detailed identification of interrupt events than legacy interrupts.
●
MSI-X
An MSI-X interrupt is identical to an MSI, except that an Endpoint may use one of up to
2048 address and data pairs in the MSI-X Posted Write packet. Endpoints with MSI-X
capability also include application logic to mask and hold pending interrupts, as well as
a memory table for the address and data pairs. The large number of address values
available to each Endpoint allows MSI-X Messages to be routed to different interrupt
consumers in a system, as compared to the single address available to MSI packets.
Upstream Switch Ports can send MSI-X packets; Root Ports cannot. In complex
systems, MSI-X packets could be routed to devices other than the RC, including other
Endpoints, based on the multiple address/data pairs available.
Only one of these capabilities is available at a time. When host software clears the MSI
Enable bit, you may only use legacy interrupts. When host software sets the MSI Enable bit,
you may only use MSI. If host software enables MSI or MSI-X, legacy interrupts are
automatically disabled. Functionality is undefined if both MSI and MSI-X are enabled.
When the PCIe controller is set in RC mode, the PCIe accepts ASSERT_INTX and
DEASSERT_INTX messages from the downstream component. CR6_Register contains
eight bits dedicated to these kind of interrupts (4 related to assert and 4 to deassert). They
are set depending on packets coming from the link.
When the PCIe controller is set in EP mode, it can generate ASSERT_INTA and
DEASSERT_INTA messages by setting sys_int bit of CR0_Register.
22.6
Functional description
Main PCIe interfaces are shown in Figure 103, while the top-level structure of the PCIe core
is shown in Figure 104 (red line indicates the boundary of PCIe controller).
Figure 103. PCIe main interfaces
!8)-ASTER)&
280)0%
!8)3LAVE)&
0#)ECONTROLLER
480)0%
0(9
!8)3LAVE)&
FORREGISTERS27
234#,+
Doc ID 018553 Rev 3
329/590
PCI express controller (PCIe)
RM0078
Figure 104. DM core block diagram (with AHB/AXI bridge module)
$-#ORE
2#0,2"90
!("!8)
2!$-
!("!8)
"RIDGE
-ODULE
280)0%
24,)
242'4
242'4
$")
,"#
!PPLICATION
REGISTERS
%,")
0(9
,OGIC
2XVENDOR
MESSAGES
3))
#80,#ORE
#$-
!PPLICATION
LOGIC
-3)8
!("!8)
,OGIC
4XVENDOR
MESSAGES
-3)
#ORE
REGISTERS
-3)8
!("!8)
BRIDGE
MODULE
8!,)
8!,)
8!$-
84,)
480)0%
2!-)
6-)
-3'?'%.
,OGIC
/PTIONAL
SYSTEM
STATUS
CONTROL
3))
#,+234
0-#
.OTES
/PTIONALINTERNALI!45ANDEXTERNALX!45ADDRESSTRANSLATIONUNITSINTERFACESARESHOWN
.ARROWARROWSREPRESENTRESPONSESIGNALPATHSTOREQUESTSBROADARROWS
)N2#MODETHE%,")PINSAREPRESENTBUTNOTOPERATIONAL
330/590
Doc ID 018553 Rev 3
2!-
RM0078
PCI express controller (PCIe)
The Common Xpress Port Logic (CXPL) module implements the basic functionality for the
PCI Express Physical, Link, and Transaction Layers. In addition to the CXPL, there are
several top-level modules that provide the configuration and mode-specific features:
●
Transmit application-dependent module (XADM)
●
Receive application-dependent module (RADM)
●
Configuration-dependent module (CDM)
●
Power management controller (PMC)
●
Local bus controller (LBC)
●
Message generation (MSG_GEN)
●
Hot plug control (HOTPLUG_CTRL)
The following sections describe in detail the main interfaces of the PCIe controller.
22.6.1
AXI bridge interface
The AXI bridge module acts as a bridge between the standard AXI interfaces and the
Synopsys DesignWare PCIe core native interfaces. The bridge interconnects the AXI
interfaces within an AMBA-embedded system with a remote PCIe link, as either a root
complex port or as an endpoint port. The bridge supports three AXI interfaces, one for an
AXI master, one for an AXI slave, and one for DBI access to the native PCIe core. The AXI
master interface enables a remote PCIe device to read and write to an AXI slave connected
to the AXI bridge. The AXI slave interface enables an AXI master to read and write through
the AXI bridge to a remote PCIe device. The slave DBI enables an AXI master to read and
write to registers inside the native PCIe core, or the device-specific registers attached to the
PCIe native core's ELBI (see Section 22.6.7: Local bus controller (LBC) and data bus
interface (DBI)). Throughout this document, the terms inbound and outbound are defined
with respect to the AXI fabric. That is, inbound transactions are defined as the transactions
presented by the native PCIe core's AXI master interface. Outbound transactions are
defined as the transactions generated by an AXI master that targets a remote PCIe device.
Doc ID 018553 Rev 3
331/590
PCI express controller (PCIe)
RM0078
Figure 105. System level view of the PCIe AXI core
/54"/5.$4RAFFIC
!PPLICATIONHARDWAREANDSOFTWARE
!8)-ASTER
!8)3LAVE
!8))NTERCONNECT
#ORE!PPLICATION
3IDE
!8)3LAVE$")
!8)3LAVE
!8)-ASTER
!8)"RIDGE
!PPLICATION
LOGIC
3)))NTERFACE
.ATIVE0#)E#ORE
0#)E
)."/5.$4RAFFIC
#ORE7IRE3IDE
0(9
0#)E2EMOTE,INK0ARTNER
Figure 106 shows the PCIe AXI core top-level interfaces (red line indicates the boundary of
PCIe controller).
332/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Figure 106. PCIe AXI core top-level interfaces
!PPLICATION
!8)-ASTER
OUTBOUND
REQUESTS
!8)
BRIDGE
SLAVE
!PPLICATION
!8)-ASTER
REGISTER
ACCESSES
!8)
BRIDGE
SLAVE
8!,)
2#0,2"90
$")
.ATIVE0#)E
#ORE
!PPLICATION
!8)3LAVE
INBOUND
REQUESTS
!8)
BRIDGE
MASTER
!PPLICATION
LOGIC
3))
%XTERNAL
BRIDGE
2!-
2!-)
0)0%
0(9
242'4
8!,)
!8)
BRIDGE
MODULE
0#)E!8)#ORE
Features
●
AXI master and slave interfaces for inbound and outbound PCI express requests.
●
32-bit address width for AXI master and slave interfaces
●
64-bit data width for AXI master and AXI slave interface for inbound and outbound
requests
●
32-bit data width for AXI slave interface for register accesses (DBI)
●
All types of PCI express transactions supported through the AXI bridge
●
Little-endian operation.
Doc ID 018553 Rev 3
333/590
PCI express controller (PCIe)
22.6.2
RM0078
Common xpress port logic (CXPL)
The CXPL module implements a large portion of the transaction layer logic, all of the data
link layer logic, and the MAC portion of the physical layer, including the link training and
Status State Machine (LTSSM). The CXPL connects to the external PHY though the PIPE.
Important aspects of the CXPL and overall core implementation include:
●
Layer 3 (transaction layer) functionality is split between the XADM, RADM, CDM, and
CXPL.
●
Layer 1(physical layer) is split across the PIPE such that the MAC functionality is in the
core and the PHY functionality is implemented in the PIPE-compliant PHY.
●
Receive and transmit path functionality is decoupled except where communication
between the two is required (such as flow control and other low-level link management
functions).
●
CXPL contains six modules, three for transmission and three for reception, as shown in
Figure 107.
–
RTLH: Receive Transaction Layer Handler
–
XTLH: Transmit Transaction Layer Handler
–
RDLH: Receive Data Link Layer Handler
–
XDLH: Transmit Data Link Layer Handler
–
RMLH: Receive MAC Layer Handler
–
XMLH: Transmit MAC Layer Handler
CXPL is compliant with the PCI express 2.0 specification with regards to the physical layer,
data link layer and transaction layer.
Figure 107. CXPL module block diagram
#80,
4O
2!$-
24,)
24,(
&ROM
8!$-
84,)
84,(
2$,(
2-,(
2X0)0%
8$,(
334/590
8-,(
2ETRYBUFFER
CONTROLLOGIC
Doc ID 018553 Rev 3
4X0)0%
RM0078
Transmit application-dependent module (XADM)
The XADM sits between the application logic and the CXPL core and implements the modespecific functionality of the PCI express transaction layer for packet transmission.
Figure 108 is a block diagram of the XADM. Its functions include arbitration, TLP formation,
and credit checking. The transmit path uses a cut-through architecture. It does not
implement transmit buffering/queues (other than the retry buffer).
Figure 108. XADM block diagram
8!$48&#CHECKING
!PPLICATION
#LIENT
8-4&##REDITS
8!,)
2EQUESTERAND
#OMPLETER
#80,
4RANSMIT
ARBITRATION
#LIENT
#LIENT
2EQUESTERAND
#OMPLETER
8!,)
4,0
FORMATION
/UTPUT-58
MODULE
8!,)
.OTE#LIENTAND8!,)AREOPTIONAL
-3'?'%.
#0,
2EQUESTERAND
#OMPLETER
-3'
22.6.3
PCI express controller (PCIe)
,"#
Arbitration
XADM provides the arbitration of TLP transmission between the following:
●
The transmit client interfaces (XALI0, XALI1). (XALI2 shown in Figure 108 is not
present).
●
Internally generated Messages from the MSG_GEN, triggered by PME, INTx (EP
mode), errors, or application logic
●
Internally generated completions:
–
EP mode: Internally generated completions are responses for type 0 configuration
read and write requests from upstream components, memory or I/O-mapped
application register space read and write requests, or responses to error
conditions (unsupported requests).
–
RC mode: Internally generated completions are unsupported request or completer
abort, as required by the incoming request filtering function of the RADM.
Doc ID 018553 Rev 3
335/590
PCI express controller (PCIe)
RM0078
In general, all internally generated TLP requests have higher priority than client interfaces.
Usage models for the client interfaces include:
●
A master is connected to each client interface (EP mode only). The XADM arbitrates
among client interfaces. There is no guarantee that order will be preserved among
client interfaces. In some cases, a requester may consider implementing some
ordering rules in the application logic, for example, holding off a Memory Read
transaction until the Memory Write transaction is completed.
●
A master is connected to Client1. A target completer is connected to Client0.. The
XADM arbitrates around each client interface. There is no guarantee that order will be
preserved among client interfaces.
●
A master for posted traffic is connected to Client0. A master for non-posted traffic is
connected to Client1. This is a model with one type of TLP per client (Posted, NonPosted, Completion).
Credit checking
The core checks that enough FC credits are available in the remote device for the specific
type of transaction (P, NP, CPL) before allowing a transmission of a TLP. TLPs that passed
the credit check are arbitrated according to the supported arbitration method. Internally
generated completions and messages are also gated by the arbitration logic, though at
highest priority, and must also pass the FC credit test before they are accepted for
transmission. If the application is using a single transmit client interface for more than one
Request type (for example, Posted and Non-Posted), and the current request (for example,
a Posted Request) is being blocked due to lack of available FC credits, then that client
interface is effectively blocked from sending other requests (for example, Non-Posted) even
though credits may be available for that type. To avoid this situation, the application can use
different transmit client interfaces for different request types.
22.6.4
Receive application-dependent module (RADM)
The RADM sits between the application logic and the CXPL core and implements the modespecific functionality of the PCI Express Transaction Layer for TLP packet reception.
Figure 109 shows a block diagram of the RADM. The RADM serves four major
functionalities as following:
●
Sort/Filter received TLPs
●
Completion lookup table (CLT), which is used for completion tracking and completion
timeout monitoring of transmitted Non-Posted requests.
●
Provide queuing (or bypass) of the received TLP
●
Output received TLP to the core's receive interface (Demux function)
The filtering rules and routing for all TLP receive options are configurable for all TLPs
received.
336/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Figure 109. RADM block diagram
2!$-
!PPLICATION
2CV&#
UPDATE
4RASH
#OMPLETION
LOOKUP
TABLE
1UEUE
242'4
4,0
FILTERING
$%-58
2CVD#0,
PROCESSING
2#0,
-ESSAGE
PROCESSING
-3'
242'4
2EQUESTER
#80,
%22
#OMPLETER
&ILTER
-3'
%,")
,"#
#&'$ATA
#$-
4O
8!$-
LBC?#0,
$")
&OR2#-3)-3)8ARENOTAVAILABLE
&OR2#AND$-2#MODE%,")ISNOTAVAILABLE
Posted and non-posted request and completion TLP processing
The RADM filter passes the Posted and Non-Posted Request and completion transactions
(such as write transactions and memory reads) directly to the application through the
RTRGT1 interface or to RTRGT0 for internal modules, as determined by the filtering and
routing rules for the current operating mode. The RADM filter segregates Posted and NonPosted TLPs into valid supported and valid un-supported Requests, and forwards them to
the queue. The filter processes each request and determines each TLP's destination along
with other controls that may be needed to generate TLPs. For Requests that the core
forwards to the RTRGT1 or Bypass interface, the application must process the request and
generate the completion. For requests that the core forwards to RTRGT0, the core
automatically generates the completion. The core automatically executes any required ELBI
access before generating the completion.
The RADM demux is designed to mux out a received TLP to the RTRGT1 and RCPL/RBYP
interfaces from single queue or multiple queue (DM/RC/EP) configurations. The filter
determines the destination and the action for each TLP, then sends this to the queue. The
demux decides whether to discard or forward the TLP onto the RTRGT1, RTRGT0, RCPL or
RBYP interfaces.
Doc ID 018553 Rev 3
337/590
PCI express controller (PCIe)
RM0078
Received completion TLP processing
Received completions are filtered against the completion lookup table content before
presenting the completion to the queue. The RADM also implements a completion time-out
mechanism (via the completion lookup table) and notifies the application when an expected
completion, corresponding to a transmitted Non Posted TLP, does not arrive within a
specified time.
Typically, infinite completion credits are advertised and the received completion is
configured in bypass mode which means that there is no queue in the core to store
completions. Completions can be configured in store and forward mode if the application
has chosen to do so. If a completion lookup has failed or other completion filtering has
failed, the core will assert an abort signal at the end of the transaction. If the core is
configured to have completions in bypass mode, it is the application's responsibility to roll
back any actions at the application's queue when an abort signal is asserted. If the core is
configured with completions enqueued, the completion will be discarded by the core and
flow control credits will be updated, as necessary, when an abort signal is detected.
Message processing
The RADM filter provides a message interface (grouped as part of the SII) to handle the
message TLPs received from the upstream component. By default, the RADM filter
processes the message and decodes the header before sending it to the application logic
on the SII. You can also write on Filter Mask registers to change this default and to send the
entire message TLP to the application in addition to providing the decoded message on the
SII.
22.6.5
Configuration-dependent module (CDM)
The CDM implements the standard PCI express configuration space and the core-specific
register space. The CDM also requests the message generation module to send messages,
as required, including MSI and interrupts.
The specific PCI Express configuration structures implemented in the CDM include the
following:
●
●
PCI-Compatible configuration registers
–
RC mode: Type 1 header
–
EP mode: Type 0 header
PCI Capability Structures:
–
PCI power management capability structure
–
MSI capability structure
–
MSI-X capability structure
–
VPD (vital product data) capability
●
PCI express capability structure
●
PCI express extended capabilities:
–
338/590
Advanced error reporting capability
–
Virtual channel capability
–
Device serial number capability
–
Power budgeting extended capability
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
The configured device type (determined by the CR0_Register[28:25], see PCIe application
control registers) affects the behavior of the message generation engine, error reporting
mechanism, as well as some PCI express configuration space registers. The CDM
communicates with application's host bus controller through the DBI.
22.6.6
Power management control (PMC)
The PMC module supports PCI software-compatible Power Management (PM) mechanisms
and the native PCI Express Active State Power Management (ASPM). The PMC is the only
module that must be powered by auxiliary power (Vaux) in the core when the core is in a
lower power state. It is also the only module containing contexts that are resetable only at
power-up. The following features are implemented in the PMC module:
22.6.7
●
ASPM support: L0s and L1
●
Control of the LTSSM to perform link power management: L0, L1, L2 and L3
●
Software-controlled device PM states: D0, D1, and D3hot/cold
●
Generation of PM Message transmission requests
●
Control of beacon generation
●
Side-band wake mechanism:
–
Supports application wake-up (for example, WOL and WAKE# signal support from
the platform system)
–
Generates wake to request system to restore power and clock
●
Power management event (PME) generation
●
Output of current power state status to the application
Local bus controller (LBC) and data bus interface (DBI)
The LBC module provides a mechanism for a link partner PCIe device (in EP mode only) or
a local CPU (through the DBI) to access:
Note:
●
internal registers (in the CDM)
●
external application registers connected externally to the ELBI.
In RC mode:
The application can access CDM registers or ELBI through the DBI.
PCIe wire access (through RTRGT0) to the CDM registers or ELBI is not possible.
Figure 110 shows the location of the LBC within the PCIe core and its role in routing
transactions.
Doc ID 018553 Rev 3
339/590
PCI express controller (PCIe)
RM0078
Figure 110. LBC context
0#)E#ORE
2#0,2"90
2!$
242'4
280)0%
24,)
242'4
#05OR
%%02/-
$")
,"#
!PPLICATION
REGISTERS
%,")
0(9
#80,#ORE
#$#ORE
REGISTERS
.OTE&ORADOWNSTREAMPORT
THE%,")PINSAREPRESENT
BUTNOTOPERATIONAL
8!$-
84,)
480)0%
2EQUEST
2ESPONSE
)NCOMINGREQUESTISRECEIVEDFROM0#)E2EMOTE,INK0ARTNER
2EQUESTISFILTEREDANDROUTEDBY2!$-VIA242'4TO,OCAL"US#ONTROLLER,"#
,"#FORWARDSREQUESTTOEXTERNALREGISTERSVIA%,")ORINTERNALREGISTERSIN#$-
,"#FORMSA#PL#PL$4,0WITHTHERESPONSERECEIVEDFROM%,")OR#$-
0#)ECORETRANSMITSRESPONSE#PL#PL$TO2EMOTE,INK0ARTNER
,OCAL#05MAYALSOGENERATEREGISTERREADWRITEREQUESTVIA$")ANDTHERESPONSEFROM%,")OR#$-IS
RETURNEDTOITIMMEDIATELY
The LBC provides a switched access function to internal registers (in the CDM) or external
registers (via ELBI) from the local application processor (CPU) via the DBI or the remote
application software (off the PCIe RX wire) via RTRGT0. Figure 111 illustrates the four
possible request paths through the LBC.
340/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Figure 111. LBC switch
)NBOUNDREQUEST
THROUGH242'4
#$-ACCESSTO
#ORESREGISTER
$")
%,")EXTERNALLOCAL
BUSINTERFACE
)NBOUND0#)EREQUESTTO2$72THE0#)ECORESINTERNALCONFIGSPACEREGISTERS
)NBOUND0#)EREQUESTTO2$72EXTERNALAPPLICATIONSPECIFICREGISTERS
,OCAL#05REQUEST $")TO2$720#)ECORESINTERNALCONFIGSPACEREGISTERS
,OCAL#05REQUEST $")TO2$72EXTERNALAPPLICATIONSPECIFICREGISTERS
The LBC also generates PCIe completions for requests coming from the PCIe wire through
RTRGT0.
Simultaneous transactions
●
The LBC is single-threaded and therefore, the DBI and RTRGT0 cannot use the LBC at
the same time. For example, a request on the DBI will not be accepted, during a
RTRGT0 <-> ELBI transaction, until both parts of that transaction.-[1] request and [2]
response (completion generation) - are completed. Therefore, it is not permissible to
use the ELBI to drive the DBI.
●
If the DBI and RTRGT0 present a request at the same time (regardless of the
target/destination of each request), then the LBC will grant access to the RTRGT0.
Application registers are connected to the ELBI. These can be accessed by PCIe request
TLPs over the PCIe link or by the DBI.
The ELBI can only be accessed by CFG access (from DBI or PCIe wire).
CDM / ELBI register space layout
The core has 4096 bytes(2) of PCI Express configuration space per function distributed as
per Figure 112. This address space is fully accessible from the DBI without any restrictions.
In EP mode it can be accessed from the PCIe wire using CFG requests.
2.
A CFG TLP has a 6-bit Register Number Field and a 4-bit Extended Register Number field allowing 1024
DWORDS (4096 bytes) to be accessed.
Doc ID 018553 Rev 3
341/590
PCI express controller (PCIe)
RM0078
Figure 112. PCIe configuration space address map (per function)
$7/2$!DDRESS
X&&
@#/.&)'?,)-)4
DEFAULTX&&
"YTE!DDRESS
X&&&
#USTOMERAPPLICATIONREGISTERS
%,")
;OPTIONAL=
%,")
MAXIMUM"YTES$7/2$3
@#/.&)'?,)-)4
DEFAULTX&&&
#$0ORTLOGICREGISTERS
;OPTIONAL=
X#
X
0#)%XPRESS
EXTENDED
CONFIGURATION
SPACE
"YTES
$7/2$3
0#)EEXTENDEDCAPABILITYSTRUCTURES
!%26#3.0"!2)32)/630#)%
X
X
0#)STANDARDCAPABILITYSTRUCTURES
0--3)0#)%-3)860$
@#AP0TR
X&
X&
0#)
SONFIGURATION
SPACE
"YTES
$7/2$3
0#)CONFIGURATIONHEADERSPACE BYTES$7/2$3
PCI configuration header and capability registers (in CDM)
The PCI Configuration Header and Capability registers in Figure 112 are PCIe core
configuration registers specified by the PCI express 2.0 specification. Access from the PCIe
wire is possible with CFG requests (in EP mode only). These registers are fully accessible
from the DBI without any restrictions.
Port logic (PL) registers (in CDM)
The port logic registers in Figure 112 are PCIe core configuration registers not specified by
the PCI express 2.0 specification, but are specific to the configuration and operation of the
PCIe controller integrated in SPEAr. In EP mode, access from the PCIe wire is with CFG
requests. There is no access from the PCIe wire in RC mode. These registers are fully
accessible from the DBI without any restrictions.
Customer application registers in ELBI
The customer application registers in Figure 112 are the customers application registers
specific to the operation of the customers application IP. They are external to the core and
are connected to the ELBI. These registers can't be directly accessed through remote link.
342/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
These registers are fully accessible from the DBI without any restrictions.
Accessibility summary
●
●
●
From the PCIe wire (through RTRGT0) in EP mode only:
–
You can Memory-Map the Port Logic (PL) register space.
–
You cannot Memory-Map the PCI and PCIe configuration register spaces.
–
You must always access with a CFG request.
–
You cannot access to customer application registers
In RC mode:
–
PCIe wire access (through RTRGT0) to the CDM registers is not possible
–
PCIe wire access (through RTRGT0) to the application registers is not possible
From the DBI:
–
You can access without any restriction the port logic (PL) register space.
–
You can access without any restriction the PCI and PCIe configuration register
–
You can access without any restriction the application register
–
CONFIG_LIMIT is not used in DBI routing to CDM/ELBI.
Table 123. AXI bridge DBI -> CDM / ELBI access details
Address bits
Access type
31-14
13
12
11-12
1
0
CDM
Not used (1)
0
CS2(2)
1 K-DWORD register
access
0
0
ELBI
Not used (1)
1
Not used
1 K-DWORD register
access
0
0
1. But forced to 0 internally
2. This bit must be asserted to write "BAR Mask registers"
Figure 113. DBI access to LBC
!("!8)!PPLICATION
0#)E#ORE
!("!8)
"53
!PPLICATION
MASTER
#05
!PPLICATION
SLAVE
!("!8)"RIDGE
-ASTER
242'4
$")3LAVE
$")
3LAVE
8!,)
Doc ID 018553 Rev 3
,"#
343/590
PCI express controller (PCIe)
22.6.8
RM0078
Message generation
The message generation module works on message generation and message processing.
22.6.9
Hot plug control (HOTPLUG_CTRL) module
In RC mode devices, the hot plug logic supports generation of hot plug interrupts on the
following hot plug events:
●
Power fault detected
●
MRL sensor changed
●
Presence detect changed
●
Command completed
●
Attention button pressed
●
Electromechanical interlock status changed
●
Data link layer state changed
When MSI or MSI-X mode is enabled, the core notifies the application of hot plug events
using the hp_msi bit in CR6_Register. When INTx interrupt mode is enabled, the core
notifies the application of hot plug events using the hp_int bit in CR6_Register. If PME is
enabled, the hot plug logic generates a hot plug wake-up signal on hp_pme, triggered by the
above hot plug events. The RC Core does not check if the PM state is D1, D2, or D3hot. It is
up to the application to check the value on pm_dstate to make sure the device is in D1, D2,
or D3hot.
22.7
Operation
This section describes the operations of the PCI Express core. The topics for this section
are:
●
Initialization
●
Link establishment
●
Transmit TLP processing
●
344/590
–
Transmit TLP arbitration
–
Transmit retry
–
Transmit DLLP priorities
Receive TLP processing
–
Receive filtering
–
Receive Routing
–
Receive queuing
●
Error handling
●
Messages
●
Interrupts
●
Address translation
●
Gen2 5.0 GT/s operation
●
Power management
●
Completion timeout ranges
Doc ID 018553 Rev 3
RM0078
22.7.1
PCI express controller (PCIe)
Initialization
Immediately after reset the DM core goes into either EP mode or RC mode depending on
the state of the device_type setting (CR0_Register[28:25], see PCIe application control
registers in RM0089, Reference manual, SPEAr1340 address map and registers).The
internal configuration registers in the CDM assume their default reset values as listed in the
PCIe core registers section.
The application must keep the app_ltssm_enable signal deasserted after reset until the
application is ready to establish a Link and start receiving and transmitting TLPs. If the
application needs to update configuration registers in the CDM as part of the initialization
process, then the application must keep app_ltssm_enable deasserted until it has
programmed all the necessary configuration registers through the DBI. After initializing the
necessary configuration registers, the application can assert app_ltssm_enable to allow the
LTSSM to begin Link establishment. The LTSSM begins link negotiation after the
deassertion of reset, miphy initialization complete and app_ltssm_enable bit
(CR0_Register[3]) is asserted.
22.7.2
Link establishment
The core and a PCI Express compliant PHY combine to provide a complete solution for
setting up and maintaining a compliant PCI express link. The core implements the LTSSM
function according to the PCI express 2.0 specification. In general, the process for
establishing a Link is a follows:
22.7.3
1.
Upon power-up (or directly out of reset), it is assumed that the power supply becomes
stable and the ASIC/SoC and SerDes PLLs reach frequency lock before the devices
attempt to establish a valid Link. Once in a valid state, the SerDes either communicates
a ready status to the core or simply begins transmitting and receiving valid data.
2.
Per the PCI express 2.0 specification, once bit and symbol synchronization are
complete, the core initiates the following sequence to establish a link (assuming a valid
and properly functioning link partner):
a)
Receiver detection on available lanes for the port.
b)
Exchange of training sequences to determine link configuration (for example, link
speed, number of lanes, and order).
c)
Once both partners reach a valid negotiated state, the link state is set up and the
LTSSM is in L0.
3.
Once link up is achieved, the data link modules take over to manage the link and
initialize flow control.
4.
After flow control initialization is complete, the data link modules signal the transaction
layer modules that the link is ready to allow transmission/reception of TLP traffic.
5.
During normal operation, the LTSSM and data link modules continue to manage the
underlying Link integrity while data traffic is communicated across the PCI express link.
Transmit TLP processing
Generally, all types of transmit TLPs (Posted, Non-Posted, and Completion) generated by
the application travel through the core in the following flow:
The application presents a transaction transmission request with header information and
payload (if applicable) on one of the transmit client interfaces (for example, XALI0).
Doc ID 018553 Rev 3
345/590
PCI express controller (PCIe)
RM0078
1.
The XADM forms the transaction into a TLP and checks the TLP against the current
Flow Control credit availability. If the TLP passes the flow control checks and wins the
arbitration with TLPs from the other client interfaces, then the TLP goes to the CXPL.
2.
The XTLH module inserts an ECRC (if applicable) and snoops/stores the necessary
TLP information for completion lookup (for Non-Posted requests only).
3.
The XDLH inserts the sequence number and LCRC into the TLP and the retry buffer
stores the TLP.
4.
The XMLH inserts start and end delimiters and performs data scrambling.
5.
The XMLH presents the packet to the PHY through the PIPE interface.
6.
The PHY receives the packet, performs 8b10b encoding, and serialization, then sends
the packet for transmission on the Link
Transmit arbitration
The XADM arbitrates transmit TLPs using round-robin method between the two transmit
client interfaces.
Regardless of the TLP transmit arbitration, messages (both internally-generated and
messages requested through the VMI) always have the highest priority, followed by
internally-generated completions. The priority order for all transmitted TLPs is:
1.
Internally generated messages
2.
Internally-generated completions
3.
Transmit TLPs from Client0 and Client1 according to the selected arbitration method
Transmit retry
There is a Retry Buffer (RB) in the core that stores a copy of each transmitted TLP until an
Ack is received. The RB consists of two buffers: retry buffer and start-of-TLP (SOT) buffer.
The retry buffer is implemented with a single port RAM. The SOT buffer stores the starting
address of each unacknowledged TLP stored in the retry buffer. The SOT buffer is
implemented with a single port RAM and is indexed by the Sequence Number of the TLP
whose starting address is being stored or retrieved. When a Nak is received or the replay
timer times out, a replay is initiated.
A replay is terminated by two conditions:
●
When the replay of all TLPs in the retry buffer is finished, or
●
An Ack DLLP is received that acknowledges all TLPs in the retry buffer
The replay timer tracks the TLP replay time. It stays at 0 when every TLP has received an
Ack and starts to count when a TLP is transmitted and the LTSSM is not in the training state.
The replay timer is reset to 0 when an Ack or Nak is received that acknowledges a TLP that
is in the retry buffer.
Note:
346/590
The retry buffer does not function as a transmit queue. The core transmits TLPs immediately
after they pass arbitration. The copy in the retry buffer is only sent in the event that the TLP
must be re-transmitted
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Transmit DLLP priorities
The order of priority to transmit pending DLLPs is:
22.7.4
●
High-priority DLLPs
●
TLPs
●
Low-priority DLLPs
Receive TLP processing
Generally, received transactions travel through the core in the following flow:
1.
The PHY receives a stream of bits and aligns/forms them into 10-bit symbols
2.
The PHY decodes the 10b stream into an 8b stream
3.
The PHY crosses the clock domain from RX to TX and presents the stream to the
PIPE.
4.
The RMLH descrambles and deskews the incoming data, checks for receiver errors,
then extracts packets.
5.
The RDLH strips off the LCRC and sequence number.
6.
The RTLH strips off the ECRC (if applicable), checks for a malformed TLP, and forms a
transaction across the RTLI interface to the RADM.
7.
The RADM filters the transaction based on the transaction type (Posted, Non-Posted,
or completion) and the rules described in Receive filtering below.
8.
Filtered transactions are sent to RADM queues.
9.
Transactions residing in the RADM queues are presented to the application or locally
handled by the LBC module, depending upon the filter result.
Receive filtering
The core contains a filter module that is responsible for the following tasks:
●
Determine the status of a received TLP using filtering rules.
●
Determine the destination interfaces of a received TLP based on the status from
applying the filter rules.
●
Signal the application for the status of the received TLP by driving signals such as
DLLP abort, TLP abort and ECRC error.
●
Report errors to Advanced Error reporting registers (ADERR_STRUC address block)
based on filter results. If more than one type of error is detected, Section 6.2.3.2.3
“Error Pollution” of the PCI express 2.0 specification is followed.
The core filters and routes received TLPs according to a set of rules determined by the TLP
type based on the PCI express base 2.0 specification and user-configurable filtering options.
The filtering rules for a received TLP are affected by I/O signals and register values. The
application can mask some of the filtering and error handling rules by setting the
corresponding bits in Symbol Timer and Filter Mask register 1 (SYMB_T_R) and Filter Mask
register 2 (FL_MSK_R2). There are three types of the filtering rules in the core:
●
rules that are applicable for all TLP received
●
rules that are dependent on the type of the TLP based on PCIe specification
●
rules that are not from the PCIe specification but requested by specific applications.
Doc ID 018553 Rev 3
347/590
PCI express controller (PCIe)
RM0078
Figure 114. Receive TLP processing flow
4RASH
1UEUE
&ILTER
242'4
4,0
FILTERING
#80,
$%-58
2CVD#0,
PROCESSING
2#0,
%22
-3'
242'4
-ESSAGE
PROCESSING
-3'
%,")
,"#
$")
#&'$ATA
#$-
Filtering rules applicable for all TLPs received
The following general rules apply to all incoming TLPs:
●
The core discards all incoming TLPs that have an invalid Type field. This TLP is treated
as a “TLP ABORT”.
●
A request TLP with the poison bit set is considered an unsupported request (UR) only
when the UR poison rule mask bit is not set. Applications can control the end result of a
poisoned TLP filter through the corresponding filter mask bit. If the filter mask bit is not
set, all request TLPs with poison bit set will be discarded.
●
A locally terminated TLP with ECRC error detected is discarded in store-and-forward
mode and an ECRC error reported only when the filter mask
CX_FLT_MASK_ECRC_DISCARD bit is not set.
●
Filter rules have no effect on received TLPs when “DLLP ABORT” signal is asserted.
●
If a completion of a non-posted request is not received within a completion timeout
period, this request will be treated as a completion timeout, and a non-advisory error
will be reported.
●
For messages to be accepted and decoded, the incoming message must be one of the
valid Message types with the correct payload length based on PCIe 2.0 specification.
Valid Messages will be decoded and passed onto the SII interface as necessary.
See Section 22.7.5: Error handling for more details.
348/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Filtering rules based on TLP type defined in PCIe specification
PCIe TLPs are categorized as requests and completions. The next table describes the
filtering rules for request and completion TLPs and the results of the core's filter.
If a received TLP passes all of the filter rules for request and completion TLPs, then it is
considered to have no errors, and the TLP will be routed to the destination that is
configured. Details on routing are provided in Receive routing section.
Notation of filter results:
UR =
Unsupported Request
CA =
Completion Abort
CRS =
Configuration Retry Request
SU =
Successful
UC =
Unexpected Completion
MLF =
Malformed
"-" =
Filtering rule does not apply to TLP type
MA =
Master Abort
TA =
Target Abort
EP mode filtering rules
Table 124. Result of filtering rules applied to request TLPs and completion (CPL)
TLPs: EP mode
TLP type
CFG
MSG
CPL with
UR/CA/CR
S status
CPL with
SU status
UR
SU
SU
UC
UC
UR
UR
-
-
-
-
TLP header poison bit is set and the
filter mask
CX_FLT_MASK_UR_POIS bit is not
set
UR
UR
UR
UR
SU
SU
Address within a BAR that is
configured to RTRGT0 and TLP DW
length > 1
CA
CA
-
-
-
-
MRd with lock and filter mask
CX_FLT_MASK_LOCKED_RD_AS
_UR bit is not set
UR
-
-
-
-
-
Filtering rule
MRd
MWr
IORd
IOWr
PowerState is not in D0
UR
Address is not within any configured
Memory BAR or IO BAR if it is an IO
request
Doc ID 018553 Rev 3
349/590
PCI express controller (PCIe)
RM0078
Table 124. Result of filtering rules applied to request TLPs and completion (CPL)
TLPs: EP mode (continued)
TLP type
CFG
MSG
CPL with
UR/CA/CR
S status
CPL with
SU status
-
UR
-
-
-
-
-
UR
-
-
-
Application requests the core filter
to return CRS by asserting signal
app_req_retry_en
-
-
CRS
-
-
-
Not valid message for EP device
-
-
-
UR/MLF
-
-
Illegal payload length of a message
-
-
-
UR
-
-
Vendor MSG Type0 with filter mask
CX_FLT_MASK_VENMSG0_DROP
bit not set
-
-
-
UR
-
-
Vendor MSG Type1 with r[2:0] to
3'b010 and {Bus#, Dev#, Func#}
mismatch
-
-
-
UR
-
-
CA
CA
CA
-
-
-
Requester ID mismatch
-
-
-
-
MA/TA
MLF
Requester TAG mismatch
-
-
-
-
MA/TA
MLF
TAG error (non-pad zero for
reserved TAG bits
-
-
-
-
MA/TA
MLF
Byte count mismatch (PCIe Gen2)
-
-
-
-
MA/TA
UC/MLF
Completion received with status of
UR
-
-
-
-
MA
-
Completion received with status of
CA
-
-
-
-
TA
-
Completion received with status of
CRS
-
-
-
-
CRS
-
Completion received with CRS
status and completion is not a
pending configuration request
-
-
-
-
MLF
-
Filtering rule
MRd
MWr
IORd
IOWr
The function number of a completer
ID within a CFG request does not
match an implemented function
within the receiver device and the
filter mask
CX_FLT_MASK_UR_FUNC_MISM
ATCH bit is not set
-
Configuration type1 TLP request
and the filter mask
CX_FLT_MASK_CFG_TYPE1_RE
Q_AS_UR is not set
TLP with ECRC error detected
350/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
A complete list of the filtering checks can be referenced at Symbol Timer and Filter Mask
register 1 (SYMB_T_R) and Filter Mask register 2 (FL_MSK_R2) in PCIe core registers
section (Endpoint register bank) of RM0089, Reference manual, SPEAr1340 address map
and registers.
RC mode filtering rules
Table 125. Result of filtering rules to request TLPs and completions (CPL) TLPs: RC mode
TLP type
IO
MSG
CPL
with
UR/CA
status
-
-
-
-
-
-
UR
-
UR
-
-
-
-
UR
UR
UR
UR
UR
-
-
-
UR
-
-
-
-
-
-
CFG Request received and the filter
mask
CX_FLT_MASK_RC_CFG_DISCARD
is not set
-
-
UR
-
-
-
-
-
IO Request received and the filter
mask
CX_FLT_MASK_RC_IO_DISCARD is
not set
-
-
-
UR
-
-
-
-
Filtering rule
MRd
MWr
Address does not satisfy any of the
following conditions:
1.
Within any configured memory
BAR.
2.
Outside of the memory range
AND prefetchable memory
range as determined by the
corresponding base and limit
fields in the Type-1 header.
3.
The filter mask
CX_FLT_MASK_UR_OUTSIDE
_BAR bit is set, which treats outof-bar TLPs as supported
requests and indicates a special
application requirement
UR
UR
Any address bit, above bit position
MASTER_BUS_ADDR_WIDTH-1 is
set to '1'
UR
TLP header poison bit is set and the
filter mask CX_FLT_MASK_UR_POIS
bit is not set
MRdLk request received and filter
mask
CX_FLT_MASK_LOCKED_RD_AS_
UR bit is set, which indicates that
customer prefer to filter out the
MRdLk
(1)
CFG
Doc ID 018553 Rev 3
CPL
with
CRS
status
CPL
with
SU
status
351/590
PCI express controller (PCIe)
RM0078
Table 125. Result of filtering rules to request TLPs and completions (CPL) TLPs: RC mode
TLP type
IO
MSG
CPL
with
UR/CA
status
-
-
UR
-
-
-
-
-
-
UR/
MLF
-
-
-
CA
CA
CA
CA
-
-
-
-
Requester ID mismatch
-
-
-
-
-
MA/TA
-
MLF
Requester TAG mismatch
-
-
-
-
-
MA/TA
-
MLF
TAG error (non-pad zero for reserved
TAG bits)
-
-
-
-
-
MA/TA
-
MLF
Byte count mismatch
-
-
-
-
-
MA/TA
-
MLF
Completion received with status of
UR
-
-
-
-
-
MA
-
-
Completion received with status of CA
-
-
-
-
-
TA
-
-
Completion received with CRS status
and completion is not a pending
configuration request
-
-
-
-
-
-
MLF
-
Filtering rule
MRd
MWr
Vendor MSG Type0 with filter mask
CX_FLT_MASK_VENMSG0_DROP
bit not set
-
-
Not valid message for RC device
-
TLP with ECRC error detected
(1)
CFG
CPL
with
CRS
status
CPL
with
SU
status
1. DM (in RC mode) should not expect to receive a CFG or IO request.
A complete list of the filtering checks can be referenced at Symbol Timer and Filter Mask
register 1 (SYMB_T_R) and Filter Mask register 2 (FL_MSK_R2) in PCIe core registers
section (Endpoint register bank) of RM0089, Reference manual, SPEAr1340 address map
and registers.
Filtering rules not defined in PCIe specification
There are additional filtering rules that are designed to provide enhanced filter support for
certain applications.
●
Core to handle the received posted or non-posted requests with zero byte length
When a zero-byte request TLP is received, also called "flush" command, the core can
drop the zero-byte request (it means that the core service internally the request but
doesn't pass it to the application). This is designed to support some applications that
cannot handle a zero-byte request. Applications can dynamically program a bit in the
filter mask CX_FLT_MASK_HANDLE_FLUSH bit to turn on/off this rule. If the core is
programmed to handle the flush, it will be the completer's task to return completion
status.
●
Core to detect oversize read request and return UR for the read request
Some applications may have a buffer limit and are not able to handle lengthy read
requests. The core over-size read request detection rule can be turned on when an
352/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
application can identify a maximum read request size that it can tolerate. This feature is
enabled when the PCIe AHB/AXI bridge is enabled.
Receive routing
●
EP Mode
The possible destinations of a posted or non-posted request TLP are RTRGT1
interface, RTRGT0 interface and core discard (dropped or terminated). By default:
–
CFG requests are routed to RTRGT0 and then to CDM via LBC.
–
BAR-matched MEM/IO requests are routed to RTRGT1.
–
MSG requests are decoded internally, signalled on the SII interface and then
terminated.
Figure 115. Default request TLP routing (assuming no TLPs with CA/CRS/UR
completion status)
CORE
CONFIG
DATA
#$
#&'
242'4
,"#
242'4
-%-)/
!DDRESS4YPE#HECK
#
8
0
,
"!2
#8?.&5.#
242'4
3))
-3'
The possible destinations of a completion TLP are RCPL interface, RBYP interface,
RTRGT1 interface, and Core Discard.
In general, a TLP type that is configured as bypass will be sent to either the RBYP interface,
or RCPL interface if it is a completion. A TLP type that is configured as a cut-through or
store-forward will be sent to RTRGT1 interface.
Doc ID 018553 Rev 3
353/590
PCI express controller (PCIe)
●
RM0078
RC mode
The possible destinations of a posted or non-posted Request TLP are RTRGT1
interface and core discard (dropped or terminated). By default:
–
MEM requests outside of the memory range AND pre-fetchable memory range as
determined by the corresponding base and limit fields in the Type-1 header, are
routed to RTRGT1.
–
MSG requests are decoded internally, signalled on the SII interface and then
terminated.
–
An RC does not expect to receive CFG or IO requests.
–
BARs should be disabled and not used.
The possible destinations of a completion TLP are RCPL interface, RBYP interface,
RTRGT1 interface, and core discard. In general, a TLP type that is configured as
bypass will be sent to the RBYP interface. A TLP type that is configured as a cutthrough or store-forward will be sent to RTRGT1 interface.
Receive queuing
A segmented buffer queuing method is used: a memory pair (header and data) is used for
all TLP types and all VCs. The memory is divided into segments for Posted, Non Posted and
Completion queues for each VC. The depth of each segment can be controlled dynamically
by writing the buffer depth related registers in Port Logic registers (PRT_LOG_R).
Posted and not posted TLP use the store-forward mode: TLPs are stored into queue and
advertisement of an available TLP is advertised only after the entire TLP is stored into the
queue. To deliver these request RTRGT0 or RTGT1 interfaces are used depending on BAR
setup and if the TLP is of CFG type or not. RTRGT1 is connected to the AXI/AHB Bridge
master interface (request) channel.
Completion TLPs use bypass mode: there is no receive queue in this mode, the application
must be able to accept all traffic - as back-pressure is disabled in the mode. To deliver these
requests RBYP interface is used which is connected to the AXI/AHB bridge slave interface
(response) channel.
22.7.5
Error handling
Errors are classified into two levels:
354/590
●
Correctable error (CORR). This means that the PCIe core has a way of automatically
handling the error. There is no loss of information. For example, Link CRC (LCRC) that
is fixed by replaying the DLL.
●
Uncorrectable error (UNCORR). The PCIe core can not fix these and they are
classified as:
–
Fatal error (FATAL). The link is not functioning correctly and may require a link
reset.
–
Non-fatal error (NONFATAL). The problem is not related to link operation.
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
The core implements the following types of error handling.
●
PCIe baseline capability. These reporting capabilities are a minimum set, and are
required of all PCI Express devices. Error notification takes two forms:
–
Messages sent to root complex (RC).
–
Completion status errors.
This also covers mapping of PCIe errors to legacy PCI generic error handling such as
PERR# and SERR#. Many of the PCIe errors are mapped into the Status register in the
PCI compatible configuration space header.
●
PCIe Advanced Error Reporting (AER) capability. Allows more sophisticated error
reporting, control, masking and logging using the PCIe extended AER capability
register structure (ADERR_STRUC address block).
The PCIe core supports advisory reporting for both the baseline and AER capabilities,
which is the configurable with-holding of reporting for non-fatal errors (NONFATAL).
For an RC port, the reporting of most errors is internal to the root port. No external error
notifications are generated. One exception to this (for example) is unsupported request (UR)
completion status.
PCIe baseline capability
Reporting of errors is achieved by sending a notification to the RC (a CPL with UR/CA/CRS
status for Non Posted requests, and optionally an error Msg).
The decision to [not] send an error MSG is controlled by a complex set of associated control
and status bits. The status is also logged in the Device status register (DEV_CAS[31:16]) for
the following errors: unsupported request (UR), FATAL, NONFATAL and CORR.
The flow diagram in Section 6.2.5, Sequence of Device Error Signaling and Logging
Operations of the PCI express specification shows the sequence of operations related to
signaling and logging of errors detected by a PCIe device.
Table 126 shows error message format sent to RC.
Table 126. Error message (Msg) format
Message code
Note
0011_00xx
ERR_CORR, ERR_NONFATAL, ERR_FATAL are encoded using 30h, 31h,33h
Completion status errors:
Completion status errors for non posted requests may be any of the following:
●
Unsupported request (UR)
●
Configuration request retry status (CRS)
●
Completion abort (CA)
Reporting through the Device control register (DEV_CAS[31:16]):
Doc ID 018553 Rev 3
355/590
PCI express controller (PCIe)
RM0078
The PCI express capability register structure provides the following support for baseline
error reporting.
●
Enable/disable error reporting (Device control register, DEV_CAS[15:0]).
●
Provide error status (Device control register, DEV_CAS[31:16]) for:
●
–
UR
–
Correctable error (CORR).
–
Fatal Error (FATAL).
–
Non fatal error (NONFATAL).
A method for software to force link retraining (Device control register, DEV_CAS[15:0]).
Advanced error reporting (AER)
AER allows more sophisticated error reporting, control, masking and logging using the
optional extended AER capability register structure (ADERR_STRUC address block).
Advanced error reporting registers:
All possible errors are enabled, masked and assigned a severity.
There are two sets of registers:
●
Error enable register
●
Error severity register
●
Error mask register
The correctable set of registers handles (for example) errors arising from bad DLLPs or
TLPs.
The uncorrectable set of registers handles (for example) errors arising from UR, ECRC,
malformed TLPs, buffer overflow, UC, CA, completion timeout and poisoned TLP.
Severity programming:
The uncorrectable error severity register allows each uncorrectable error to be programmed
to fatal or non fatal.
The transmission of these error messages by class (correctable, non-fatal, fatal) is enabled
using the Reporting Enable fields of the Device control register, DEV_CAS[15:0] or the
SERR Enable bit in the PCI status and command register (PCI_CONFIG_HEADER
registers, address 0x04).
The Uncorrectable Error Mask register and Correctable Error Mask register allows each
error condition to be masked independently. If messages for a particular class of error are
not enabled by the combined settings in the Device control register (DEV_CAS[15:0]) and
the PCI status and command register (PCI_CONFIG_HEADER registers, address 0x04),
then no messages of that class will be sent regardless of the values for the corresponding
mask register. If an individual error is masked when it is detected, its error status bit is still
affected, but no error reporting message is sent to the root complex, and the header log and
first error pointer registers are unmodified.
356/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Advisory non fatal messages
The PCIe core supports advisory reporting which is the configurable with-holding of
reporting for non fatal errors.
●
During baseline error reporting, the core produces no error message.
●
During AER, the core can instead, signal a non-fatal error with ERR_COR, which
serves as an advisory notification to software.
It will always signal a fatal error with ERR_FATAL
UR/CA advisory
The PCIe core generally sends a CPL with UR/CA status to signal a uncorrectable error for
a non posted request.
If the severity of the UR/CA error is non fatal, the PCIe core will handle this case as an
advisory non fatal error. By default, the PCIe core will signal the non fatal error (if enabled)
by sending an ERR_COR message.
UC advisory
When the PCIe core receives an UC and the severity of the UC error is non fatal, the PCIe
core will handle this case as an advisory non fatal error. By default, the PCIe core will signal
the error (if enabled) by sending an ERR_COR Message.
Error source classification
The following table indicates how some of the more common low level errors are classified.
Table 127. Possible causes for typical errors
Error type
Possible cause
UR (unsupported request)
Poisoned TLP (EP=1)
No BAR match
MRd length > max read request size
UC (unexpected completion)
TAG mismatch
Requester ID (RID) mismatch
CPL timeout
Remote device hung
CA (completion abort)
ECRC
Malformed TLP
Bad TLP header caused by bad link
Buffer overflow
Credit miscalculation by some PCIe device
BNad DLLP
LCRC
For a full analysis of what error conditions contribute towards an UR or CA status, see
Receive filtering on page 347.
In many cases, the standard operation may be 'masked' or ignored by setting the
corresponding bit in the “Symbol Timer and Filter Mask Register 1”.
Doc ID 018553 Rev 3
357/590
PCI express controller (PCIe)
RM0078
Error detection
Built into the core are all mandatory error detections, some optional error detections, and
the error report mechanism based on the PCI express specification. The core also has an
option for the application to turn off the filter rules and perform its own error checking.
The following general rules apply to all incoming TLPs:
●
The core discards all incoming TLPs that have an invalid type field. This TLP is treated
as a “TLP-ABORT”.
●
A locally terminated TLP with ECRC error detected is discarded in store-and-forward
mode and an ECRC error reported only when the filter mask
CX_FLT_MASK_ECRC_DISCARD bit is not set.
●
Filter rules have no affect on received TLP when "DLLP-ABORT" signal is asserted.
●
If a completion of a non-posted request is not received within a completion timeout
period, this request will be treated as a completion timeout, and a non-advisory error
will be reported. See Advanced error reporting (AER) on page 356 for more details.
●
“DLLP-ABORT” is asserted as a result of one of two conditions:
●
22.7.6
–
A data link layer error is detected (for example, LCRC). A retry from a remote
device will occur.
–
UC or completion with ECRC error is detected. This condition is valid only when
the application has configured the core with infinite credits. Because the
completion buffer of the core or application has limited resources defined for
expected completions, it is necessary to avoid overflowing the completion buffer by
unexpected completions. Therefore “DLLP-ABORT” is asserted to notify the core
completion buffer (if completion is in store-forward mode) or application's
completion buffer to rewind their buffer pointers when a completion with ECRC
error or unexpected completion is detected.
TLP-ABORT is asserted as a result of one of three conditions:
–
Malformed TLP
–
UC
–
ECRC
Messages
Similar to MWr, messages (Msg/MsgD) are posted transactions. The 8-bit “Message Code”
field defines what class of message the TLP is. Some examples of typical message classes
are given below:
Table 128. Message classes based on the message code
Message code [7:0]
Message class
TLP type
Note
0001_xxxx
Power management
Msg
0010_0xxx
Legacy PCI interrupt
Msg
Assert/Deassert for each of INT
A/B/C/D
0011_00xx
Error signaling
Msg
ERR_CORR, ERR_NONFATAL,
ERR_FATAL are encoded using
30h, 31h, 33h
0111_11xx
Vendor defined
Msg/MsgD
Other classes (used by PCIe core include locked transaction and slot power limit
358/590
Doc ID 018553 Rev 3
RM0078
Message signalled interrupts (MSI/MSI-X) are not messages (Msg/MsgD) but MWr TLPs
Message generation
Messages that are transmitted by the PCI express core can potentially be derived from the
following seven sources. Referring to the circled numbers in the following diagrams,
outbound messages can be created either by:
●
The core automatically as follows:
–
Power management messages.
–
Error signaling messages.
or
●
The customer application as follows:
–
Direct supply of message TLPs at AXI bridge master
–
Vendor defined messages through the Vendor Message Interface (VMI).
–
Locked transaction messages through the SII Message interface [RC mode],
Legacy PCI interrupt messages through the SII Interrupt interface.
–
Error signaling messages through the SII Transmit Control interface (app_err* I/O).
Figure 116. Message transmission: EP mode
0#)E#ORE
2!$-
280)0%
%22/23IGNALING
Note:
PCI express controller (PCIe)
!45
!PPLICATION
GENERATED
MESSAGES
!("!8)
BRIDGE
SLAVE
!("!8)
0(9
A
#80,#ORE
8!,)
6ENDORDEFINED
6-)
!PPLICATION
ERRORSIGNALLING
8!$-3'?'%.
3))4RANSMIT#ONTROL
480)0%
,EGACY0#)INTERRUPT
3)))NTERRUPT
0-#
Doc ID 018553 Rev 3
359/590
PCI express controller (PCIe)
RM0078
Figure 117. Message transmission: RC mode
0#)E#ORE
2!$-
280)0%
0(9
!45
!PPLICATION
GENERATED
MESSAGES
#80,#ORE
!("!8)
BRIDGE
SLAVE
!("!8)
A
8!,)
6ENDORDEFINED
6-)
8!$-
!PPLICATION
ERRORSIGNALLING
3))4RANSMIT#ONTROL
-3'?'%.
480)0%
,OCKEDTRANSACTION
$")
3))-ESSAGES
#$REGISTERS
0-#
Table 129. Message transmission
Index(1)
EP mode
RC mode
Power management controller in the core
(Msg)
PM_PME(2)
2
Error signaling inside the core (Msg).
COR_ERR /
ERR_NONFATAL /
ERR_FATAL.
See Section 22.7.5:
Error handling for
more details.
3
Direct Supply of any class of message
(Msg/MsgD).
Access through AXI interface
1
360/590
Message source (type)
Doc ID 018553 Rev 3
PME_Turn_off(3)
n/a
RM0078
PCI express controller (PCIe)
Table 129. Message transmission (continued)
Index(1)
3a
4
5
Message source (type)
Indirect supply of any class of message
(Msg/MsgD).
(4)
Vendor defined (Msg )
EP mode
See Section 22.7.7: Address translation for
more details on generating Msg/MsgD from
MWr/IOWr using address translation unit
(ATU).
The core generates vendor defined
messages in response to requests on the
VMI (see application registers which manage
this interface).
Locked transaction (Msg)
6
Legacy PCI interrupt (Msg)
7
Error signaling from the application (Msg)
RC mode
Unlock message,
triggered by root
complex i by setting
the app_unlock_msg
bit in the application
register
n/a
Setting sys_int bit in
application register
(see Section 22.5:
Interrupts)
n/a
n/a
1. The “Index” referts to the numbers in the previous graphics.
2.
Triggered by your EP application through the outband_pwrup_cmd or apps_pm_xmt_pme bits in
CR1_Register (see RM0089, Reference manual, SPEAr1340 address map and registers)
3.
Triggered by your RC application through the apps_pm_xmt_turnoff bits in CR1_Register (see RM0089,
Reference manual, SPEAr1340 address map and registers)
4.
MsgD not possible on VMI.
Message reception
The PCI express core can receive the following types of messages. The index in the first
column refers to the circled numbers in the following diagrams.
Table 130. Message reception
Index(1)
Message source (type)
EP mode
RC mode
PM_PME
PME_TO_Ack
1
Power management (Msg)
PME_Turn_Off
1a
Slot power limit (Msg)
Set_Slot_Power_Limi
t Support Message.
n/a
2
Error signaling from downstream
component (Msg)
n/a
COR_ERR/ERR_NO
NFATAL/ ERR_FATAL
3
Vendor defined (Msg/MsgD)
4
Locked transaction (Msg)
Unlock message
n/a
5
Legacy PCI interrupts from downstream
devices (Msg)
n/a
See PCI legacy
interrupt in
Section 22.5:
Interrupts.
1. The “Index” referts to the numbers in the previous graphics.
Doc ID 018553 Rev 3
361/590
PCI express controller (PCIe)
RM0078
Figure 118. Message reception: EP mode
0#)E#ORE
!("!8)
!("!8)
BRIDGE
SASTER
2!$-
242'4
280)0%
6ENDORDEFINED
POWERMANAGEMENT
SOMELOCKEDTRANSACTION
0OWERMANAGEMENT
3))-ESSAGES
A
0(9
#80,#ORE
480)0%
0-#
362/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Figure 119. Message reception: RC mode
0#)E#ORE
!("!8)
!("!8)
"RIDGE
-ASTER
2!$-
242'4
280)0%
6ENDORDEFINED
POWERMANAGEMENTSOME
LOCKEDTRANSACTION
ERRORSIGNALING
,EGACY0#)INTERRUPT
0OWERMANAGEMENT
3))-ESSAGES
3)))NTERRUPT
0(9
#80,#ORE
480)0%
0-#
The RADM filter processes every received message and decodes the header before
sending it to the application logic on the System Information Interface (SII). In addition,
power management messages are processed by the PCIe core Power Management
Controller (PMC). By default, all received messages are dropped (serviced internally) and
not passed to the application (through AXI bridge master). To have all decoded messages
also sent to the application interface then the register fields outlined in Table 131 must be
set to “1”.
Doc ID 018553 Rev 3
363/590
PCI express controller (PCIe)
RM0078
Table 131. Controlling the routing of received messages
Register
Filter Mask register 1
Bit
Default value
29
Mask the dropping of non-vendor
messages
0: Drop
1: Do not drop
DEFAULT_FILTER_
MASK_1[13]
0
Mask the dropping of vendor type 0
messages
0: Drop(1)
1: Do not drop
DEFAULT_FILTER_
MASK_2[0] = 0
1
Mask the dropping of non-vendor type 1
messages
0: Drop
1: Do not drop
DEFAULT_FILTER_
MASK_2[1] = 0
Filter Mask register 2
Filter Mask register 2
1.
Function
Vendor TYPE0 messages are dropped with UR error reporting
For the masking (of the dropping) of vendor messages, it is not possible to differentiate
between “Vendor Message without Payload (Msg)” and "Vendor Message with Payload
(MsgD)”.
Note:
See RM0089, Reference manual, SPEAr1340 address map and registers for full details of
the Filter Mask registers.
When a message request is filtered with UR/CA/CRS status, the TLP is always dropped.
Only message requests filtered with SC status, can potentially be forwarded to the
application on AXI bridge master.
22.7.7
Address translation
Address translation is used for mapping different address ranges to different memory
spaces supported by the application. A typical example will map the AMBA memory space
to PCIe memory space.
It can be configured (by software) to implement a customer-defined address (and
TYPE/FORMAT) translation scheme without the need for additional external hardware.
Outbound (TX) features
●
Address match mode operation for MEM/IO/CFG/MSG TLPs. No address translation
for CPL.
●
Supports TYPE translation via TLP TYPE header field replacement for MEM types to
MSG/CFG types.
●
●
364/590
–
This includes translation from posted to non posted (for example, MWr to CfgWr0).
–
No TYPE translation from CPL TLPs.
Programmable TLP header per region for the following fields for TLP field replacement.
–
TYPE / TD / TC / AT / ATTR / MSG Code
–
Function Number (Physical and Virtual).
8 address regions based on programmable registers for location and size.
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
●
Programmable enable/disable per region.
●
Automatic format (FMT) field translation between 3 DW and 4 DW for 64-bit addresses.
●
Invert address matching mode to translate accesses outside of a successful address
match.
●
ECAM configuration shift mode to allow a 256 MB CFG space to be located anywhere
in the 64-bit address space.
●
Supports regions from 64 kB to 4 GB in size.
Inbound (RX) features
●
●
Address Match Mode operation for MEM/IO/CFG/MSG TLPs. No address translation
for CPL. Selectable BAR match mode operation for IO/MEM TLPs.
–
TLPs destined for RTRGT0 (internal CDM or ELBI) will not be translated.
–
TLPs that are not error-free (ECRC, malformed and so on) will not be translated.
Programmable TLP header per region for the following fields for matching.
–
TYPE / TD / TC / AT / ATTR / MSG code
–
Function number (physical and virtual).
●
8 address regions based on programmable registers for location and size.
●
Programmable enable/disable per region.
●
Automatic format (FMT) field translation between 3 DW and 4 DW for 64-bit addresses.
●
Invert address matching mode to translate accesses outside of a successful address
match.
●
Configuration shift mode. Optimizes the memory footprint of CFG accesses destined
for the AXI interface in multi-function devices.
●
Response Code defines the CPL completion status to return for accesses matching a
region.
●
Supports regions from 64 kB to 4 GB in size.
The iATU registers are in the PCIe core port logic register space (See Port logic (PL)
registers (in CDM) on page 342). This may be accessed locally via the DBI interface or via
PCIe configuration accesses. The following registers are used for programming the iATU.
Table 132. Registers used for programming the iATU
Byte offset
Description
+0x200
iATU viewport register
+0x204
iATU region control 1 register
+0x208
iATU region control 2 register
+0x20C
iATU region lower base address register
+0x210
iATU Region upper base address register
+0x214
iATU region limit address register
+0x218
iATU region lower target address register
+0x21C
iATU region upper target address register
Doc ID 018553 Rev 3
365/590
PCI express controller (PCIe)
22.7.8
RM0078
Outbound iATU operation: address match mode
The address field of each request MEM/IO TLP is checked to see if it falls into any of the
enabled(3) address regions defined by the 'Start' and 'End' addresses as defined in
Figure 120: iATU address region mapping: outbound and inbound (address match mode). If
an address match is found, then the TLP address field is modified as follows:
Address = Address - Base Address + Target Address
and the TYPE, TD, TC, AT and ATTR TLP header fields are replaced with the corresponding
fields in iATU Region Control 1 register.
If the application address field matches more than one of the eight address regions, then the
first (lowest of the numbers from 0 to 7) enabled region to be matched is used. If there is no
address match then the address is untranslated. Figure 120: iATU address region mapping:
outbound and inbound (address match mode) provides more details on this translation
process.
Figure 120. iATU address region mapping: outbound and inbound (address match mode)
,IMITADDRESS
E!45 2EGIONNREGISTER
5NTRANSLATED
ADDRESS-AP
2EGIONSIZE
%NDADDRESSn3TARTADDRESS
4RANSLATED
ADDRESSMAP
X&&&&
4HERESULTINGTRANSLATEDADDRESSSPACECAN
BEBITORBIT
2EGIONN
%NDADDRESS
2EGIONN
X
3TARTADDRESS
5PPERTARGETADDRESS
E!45 2EGIONNREGISTER
,OWERTARGETADDRESS
E!452EGIONNREGISTER
X
ISLOG#8?!45?-).?2%')/ .?3):%
,OWERBASEADDRESS
E!45 2EGIONNREGISTER
The upper 32 bits of the target address register will always form the upper 32 bits of the
translated address because:
22.7.9
●
The maximum region size is 4 GB.
●
A region may not cross a 4 GB boundary.
Inbound iATU operations
The main difference between Inbound and Outbound iATU operation is that the TLP TYPE is
never changed in the inbound direction. Instead, the TYPE field is used for more precise
matching. Other fields may also be optionally used to further refine the matching process.
Another difference is that for MEM/IO TLPs, you can select between Address matching (as
used in Outbound Operation) or BAR matching. Normally an End Point (EP) will use BAR
3.
366/590
If the region enable bit of the Region Control register is '0' then that region is not used for address matching.
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
match mode and a Root Complex (RC) will use address mode as an RC normally has no
BAR's implemented.
Lastly, for CFG0 TLPs, you can select between routing ID matching or accept mode.
If there is no match then the address is untranslated. In addition,
●
TLPs destined for RTRGT0 (internal CDM or ELBI) will not be translated.
●
TLPs that are not error-free (ECRC, malformed and so on) will not be translated.
●
Address translation of all TLP types (MEM/IO/CFG/MSG) except CPL is supported in
Address match mode. In BAR match mode only translation of IO/MEM is supported.
IO/MEM match modes
Inbound address translation for IO/MEM TLPs will operate in one of two matching modes as
determined by the 'Inbound match mode' field in the iATU Region Control 2 register.
●
Address match mode
The operation is similar to Figure 22.7.8: Outbound iATU operation: address match mode.
The address field of each request TLP is checked to see if it falls into any of the enabled
address regions defined by the 'Start' and 'End' addresses as defined in . If an address
match is found, then the TLP address field is modified as follows:
Address = Address - Base Address + Target Address
If the TLP address field matches more than one of the eight address regions, then the first
(lowest of the numbers from 0 to 7) enabled region to be matched is used.
Address match mode should always be used to match MSG transactions as these will never
generate a match against a BAR.
●
BAR match mode
Looking for an address match is a two-step process.
The address field of MEM/IO (only) request TLPs is checked by the standard internal PCI
Express BAR matching mechanism to see if it falls into any address region defined by the
enabled BAR addresses and masks.
If a matched BAR was found, then that matched BAR ID is compared by the iATU to the
'BAR Number' field in the iATU Region Control 2 register for all enabled regions. Figure 121:
iATU address region mapping: inbound (bar match mode) provides more details on inbound
translation in BAR match mode. BAR match mode can only be used for MEM/IO
transactions.
Normally an EP will use BAR match mode and an RC will use address match mode - as an
RC normally has no BAR's implemented or at least must handle requests which do not
match any of its BARs. However, the user has the freedom to implement any mode in their
device. For example, an EP device may use address match mode, but should be aware that
if the address range does not match one of its BAR ranges in an EP, the device will reject
the request with Unsupported Request (UR) completion status and no translation will occur.
When the PCIe core is operating with 32-bit BARs, the operation is defined as in Figure 121:
iATU address region mapping: inbound (bar match mode).
Doc ID 018553 Rev 3
367/590
PCI express controller (PCIe)
RM0078
Figure 121. iATU address region mapping: inbound (bar match mode)
5NTRANSLATED
ADDRESSMAP
2EGION3IZESETBYTHE"!2-ASK
OFTHEMATCHED"!2
4RANSLATED ADDRESSMAP
X-ATCHED"!2NUMBER
4HERESULTINGTRANSLATEDADDRESSSPACECAN
BEBITORBIT
2EGIONX
2EGIONX
X
3TARTADDRESS
5PPERTARGETADDRESS
E!45 2EGIONXREGISTER
"!2X
22.7.10
,OWERTARGETADDRESS
E!452EGIONXREGISTER
ISLOG#8?!45 ?-).?2%')/.?3):%
ISDETERMINEDBY"!2X-ASK2EGISTER
Gen2 5.0GT/s operation
The PCIe express core supports all of the non-optional Gen2 5.0 GT/s features defined in
the PCI express 2.0 specification. The core operates at 125 MHz Gen1 rate. When
operating at the Gen2 rate, the core's clock frequency is changed to 250 MHz.
Software configuration of Gen2 5.0 GT/s operation is available through the Gen2 Related
register.
If bit 17 “Directed Speed Change” of the Gen2 Related register is set to '1', then the LTSSM
will initiate a speed change after the link is initialized. The PCIe core changes the rate signal
and waits for a pulse on the phy_mac_phystatus signal to confirm that the PHY has
accepted the requested rate.
22.7.11
Power management
An architectural overview of the power management controller is given in Section 22.6.6:
Power management control (PMC). There are two types of power management operations:
●
Software controlled PCI power management operations
●
Active state power management operation (ASPM) for PCIe device only
The L0s link state is controlled by the ASPM L0s enter condition met state. The L1 link state
is controlled either by the ASPM L1 enter condition met state, or by the D-state (D1, D2, or
D3) of the PCIe device. The D-state of the PCIe device is programmable by software. The
L2/L3 ready state is controlled by D-state and power turn-off event. The power saving of
links in lower power states is greater as the link state numbers get larger. Figure 122:
Relationship of power down states between link partners shows the links states of PCIe
devices and the relationships of power down states between link partners.
368/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
Figure 122. Relationship of power down states between link partners
,SENTERCONDITION
,SEXITCONDITION
METFORDEVICE!
METFORDEVICE !
,INKSTATEOF
0#)EDEVICE!
,)DLE
,INKSTATEOF
0#)EDEVICE"
,S
,)DLE
,SENTERCONDITION
METFORDEVICE!
,)DLE
,S
,SENTERCONDITION
METFORDEVICE"
,S
,)DLE
,SEXITCONDITION METFORDEVICE"
,S
,SENTERCONDITION
METFORDEVICE"
,ENTERCONDITION
METFORDEVICE!
,INKSTATEOF
0#)EDEVICE! ,IDLE
,INKSTATEOF
0#)EDEVICE"
,ENTER
NEGOTIATION
2ECEIVED,
ENTERREQUEST
,IDLE
2ECEIVED
,EXIT
,
2EQUEST
,EXIT
,
,IDLE
,IDLE
,EXITCONDITION
METFORDEVICE"
,ENTERCONDITION
METFORDEVICE!
,INKSTATEOF
0#)EDEVICE!
,INKSTATEOF
0#)EDEVICE"
,IDLE
,ENTER
NEGOTIATION
,IDLE
2ECEIVED,
ENTERREQUEST
,EXITCONDITION
METFORDEVICE!
,
,
2ECEIVED
,EXIT
2EQUEST
,EXIT
,
IDLE
,
IDLE
HIGHESTPOWER
SECONDHIGHESTPOWER
THIRDHIGHESTPOWER
LOWESTPOWER
L0s power down
L0s is a low power state enabled by Active State Power Management (ASPM). ASPM
enabled devices can only control L0s entrance of the transmitter. The receiver L0s is
controlled by the remote devices.
To enter in this state all of the following condition has to be met:
●
ASPM L0s is enabled.
●
L0s enter conditions defined by PCI express specification for a duration of time and
there is no higher stage of power down requested.
●
The timeout value is controlled by the DEFAULT_L0S_ENTR_LATENCY constant
which is set to 4 us.
To exit from this state any of the following conditions should occur:
Doc ID 018553 Rev 3
369/590
PCI express controller (PCIe)
RM0078
1.
Any DLLP or TLP pending to be sent.
2.
L1 enter condition met.
3.
PCIe link partner request to enter into link recovery.
L1 power down
L1 is a power down state enabled either by ASPM or by the software controlled D1, D2 or
D3 state (which is programmed by the system power management unit). L1 state is a bidirectional link power down state. Both link partners must negotiate to go to L1 state.
To enter in L1 state due to ASPM there are three possible scenarios (All conditions met):
Scenario 1: L1 Idle timeout From L0s
1.
ASPM L1 and L0s are enabled.
2.
Link state is in L0s for both transmitter and receiver of the link, and bit 30 of the “Ack
Frequency and L0-L1 ASPM Control register” is set to 0 (default setting) OR Link state
is in L0s of transmitter and bit 30 of the “Ack Frequency and L0-L1 ASPM Control
register” is set to 1.
3.
L1 enter conditions defined by PCIe spec for a duration of time and there is no higher
stage of power down requested.
4.
The timeout value is controlled by the DEFAULT_L1_ENTR_LATENCY constant which
is set to 8 us.
Scenario 2: L1 Idle timeout from L0
1.
ASPM L1 is enabled and L0s is not enabled.
2.
Link state is in L0.
3.
L1 enter conditions defined by PCIe spec for duration of time, and there is no higher
stage of power down requested.
4.
The timeout value is controlled by the DEFAULT_L1_ENTR_LATENCY constant which
is set to 8 us.
Scenario 3: Application controlled
1.
ASPM L1 is enabled.
2.
Application request to enter L1 by asserting signal app_req_entr_l1.
3.
L1 enter conditions defined by PCIe spec is met.
To enter in L1 State due to D1/D2/D3 States (all conditions met)
●
All functions that are programmed to D1, D2 or D3 states.
●
Always enter L1 when L2/L3 PM turn-off negotiation has not yet been done.
To exit from L1 State any of the following condition should be met
●
Software requests a higher stage of power down.
●
Any DLLP or TLP pending to be sent.
●
Application requesting exit of L1 by asserting signal app_req_exit_l1.
●
Link partner requesting exit of L1.
Once L1 has exited, another L1 entry will not be initiated for 10us if the enter L1 condition is
due to ASPM. If the enter L1 condition is due to lower power D-state, the core will enter L1
370/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
again after a wait time of cfg_cpl_sent_count cycles defined in PL register. This wait time to
ensures the exit conditions have been served.
L2/L3 power down
The core has control over the L2 or L3 ready link state. After the L2/L3 ready is entered, the
downstream device will begin preparation for the power and clock removal. After main power
has been removed, the link will transition to L2 if Vaux is provided, or it will transition to L3 if
no Vaux is provided. L2/L3 ready is a bi-directional link power down state.
To enter L2/L3 state all of the following condition should be met:
●
PME_Turn_Off/Pme_To_Ack handshake has been completed at any of D0,D1,D2,D3
states.
●
Application is ready to be turned off by asserting signal app_ready_entr_l23.
To exit from L2/L3 state any of the following condition should be met:
●
Device is programmed with capability to support PME and application requests wakeup
by asserting the apps_pm_xmt_pme signal or by triggering a native hot-plug event
when D-state is in D1, D2 or D3.
●
Link partner requesting exit of L2/L3.
The core supports beacon signaling by asserting signal pm_phy_beacongen or wake when
a wake-up event is initiated by a PCIe device.
Completion timeout ranges
Timeout ranges are supported as defined in the PCI express 2.0 Specification. The Device
Capabilities 2 register (offset 24h) shows support for all ranges. The Device Control 2
register (offset 28h) will have a reset value equal to the default value in the specification:
"0000b Default range: 50 us to 50 ms". If the default value is used then the timeout will be in
"Range B: 0101b: 16ms to 55ms." This range was chosen for the default because the PCI
express 2.0 specification states "It is strongly recommended that the Completion Timeout
mechanism not expire in less than 10 ms." The following table illustrates the specification
values versus the PCI Express core values for the ranges.
Table 133. PCIe core completion timeout ranges versus PCI express specification
Range
Spec
minimum
Encoding
Spec
maximum
PCIe core
minimum
PCIe core
maximum
Default
0000b
50µs
510ms
28ms
44ms
A
0001b
50µs
100µs
65µs
99µs
A
0010b
1ms
10ms
4.1ms
6.2msµs
B
0101b
16ms
55ms
28ms
44ms
B
0110b
65ms
210ms
86ms
131ms
C
1001b
260ms
900ms
260ms
390ms
C
1010b
1s
3.5s
1.8s
2.8s
D
1101b
4s
13s
5.4s
8.2s
D
1110b
17s
64s
38s
58s
Doc ID 018553 Rev 3
371/590
PCI express controller (PCIe)
22.8
RM0078
Programming
Here below, programming sequence to configure PCIe controller is reported (refer to MISC
registers section in RM0089, Reference manual, SPEAr1340 address map and registers):
●
Enable the AXI clock to PCIe (by writing PERIP1_CLK_ENB)
●
Release AXI reset to PCIe (by writing PERIP1_SW_RST)
●
Set the PCIE_SATA_CFG register to work with PCIe and to enable clock and release
power up reset (by writing PCIE_SATA_CFG)
●
Configure PCIe module as an endpoint, a legacy endpoint or root complex PCIe
module (by writing CR0[28:25] bits)
●
Set app_ltssm_enable bit to allow LTSSM to continue Link establishment (by writing
CR0[3] bit)
●
Wait the end of Link Up sequence by polling xmlh_link_up bit of CR3 register (CR3[6])
●
Wait LTSSM in L0 state by polling xmlh_ltssm_state bits of CR3 registers (CR3[4:0] =
0x11 is the expected value)
After this sequence the link is up and ready to start communication. After this Address
Translation can be configured as shown hereafter.
22.8.1
Programming example 1
Define Outbound Region 1 as:
IO region from 0x80000000_d000000 - 0x80000000_d000ffff (64k) mapped to 0x00010000
in PCIe IO space.
1.
Set up the viewport register
Write 0x00000001 to address { 0x700 + 0x200 } to set outbound region 1 as the current
region
2.
3.
4.
Set up the region base and limit address registers
–
Write 0xd0000000 to address {0x700 + 0x20C} to set the lower base address.
–
Write 0x80000000 to address {0x700 + 0x210} to set the upper base address.
–
Write 0xd000ffff to address {0x700 + 0x214} to set the limit address
Set up the target address registers
–
Write 0x00010000 to address {0x700 + 0x218} to set the lower target address
–
Write 0x00000000 to address {0x700 + 0x21C} to set the upper target address
Configure the region via the region control 1 register
Write 0x00000002 to address {0x700 + 0x204} to define the type of the region to be IO.
5.
Enable the region
Write 0x80000000 to address {0x700 + 0x208} to enable the region.
22.8.2
Programming example 2
Define Inbound region 2 as: MEM region matching BAR4 (BAR Match mode) mapping to
0x8000000020000000 in the application memory space.
372/590
Doc ID 018553 Rev 3
RM0078
PCI express controller (PCIe)
1.
Set up the viewport register
Write 0x80000002 to address { 0x700 + 0x200 } to set inbound region 2as the current
region
2.
Set up the target address registers
Write 0x20000000 to Address {0x700 + 0x218} to set the lower target address
Write 0x80000000 to Address {0x700 + 0x21C} to set the upper target address
3.
Configure the region via the region control 1 register
Write 0x00000000 to address {0x700 + 0x204} to define the type of the region to be
MEM.
4.
Enable the region for BAR match mode
Write 0xC0000400 to Address {0x700 + 0x208} to enable the region for BAR match
mode for BAR#4.
Define Inbound Region 0 as: MEM region matching TLPs with addresses in the range
0x00010000 - 0x0005ffff mapped to 0x1000000020000000 - 0x100000002004ffff in the
application memory space
1.
Set up the viewport register
Write 0x80000000 to address { 0x700 + 0x200 } to set inbound region 0 as the current
region
2.
Set up the region base and limit address registers
Write 0x00010000 to address {0x700 + 0x20C} to set the lower base address.
Write 0x00000000 to address {0x700 + 0x210} to set the upper base address.
Write 0x0005ffff to address {0x700 + 0x214} to set the limit address
3.
Set up the target address registers
Write 0x20000000 to address {0x700 + 0x218} to set the lower target address
Write 0x10000000 to address {0x700 + 0x21C} to set the upper target address
4.
Configure the region via the region control 1 register
Write 0x00000000 to address {0x700 + 0x204} to define the type of the region to be
MEM.
5.
Enable the region
Write 0x80000000 to address {0x700 + 0x208} to enable the region in address match
mode.
Doc ID 018553 Rev 3
373/590
Serial ATA controllers (SATA)
23
RM0078
Serial ATA controllers (SATA)
This chapter focuses on SATA functionality and operation.
For the SATA feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
23.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The Serial ATA controller implements the serial advanced technology attachment (SATA)
storage interface for physical storage devices. It is compliant with Serial ATA, AMBA and
AHCI standards:
●
The Serial ATA specifications can be found at the following website:
http://sata-io.org
●
The AMBA specification can be found at the following website:
http://www.arm.com/products/system-ip/amba/amba-open-specifications.php
●
The AHCI specification can be found at the following website:
http://www.intel.com/technology/serialata/ahci.htm
The SATA consists of three main blocks:
●
Bus interface unit (BIU)
●
Generic registers (GCSR)
●
Port
Figure 123. SATA block diagram
PHY
I/F
PHY
I/F
Application clock
RX
clock
TX
clock
Port
DS
FIFO
Link
layer
RX FIFO
Transport
layer
TX FIFO
Port DMA
(PDMA)
Port registers
(PCSR)
DMA
I/F
REG
I/F
Bus
interface
unit (BIU)
BIU
Master
Master
I/F
Port power control module
BIU
Slave
Keep-alive clock
Generic registers
(GCSR)
374/590
Doc ID 018553 Rev 3
Slave
I/F
RM0078
23.2
Serial ATA controllers (SATA)
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
23.3
Clocks
The Serial ATA controller operates in four clock domains:
●
Application clock (ACLK): this is the AXI bus interface unit clock, used for AXI master
and slave interfaces,
●
RX clock (p1_clk_rx): PHY receive clock; synchronous clock used to receive data from
the MIPHY.
Note: According to SATA specifications, this clock must never exceed the TX clock
frequency by more than 350 ppm.
23.4
23.5
●
TX clock (p1_clk_tx): PHY transmit clock; this clock is generated by the PHY for
clocking the Port link and Transport layers (TX clock domain): 37.5 MHz, 75 MHz,and
150 MHz.
●
Power module keep-alive clock(ref_clk): this free-running clock is used by the link layer
power module to facilitate power management. This clock has an allowable range of
20–150 MHz.
Resets
●
ARESET: AXI BIU reset, AXI reset for AXI master and slave interfaces. The application
must reset the BIU interface when it is asserting a reset to the core.
●
Reset_rx: PHY receive clock domain reset, asynchronous power-on reset input for the
RX clock domain.
●
Reset_tx: PHY transmit clock domain reset, asynchronous power-on reset input for the
PHY TX clock domain
●
p1_rst_phy_n: Power module keep-alive clock domain reset, asynchronous power-on
reset input for the power module clock domain.
Interrupts
See Appendix A: Interrupts.
Doc ID 018553 Rev 3
375/590
Serial ATA controllers (SATA)
RM0078
23.6
Functional description
23.6.1
Bus interface unit (BIU)
The bus interface unit provides two AXI interfaces:
●
AXI Master: this interface enables the SATA AHCI DMA engine to read and write to an
AXI slave connected to the AXI BIU.
●
AXI Slave: this interface enables an AXI master to read and write through the AXI BIU
to the SATA AHCI registers.
The Port DMA (PDMA) module implements the following functions, per port functions:
●
Connection to the AXI BIU Master using AXI-specific DMA interface
●
PRD prefetch capability of the SATA core using small PRD FIFO
The PDMA PRD prefetch logic utilizes separate AXI bus IDs for PRD data and DMA data to
enhance performance.
Figure 124 shows a detailed block diagram of the BIU.
Figure 124. Bus interface unit block diagram
Write address
Req
Write data
Request Ch.
Resp
Slave
AXI
GS
Write response
GIF
Core
GS
Read address
BIU
register
read
MUX
Registers
Generic registers
(GCSR)
Register write address & data
Response Ch.
Read response
Read
response
stall
Read resp
Error
response
handling
Write address
Request Ch.
Request Ch.
Write data
Write response
AXI GM
data
converter
Master
AXI
GM
BIU
DMA
arbiter
Read address
Port
Response Ch.
Response Ch.
Response Ch.
Read data
376/590
Request Ch.
Doc ID 018553 Rev 3
RM0078
Serial ATA controllers (SATA)
The AXI BIU Module provides an interface between the DesignWare SATA AHCI IP’s
application interface and the AXI interconnect. It enables a SATA AHCI Host to be
connected to an AXI slave and AXI master, thus enabling SATA-compliant devices to be
connected to the system (when Host IP is combined with a SATA-compliant PHY). This
module includes the following:
●
AXI master and slave protocol handlers
●
Internal slave and master control for generic request and response interfaces
●
Data converters
●
Register read MUX
●
DMA request arbiter
The slave and master protocol handlers support the AXI protocol conversion between an
AXI transfer and a generic transfer within the BIU, which is converted to master and slave
requests and responses. The slave also requires a Read Response Stall module to break
an input/output timing throughpath, as well as an Error Response Handling module.
Feature limitations
The following list identifies the limitations when using the AXI bridge with the SATA core.
●
For a burst transaction, if an external AXI slave is going to respond with DECERR or
SLVERR for any data beats, it must respond with DECERR or SLVERR for all data
beats in the transfer.
●
The AXI slave interface returns register data in order, as data is immediately available.
●
Writes to the AXI Slave must be performed with awid and wid “in-order”. In other words,
the ID for address and data writes must contain the same ID.
●
No support for AXI exclusive transfer.
●
SATA AHCI will always perform DMA transactions with same request IDs for each of
PRD and Data type requests, so that completions of AXI master read requests are
always returned in-order.
●
AXI bus interleave is not supported.
Writes to the AXI Slave must be performed with awid and wid .in-order.. In other words,
the ID for address and data writes must contain the same ID, and the entire write data
must be written (until wlast is asserted) before another master can access the slave.
Read response data to the AXI Master must also not be interleaved. In other words,
once a read response for a particular request has begun, it must complete before rid
can be asserted for a different response.
●
Order enforcement in AXI BIU slave for the register responses data is first come, first
serve only, but data is available within a few clock cycles.
Doc ID 018553 Rev 3
377/590
Serial ATA controllers (SATA)
RM0078
AHCI Core operations
Supported AXI transfer type
The AXI BIU Module is compliant with the AMBA 3.0 AXI specification.
Supported AXI burst operations
●
For AXI Slave transfers (accessing AHCI registers): The AXI BIU supports only the
incremental burst type (INCR) and not the WRAP and FIXED burst types. INCR is used
in conjunction with ARLEN and AWLEN to define any length of burst. If an attempt to
read or write slave addresses beyond the register boundary, an error response is
provided for the beats that are not legal. Addresses that might exist within the AHCI
address range that are not populated or defined, will provide 0x0 response data. No
holes are supported in bursts, including zero-write bytes in any beat.
●
For AXI Master transfers (AHCI DMA transactions): The AXI BIU supports only the
incremental burst type (INCR) and not the WRAP and FIXED burst types. INCR is used
in conjunction with ARLEN and AWLEN to define any length of burst. The application
can configure the master maximum request size that the system slaves can take. If the
slave’s maximum request size is smaller than the AHCI maximum transfer size, then
the AXI BIU will split the DMA request into two or more requests. The responses of the
split requests are required to be returned in-order per the request ID being fixed for
each of PRD and Data type requests. Finally, the AXI Master performs any request that
does not start at an AXI data bus width address boundary, by issuing one request for
data up to the data bus width amount, in order to internally align and optimize all
subsequent requests.
Supported AXI transfer size
The AXI master performs precise DMA writes via write strobes to the intended write request
locations and supports all AXI transfer sizes. DMA requests are limited to a minimum 16-bit
transfer size (see Section 23.6.3: Port). The AXI Master in some cases performs reads that
extend beyond the actual data request. If an AXI bus request length does not end in a busaligned address, the request will be preformed with full bus width beats and the last beat
may over-read up to the full bus data width to complete the request. This excess data is
discarded internally and not written to a connected disk.
The AXI slave supports all sizes and burst lengths of an incremental type, within the AHCI
address space. The AXI BIU slave supports all non-aligned starting addresses for both read
and write register access.
DMA transaction order enforcement through the AXI BIU Master
Order enforcement through the AXI BIU DMA is handled differently for reads and write
transactions.
AXI Master bus write transfers:
The AXI BIU master will use the same ID for all DMA write transactions, thus enforcing
correct order for the write and response. PRDs are not written to the AXI bus.
AXI Master bus read transfers:
The AXI BIU master will use the same ID for all DMA read transactions, and the same
ID for all PRD read transactions (per AHCI port), though they differ between the two.
This enforces correct order on the returned data. It is not possible for there to be an
ordering issue between PRDs and Data, because Data requests are the result of
particular PRD read requests.
378/590
Doc ID 018553 Rev 3
RM0078
Serial ATA controllers (SATA)
AXI Slave bus transfers:
The AXI BIU slave will return all data and response in order., responding with the
appropriate ID for all read and write transactions. All register data is available within a
few clock cycles, and is handled first come first served.
Note:
The AXI Master ID is an encoded version of the Port’s one bit ID (Data/PRD) and the Port
number, the width is determined by the number of ports. The AXI Slave ID is set via the max
number of Masters in the system.
AXI write with data gaps (“Holes”)
The AXI Master will never perform write access with “holes” (writes only, reads N/A).
The AXI slave allows for non-contiguous write byte enables such as 4’b0101, as long as AXI
protocol is followed. Reads must always be performed on contiguous data, as the AXI does
not have control over individual bytes via strobes.
Maximum AXI transfer burst length
The SATA AHCI core is programmed to support certain maximum transfer sizes. The AXI
BIU supports sequential burst transfers with a maximum burst length determined by the AXI
interconnect. The AXI BIU can support more than the traditional 16-beat maximum burst
length, such that bursts up to 4KB may be supported. The maximum AXI transfer length is
limited by the data bus width, the request address, and the maximum AXI burst length, in
beats. The AXI BIU supports mismatches that occur when the AXI maximum transfer length
is different than the AHCI DMA maximum request size. The DMA engine automatically
accounts for the difference and limits the requests such that the maximum AXI request size
is not exceeded, along with making sure no 4K boundaries are crossed in any single
transfer. The application must define its system requirements such as maximum DMA
request size and maximum AXI burst length. There are resources allocated within the AXI
BIU which are set based on these system parameters.
23.6.2
Generic registers (GCSR)
This module implements the registers present in the companion document: RM0089,
Reference manual, SPEAr1340 address map and registers for a detailed register
description.
23.6.3
Port
The Port instantiates the following modules:
●
Port DMA
●
Port registers
●
Transport layer
●
Link layer
●
Port power control module
Doc ID 018553 Rev 3
379/590
Serial ATA controllers (SATA)
RM0078
Port DMA
The port DMA (PDMA) module implements the following functions:
Note:
●
Monitors commands posted by system software using P#CI register. When any of the
command slots becomes active, PDMA downloads the corresponding Register FIS
from the Command List structure and passes it to the Transport Layer TxFIFO for
transmission to the Device.
●
Controls data transfer between the Transport Layer FIFOs and system memory using
Physical Region Descriptor Tables (PRDT).
●
During Data FIS reception, PDMA requests AMBA write transfer of P#DMACR.RXTS
size from the BIU Master when RxFIFO contains data of at least this size.
●
During Data FIS transmission, PDMA requests AMBA read transfer of
P#DMACR.TXTS size from the BIU Master when TxFIFO contains space of at least
this size.
PDMA requests read transactions of P#DMACR.TXTS size regardless of the TxFIFO space
up to the PRD or Data FIS limit (whatever is smaller). When the read data is returned by the
AXI BIU Master, the data flow is controlled by the PDMA-BIU interface.
●
Transfers non-Data FISes received from the device to system memory using Received
FIS Structure.
Most of the communication between the PDMA and software is done using two system
memory descriptors that are constructed by software prior to initiating the transfer: FIS
descriptor, which contains FISes received from the device, and the other is the Command
List, which contains a list of 1 to 32 commands available for the Port to execute and the
pointers for data transfers. Some additional communication is done via registers located in
the GCSR and PCSR modules.
System memory structures are described in the SATA AHCI specification.
The PDMA module operates in the application clock (aclk) domain and has 32-bit-wide data
path.
Port registers
The port registers (PCSR) module implements all port-specific registers:
●
Command list and FIS base addresses
●
Interrupt status/ enable
●
Port command/ status
●
Task file data/ signature/ serial ATA
●
DMA status/control
●
PHY status/control
Transport layer
The transport layer functional block diagram is shown in Figure 125. The transport Layer
consists of the following five main modules:
380/590
●
Receive FIFO (RxFIFO)
●
Transmit FIFO (TxFIFO)
●
Transport check module (TCHK)
●
Transport state machine module (TSM)
●
Synchronization module (APP_ASIC)
Doc ID 018553 Rev 3
RM0078
Serial ATA controllers (SATA)
Figure 125. Transport layer functional block diagram
TX Clock Domain (clk_asic#)
33
Receive FIFO
FIFO flags
RX Data[32:0]
(RxFIFO)
FIFO pop request
Data valid
Link Layer (LL)
RX Control
Transport
Check FIFO push
request
RX Data[32]
RX Data[10:0]
(TCHK)
Transport
Errors
Link/PHY Errors
RX Control
Transport
State
Machine
(TSM)
Transport Layer Interface
PHY/ Power Management
Sync
Module
(SYN)
DMA Control
TX Control
Port DMA (PDMA)
FIFO almost full
RX Data[31:0]
Application Clock Domain (clk_app)
PHY/ Power Management
Link/PHY/Transport Errors
TX Control
TX Data [32:0]
FIFO pop request
Transmit FIFO
(TxFIFO)
FIFO flags
TX Data[32:0]
FIFO push request
The transport layer operates in two clock domains: transmit and application. Transmit clock
is generated in the PHY and depends on the Link Layer data path width (valid frequency
values are: 37.5 MHz, 75 MHz, and 150 MHz). The application clock is sourced from the
system bus and depends on the software. Both transmit and receive data paths are 32-bit
wide.
The Transport Layer block provides FIS reception and transmission functions of the SATA
Transport Layer. During reception the Transport Layer receives a new FIS from the link layer
through the RxFIFO, decodes the FIS type, and instructs the PDMA to route the FIS payload
data to the appropriate location in system memory. During transmission the Transport Layer
instructs the PDMA to construct the appropriate FIS, and then passes it to the Link Layer
through the TxFIFO. The transport layer block receives all the PHY/Link errors from the Link
Layer, detects Transport errors, and passes them to the PCSR for setting the corresponding
error bits.
The Transport Layer processes one FIS at time on the transmit side, meaning only one FIS
is allowed in the TxFIFO at a time. On the receive side, RxFIFO can potentially contain more
than one FIS at a time. For example, when the device transmits several DMA Data FISs
back-to-back with minimal delay, RxFIFO might still have the previous Data FIS while the
next FIS is being received.
Doc ID 018553 Rev 3
381/590
Serial ATA controllers (SATA)
RM0078
Transport layer FIS reception
The FIS reception process is described as follows:
●
The Link Layer starts frame reception and passes FIS content to the transport layer
THCK. RxFIFO “almost full” flag notifies the link layer to send HOLDp to the device to
prevent RxFIFO overflow. Upon detecting EOFp, the link layer asserts an “End status”
signal to indicate the end of the FIS. All link layer/PHY errors are valid at this time.
●
THCK module checks for transport layer protocol errors, passes FIS data to the
RxFIFO, then appends “End status” DWORD at the end with all the link/PHY and
transport errors.
●
TSM module receives the FIS from the RxFIFO and passes it to the PDMA/PCSR.
When any of the Link/PHY/Transport errors is detected, then the FIS is either ignored
(when non-Data FIS) or the transfer is aborted (when Data FIS) and the corresponding
bits are set in the P#SERR register.
Transport layer FIS transmission
The FIS transmission process is described as follows:
Note:
●
The PDMA detects a request from the system software and notifies the TSM to enter a
transmit state. The DMA data transmission is activated by the TSM after it receives
DMA Activate FIS from the device.
●
The PDMA receives the appropriate FIS from BIU Master and pushes it into the
TxFIFO. The following FIS types are supported:
–
Register FIS - Control or Command type.
–
Data FIS - PIO or DMA type.
–
BIST Activate FIS
●
The link layer uses negation of the TxFIFO .empty. flag to generate SOFp and begin
frame transmission. Bit 32 of the TxFIFO is used to indicate the FIS “last DWORD” to
the Link Layer. When the Link Layer sees this bit valid, it closes the frame with CRC
and EOFp.
●
The TSM waits for either positive or negative frame transmission acknowledgement
from the Link Layer (Link Layer “handshake” error). Both of these conditions are
passed from Link to TSM in the “End Status” DWORD. Negative acknowledgement is
generated when the device detects an error during the frame reception and signals it to
the host Link Layer. In this case any non-data FIS is resent to the device using
Transport Layer retry logic. When the error is detected during Data FIS transmission,
then this transfer is aborted and the FIS is not resent.
When neither positive nor negative acknowledgement is received from the Link Layer
following frame transmission, host s/w times-out and resets the interface.
Receive/Transmit FIFO (RX/TxFIFO)
Both receive and transmit FIFOs are used as temporary FIS buffers and for clock domain
crossing.
The RxFIFO width is 33 bits: 32 bits are used to transfer data and the 33rd bit is used to
indicate the End- Status DWORD so the Transport Layer can detect the end of the previous
FIS and the start of the next FIS in the situation when more than one FIS is in the RxFIFO.
The TxFIFO is 33 bits wide: 32 bits are used to transfer data and 33rd bit is used to indicate
the last FIS DWORD to the Link Layer. Both FIFOs are reset on power-up either by the
system bus reset signal, by the software setting SControl register bit 0 (COMRESET), or by
the COMINIT condition.
382/590
Doc ID 018553 Rev 3
RM0078
Serial ATA controllers (SATA)
Based on the system bus software requirements, a value of 1024 (2048 DWORDS) was
selected for FIFO. An RxFIFO "almost full" flag is set to comply with the SATA HOLDp
latency requirement: the Link Layer sends HOLDp on the back channel when this flag is
asserted to prevent RxFIFO overflow.
Data is read from the RxFIFO or written into the TxFIFO by the PDMA when there is enough
data in the RxFIFO or room in the TxFIFO for a given DMA transaction size.
Transport check (TCHK)
The TCHK module provides the following functions:
●
Detects new FIS reception by the Link Layer based on the received control signals.
●
Decodes the FIS type located in the least-significant byte of the first DWORD and
checks its validity. The following FIS types are supported:
–
Register FIS
–
Set Device Bits FIS
–
PIO Setup FIS
–
DMA Activate FIS
–
DMA Setup FIS
–
Data FIS
–
BIST Activate FIS
–
Unknown FIS (length is less than or equal to 64 bytes)
●
Checks for all the Transport Layer errors (unrecognized FIS, protocol, transition, etc.).
●
Detects an “End Status” signal assertion indicating the end of the current FIS from the
Link Layer and passes all Link Layer/PHY/Transport Layer errors to the RxFIFO and to
the PCSR module.
●
The TCHK provides “Good FIS/Bad FIS” status acknowledgement to the Link Layer at
the end of the received FIS.
The TCHK module receives 32-bit FIS DWORD data from the Link Layer and adds one bit
(bit 32) before writing it to the RxFIFO. This bit indicates either FIS data, when cleared, or
.End Status. DWORD, when set. The following Transport Layer errors are checked in the
TCHK (assuming no errors were detected in the Link/PHY):
Doc ID 018553 Rev 3
383/590
Serial ATA controllers (SATA)
1.
RM0078
FIS length:
–
Non-data FIS according to the FIS type
–
Data FIS should be between 2 and 2049 DWORDs
–
Unknown FIS should be between 1 and 16 DWORDs
2.
PIO Setup FIS transfer count - should be non-zero and even byte count and not exceed
8192 bytes
3.
PIO Data FIS following the PIO Setup FIS with D=1 (PIO read) DWORD count - should
match the transfer count
4.
PIO read protocol FIS sequence - only Data FIS or end status when error are expected
after the PIO Setup FIS with D=1, any other FIS would be negatively acknowledged to
the Link Layer
5.
DMA Setup FIS buffer offset - bits 0 and 1 should be cleared and transfer count should
be an even (not zero) number
6.
First Party DMA read protocol - DMA Setup FIS with D=1 is followed either by Data FIS
or Set Device Bits FIS or end status when error
7.
First Party DMA write protocol - DMA Setup FIS with D=0 is followed by DMA Activate
FIS (when A=0) or Set Device Bits FIS or end status (when A=1)
8.
BIST Activate FIS is supported type only
9.
RxFIFO push error for Data FIS - detected when Link has valid data and RxFIFO is “ful”
(for example, device violates HOLD latency requirement)
The Transport Transition Error P#SERR.DIAG_T bit is set when errors 1.8 are detected. The
Unknown FIS P#SERR.DIAG_F bit is set when the Unknown FIS length does not exceed 64
bytes. The Protocol Error P#SERR.ERR_P bit is set on detection of error 9.
Transport state machine (TSM)
The TSM module provides the following functions:
●
Implements the host Transport Layer state machine according to the SATA spec with
the exception of the FIS checking and error handling functions.
●
Decodes the FIS type by reading the least-significant-byte of the first DWORD of the
FIS.
●
Detects the “End status” DWORD and checks for any Link Layer/PHY/Transport Layer
errors. When any of the errors is detected:
–
On a non-data FIS, the received FIS is discarded, the transmitted FIS is retried
indefinitely, and the corresponding P#SERR register ERR_I bit is set.
–
On a data FIS, it can be passed to the system memory before the final status is
reflected in the P#SERR register ERR_T bit.
●
Generates/receives the appropriate control signals to/from the PDMA based on the
received FIS and its state.
●
Handles transfer termination requests originated from the Link Layer or PDMA module.
Sync module (APP_ASIC)
This module is used to synchronize several control signals between the Link Layer and the
Transport Layer clock domains.
384/590
Doc ID 018553 Rev 3
RM0078
Serial ATA controllers (SATA)
Link layer
The Link layer functional block diagram is shown in Figure 126.
Figure 126. Link layer functional block diagram
RX
Data
RX PHY Control
Signal Decode
8b10b Decoding
Data Alignment
clk_rbc#/
clk_asic0
Synch
(optional)
Data
Conversion
Descrambler
Repeat
Primitive
Drop
Deframer
BIST
Data
Checker
RX OOB
Detection
(optional)
SigDetect
TX OOB
Generation
(optional)
Main
Link
State
Machine
Data
Converter
8b10b
Encoding
Shared
Data and
RPD
Scrambler
RX
Data
Error
Results
PHY/Link
Initialization
State Machine
BIST
Data
Generation
Main link module
TX
Data
RX
CRC
Check and
Output
Register
Framer
CRC
Calculator
TX
Data
On power-up, system reset or device hot-plug, the following sequence occurs:
1.
The Link Layer transmits sequences of control data and ALIGN Primitives to the PHY.
2.
They are then forwarded to a device PHY as OOB signaling.
3.
In addition, the Link Layer detects OOB sequences.
These OOB sequences bring the host controller, PHY, and device to an initialized condition.
Once this occurs:
Doc ID 018553 Rev 3
385/590
Serial ATA controllers (SATA)
RM0078
1.
The Link Layer passes a PHY Ready status to the Transport Layer and normal
communication begins.
2.
The Link Layer receives requests from the Transport Layer to transmit data, in the form
of a Frame Information Structure (FIS) comprised of DWORDs, to a device via the local
PHY.
3.
The Link Layer in turn transmits the FIS by inserting Primitives, scrambling and
optionally encoding the data, sending it to the PHY and waiting for status.
4.
When a status FIS is received, the Link Layer optionally decodes, aligns and descrambles the data, removes Primitives and forwards the data to the Transport Layer.
5.
The Link Layer then notifies the Transport Layer of the ending transfer status. The Link
Layer has no notion of the FIS content, other than its beginning and end points and
CRC.
6.
Data alignment is performed on received FIS data via ALIGN Primitives. Flow control is
also achieved on FIS going in either direction via HOLD Primitives.
7.
In addition, the Link Layer receives requests from the Transport and PHY Layers to go
into and out of power management modes.
Power management is achieved by notifying the PHY of a partial or slumber condition
and then disabling normal data transmission on PHY RX and TX interfaces until a
wake-up request from Transport Layer or remote device via the PHY is seen from the
Power Control Module. Power management is controlled via Partial and Slumber
requests as described in the SATA specifications.
The Initialization State Machine controls the Link Layer, PHY and device system
initialization. The main Link Layer State Machine controls FIS traffic, flow control and error
detection and status reporting. FIS traffic is generated and disassembled via Framer and
Deframer modules. The Link Layer also performs CRC calculations on FIS, as well as
scrambling and optionally encoding the data.
●
Optional decoding of received FIS is performed in the Rx clock domain due to the fact
that the incoming FIS is on an asynchronous, but frequency locked clock of the same
rate as the Tx clock domain.
●
8b/10b encoding and decoding are performed in the Link Layer.
The Link Layer receives data on either Rx clock, recovered from the incoming data stream
by the PHY, or on Tx clock. This single receive clock is then used in this module to decode
data and control signals from the PHY and pass it to the rest of the Link Layer. Data is
passed through a synchronizing Datastream FIFO. ALIGN Primitives are also detected and
dropped in the front end of the receiver as a means of guaranteeing no Datastream FIFO
overruns, when a Datastream FIFO is included. ALIGN Primitives are also used to
synchronize to the data stream in the PHY by triggering data realignment where necessary.
Finally, ALIGNs are required by the TX OOB initialization state machine to complete
initialization, following the SATA specifications. For this reason, the PHY must indicate the
presence of at least two ALIGNs after the Link Layer detects the release of COMWAKE.
Otherwise, the Link Layer is not able to complete initialization and begin normal operation.
This is required regardless whether the PHY drops ALIGNs at any other time.
Note:
386/590
Even if the PHY drops ALIGNs, data indicating the comma character must be present on
phy_rx_data, in the corresponding phy_comma_det slot. This is required to invalidate
comma characters before they are stable.
Doc ID 018553 Rev 3
RM0078
Serial ATA controllers (SATA)
Link layer features
The SATA Link Layer features are as follows:
●
Highly configurable PHY interface with selectable data widths
●
Optional RX Data Buffer for recovered clock systems
●
Optional OOB signaling and system Initialization
●
1.5 Gb/s and 3.0 Gb/s speed negotiation when TX OOB signaling is selected
●
Frame negotiation and arbitration
●
Envelope framing/deframing
●
CRC calculating, insertion and checking
●
Optional 8b/10b encoding/decoding
●
Flow control
●
Frame acknowledgement and status reporting
●
Data width conversions
●
Data scrambling/de-scrambling for EMI reduction
●
Repeat Primitive data transmission and reception handling
●
ALIGN Primitive detection, dropping and data alignment
●
Power management support
Configurable PHY interface
Many of the SATA features are detailed in the SATA specifications, and are not repeated
here.
RX and TX Data
The SATA PHY interface data width is 20 bits for both RX and TX (16+4 for the 8b/10b
encoding).
The Link Layer can receive data on a clock recovered from the incoming data stream (Rx
clock), or the data can already be synchronized into the Tx clock domain. When data is
presented to the Link Layer on a recovered clock, the Link Layer synchronizes data into the
Tx clock domain via a Datastream FIFO.
Port power control module
The port power control module (PCM) implements the following functions:
●
Monitors Transport, Link and PHY ready/not ready conditions, as well as Device and
Host power requests.
●
Systematically controls the Link and Transport Layer transitions into and out of offline
conditions (system reset, COMRESET and power modes).
●
Allows Tx clock and Rx clock to be stopped during Slumber and Partial power modes.
The PCM main function is to allow disabling Tx clock and Rx clock in SATA power down
modes.
Note:
If Tx clock or Rx clock are stopped, Near End Analog Loopback mode is not supported
when a device is connected to the system. Therefore, it is recommended to only stop clocks
in Slumber mode, in order to support Near End Analog Loopback mode when a device is
connected.
In order to support Host-initiated power modes where Rx clock and Tx clock are removed,
the PMACK received from the Device must be able to make it through the Rx clock domain,
Doc ID 018553 Rev 3
387/590
Serial ATA controllers (SATA)
RM0078
synchronization, and the Link Layer Tx clock domain RX Data path to the Link state
machine, before the clocks can be removed. The SATA specifications allow a Device to
transmit 4 to 16 PMACKs before going into power down. While 16 PMACKs are enough to
guarantee receipt by the Link state machine, 4 are not. In the cases where a Device does
not send enough PMACKs, the clocks will need to be kept running long enough for the Link
state machine to detect the PMACK, or the Host will not go completely into power down
mode and a Host COMRESET would be required to exit the failed power mode.
Figure 127 shows a high-level state diagram of the power control module.
Figure 127. Port power control module diagram
Rx clock domain
RX Data from
PHY
Rx clock/Tx clock
synchronization
data stream FIFOs
RX Front
End
Rx OOB detection
clock domain
sigdet
Tx clock domain
RX OOB
Detect
RX
Note: Regardless of whether OOB
detection is in the PHY or SATA,
the power control module always uses
COMWAKE and COMINIT.
Link layer
COMWAKE/COMINIT
TX data
to PHY
Tx clock
wake-Up
Tx clock
partial slumber
Tx clock power
mode request
Note: There are clock-crossing
synchronizers on all power control
module I/O
Power control module
(always-alive clock domain)
PHY Slumber
Once asserted, PHY can remove Tx clock
Systematically controls power-down and
wake-up, allowing the PHY to remove
clk_asic.
Power mode request and enable
from transport layer
Wake-up
from transport layer
Partial/Slumber
to transport layer
The power control module exists in the ’always alive’ power module keep-alive clock
domain. The power module keep-alive clock must always be present and must never
change frequency. All signals into and out of the PCM are synchronized between the power
module keep-alive clock, Tx clock, Rx clock, and the application clock (aclk) clock domains
with one or more of synchronizers. The Power Control Module serves to assure all SATA
Layers and the PHY move correctly between inactive and active states in unison.
Note:
388/590
Within the core there is no difference between going into and out of Partial and Slumber
power modes, even if a system disables Tx clock and Rx clock in one mode, but not the
other. Clocks do not have to be removed in either mode.
Doc ID 018553 Rev 3
RM0078
Serial ATA controllers (SATA)
23.7
Programming
23.7.1
Software initialization
The SATA software initialization consists of two independent phases: a firmware phase
(platform BIOS) and a system software phase. This section contains the following topics:
●
Firmware specific initialization
●
System software specific initialization
Firmware specific initialization
The firmware initialization is done on power-up. The following registers should be initialized
to values that reflect the capabilities supported by the platform:
Note:
●
CAP.SSS= support for staggered spin-up
●
CAP.SMPS= support for mechanical presence switches
●
PI= ports implemented
●
P#CMD.HPCP= whether the Port is hot plug capable. The P#CMD.HPCP should be set
to 1 when P#CMD.MPSP or P#CMD.CPD is set to 1 for the Port.
●
P#CMD.MPSP= whether mechanical presence switch is attached to the Port.
●
P#CMD.CPD= whether cold presence detect logic is attached to the Port.
Firmware should initialize the HPCP, MPSP, and CPD bits for each port implemented on the
platform as defined by the PI register.
After firmware has initialized the above mentioned registers, it should then perform the
following steps to complete the staggered spin-up process (when applicable to the platform)
on each port implemented (as indicated by the PI register):
1.
Ensure that P#CMD.ST=0, P#CMD.CR=0, P#CMD.FRE=0, P#CMD.FR=0, and
P#SCTL.DET=0.
2.
Allocate memory for the command list and the FIS receive area. Set P#CLB and
P#CLBU to the physical address of the allocated command list. Set P#FB and P#FBU
to the physical address of the allocated FIS receive area. Then set P#CMD.FRE to 1.
3.
Initiate a spin-up of the SATA drive attached to the Port by setting P#CMD.SUD to 1.
4.
Wait for a positive indication that a device is attached to the Port (the maximum time to
wait for presence indication is specified in the Serial ATA specification). This is done by
polling P#SSTS.DET.
When P#SSTS.DET returns a value of 1h or 3h when read, then the firmware should
continue to the next step, otherwise when polling process times out, it moves to the
next implemented Port and returns to Step 1.
5.
Clear the P#SERR register by writing ones to each implemented bit location.
6.
Wait for indication that SATA drive is ready. This is determined through examination of
P#TFD.STS. When P#TFD.STS.BSY, P#TFD.STS.DRQ, and P#TFD.STS.ERR are all
0, prior to the maximum allowed time as specified in the ATA/ATAPI-7 specification, the
device is ready.
System software specific initialization
Software may perform the SATA global reset prior to initializing by setting GHC.HR to 1
when desired. When firmware (BIOS) already allocated memory and initialized the
Doc ID 018553 Rev 3
389/590
Serial ATA controllers (SATA)
RM0078
appropriate registers for the command list and FIS receive area, the software may skip this
step in the process.
Following is the list of steps for system software to place the SATA into a minimally initialized
state:
Note:
Note:
1.
Determine which ports are implemented by the SATA, by reading the PI register. This
bit map value aids the software to determine how many ports are available and which
Port registers need to be initialized.
2.
Ensure that the SATA is not in the running state by reading and examining each
implemented Port.s P#CMD register. When P#CMD.ST, P#CMD.CR, P#CMD.FRE and
P#CMD.FR are all cleared, the Port is in an idle state. Otherwise, the Port is not idle
and should be placed in the idle state prior to manipulating the SATA global and Port
specific register. System software places a Port into the idle state by clearing
P#CMD.ST and waiting for P#CMD.CR to return 0 when read. Software should wait at
least 500ms for this to occur. When P#CMD.FRE is set to 1, software should clear it to
0 and wait at least 500ms for P#CMD.FR to return 0 when read. When P#CMD.CR or
P#CMD.FR do not clear to 0 correctly, then software may attempt a Port reset or a
global reset to recover.
3.
Determine how many command slots the HBA supports, by reading CAP.NCS.
4.
For each implemented Port, system software should allocate memory for and program:
–
P#CLB and P#CLBU (when CAP.S64A is set to 1)
–
P#FB and P#FBU (when CAP.S64A is set to 1)
It is good practice for system software to zero-out the memory allocated and referenced by
P#CLB and P#FB. After setting P#FB and P#FBU to the physical address of the FIS receive
area, system software should set P#CMD.FRE to 1.
5.
For each implemented Port, clear the P#SERR register, by writing ones to each
implemented bit location.
6.
Determine which events should cause an interrupt, and set each implemented Port.s
P#IE register with the appropriate enables. To enable the SATA to generate interrupts,
system software must also set GHC.IE to 1.
Due to the multi-tiered nature of the SATA interrupt architecture, system software must
always ensure that the P#IS (clear this first) and IS.IPS (clear this second) register are
cleared to ‘0’ before programming the P#IE and GHC.IE registers. This prevents any
residual bits set in these registers from causing an interrupt to be asserted.
Software should not set P#CMD.ST to 1 until it is determined that a functional device is
present on the Port as determined by P#TFD.STS.BSY, P#TFD.STS.DRQ,
P#TFD.STS.ERR bits all cleared, and P#SSTS.DET=3h. To enable the P#TFD register to be
updated with the initial Register FIS for a Port, the P#SERR.DIAG_X bit must be cleared to
0.
390/590
Doc ID 018553 Rev 3
RM0078
23.7.2
Serial ATA controllers (SATA)
Software manipulation of Port DMA
This section contains the following topics:
●
Start (P#CMD.ST)
●
FIS Receive Enable (P#CMD.FRE)
Start (P#CMD.ST)
When P#CMD.ST is set to 1, software is not allowed to perform the following actions:
●
Manipulate P#CMD.POD to power on or off a device through cold presence detect logic
(when supported by the platform and enabled in the SATA);
●
Manipulate P#SCTL.DET to change the PHY state;
●
Manipulate P#CMD.SUD to spin-up the device (when supported by the platform)
The above actions are only allowed while the Port is in the Not Running state, indicated by
both P#CMD.ST and P#CMD.CR being 0.
Software should set P#CMD.ST only after the following conditions become true:
●
P#CMD.CR is verified to be cleared to .0. and P#CMD.FRE has been set to 1;
●
A functional device is present on the Port (as determined by P#TFD.STS.BSY=0,
P#TFD.STS.DRQ=0, and P#SSTS.DET=3h) and P#CLB/P#CLBU are programmed to
valid values.
FIS Receive Enable (P#CMD.FRE)
When P#CMD.FRE is set (causing P#CMD.FR to be set to 1), the Port receives FISes from
the devices and copies them into system memory. When P#CMD.FRE is cleared (causing
P#CMD.FR to be cleared to 0), received FISes are held in the RxFIFO, and when it is full,
further FIS reception is blocked.
Software is allowed to manipulate P#CMD.FRE so that it may move the FIS receive area to
a new location. When this bit is cleared to 0, software must first wait for P#CMD.FR to clear
to 0, indicating that the Port DMA engine for FIS reception is in an idle condition. When
P#CMD.FR and P#CMD.FRE are both cleared to 0, software may update the values of
P#FB and P#FBU. Prior to setting P#CMD.FRE to 1, software should ensure that P#FB and
P#FBU are set to valid values. Software should not write P#FB/P#FBU while P#CMD.FRE is
set to 1.
Software should set P#CMD.FRE to 1 prior to setting P#CMD.ST to 1. Software should not
clear P#CMD.FRE while P#CMD.ST or P#CMD.CR is set to 1.
Upon global or Port reset, the P#CMD.FRE bit is cleared. The D2H Register FIS containing
the device signature is accepted by the Port, and the signature field is updated.
When the SATA Port stops running due to an error (e.g., P#IS.IFS is set to 1), FISes may not
be posted until the P#CMD.ST bit is cleared to 0 to recover from the error.
Doc ID 018553 Rev 3
391/590
SATA/PCIe physical interface (MiPHY)
24
RM0078
SATA/PCIe physical interface (MiPHY)
This chapter focuses on MiPHY functionality and operation.
For the MiPHY feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
24.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The MiPHY macrocell implements the lower (physical) layer protocols providing data
transmission and reception over a dual differential pair cable. The TX (transmit) and RX
(receive) serial channels operate plesiochronously (NRZ). The macrocell can be used in
Host or Device applications.
Figure 128. MiPHY application diagram
Serial data
over copper
(cable, PCB)
24.2
MiPHY
System clock (Device)
MiPHY
Controller
Bus
Controller
20 bits
Bus
System clock (Host)
System 2
Device
20 bits
System 1
Host
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
392/590
Doc ID 018553 Rev 3
RM0078
24.3
SATA/PCIe physical interface (MiPHY)
Clocks
The macrocell has an embedded PLL that can be configured to use either internal or
external reference clocks. The PLL internal divider can be programmed through dedicated
registers in the miscellaneous module.
24.3.1
●
ref_clk: reference clock for the internal PLL
●
p1_clk_tx: Serializer output clock. It feeds PCIe or SATA controller tx path, depending
on pcie_sata_sel value (see miscellaneous register PCIE_SATA_CFG ). This clock is
asynchronous with respect to ACLK clock (PCIe/SATA AXI interface clock).
●
p1_clk_rx: Deserializer output clock. It is asynchronous with respect to ACLK clock.
Reference clock configuration
The reference clock has a direct impact on the frequency accuracy. Its precision must be in
line with the respective specifications (SATA/PCI Express). The frequency of the reference
clock impacts also the calibration time. The reference clock may be configured as described
in Table 134.
Whatever the PLL reference clock, the macrocell provides it on the p1_clk_osc pin.
Figure 129 shows the reference clock selection circuitry. To select the reference clock, you
must configure the miscellaneous register PCIE_MIPHY_CFG.
Table 134. p1_clk_osc selection truth table
p1_osc_
bypass
0
1
X
X
p1_osc_
force_ext
0
0
1
1
osc_ext_
sel
pll_ref_div
[1:0]
p1_clk_osc
00
Crystal reference clock
01
Crystal reference clock divided by 2
10
Crystal reference clock divided by 4
11
Crystal reference clock divided by 8
00
Differential external reference clock
01
Differential external reference clock divided by 2
10
Differential external reference clock divided by 4
11
Differential external reference clock divided by 6
00
Internal SoC reference clock (clk_pll_ref_zi)
01
Internal SoC reference clock (clk_pll_ref_zi) divided by 2
10
Internal SoC reference clock (clk_pll_ref_zi) divided by 4
11
Internal SoC reference clock (clk_pll_ref_zi) divided by 6
00
Internal SoC reference clock (clk_pll_ref_2V5_zi)
01
Internal SoC reference clock (clk_pll_ref_2V5_zi) divided by 2
10
Internal SoC reference clock (clk_pll_ref_2V5_zi) divided by 4
11
Internal SoC reference clock (clk_pll_ref_2V5_zi) divided by 6
X
X
0
1
Doc ID 018553 Rev 3
393/590
SATA/PCIe physical interface (MiPHY)
RM0078
Figure 129. Reference clock selection circuitry
Crystal or differential clock
xtal1 xtal2
Macrocell
PLL
Oscillator
2
qdiff
Oscillator
circuit
VCO
/x*
Divider
p1_osc_bypass
PLL high
frequency
clock
clk_osc_2v5_zo
clk_osc_2v5_nzo
ckout
p1_clk_osc
p1_osc_force_ext
osc_ext_sel
clk_pll_ref_2v5_zi
clk_pll_ref_2v5_nzi
clk_pll_ref_zi
clk_pll_ref_nzi
clk_osc_zo_en
24.3.2
pll_ref_div[1:0]
fref
clk_osc_zo
clk_osc_nzo
Recommended clock frequencies
Default configuration:
●
PCIe selected
●
PLL reference clock @100 MHz
●
PLL output clock @ 2.5 GHz
●
p1_clk_tx and p1_clk_rx @125 MHz when in gen1, and @250 MHz in gen2
Configuration for SATA selection:
●
SATA selected (by configuring pcie_sata_sel)
●
PLL reference clock @25MHz
●
PLL ratio set to 0x78 (PCIE_MIPHY_CFG[7:0]), PLL output clock @ 3 GHz
●
p1_clk_tx and p1_clk_rx @75 MHz when in gen1, and @150 MHz in gen2
–or–
394/590
●
SATA selected (by configuring pcie_sata_sel)
●
PLL reference clock @100MHz
●
PLL ratio set to 0x3C (PCIE_MIPHY_CFG[7:0]), PLL output clock @ 3 GHz
●
p1_clk_tx and p1_clk_rx @75 MHz when in gen1, and @150 MHz in gen2.
Doc ID 018553 Rev 3
RM0078
24.3.3
SATA/PCIe physical interface (MiPHY)
SerDes clocks
The SerDes generates the px_clk_tx and px_clk_rx clocks from the high frequency PLL
clock.
The px_tx_spdsel, px_rx_spdsel, px_tx_lspd and px_power_mode[2:0] clocks are directly
managed by PCIe or SATA controllers.
Note:
Regarding the pin naming convention px_yyy:
–
p is the port.
–
x is the macrocell port number (x is always 1)
–
yyy is the pin name.
Figure 130. SerDes clocks
SerDes
clk/clkb
from PLL
(high freq)
Clock
recovery
fref_ready
/5or10
/5or10
/2,4 or 8
/2,4 or 8
24.4
px_tck
px_rx_lspd
px_clk_rx
px_rx_spdsel
px_power_mode[2:0]
px_tx_lspd
px_tx_spdsel
px_clk_tx
p1_clk_osc
clk_ref
from PLL
(low freq)
Resets
●
p1_rst_phy_n: global reset for the macrocell (internal PLL included). When this reset is
asserted the macrocell is in minimum power mode (everything is off).
●
p1_rst_tx: serializer data path reset
●
p1_rst_rx: deserializer data path reset
Doc ID 018553 Rev 3
395/590
SATA/PCIe physical interface (MiPHY)
24.5
RM0078
Functional description
As shown in Figure 131, the macrocell contains the following blocks:
●
PLL: provides the high-speed reference clock for transmit and receive channels.
●
SerDes: includes the standard-compliant transmit and receive functions:
●
–
SER: transmitter module
–
DES: clock and data recovery module.
–
I-DLL: oversampling clocks generator module
–
PMC: power management controller module
Compensation: performs TX and RX buffer 100-ohm
Figure 131. MiPHY functional block diagram
PLL reference clock
txp/txn
rxp/rxn
Sigma delta
PLL
SER
DES
Reference resistor
DOC
Compensation
I-DLL
DIC
SerDes
PMC
1 port
macrocell
24.5.1
PLL description
A sigma delta PLL provides the macrocell with the bit stream clock to all SerDes modules
through a propagation line. The PLL has the following properties:
●
harmonic PLL
●
differential generated clock provided to all SerDes through a propagation line
●
frequency range from 2.5 GHz to 3.0 GHz
●
fractional PLL with 1 ppm frequency precision
●
SSC modulation feature included
The PLL is set to a dedicated frequency for each standard:
●
SATA: 3 GHz
●
PCI Express: 2.5 GHz
The PLL is controlled by only one SerDes (the first on the PLL right side). But the PLL
provides its status to all SerDes (PLL locked flag signal).
396/590
Doc ID 018553 Rev 3
RM0078
24.5.2
SATA/PCIe physical interface (MiPHY)
SerDes description
One SerDes module has all the circuitry needed to support 1 port (of SATA), PCI Express.
SER module
●
differential output signal TXP/TXN with programmable
–
swing
–
pre-emphasis
–
slew-rate
●
detection of a peer transceiver on TXP/TXN pads (PCI Express feature only with low
swing TX buffer)
●
de-emphasis depending on TX buffer choice
●
8b10b encoder
I-DLL module
This module generates the multiple clocks from the single PLL clock that allow the
deserializer to sample the incoming data stream.
DES module
●
equalization of RXP/RXN input
●
signal detection circuitry on RXP/RXN input (both synchronous and asynchronous for
OOB sequence and wake-up circuitry)
●
clock and data recovery (CDR)
●
8b10b decoder with error detection
PMC module
This module manages the following:
24.5.3
●
macrocell wake up procedure
●
macrocell power modes.
Compensation module (COMPENS) description
This block compensates the RX buffer input impedance, TX buffer output impedance, and
TX buffer output slew rate over process, voltage and temperature variations, using an
external reference resistor.
Doc ID 018553 Rev 3
397/590
SATA/PCIe physical interface (MiPHY)
24.6
RM0078
Operation
Figure 132 shows how the MiPHY is integrated in the SPEAr1340 device.
To select whether PCIe or SATA should be selected by the multiplexer (MUX), you must
configure the miscellaneous PCIe_SATA_CFG register[0]: pcie_sata_sel.
●
Set pcie_sata_sel = 0 to select the PCIe block.
●
Set pcie_sata_sel = 1 to select the SATA block.
Figure 132. MiPHY module in SPEAr1340
SPEAr top level
MISC
MIPHY_S_0_TXp
PCIe0 signals
MIPHY_S_0_TXn
p1_rst_phy_n
p1_clk_auxi
p1_clk_rx
MUX
pcie_sata_sel[0]
p1_clk_tx
p1_data_in
p1_data_out
MIPHY
single lane
RCG
PCIe0
MIPHY_S_0_RXp
SATA0
MIPHY_S_0_RXn
SATA0 signals
MIPHY_S_XTAL1
pcie_sata_sel[0]
PLL
MIPHY_S_XTAL2
MIPHY single PLL control signal
coming from MISC register
398/590
Doc ID 018553 Rev 3
RM0078
25
Asynchronous serial ports (UART)
Asynchronous serial ports (UART)
This chapter focuses on UART functionality and operation.
For the UART feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
25.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device integrates 2 instances of an asynchronous serial port digital block,
identified as UART0 and UART1.
Asynchronous serial ports (commonly referred as UARTs) perform the main task in
computer serial communication by converting incoming parallel information into serial data
and incoming serial information into parallel data that can be sent on a communication line
connected to an external peripheral device.
Typical UART use cases include the connection of SPEAr-based platforms to debugging
consoles, the communication with modems, and the interfacing of Bluetooth, DECT or
ZigBee chipsets.
UART ports usually do not directly generate or receive the external signals sent between
different pieces of equipment. External interface devices convert the logic level signals of
the UART to and from the external signal levels. External signals can take many different
forms, such as RS-232, infrared, and wireless radio. In particular, the SPEAr1340 UART
interfaces directly support (by software selection) the IrDA-compliant SIR (Serial InfraRed)
protocol.
The SPEAr1340 UART features offer functionality similar to the industry-standard 16C650
UART device.
The UART supports standard asynchronous communication bits (start, stop, and parity),
which are added prior to transmission and removed on reception.
Doc ID 018553 Rev 3
399/590
Asynchronous serial ports (UART)
RM0078
Figure 133. UART block diagram
Read data[11:0]
nUARTRST
rxd[11:0]
Write data[7:0]
16x8
Transmit
FIFO
PCLK
16x12
Receive
FIFO
txd[7:0]
PRESETn
UARTx_TXD
Control and Status
PSEL
Transmitter
PWRITE
PADDR[11:2]
SIROUT (1)
Baud16
PENABLE
APB interface
and register
block
Baud
rate
divisor
Baud rate
generator
PWDATA[15 0]
PWDATA[15:0]
UART RXD
UARTx_RXD
Baud16
Receiver
PRDATA[15:0]
SIRIN
Receive
FIFO
status
Transmit
FIFO
status
UARTCLK
FIFO
flags
UARTRXDMACLR
UART0_RIn
UARTTXDMACLR
UART0_CTSn
UARTRXDMASREQ
UARTTXDMASREQ
UARTRXDMABREQ
UARTTXDMABREQ
Note:
25.2
1
(1)
UART0 DSRn
UART0_DSRn
UARTTXINTR
DMA
interface
UART0_DCDn
UARTRXINTR
UARTMSINTR
FIFO status
and interrupt
generation
UART0_DTRn
UARTRTINTR
UART0_RTSn
UARTEINTR
nUARTOut1
UARTINTR
nUARTOut2
For more information on this signal, refer to Section 25.5.4: IrDA SIR ENDEC.
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
400/590
Doc ID 018553 Rev 3
RM0078
25.3
Asynchronous serial ports (UART)
Clocks
UART uses the PCLK clock for APB bus transactions and the reference clock UARTCLK for
internal operations as well as for baud rate generation.
UARTCLK has certain constraints with regards to PCLK. See Section 25.5.5: Baud rate
generation and transmit logic for more information.
Note:
For details on UART clock configuration, see also Chapter 5: Reset and clock generator
(RCG)
25.4
Interrupts
See also: Appendix A: Interrupts.
Table 135 shows a summary of the 11 maskable interrupts generated within the UART.
These interrupts are combined to form four individual interrupt outputs and one which is the
logical OR of the individual outputs. Any individual interrupt can be enabled or disabled by
changing the corresponding mask bit in the UARTIMSC register. The status of the individual
interrupt sources can be read either from the UARTRIS register for raw status, or from the
UARTMIS register for the masked status.
Table 135. UART interrupt summary with combined outputs
Name
Source
Combined outputs
UARTRXINTR
Receive FIFO
UARTRXINTR
UARTTXINTR
Transmit FIFO
UARTTXINTR
UARTRTINTR
Receive time-out in
Receive FIFO
UARTRTINTR
UARTCTSINTR
Clear to send
UARTDCDINTR
Data carrier detect
UARTDCSRINTR
Data carrier send
UARTRIINTR
Ring indicator modem
status
UARTOEINTR
Overrun error
UARTBEINTR
Break error
(in reception)
UARTPEINTR
Parity error in the
received character
UARTFEINTR
Framing error in the
received character
UARTMSINTR
UARTINTR
UARTEINTR
Doc ID 018553 Rev 3
401/590
Asynchronous serial ports (UART)
RM0078
UARTRXINTR
This interrupt is asserted when one of the following events occurs:
●
If the FIFOs are enabled and the receive FIFO reaches the programmed trigger level.
To clear this interrupt, either read data from the receive FIFO until it becomes less than
the trigger level, or write 1‘b1 to the corresponding bit of the UARTICR register.
●
If the FIFOs are disabled and data is received, thereby filling the location.
To clear this interrupt, either perform a single read of the receive FIFO, or write 1‘b1 to
the corresponding bit of the UARTICR register.
UARTTXINTR
This interrupt is asserted when one of the following events occurs:
●
If the FIFOs are enabled (FEN bit set to 1‘b1 in UARTLCR_H register) and the transmit
FIFO reaches the programmed trigger level (TXIFLSEL in UARTIFLS register).
To clear this interrupt, either write data to the transmit FIFO until it becomes greater
than the trigger level, or write 1‘b1 to the corresponding bit of the UARTICR register.
●
If the FIFOs are disabled and there is no data in the transmitter single location.
To clear this interrupt, either perform a single write to the transmit FIFO, or write 1‘b1 to
the corresponding bit of the UARTICR register.
UARTRTINTR
This interrupt is asserted when the receive FIFO is not empty, and no further data is
received over a 32-bit period. This interrupt clears either when the receive FIFO becomes
empty through reading all the data (or by reading the holding register), or when 1‘b1 is
written to the corresponding bit of the UARTICR register.
UARTMSINTR
This interrupt is asserted if any of the modem status line changes:
●
UARTRIINTR, because of a change in the nUARTRI modem status.
●
UARTCTSINTR, because of a change in the nUARTCTS modem status.
●
UARTDCDINTR, because of a change in the nUARTDCD modem status.
●
UARTDSRINTR, because of a change in the nUARTDSR modem status.
UARTEINTR
This error interrupt is triggered when there is an error in the reception of the data. The
interrupt can be caused by a number of different error conditions, such as overrun, break,
parity and framing.
UARTINTR
This is the OR logical function of all the individual masked interrupt sources. This interrupt is
asserted if any of the individual interrupts are asserted and enabled.
402/590
Doc ID 018553 Rev 3
RM0078
Asynchronous serial ports (UART)
25.5
Functional description
25.5.1
Main interfaces
APB interface
The APB interface block generates read and write decodes for accesses to control and
status registers (CSRs) as well as to transmit/receive FIFO memories.
Register block
The register block stores data written, or to be read, across the APB interface.
Baud rate generator
The baud rate generator contains free-running counters that generate the internal x16
clocks, and Baud16 signal.
Baud16 provides timing information for UART transmit and receive control. It consists of a
stream of pulses with a width of one UARTCLK clock period and a frequency of 16 times the
baud rate.
Transmit FIFO
The transmit FIFO is an 8-bit wide, 16-location deep FIFO memory buffer. CPU data written
across the APB interface is stored in this FIFO until read out by the transmit logic.
Note:
The transmit FIFO block can be disabled to act like a one-byte holding register.
Receive FIFO
The receive FIFO is a 12-bit wide, 16-location deep FIFO memory buffer. Received data and
corresponding error bits are stored in the receive FIFO by the receive logic until read out by
the CPU across the APB interface.
Note:
The receive FIFO block can be disabled to act like a one-byte holding register.
Transmit logic
The transmit logic performs parallel-to-serial conversion on the data read from the transmit
FIFO. The control logic outputs the serial bit stream beginning with a start bit followed by
data bits, with the LSB first and ended by parity bit and stop bit according to the
programmed configuration in control registers.
See also: Section 25.5.5: Baud rate generation and transmit logic.
Receive logic
The receive logic performs serial-to-parallel conversion on the received serial bit stream
after a valid pulse has been detected. The receive logic also performs detection of overrun,
parity, frame error checking and line break, and their status accompanies the data that is
written to the receive FIFO.
See also: Section 25.5.5: Baud rate generation and transmit logic.
Doc ID 018553 Rev 3
403/590
Asynchronous serial ports (UART)
RM0078
Interrupt generation logic
UART generates individual maskable active HIGH interrupts. A combined interrupt output is
also generated as an OR function of the individual interrupt requests.
A single combined interrupt can be used in a system interrupt controller that provides
another level of masking on a per-peripheral basis. This enables to use modular device
drivers that always know where to find the interrupt source control register bits.
See also: Section 25.4: Interrupts
DMA interface
The UART provides a DMA interface to connect to a DMA controller. The DMA operation of
the UART is controlled through the UART DMA control register—UARTDMACR.
The DMA interface includes the following signals:
●
●
For receive:
–
UARTRXDMASREQ: Single character DMA transfer request, asserted by the
UART. For receive, one character consists of up to 12 bits. This signal is asserted
when the receive FIFO contains at least one character.
–
UARTRXDMABREQ: Burst DMA transfer request, asserted by the UART. This
signal is asserted when the receive FIFO contains more characters than the
programmed watermark level. You can program the watermark level for each FIFO
using the Interrupt FIFO Level Select Register, UARTIFLS.
–
UARTRXDMACLR: DMA request clear, asserted by a DMA controller to clear the
receive request signals. If DMA burst transfer is requested, the clear signal is
asserted during the transfer of the last data in the burst.
For transmit:
–
UARTTXDMASREQ: Single character DMA transfer request, asserted by the
UART. For transmit one character consists of up to eight bits. This signal is
asserted when there is at least one empty location in the transmit FIFO.
–
UARTTXDMABREQ: Burst DMA transfer request, asserted by the UART. This
signal is asserted when the transmit FIFO contains less characters than the
watermark level. You can program the watermark level for each FIFO using the
Interrupt FIFO Level Select Register, UARTIFLS.
–
UARTTXDMACLR: DMA request clear, asserted by a DMA controller to clear the
transmit request signals. If DMA burst transfer is requested, the clear signal is
asserted during the transfer of the last data in the burst.
The burst transfer and single transfer request signals are not mutually exclusive, so they can
both be asserted at the same time. When the UART is in the FIFO disabled mode (where
both FIFOs act like a one-byte holding register), only the DMA single transfer mode can
operate, because only one character can be transferred to or from the FIFO at any time.
When the UART is in the FIFO enabled mode, data transfers can be made by either single
or burst transfers depending on the programmed watermark level and the amount of data in
the FIFO.
In addition, the DMAONERR bit in the DMA Control Register (UARTDMACR) supports the
use of the receive error interrupt, UARTEINTR. It enables the DMA receive request outputs,
UARTRXDMASREQ or UARTRXDMABREQ, to be masked out when the UART error
interrupt, UARTEINTR, is asserted. The DMA receive request outputs remain inactive until
the UARTEINTR is cleared. The DMA transmit request outputs are unaffected.
404/590
Doc ID 018553 Rev 3
RM0078
Note:
Asynchronous serial ports (UART)
The two UART receive and transmit DMA interfaces as shown in Table 49: DMAC MUX selecting the peripheralare called UARTx_RX (comprising of UARTRXDMABREQ,
UARTRXDMASREQ and UARTRXDMACLR), and UARTx_TX (comprising of
UARTTXDMABREQ, UARTTXDMASREQ and UARTTXDMACLR).
UARTx is the instance number.
Synchronization registers and logic
Because the UART supports both asynchronous and synchronous operation of the clocks
PCLK and UARTCLK, synchronization registers and handshaking logic have been
implemented and are active at all times. Synchronization of control signal is performed on
both directions of data flow.
25.5.2
Modem operation
You can use the UART to support the data terminal equipment (DTE) mode operation.
Figure 133: UART block diagram shows the modem signals in the DTE mode, while the
following table shows the meaning of the signals.
Table 136. Meaning of modem input/output in DTE and DCE modes
Meaning
Port name
DTE
DCE
nUARTCTS
Clear to send
Request to send
nUARTDSR
Data set ready
Data terminal ready
nUARTDCD
Data carrier detect
–
Ring indicator
–
nUARTRTS
Request to send
Clear to send
nUARTDTR
Data terminal ready
Data set ready
nUARTRI
Doc ID 018553 Rev 3
405/590
Asynchronous serial ports (UART)
25.5.3
RM0078
Hardware flow control
The hardware flow control feature is fully selectable, and enables you to control the serial
data flow by using the nUARTRTS output and nUARTCTS input signals. Figure 134 shows
how two devices can communicate using hardware flow control.
Figure 134. Hardware flow control between two similar devices
RX FIFO
and
flow control
TX FIFO
and
flow control
nUARTRTS
UARTRTS
nUARTRTS
UARTRTS
nUARTCTS
nUARTCTS
UART 1
RX FIFO
and
flow control
TX FIFO
and
flow control
UART 2
When the RTS flow control is enabled, the nUARTRTS signal is asserted until the receive
FIFO is filled up to the programmed watermark level. When the CTS flow control is enabled,
the transmitter can only transmit data when the nUARTCTS signal is asserted.
The hardware flow control is selectable through bits 14 (RTSEn) and 15 (CTSEn) of the
UART control register (UARTCR). Table 137 lists how bits must be set to enable RTS and
CTS flow control both simultaneously, and independently.
When RTS flow control is enabled, the software cannot control the nUARTRTS line through
bit 11 of the UART control register.
Figure 135. Hardware flow control transfer diagram (start of transfer)
UART1 RX
nUARTRTS
UARTx_RXD
B0
Start
bit
B1
Pbit
B3 - - - B7
N1
Stop Bit
N1
N1
nUARTCTS
UART2 TX
Start
bit
B0
B1
N1
406/590
Doc ID 018553 Rev 3
B3 - - - B7
Pbit
N1
Stop Bit
N1
RM0078
Asynchronous serial ports (UART)
Figure 136. Hardware flow control transfer diagram (end of transfer)
UART1RX
nUARTRTS
B0
Start
bit
B1
Pbit
B3 - - - B7
N1
Stop Bit
N1
N1
nUARTCTS
UART2TX
Start
bit
B0
B1
B3 - - - B7
Pbit
N1
N1
Stop Bit
N1
No more frame is transferred after the above frame, as CTS is high. UART2 completes the
ongoing frame and then stops to TX from the next frame.
Table 137. Control bits to enable and disable hardware flow control
UARTCR bit 15
(CTSEn)
UARTCR bit 14
(RTSEn)
1
1
Both RTS and CTS flow control enabled
1
0
Only CTS flow control enabled
0
1
Only RTS flow control enabled
0
0
Both RTS and CTS flow control disabled
Description
RTS flow control
The RTS flow control logic is linked to the programmable receive FIFO watermark levels.
When RTS flow control is enabled, the nUARTRTS is asserted until the receive FIFO is filled
up to the watermark level. When the receive FIFO watermark level is reached, the
nUARTRTS signal is deasserted, indicating that there is no more room to receive any more
data. The transmission of data is expected to cease after the current character has been
transmitted.
The nUARTRTS signal is reasserted when data has been read out of the receive FIFO so
that it is filled to less than the watermark level. If RTS flow control is disabled and the UART
is still enabled, then data is received until the receive FIFO is full, or no more data is
transmitted to it.
CTS flow control
If CTS flow control is enabled, the transmitter checks the nUARTCTS signal before
transmitting the next byte. If the nUARTCTS signal is asserted, the byte is transmitted,
otherwise transmission does not occur. Data continues to be transmitted while nUARTCTS
is asserted and the transmit FIFO is not empty. If the transmit FIFO is empty and the
nUARTCTS signal is asserted, no data is transmitted.
Doc ID 018553 Rev 3
407/590
Asynchronous serial ports (UART)
RM0078
If the nUARTCTS signal is deasserted and CTS flow control is enabled, the current
character transmission is completed before stopping. If CTS flow control is disabled and the
UART is enabled, the data continues to be transmitted until the transmit FIFO is empty.
25.5.4
IrDA SIR ENDEC
The IrDA SIR ENDEC comprises:
●
an IrDA SIR transmit encoder and
●
an IrDA SIR receive decoder
The transmit encoder modulates the non return-to-zero (NRZ) transmit bit stream output
from the UART. The IrDA SIR physical layer specifies use of a return to zero inverted (RZI)
modulation scheme that represents logic 0 as an infrared light pulse. The modulated output
pulse stream is transmitted to an external output driver and infrared light emitting diode
(LED).
In normal mode the transmitted pulse width is specified as three times the period of the
internal x16 clock (Baud16), that is, 3/16 of a bit period.
In low-power mode the transmit pulse width is specified as 3/16 of a 115.2 Kbits/s bit period.
This is implemented as three times the period of a nominal 1.8432 MHz clock (IrLPBaud16)
derived from dividing down of UARTCLK clock.
The frequency of IrLPBaud16 is set up by writing the appropriate divisor value to
UARTILPR.
The active low encoder output is normally LOW for the marking state (no light pulse). The
encoder outputs a high pulse to generate an infrared light pulse representing a logic 0 or
spacing state.
In normal and low power IrDA modes, when the fractional baud rate divider is used, the
transmitted SIR pulse stream includes an increased amount of jitter. This jitter is because
the Baud16 pulses cannot be generated at regular intervals when fractional division is used.
That is, the Baud16 cycles have a different number of UARTCLK cycles. It can be shown
that the worst case jitter in the SIR pulse stream can be up to three UARTCLK cycles. This
is within the limits of the SIR IrDA Specification where the maximum amount of jitter allowed
is 13%, as long as the UARTCLK is > 3.6864 MHz and the maximum baud rate used for
normal mode SIR is <= 115.2 kbps. Under these conditions, the jitter is less than 9%.
The receive decoder demodulates the return-to-zero bit stream from the infrared detector
and outputs the received NRZ serial bit stream to the UART received data input. The
decoder input is normally HIGH (marking state) in the idle state.
The output polarity of the transmit encoder is opposite that of the decoder input. A start bit is
detected when the decoder input is LOW. Regardless of of the power mode (normal or lowpower), a start bit is deemed valid if the decoder is still LOW, one period of IrLPBaud16 after
the LOW was first detected. This enables a normal-mode UART to receive data from a lowpower mode UART that can transmit pulses as small as 1.41 µs.
IrDA operation
The IrDA SIR block (see Figure 137) contains an IrDA SIR protocol ENDEC. The SIR
protocol ENDEC can be enabled for serial communication through signals nSIROUT and
SIRIN to an infrared transducer, instead of using the UART signals UARTTXD and
UARTRXD.
408/590
Doc ID 018553 Rev 3
RM0078
Asynchronous serial ports (UART)
Figure 137. UART/IrDA block diagram
TXD
OR
UARTTXD
M
U
X
nSIROUT
APB
SIR
Transmit
encoder
SIREN
UART0 TXD
UART0_TXD
UART_SIR_SEL
UART
nSIRIN
RXD
1
AND
SIR
Receive
decoder
0
UART0_RXD
UARTRXD AND
Wrapper
To enable the SIR interface, you must configure the miscellaneous register PERIP_CFG , bit
uart*_sir_uart_sel. If the SIR protocol is enabled, the UARTTXD line is held in the passive
state (HIGH) and transitions of the modem status, or the UARTRXD line have no effect; this
protocol can receive and transmit, but it is half-duplex only.
The IrDA SIR ENDEC provides functionality that converts between an asynchronous UART
data stream, and half-duplex serial SIR interface. No analog processing is performed onchip. The role of the SIR ENDEC is to provide a digital encoded output, and decoded input
to the UART. There are two modes of operation:
●
In normal IrDA mode, a zero logic level is transmitted as high pulse of 3/ 16th duration
of the selected baud rate bit period on the nSIROUT signal, while logic one levels are
transmitted as a static LOW signal. These levels control the driver of an infrared
transmitter, sending a pulse of light for each zero. On the reception side, the incoming
light pulses energize the photo transistor base of the receiver, pulling its output LOW.
This drives the SIRIN signal LOW.
●
In low-power IrDA mode, the width of the transmitted infrared pulse is set to three
times the period of the internally generated IrLPBaud16 signal (1.63µs, assuming a
nominal 1.8432 MHz frequency) by changing the appropriate bit in UARTCR.
In both normal and low-power IrDA modes:
●
during transmission, the UART data bit is used as the base for encoding
●
during reception, the decoded bits are transferred to the UART receive logic
The IrDA SIR physical layer specifies a half-duplex communication link, with a minimum
10 ms delay between transmission and reception. This delay must be generated by software
because it is not supported by the UART. The delay is required because the infrared
receiver electronics might become biased, or even saturated from the optical power coupled
from the adjacent transmitter LED. This delay is known as latency, or receiver setup time.
Doc ID 018553 Rev 3
409/590
Asynchronous serial ports (UART)
RM0078
The IrLPBaud16 signal is generated by dividing down the UARTCLK signal according to the
low-power divisor value written to UARTILPR. The low-power divisor value is calculated as
follows:
Low-power divisor = (FUARTCLK / FIrLPBaud16)
Where FIrLPBaud16 is nominally 1.8432 MHz.
The divisor must be chosen so that 1.42 MHz < FIrLPBaud16 < 2.12 MHz.
IrDA data modulation
Figure 138 shows the effect of IrDA 3/16 data modulation:
Figure 138. IrDA data modulation (3/16)
Data bits
Start
bit
TXD
1
0
0
0
1
Stop
bit
0
0
1
1
1
nSIROUT
3
16 Bit period
Bit period
SIRIN
RXD
0
0
1
Start
25.5.5
0
1
0
1
0
1
1
Stop
Data bits
Baud rate generation and transmit logic
UART character frame
This is the frame format which is transmitted and received by UART from UARTx_TXD and
UARTx_RXD pins respectively.
Figure 139. UART character frame
UARTx_TXD/
UARTx_RXD
B0
B1
N1
N1
Legend:
B: bits 0-7 (number depends on configuration)
Pbit: Parity bit if parity is enabled
Stop bits: 1 or 2 (number depends on configuration)
410/590
B3 – - - B7
Pbit
Stop
Bit
N1
N1
Start
Bit
Doc ID 018553 Rev 3
RM0078
Asynchronous serial ports (UART)
UART transmission
UART transmits and receives the frame in Figure 139 in the following form, assuming the
baud rate is 19200 bps.
Data of 15 is received at RXFIFO in the following form. For example, 00010101:
Figure 140. RXFIFO payload
1
0
1
0
1
0
0
0
Above is RXFIFO 8 bit payload, where the toggling rate is 19200. According to standard
protocol, UART needs to generate a bit width of 52.08 us.
Figure 141. UART transfer bit diagram
UART generates bits with an error percentage of 1.56 %. See the example in the following
section.
Baud rate generation
The baud rate divisor is a 22-bit number consisting of a 16-bit integer and a 6-bit fractional
part. This is used by the baud rate generator to determine the bit period. The fractional baud
rate divider enables the use of any clock with a frequency > 3.6864 MHz to act as
UARTCLK, while it is still possible to generate all the standard baud rates.
The baud rate divisor (BRD) has the following relationship to UARTCLK in MHz:
Formula 1
6
× 10 = BRD + BRD
---------------------------------------------BRD = UARTCLK
l
F
16 × Baud rate
Where BRDI is the integer part and BRDF is the fractional part separated by a decimal point
as shown in the next figure:
Figure 142. Baud rate divisor
16-bit integer
6-bit fractional part
You can calculate the 6-bit number (m) by taking the fractional part of the required baud rate
divisor and multiplying it by 64 (that is, 2n, where n is the width of the UARTFBRD register)
and adding 0.5 to account for rounding errors:
m = integer(BRDF * 2n + 0.5)
Doc ID 018553 Rev 3
411/590
Asynchronous serial ports (UART)
Note:
RM0078
1
The contents of integer and fractional value registers (UARTIBRD and UARTFBRD) are not
updated until current frame is transferred.
2
Integer value of 0 is invalid and fractional value is ignored in this case.
3
If integer value is 0xffff, the fractional value must not be greater than zero.
Example: how to calculate BRD values
Assuming that UARTCLK = 48 MHz, the required baud is 115.2 k.
Using Formula 1 above:
BRDI= (48 x 106)/ (16 x 115,2 x 103) = 26,0417= 26
BRDF = Integer ((0,0417 x 64) + 0,5= 3,1688 = 3
Generated baud divider = [(BRDF/ (2nbit)) + BRDI] = 3/64 =0,0469 + 26= 26,0469
Generated baud rate = (48 x 106)/ (16 x 26,0469) = 115.176,854
Error = (115200 -115167,5688)/115200 × 100 = 0.0002 %
The maximum error using a 6-bit UARTFBRD register = 1/64 × 100 = 1.56 %.
This occurs when m = 1, and the error is cumulative over 64 clock ticks.
Frequency and baud rate constraints
UART has certain constraints regarding frequency range and baud rates.
The frequency selection for UARTCLK must be in the required range of baud rates:
FUARTCLK (min) ≥ 16 x baud_rate(max)
FUARTCLK (max) ≤ 16 x 65535 × baud_rate(min)
For instance, for a range of baud rates from 110 baud to 460800 baud, the UARTCLK
frequency must be between 7.3728 MHz and 115.34 MHz. If the baud rate required is high,
UARTCLK minimum frequency must be high enough, as mentioned in the formulas above.
Another constraint imposed is the clock frequency for PCLK in relation to UARTCLK. The
frequency of UARTCLK must be no more than 5/3 times faster than the frequency of PCLK:
FUARTCLK ≤ 5/3 x FPCLK
412/590
Doc ID 018553 Rev 3
RM0078
26
Synchronous serial port (SSP)
Synchronous serial port (SSP)
This chapter focuses on SSP functionality and operation.
For the SSP feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
26.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The synchronous serial port (SSP) block includes a master or slave interface to enable
synchronous serial communication with slave or master peripherals.
Figure 143. SSP block diagram
PCLK
Tx FIFO
(16x8)
FIFO status
and
Interrupt
generation
AMBA
APB
Interface
SSPINTR
Rx FIFO
(16x8)
PCLK
PCLK
SSPCLK
PCLK
Register
block
SSPCLK
SSPTXD
Clock
prescaler
Transmit/
receive
logic
DMA
interface
26.2
SSPCLKOUT
SSPCLKIN
SSPRXD
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
Doc ID 018553 Rev 3
413/590
Synchronous serial port (SSP)
26.3
RM0078
Clocks
The SSP clocks are:
●
PCLK: the APB clock
●
SSP0_SCK (external) which is:
●
–
SSPCLKOUT when SSP works as a master
–
SSPCLKIN when SSP works as a slave
SSPCLK: main SSP clock input (internal)
See also Chapter 5: Reset and clock generator (RCG).
26.4
Functional description
26.4.1
Main interfaces
APB slave interface
The AMBA APB interface generates read and write decodes for accesses to status and
control registers, and transmit and receive FIFO memories.
The AMBA APB is a local secondary bus that provides a low-power extension to the higher
bandwidth AMBA advanced high-performance bus (AHB) within the AMBA system
hierarchy. The AMBA APB groups narrow-bus peripherals to avoid loading the system bus
and provides an interface using memory-mapped registers, which are accessed under
programmed control.
Register block
The register block stores data written or to be read across the AMBA APB interface.
Clock prescaler
When configured as a master, an internal prescaler, comprising two free-running reloadable serially linked counters, is used to provide the serial output clock CLKOUT.
You can program the clock prescaler, through the SSPCPSR register, to divide CLK by a
factor of 2 to 254 in steps of two. By not utilizing the least significant bit of the SSPCPSR
register, division by an odd number is not possible and this ensures a symmetrical (equal
mark space ratio) clock is generated.
The output of the prescaler is further divided by a factor of 1 to 256, through the
programming of the SSPCR0 control register, to give the final master output clock CLKOUT.
Transmit FIFO
The common transmit FIFO is a 16-bit wide, 8-location deep, first-in, first-out memory buffer.
CPU data written across the AMBA APB interface are stored in the buffer until read out by
the transmit logic.
When configured as a master or a slave parallel data is written into the transmit FIFO prior
to serial conversion and transmission to the attached slave or master respectively, through
the SSPTXD pin.
414/590
Doc ID 018553 Rev 3
RM0078
Synchronous serial port (SSP)
Receive FIFO
The common receive FIFO is a 16-bit wide, 8-location deep, first-in, first-out memory buffer.
Received data from the serial interface are stored in the buffer until read out by the CPU
across the AMBA APB interface.
When configured as a master or slave, serial data received through the SSPRXD pin is
registered prior to parallel loading into the attached slave or master receive FIFO
respectively.
Transmit and receive logic
When configured as a master, the clock to the attached slaves is derived from a divided
down version of CLK through the prescaler operations described previously. The master
transmit logic successively reads a value from its transmit FIFO and performs parallel to
serial conversion on it. Then the serial data stream and frame control signal, synchronized
to CLKOUT, are output through the TXD pin to the attached slaves. The master receive logic
performs serial to parallel conversion on the incoming synchronous SSPRXD data stream,
extracting and storing values into its receive FIFO, for subsequent reading through the APB
interface.
When configured as a slave, the SSPCLKIN clock is provided by an attached master and
used to time its transmission and reception sequences. The slave transmit logic, under
control of the master clock, successively reads a value from its transmit FIFO, performs
parallel to serial conversion, then output the serial data stream and frame control signal
through the slave SSPTXD pin. The slave receive logic performs serial to parallel conversion
on the incoming SSPRXD data stream, extracting and storing values into its receive FIFO,
for subsequent reading through the APB interface.
Interrupt generation logic
The SSP generates four individual maskable, active HIGH interrupts.
A combined interrupt output is also generated as an OR function of the individual interrupt
requests.
You can use the single combined interrupt with a system interrupt controller that provides
another level of masking on a per-peripheral basis. This allows use of modular device
drivers that always know where to find the interrupt source control register bits.
The individual interrupt requests could also be used with a system interrupt controller that
provides masking for the outputs of each peripheral. In this way, a global interrupt controller
service routine would be able to read the entire set of sources from one wide register in the
system interrupt controller. This is attractive where the time to read from the peripheral
registers is significant compared to the CPU clock speed in a real-time system.
The peripheral supports both the above methods.
The transmit and receive dynamic data-flow interrupts, TXINTR and RXINTR, are separated
from the status interrupts so that data can be read or written in response to the FIFO trigger
levels.
DMA interface
This block manages the DMA interface. It can work in single transfer mode or in burst
transfer mode. The DMA operation of the PrimeCell SSP is controlled through the DMA
control register, SSPDMACR.
Doc ID 018553 Rev 3
415/590
Synchronous serial port (SSP)
RM0078
The DMA interface includes the following signals:
●
●
For receive
–
SSPRXDMASREQ: Single-character DMA transfer request, asserted by the SSP.
This signal is asserted when the receive FIFO contains at least one character.
–
SSPRXDMABREQ: Burst DMA transfer request, asserted by the SSP. This signal
is asserted when the receive FIFO contains four or more characters.
–
SSPRXDMACLR: DMA request clear, asserted by the DMA controller to clear the
receive request signals. If DMA burst transfer is requested, the clear signal is
asserted during the transfer of the last data in the burst.
For transmit
–
SSPTXDMASREQ: Single-character DMA transfer request, asserted by the SSP.
This signal is asserted when there is at least one empty location in the transmit
FIFO.
–
SSPTXDMABREQ: Burst DMA transfer request, asserted by the SSP. This signal
is asserted when the transmit FIFO contains four or less characters.
–
SSPTXDMACLR: DMA request clear, asserted by the DMA controller to clear the
transmit request signals. If DMA burst transfer is requested, the clear signal is
asserted during the transfer of the last data in the burst.
The burst transfer and single transfer request signals are not mutually exclusive. They can
both be asserted at the same time. For example, when there is more data than the
watermark level of four in the receive FIFO, the burst transfer request and the single transfer
request are asserted. When the amount of data left in the receive FIFO is less than the
watermark level, the single request only is asserted. Each request signal remains asserted
until the relevant DMA clear signal is asserted.
After the request clear signal is deasserted, a request signal can become active again,
depending on the conditions described above.
Note:
The two SSP receive and transmit DMA interfaces as shown in Table 50: DMAC MUX selecting the peripheral are called SSPn_RX (consisting of SSPRXDMABREQ,
SSPRXDMASREQ and SSPRXDMACLR), and SSPn_TX (consisting of SSPTXDMABREQ,
SSPTXDMASREQ and SSPTXDMACLR).
For more detail on this interface, refer to Chapter 12: Direct memory access controllers
(DMAC).
Synchronizing registers and logic
The SSP supports both asynchronous and synchronous operation of the clocks, PCLK and
SSPCLK. Synchronization registers and handshaking logic have been implemented, and
are active at all times. This has a minimal impact on performance or area. Synchronization
of control signals is performed on both directions of data flow, which is from the PCLK to the
SSPCLK domain and from the SSPCLK to the PCLK domain.
26.5
Operation
This section describes the operation of the SSP block.
After reset, the SSP logic is disabled and must be configured. The SSP can be configured
as master or slave (see Section 26.6.3: Configuring SSP as master or slave)
416/590
Doc ID 018553 Rev 3
RM0078
Synchronous serial port (SSP)
The bit rate, derived from the APB clock (PCLK), requires the programming of the clock
prescale register SSPCPSR (refer to RM0089, Reference manual, SPEAr1340 address
map and registers, MISC registers, for the PCLK frequency).
26.5.1
Bit rate generation
Dividing down the input clock SSPCLK derives the serial bit rate. The clock is first divided by
an even prescale value CPSDVSR from 2 to 254, which is programmed in SSPCPSR. The
clock is further divided by a value from 1 to 256, which is 1 + SCR, where SCR is the value
programmed in SSPCR0.
The frequency of the output signal bit clock SSPCLKOUT is:
( FSSPCLK )
SSPCLKOUT = -----------------------------------------------------------------[ CPSDVR ⋅ ( 1 + SCR ) ]
For example, if SSPCLK is 3.6864 MHz, and CPSDVSR = 2, then SSPCLKOUT has a
frequency range from 7.2 kHz to 1.8432 MHz.
26.5.2
Frame format
Each data frame is between 4- to 16-bit long depending on the size of data programmed,
and is transmitted starting with the MSB. There are three basic frame types that can be
selected:
●
Texas Instruments synchronous serial
●
Motorola SPI
●
National Semiconductor Microwire
For all three formats, the serial clock (SSPCLKOUT) is held inactive while the SSP is idle,
and transitions at the programmed frequency only during active transmission or reception of
data. The idle state of SSPCLKOUT is utilized to provide a receive timeout indication that
occurs when the receive FIFO still contains data after a timeout period.
For Motorola SPI and National Semiconductor Microwire frame formats, the serial frame
(SSPFSSOUT) pin is active LOW, and is asserted (pulled down) during the entire
transmission of the frame.
For Texas Instruments synchronous serial frame format, the SSPFSSOUT pin is pulsed for
one serial clock period starting at its rising edge, prior to the transmission of each frame. For
this frame format, both the SSP and the off-chip slave device drive their output data on the
rising edge of SSPCLKOUT, and latch data from the other device on the falling edge.
Unlike the full-duplex transmission of the other two frame formats, the National
Semiconductor Microwire format uses a special master-slave messaging technique, which
operates at half-duplex. In this mode, when a frame begins, an 8-bit control message is
transmitted to the off-chip slave. During this transmission no incoming data is received by
the SSP. After the message has been sent, the off-chip slave decodes it and, after waiting
one serial clock after the last bit of the 8-bit control message has been sent, responds with
the requested data. The returned data can be 4 to 16-bits in length, making the total frame
length anywhere from 13 to 25-bits.
Doc ID 018553 Rev 3
417/590
Synchronous serial port (SSP)
RM0078
26.6
Programming
26.6.1
Defining the chip select
Four chip select lines are available to verify the real availability of the external signal;
however, only one can be operational at time. To select the active one, you have to program
the ssp_cs_en bit of the PERIPH_CFG miscellaneous register. It is also possible driving by
software the chip select with hs_ssp_sw_cs; this feature is enabled by hs_ssp_en. The chip
select driven by software is mandatory when the prescaler is set to 2 that means (SCR=0
and CPSDVR=2).
All the bits mentionned above (sp_cs_en, hs_ssp_sw_cs and hs_ssp_en) belong to the
PERIPH_CFG miscellaneous register.
Table 138. External CS selection
26.6.2
ssp_cs_en [0,1] (from MISC)
CS
00
SSPFSSOUT_0
01
SSPFSSOUT_1
10
SSPFSSOUT_2
11
SSPFSSOUT_3
Enabling SSP operation
To enable SSP, you can either:
●
prime the transmit FIFO, by writing up to eight 16-bit values when the SSP is disabled,
-or-
●
allow the transmit FIFO service request to interrupt the CPU.
Once enabled, transmission or reception of data begins on the transmit (SSPTXD) and
receive (SSPRXD) pins respectively.
Clock ratios
There is a constraint on the ratio of the frequencies of PCLK to SSPCLK. The frequency of
SSPCLK must be less than or equal to that of PCLK. This ensures that control signals from
the SSPCLK domain to the PCLK domain are certain to get synchronized before one frame
duration:
FSSPCLK <= FPCLK.
In the slave operation, the SSPCLKIN signal from the external master is double
synchronized and then delayed to detect an edge. It takes three SSPCLKs to detect an edge
on SSPCLKIN. SSPTXD has less setup time to the falling edge of SSPCLKIN on which the
master is sampling the line. The setup and hold times on SSPRXD with reference to
SSPCLKIN must be more conservative to ensure that it is at the right value when the actual
sampling occurs within the SSPMS. To ensure correct device operation, SSPCLK must be at
least 12 times faster than the maximum expected frequency of SSPCLKIN.
The frequency selected for SSPCLK must accommodate the desired range of bit clock
rates. The ratio of minimum SSPCLK frequency to SSPCLKOUT maximum frequency in the
case of the slave mode is 12 and for the master mode it is two.
418/590
Doc ID 018553 Rev 3
RM0078
Synchronous serial port (SSP)
To generate a maximum bit rate of 1.8432 Mbps in the master mode, the frequency of
SSPCLK must be at least 3.6864 MHz. With an SSPCLK frequency of 3.6864 MHz, the
SSPCPSR register must be programmed with a value of two and the SCR[7:0] field in the
SSPCR0 register needs to be programmed as zero.
To work with a maximum bit rate of 1.8432 Mbps in the slave mode, the frequency of
SSPCLK must be at least 22.12 MHz. With an SSPCLK frequency of 22.12 MHz, the
SSPCPSR register can be programmed with a value of 12 and the SCR[7:0] field in the
SSPCR0 register can be programmed as zero. Similarly the ratio of SSPCLK maximum
frequency to SSPCLKOUT minimum frequency is 254 x 256.
The minimum frequency of SSPCLK is governed by the following equations, both of which
have to be satisfied:
FSSPCLK(min) => 2 x FSSPCLKOUT(max) [for master mode]
FSSPCLK(min) => 12 x FSSPCLKIN(max) [for slave mode]
The maximum frequency of SSPCLK is governed by the following equations, both of which
have to be satisfied:
FSSPCLK(max) <= 254 x 256 x FSSPCLKOUT(min) [for master mode]
FSSPCLK(max) <= 254 x 256 x FSSPCLKIN(min) [for slave mode]
26.6.3
Configuring SSP as master or slave
Through the control registers SSPCR0 and SSPCR1, you can configure the peripheral as a
master or slave operating under one of the following protocols:
●
Motorola SPI
●
Texas Instruments SSI
●
National Semiconductor
Programming the SSPCR0 control register
The SSPCR0 register is used to:
●
program the serial clock rate
●
select one of the three protocols
●
select the data word size (where applicable)
The serial clock rate (SCR) value, in conjunction with the SSPCPSR clock prescale divisor
value (CPSDVSR), is used to derive the SSP transmit and receive bit rate from the external
SSPCLK.
The frame format is programmed through the FRF bits and the data word size through the
DSS bits.
Bit phase and polarity, applicable to Motorola SPI format only, are programmed through the
SPH and SPO bits.
Programming the SSPCR1 control register
The SSPCR1 register is used to:
●
select master or slave mode
●
enable a loop back test feature
●
enable the SSP peripheral
Doc ID 018553 Rev 3
419/590
Synchronous serial port (SSP)
RM0078
To configure the SSP as a master, clear the SSPCR1 register master or slave selection bit
(MS) to 0, which is the default value on reset.
To configure the SSP as a slave, set the SSPCR1 register MS bit to 1. In this configuration,
to enable or disable the SSP SSPTXD signal, use the SSPCR1 slave mode SSPTXD output
disable bit (SOD). This can be used in some multislave environments where masters might
parallel broadcast.
To enable the operation of the SSP, set the synchronous serial port enable (SSE) bit to 1.
420/590
Doc ID 018553 Rev 3
RM0078
27
I2C bus controllers (I2C)
I2C bus controllers (I2C)
This chapter focuses on I2C functionality and operation.
For the I2C feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
27.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device integrates 1 instance of an I2C controller, identified as I2C0. The
I2C controller acts as an APB slave interface to the two-wire serial I2C bus.
Figure 144. I2C block diagram
I2C controller
AMBA bus
interface unit
Register
file
Slave state
machine
Master state
machine
Clock
generator
Rx
shift
Tx
shift
Rx
filter
Toggle
Synchronizer
DMA interface
Interrupt
controller
RX
FIFO
27.2
TX
FIFO
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
27.3
Clocks
See Chapter 5: Reset and clock generator (RCG).
Doc ID 018553 Rev 3
421/590
I2C bus controllers (I2C)
27.4
RM0078
Functional description
The I2C bus is a two-wire serial interface, consisting of a serial data line (SDA) and a serial
clock (SCL). These wires carry information between the devices connected to the bus. Each
device is recognized by a unique address and can operate as either a “transmitter” or
“receiver,” depending on the function of the device. Devices can also be considered as
masters or slaves when performing data transfers. A master is a device that initiates a data
transfer on the bus and generates the clock signals to permit that transfer. At that time, any
device addressed is considered a slave.
Note:
The I2C must only be programmed to operate in either master OR slave mode only.
Operating as a master and slave simultaneously is not supported.
The I2C module can operate in standard mode (with data rates up to 100 Kb/s), fast mode
(with data rates up to 400 Kb/s), and high-speed mode (with data rates up to 3.4 Mb/s). The
I2C can communicate with devices only of these modes as long as they are attached to the
bus. Additionally, highspeed mode and fast mode devices are downward compatible. For
instance, high-speed mode devices can communicate with fast mode and standard mode
devices in a mixed-speed bus system; fast mode devices can communicate with standard
mode devices in 0 to 100 Kb/s I2C bus system. However, standard mode devices are not
upward compatible and should not be incorporated in a fast-mode I2C bus system as they
cannot follow the higher transfer rate and unpredictable states would occur.
An example of high-speed mode devices are LCD displays, high-bit count ADCs, and high
capacity EEPROMs. These devices typically need to transfer large amounts of data. Most
maintenance and control applications, the common use for the I²C bus, typically operate at
100 kHz (in standard and fast modes). Any I2C device can be attached to an I²C-bus and
every device can talk with any master, passing information back and forth. There needs to
be at least one master (such as a microcontroller or DSP) on the bus but there can be
multiple masters, which require them to arbitrate for ownership. Multiple masters and
arbitration are explained later in this chapter.
27.4.1
Main interfaces
The I2C is made up of an AMBA APB slave interface, an I2C interface, and FIFO logic to
maintain coherency between the two interfaces. A simplified block diagram of the
component is illustrated in Figure 144.
The following list defines the main functions of the I2C blocks.
●
AMBA bus interface unit: takes the APB interface signals and translates them into a
common generic interface that allows the register file to be bus protocol- agnostic.
●
Register file: contains configuration registers and is the interface with software.
●
Slave state machine: follows the protocol for a slave and monitors bus for address
match.
●
Master state machine: generates the I2C protocol for the master transfers.
●
Clock generator: calculates the required timing to do the following:
●
422/590
–
generate the SCL clock when configured as a master
–
check for bus idle
–
generate a START and a STOP
–
set up the data and hold the data
Rx shift: takes data into the design and extracts it in byte format.
Doc ID 018553 Rev 3
RM0078
27.4.2
I2C bus controllers (I2C)
●
Tx shift: presents data supplied by CPU for transfer on the I2C bus.
●
Rx filter: detects the events in the bus; for example, start, stop and arbitration lost.
●
Toggle: generates pulses on both sides and toggles to transfer signals across clock
domains.
●
Synchronizer: transfers signals from one clock domain to another.
●
DMA interface: generates the handshaking signals to the central DMA controller in
order to automate the data transfer without CPU intervention.
●
Interrupt controller: Generates the raw interrupt and interrupt flags, allowing them to be
set and cleared.
●
RX FIFO/TX FIFO: holds the RX FIFO and TX FIFO register banks and controllers,
along with their status levels.
I2C terminology
The following terms are used throughout this manual and are defined as follows.
I2C bus terms
The following terms relate to how the role of the I2C device and how it interacts with other
I2C devices on the bus.
●
Transmitter: the device that sends data to the bus. A transmitter can either be a device
that initiates the data transmission to the bus (a master-transmitter) or responds to a
request from the master to send data to the bus (a slave-transmitter).
●
Receiver: the device that receives data from the bus. A receiver can either be a device
that receives data on its own request (a master-receiver) or in response to a request
from the master (a slave-receiver).
●
Master: the component that initializes a transfer (START command), generates the
clock (SCL) signal and terminates the transfer (STOP command). A master can be
either a transmitter or a receiver.
●
Slave: the device addressed by the master. A slave can be either receiver or
transmitter.
These concepts are illustrated in Figure 145.
Figure 145. Master/slave and transmitter/receiver relationships
Master
Slave
SDA
Transmitter
Receiver
SCL
Master
Slave
SDA
Transmitter
Receiver
SCL
Doc ID 018553 Rev 3
423/590
I2C bus controllers (I2C)
RM0078
●
Multi-master: the ability for more than one master to co-exist on the bus at the same
time without collision or data loss.
●
Arbitration: the predefined procedure that authorizes only one master at a time to take
control of the bus. For more information about this behavior, refer to Section 27.4.5:
Multiple master arbitration.
●
Synchronization: the predefined procedure that synchronizes the clock signals
provided by two or more masters. For more information about this feature, refer to
Section 27.4.6: Clock synchronization.
●
SDA: data signal line (Serial DAta)
●
SCL: clock signal line (Serial CLock)
Bus transfer terms
The following terms are specific to data transfers that occur to/from the I2C bus.
●
Note:
START and RESTART conditions are functionally identical.
●
27.4.3
START (RESTART): data transfer begins with a START or RESTART condition. The
level of the SDA data line changes from high to low, while the SCL clock line remains
high. When this occurs, the bus becomes busy.
STOP: data transfer is terminated by a STOP condition. This occurs when the level on
the SDA data line passes from the low state to the high state, while the SCL clock line
remains high. When the data transfer has been terminated, the bus is free or idle once
again. The bus stays busy if a RESTART is generated instead of a STOP condition.
I2C behavior
The I2C can be controlled via software to be either:
●
An I2C master only, communicating with other I2C slaves; OR
●
An I2C slave only, communicating with one more I2C masters.
The master is responsible for generating the clock and controlling the transfer of data. The
slave is responsible for either transmitting or receiving data to/from the master. The
acknowledgement of data is sent by the device that is receiving data, which can be either a
master or a slave. As mentioned previously, the I2C protocol also allows multiple masters to
reside on the I2C bus and uses an arbitration procedure to determine bus ownership.
Each slave has a unique address that is determined by the system designer. When a master
wants to communicate with a slave, the master transmits a START/RESTART condition that
is then followed by the slave’s address and a control bit (R/W) to determine if the master
wants to transmit data or receive data from the slave. The slave then sends an acknowledge
(ACK) pulse after the address.
If the master (master-transmitter) is writing to the slave (slave-receiver), the receiver gets
one byte of data. This transaction continues until the master terminates the transmission
with a STOP condition. If the master is reading from a slave (master-receiver), the slave
transmits (slave-transmitter) a byte of data to the master, and the master then acknowledges
the transaction with the ACK pulse. This transaction continues until the master terminates
the transmission by not acknowledging (NACK) the transaction after the last byte is
received, and then the master issues a STOP condition or addresses another slave after
issuing a RESTART condition. This behavior is illustrated in Figure 146.
424/590
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
Figure 146. Data transfer on the I2C bus
P or R
MSB
SDA
LSB
ACK
ACK
from slave
SCL
S
or
R
1
2
7
8
Byte complete
interrupt within
slave
START or
RESTART
condition
from receiver
9
1
2
SCL held low
while servicing
interrupts
3-8
9
R or P
STOP AND
RESTART
condition
The I2C is a synchronous serial interface. The SDA line is a bidirectional signal and
changes only while the SCL line is low, except for STOP, START, and RESTART conditions.
The output drivers are open-drain or open-collector to perform wire-AND functions on the
bus. The maximum number of devices on the bus is limited by only the maximum
capacitance specification of 400 pF. Data is transmitted in byte packages.
Putting data into the FIFO generates a START, and emptying the FIFO generates a STOP.
For more information, refer to START and STOP generation.
The I2C protocols implemented in I2C are described in more details in Section 27.4.4: I2C
protocols.
START and STOP generation
When operating as an I2C master, putting data into the transmit FIFO causes the I2C to
generate a START condition on the I2C bus. Allowing the transmit FIFO to empty causes the
I2C to generate a STOP condition on the I2C bus.
When operating as a slave, the I2C does not generate START and STOP conditions, as per
the protocol. However, if a read request is made to the I2C, it holds the SCL line low until
read data has been supplied to it. This stalls the I2C bus until read data is provided to the
slave I2C, or the I2C slave is disabled by writing a 0 to IC_ENABLE.
Combined formats
The I2C supports mixed read and write combined format transactions in both 7-bit and 10bit addressing modes.
The I2C does not support mixed address and mixed address format—that is, a 7-bit address
transaction followed by a 10-bit address transaction or vice versa—combined format
transactions.
To initiate combined format transfers, IC_CON.IC_RESTART_EN should be set to 1. With
this value set and operating as a master, when the I2C completes an I2C transfer, it checks
the transmit FIFO and executes the next transfer. If the direction of this transfer differs from
the previous transfer, the combined format is used to issue the transfer. If the transmit FIFO
is empty when the current I2C transfer completes, a STOP is issued and the next transfer is
issued following a START condition.
Doc ID 018553 Rev 3
425/590
I2C bus controllers (I2C)
27.4.4
RM0078
I2C protocols
The I2C has the protocols dicussed in this section.
START and STOP conditions
When the bus is idle, both the SCL and SDA signals are pulled high through external pull-up
resistors on the bus. When the master wants to start a transmission on the bus, the master
issues a START condition. This is defined to be a high-to-low transition of the SDA signal
while SCL is 1. When the master wants to terminate the transmission, the master issues a
STOP condition. This is defined to be a low-to-high transition of the SDA line while SCL is 1.
Figure 147 shows the timing of the START and STOP conditions. When data is being
transmitted on the bus, the SDA line must be stable when SCL is 1.
Figure 147. START and STOP condition
SDA
SCL
P
S
Start condition
Note:
Data line stable
data valid
Change of data
allowed
Change of data
allowed
Stop condition
The signal transitions for the START/STOP conditions, as depicted in Figure 147, reflect
those observed at the output signals of the Master driving the I2C bus. Care should be taken
when observing the SDA/SCL signals at the input signals of the Slave(s), because unequal
line delays may result in an incorrect SDA/SCL timing relationship.
Addressing slave protocol
There are two address formats: the 7-bit address format and the 10-bit address format.
7-bit address format
During the 7-bit address format, the first seven bits (bits 7:1) of the first byte set the slave
address and the LSB bit (bit 0) is the R/W bit as shown in Figure 148. When bit 0 (R/W) is
set to 0, the master writes to the slave. When bit 0 (R/W) is set to 1, the master reads from
the slave.
Figure 148. 7-bit address format
MSB
S
A6
LSB
A5
A4
A3
A2
A1
A0
R/W
ACK
Sent by slave
Slave address
S = START condition
426/590
ACK = Acknowledge
Doc ID 018553 Rev 3
R/W = Read/Write pulse
RM0078
I2C bus controllers (I2C)
10-bit address format
During 10-bit addressing, two bytes are transferred to set the 10-bit address. The transfer of
the first byte contains the following bit definition. The first five bits (bits 7:3) notify the slaves
that this is a 10-bit transfer followed by the next two bits (bits 2:1), which set the slaves
address bits 9:8, and the LSB bit (bit 0) is the R/W bit. The second byte transferred sets bits
7:0 of the slave address. Figure 149 shows the 10-bit address format, and Table 139 defines
the special purpose and reserved first byte addresses.
Figure 149. 10-bit address format
S
‘1’
‘1’
‘1’ ‘1’
‘0’ A9 A8 R/W ACK A7 A6 A5 A4 A3 A2 A1 A0 ACK
Sent by slave
Reserved for 10-bit
address
Sent by slave
S = START condition
R/W = Read/Write pulse
ACK = Acknowledge
Table 139. I2C definition of bits in first byte
Slave address
R/W bit
Description
0000 000
0
General call address. I2C places the data in the receive buffer and
issues a general call interrupt.
0000 000
1
START byte. For more information, refer to START BYTE transfer
protocol on page 429.
0000 001
X
CBUS address. I2C ignores these accesses.
0000 010
X
Reserved
0000 011
X
Reserved
0000 1xx
X
High-speed master code. For more information, refer to
Section 27.4.5: Multiple master arbitration.
1111 1xx
X
Reserved
1111 0xx
X
10-bit slave addressing
I2C does not restrict you from using these reserved addresses. However, if you use these
reserved addresses, you may run into incompatibilities with other I2C components.
Transmitting and receiving protocol
The master can initiate data transmission and reception to/from the bus, acting as either a
master-transmitter or master-receiver. A slave responds to requests from the master to
either transmit data or receive data to/from the bus, acting as either a slave-transmitter or
slave-receiver, respectively.
Doc ID 018553 Rev 3
427/590
I2C bus controllers (I2C)
RM0078
Master-transmitter and slave-receiver
All data is transmitted in byte format, with no limit on the number of bytes transferred per
data transfer. After the master sends the address and R/W bit or the master transmits a byte
of data to the slave, the slave-receiver must respond with the acknowledge signal (ACK).
When a slave-receiver does not respond with an ACK pulse, the master aborts the transfer
by issuing a STOP condition. The slave must leave the SDA line high so that the master can
abort the transfer.
If the master-transmitter is transmitting data as shown in Figure 150, then the slave-receiver
responds to the master-transmitter with an acknowledge pulse after every byte of data is
received.
Figure 150. Master-transmitter protocol
For 7-bit Address
S
Slave Address
R/W A
DATA A DATA A /A P
‘0’ (write)
For 10-bit Address
S
Slave Address R/W A Slave Address
Second Byte
First 7 bits ‘0’ (write)
‘11110xxx’
A DATA
A /A P
‘0’ (write)
From Master to Slave
From Slave to Master
A = Acknowledge (SDA low)
A = No acknowledge (SDA high)
S = START condition
P = STOP condition
Master-receiver and slave-transmitter
If the master is receiving data as shown in Figure 151, then the master responds to the
slave-transmitter with an acknowledge pulse after a byte of data has been received, except
for the last byte. This is the way the master-receiver notifies the slave-transmitter that this is
the last byte. The slave-transmitter relinquishes the SDA line after detecting the No
Acknowledge (NACK) so that the master can issue a STOP condition.
When a master does not want to relinquish the bus with a STOP condition, the master can
issue a RESTART condition. This is identical to a START condition except it occurs after the
ACK pulse. Operating in master mode, the I2C can then communicate with the same slave
using a transfer of a different direction. For a description of the combined format
transactions that the I2C supports, refer to Combined formats on page 425.
Note:
428/590
The I2C must be inactive on the serial port—if I2C_DYNAMIC_TAR_UPDATE = 1—before
the target slave address register (IC_TAR) can be reprogrammed.
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
Figure 151. Master-receiver protocol
For 7-bit Address
S
Slave Address
R/W A
DATA A DATA A P
‘1’ (read)
For 10-bit Address
S
Slave Address
Slave Address
R/W A
First 7 bits ‘0’ (write)
Second Byte
‘11110xxx’
Slave Address
First 7 bits
A Sr
‘11110xxx’
‘0’ (write)
From Master to Slave
A = Acknowledge (SDA low)
A = No acknowledge (SDA high)
S = START condition
From Slave to Master
R/W A DATA A P
‘1’ (read)
R = RESTART condition
P = STOP condition
START BYTE transfer protocol
The START BYTE transfer protocol is set up for systems that do not have an on-board
dedicated I2C hardware module. When the I2C is addressed as a slave, it always samples
the I2C bus at the highest speed supported so that it never requires a START BYTE transfer.
However, when I2C is a master, it supports the generation of START BYTE transfers at the
beginning of every transfer in case a slave device requires it. This protocol consists of seven
zeros being transmitted followed by a 1, as illustrated in Figure 152. This allows the
processor that is polling the bus to under-sample the address phase until 0 is detected.
Once the microcontroller detects a 0, it switches from the under sampling rate to the correct
rate of the master.
Figure 152. START BYTE transfer
SDA
dummy
acknowledge
SCL
1
2
7
S
8
(HIGH)
9
ACK
Sr
start byte 00000001
The START BYTE procedure is as follows:
1.
Master generates a START condition.
2.
Master transmits the START byte (0000 0001).
3.
Master transmits the ACK clock pulse. (Present only to conform with the byte handling
format used on the bus)
4.
No slave sets the ACK signal to 0.
5.
Master generates a RESTART (R) condition.
A hardware receiver does not respond to the START BYTE because it is a reserved address
and resets after the RESTART condition is generated.
Doc ID 018553 Rev 3
429/590
I2C bus controllers (I2C)
27.4.5
RM0078
Multiple master arbitration
The I2C bus protocol allows multiple masters to reside on the same bus. If there are two
masters on the same I²C-bus, there is an arbitration procedure if both try to take control of
the bus at the same time by generating a START condition at the same time. Once a master
(for example, a microcontroller) has control of the bus, no other master can take control until
the first master sends a STOP condition and places the bus in an idle state.
Arbitration takes place on the SDA line, while the SCL line is 1. The master, which transmits
a 1 while the other master transmits 0, loses arbitration and turns off its data output stage.
The master that lost arbitration can continue to generate clocks until the end of the byte
transfer. If both masters are addressing the same slave device, the arbitration could go into
the data phase.
Upon detecting that it has lost arbitration to another master, the I2C will stop generating SCL
(ic_clk_oe).
Figure 153 illustrates the timing of when two masters are arbitrating on the bus.
Figure 153. Multiple master arbitration
DATA1
‘1’
MSB
DATA1 loses arbitration
matching data
MSB
DATA2
‘0’
SDA mirrors DATA2
SDA
MSB
SCL
SDA lines up
with DATA1
START condition
For high-speed mode, the arbitration cannot go into the data phase because each master is
programmed with a unique high-speed master code. This 8-bitcode is defined by the system
designer and is set by writing to the High Speed Master Mode Code Address Register,
IC_HS_MADDR. Because the codes are unique, only one master can win arbitration, which
occurs by the end of the transmission of the high-speed master code.
Control of the bus is determined by address or master code and data sent by competing
masters, so there is no central master nor any order of priority on the bus.
430/590
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
Arbitration is not allowed between the following conditions:
●
A RESTART condition and a data bit
●
A STOP condition and a data bit
●
A RESTART condition and a STOP condition
Slaves are not involved in the arbitration process.
27.4.6
Clock synchronization
When two or more masters try to transfer information on the bus at the same time, they must
arbitrate and synchronize the SCL clock. All masters generate their own clock to transfer
messages. Data is valid only during the high period of SCL clock. Clock synchronization is
performed using the wired-AND connection to the SCL signal. When the master transitions
the SCL clock to 0, the master starts counting the low time of the SCL clock and transitions
the SCL clock signal to 1 at the beginning of the next clock period. However, if another
master is holding the SCL line to 0, then the master goes into a HIGH wait state until the
SCL clock line transitions to 1.
All masters then count off their high time, and the master with the shortest high time
transitions the SCL line to 0. The masters then counts out their low time and the one with the
longest low time forces the other master into a HIGH wait state. Therefore, a synchronized
SCL clock is generated, which is illustrated in Figure 154. Optionally, slaves may hold the
SCL line low to slow down the timing on the I2C bus.
Figure 154. Multi-master clock synchronization
Wait State
Start counting HIGH period
CLKA
CLKB
SCL
SCL LOW transition
Resets all CLKs to start
counting their LOW periods
SCL transitions HIGH
when all CLKs are in HIGH state
Doc ID 018553 Rev 3
431/590
I2C bus controllers (I2C)
27.4.7
RM0078
IC_CLK frequency configuration
When the I2C is configured as a master, the *CNT registers must be set before any I2C bus
transaction can take place in order to ensure proper I/O timing. The *CNT registers are:
Note:
●
IC_SS_SCL_HCNT
●
IC_SS_SCL_LCNT
●
IC_FS_SCL_HCNT
●
IC_FS_SCL_LCNT
●
IC_HS_SCL_HCNT
●
IC_HS_SCL_LCNT
It is not necessary to program any of the *CNT registers if the I2C is enabled to operate only
as an I2C slave, since these registers are used only to determine the SCL timing
requirements for operation as an I2C master.
Minimum high and low counts
When the I2C operates as an I2C master, in both transmit and receive transfers:
●
Minimum value that can be programmed in the *_LCNT registers is 8
●
Minimum value allowed for the *_HCNT registers is 6
The minimum value of 8 for the *_LCNT registers is due to the time required for the I2C to
drive SDA after a negative edge of SCL.
The minimum value of 6 for the *_HCNT register is due to the time required for the I2C to
sample SDA during the high period of SCL.
The I2C adds one cycle to the programmed *_LCNT value in order to generate the low
period of the SCL clock. This is due to the counting logic for SCL low counting to
(*_LCNT+1).
The I2C adds eight cycles to the programmed *_HCNT value in order to generate the high
period of the SCL clock. This is due to the following factors:
●
The counting logic for SCL high counts to (*_HCNT+1).
●
The digital filtering applied to the SCL line incurs a delay of four ic_clk cycles. This
filtering includes metastability removal and a 2-out-of-3 majority vote processing on
SDA and SCL edges.
●
Whenever SCL is driven 1 to 0 by the I2C, that is, completing the SCL high time.an
internal logic latency of three ic_clk cycles is incurred.
Consequently, the minimum SCL low time of which the I2C is capable is nine (9) ic_clk
periods (8+1), while the minimum SCL high time is fourteen (14) ic_clk periods (6+1+4+3).
Minimum IC_CLK frequency
This section describes the minimum ic_clk frequencies that the I2C supports for each speed
mode, and the associated high and low count values. It should be noted that these limits
apply to the I2C in both master and slave modes. The limits for slave mode are required so
that the I2C does not break the Thd;dat maximum I2C protocol timing requirement.
432/590
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
Standard and fast modes
This section details how to derive a minimum ic_clk value for standard and fast modes of the
I2C. Although the following method shows how to do fast mode calculations, you can also
use the same method in order to do calculations for standard mode.
Given conditions and calculations for the minimum I2C ic_clk value in fast mode:
●
Fast mode has data rate of 400kb/s; implies SCL period of 1/400khz = 2.5 us
●
Minimum hcnt value of 14 as a seed value; IC_HCNT_FS = 14
●
Protocol minimum SCL high and low times:
–
MIN_SCL_LOWtime_FS = 1300 ns
–
MIN_SCL_HIGHtime_FS = 600 ns
Derived equations:
SCL_PERIOD_FS
--------------------------------------------------------------------------------= IC_CLK_PERIOD
IC_HCNT_FS + IC_LCNT_FS
IC_LCNT_FS × IC_CLK_PERIOD = MIN_SCL_LOWtime_FS
Combined, the previous equations produce the following:
SCL_PERIOD_FS
IC_LCNT_FS × --------------------------------------------------------------------------------- = MIN_SCL_LOWtime_FS
IC_LCNT_FS + IC_HCNT_FS
Solving for IC_LCNT_FS:
2,5 μs
IC_LCNT_FS × --------------------------------------------------- = 1.3 μs
IC_LCNT_FS + 14
The previous equation gives:
IC_LCNT_FS=roundup(15.166)=16
These calculations produce IC_LCNT_FS = 16 and IC_HCNT_FS = 14, giving an ic_clk
value of:
2.5 μs- = 83.3 ns = 12 MHz
-----------------16 + 14
Testing these results shows that protocol requirements are satisfied.
High-speed modes
The method used for standard and fast modes is not enough to derive correct ic_clk values
for the high-speed modes. For example, given a high-speed mode with a 100pf bus loading,
using the standard and fast modes method produces the following:
●
IC_LCNT_HS = 17
●
IC_HCNT_HS = 14
●
ic_clk = 105.4 MHz
Depending on glitch suppression, the I2C can take up to nine ic_clk cycles to drive SDA
after a negative edge of SCL; however, the protocol requires a maximum Thd;dat of 70ns for
this mode. For example:
●
105 MHz => IC_CLK_PERIOD = 9.48 ns
●
9.48 ns * 9 = 85.32 ns
●
85.32 ns is a maximum violation of Thd;dat
Doc ID 018553 Rev 3
433/590
I2C bus controllers (I2C)
RM0078
Thus, these values cannot be used.
To satisfy this rule, IC_CLK_PERIOD can be derived as follows:
70
ns- = 7.77 ns
-------------9
From this value, high and low count values can be derived:
IC_LCNT_HS × IC_CLK_PERIOD ≥ MIN_SCL_LOWtime_HS
IC_LCNT_HS × 7.77 ns ≥ 160 ns
IC_LCNT_HS ≥ 21
The minimum value of 14 for IC_HCNT_HS easily accommodates the
MIN_SCL_HIGHtime_HS requirement of 60ns for this requirement. Therefore:
MIN_SCL_HIGHtime_FS = 14
MIN_SCL_HIGHtime_FS = 21
This derivation gives a baud rate higher than the allowed 3.4 Mb/s, but the high or low count
can be scaled up to give the desired baud rate.
Given:
SCL_PERIOD = 1/3.4 MHz = 294 ns
IC_CLK_PERIOD = 7.77 ns
Required:
roundup(294/7.77) = 38 ic_clk periods for a baud rate of 3.4 Mb/s
To achieve this, the low count must be scaled up by 3 to give:
MIN_SCL_HIGHtime_FS = 14
MIN_SCL_HIGHtime_FS = 24
The values for HS mode with a bus loading of 400 pf can be derived in the same way.
Table 140 lists the minimum ic_clk values for all modes with high and low count values.
Table 140. ic_clk in relation to high and low counts
Speed
mode
ic_clkfreq
(MHz)
SCL low
count
SCL low
program
value
SCL low
time
SCL high
count
SCL high
program
value
SCL high
time
SS
2.7
13
12
4.7 μ s
14
6
5.2 μ s
FS
12.0
16
15
1.33 μ s
14
6
1.16 μ s
HS (400 pf)
60.2
22
21
365 ns
14
6
232 ns
HS (100 pf)
128.5
24
23
186 ns
14
6
108 ns
Note:
434/590
The IC_*_SCL_LCNT and IC_*_SCL_HCNT registers are programmed using the SCL low
and high program values in Table 140, which are calculated using SCL low count minus 1,
and SCL high counts minus 8, respectively.
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
Calculating high and low counts
The calculations below show how to calculate SCL high and low counts for each speed
mode in the I2C. For the calculations to work, the ic_clk frequencies used must not be less
than the minimum ic_clk frequencies specified in Table 140.
The I2C coreConsultant GUI can automatically calculate SCL high and low count values. By
specifying an integer ic_clk period value in nanoseconds for the IC_CLK_PERIOD
parameter, SCL high and low count values are automatically calculated for each speed
mode. The ic_clk period must not specify a clock of a lower frequency than required for all
supported speed modes. It is possible that the automatically calculated values may result in
a baud rate higher than the maximum rate specified by the protocol. If this happens, either
the low or high count values can be scaled up to reduce the baud rate.
The minimum IC_CLK calculations for high-speed mode show how to do this; for details,
refer to High-speed modes on page 433.
The equation to calculate the proper number of ic_clk signals required for setting the proper
SCL clocks high and low times is as follows:
IC_xCNT = (ROUNDUP(MIN_SCL_xxxtime*OSCFREQ,0))
ROUNDUP is an explicit Excel function call that is used to convert a
real number to its equivalent integer number.
MIN_SCL_HIGHtime = Minimum High Period
MIN_SCL_HIGHtime = 4000 ns for 100 kbps
600 ns for 400 kbps
60 ns for 3.4 Mbs, bus loading = 100pF
160 ns for 3.4 Mbs, bus loading = 400pF
MIN_SCL_LOWtime = Minimum Low Period
MIN_SCL_LOWtime = 4700 ns for 100 kbps
1300 ns for 400 kbps
120 ns for 3.4Mbs, bus loading = 100pF
320 ns for 3.4Mbs, bus loading = 400pF
OSCFREQ = ic_clk Clock Frequency (Hz).
For example:
OSCFREQ = 100 MHz
I2Cmode = fast, 400 kbit/s
MIN_SCL_HIGHtime = 600 ns.
MIN_SCL_LOWtime = 1300 ns.
IC_xCNT = (ROUNDUP(MIN_SCL_HIGH_LOWtime*OSCFREQ,0))
IC_HCNT = (ROUNDUP(600 ns * 100 MHz,0))
IC_HCNTSCL PERIOD = 60
IC_LCNT = (ROUNDUP(1300 ns * 100 MHz,0))
IC_LCNTSCL PERIOD = 130
Actual MIN_SCL_HIGHtime = 60*(1/100 MHz) = 600 ns
Actual MIN_SCL_LOWtime = 130*(1/100 MHz) = 1300 ns
Doc ID 018553 Rev 3
435/590
I2C bus controllers (I2C)
27.4.8
RM0078
SDA hold time
The I2C protocol specification requires 300ns of hold time on the SDA signal (tHD;DAT) in
standard and fast speed modes, and a hold time long enough to bridge the undefined part
between logic 1 and logic 0 of the falling edge of SCL in high speed mode.
Board delays on the SCL and SDA signals can mean that the hold-time requirement is met
at the I2C master, but not at the I2C slave (or vice-versa). As each application will encounter
differing board delays, the I2C contains a software programmable register (IC_SDA_HOLD)
to enable dynamic adjustment of the SDA hold-time.
The IC_SDA_HOLD register can be used to alter the timing of the generated SDA
(ic_data_oe) signal by the I2C. Each value in the IC_SDA_HOLD register represents a unit
of one ic_clk period.
When the I2C is operating in Master Mode, the minimum tHD:DAT timing is one ic_clk
period. Therefore even when IC_SDA_HOLD has a value of zero, the I2C will drive SDA
(ic_data_oe) one ic_clk cycle after driving SCL (ic_clk_oe) to logic 0. For all other values of
IC_SDA_HOLD, the following is true:
●
Drive on SDA (ic_data_oe) will occur IC_SDA_HOLD ic_clk cycles after driving SCL
(ic_clk_oe) to logic 0
When the I2C operates in slave mode, the minimum tHD:DAT timing is eight ic_clk periods.
This delay is to allow for synchronization and filtering on the SCL (ic_clk_in) sample.
Therefore, even when IC_SDA_HOLD has a value less than 8, the I2C will drive SDA
(ic_data_oe) eight ic_clk cycles after SCL (ic_clk_in) has transitioned to logic 0. For all other
values of IC_SDA_HOLD, the following is true:
●
Drive on SDA (ic_data_oe) will occur IC_SDA_HOLD ic_clk cycles after SCL (ic_clk_in)
has transitioned to logic 0
If different SDA hold times are required for different speed modes, the IC_SDA_HOLD
register must be reprogrammed when the speed mode is being changed. The
IC_SDA_HOLD register can be programmed only when the I2C is disabled (IC_ENABLE =
0).
The reset value of the IC_SDA_HOLD register can be set via the coreConsultant parameter
IC_DEFAULT_SDA_HOLD.
Figure 155 shows the tHD:DAT timing generated by the I2C operating in Master Mode when
IC_SDA_HOLD = 3.
Figure 155. I2C master implementing tHD;DAT when IC_SDA_HOLD = 3
ic_clk
ic_data_oe
ic_clk_oe
IC_SDA_HOLD = 3
436/590
Doc ID 018553 Rev 3
RM0078
27.4.9
I2C bus controllers (I2C)
DMA controller interface
The I2C has an optional built-in DMA capability that can be selected at configuration time; it
has a handshaking interface to a DMA Controller to request and control transfers. The APB
bus is used to perform the data transfer to or from the DMA. While the I2C DMA operation is
designed in a generic way to fit any DMA controller as easily as possible, it is designed to
work seamlessly, and best used, with the DMA Controller, the DMAC. The settings of the
DMAC that are relevant to the operation of the I2C are discussed here, mainly bit fields in
the DMAC channel control register, CTLx, where x is the channel number.
When the I2C interfaces to the DMAC, the DMAC is always a flow controller; that is, it
controls the block size. This must be programmed by software in the DMAC. The DMAC
always transfers data using DMA burst transactions if possible, for efficiency. For more
information, refer to Chapter 12: Direct memory access controllers (DMAC). Other DMA
controllers act in a similar manner.
The DMA output dma_finish is a status signal to indicate that the DMA block transfer is
complete. I2C does not use this status signal, and therefore does not appear in the I/O port
list.
The I2Cn has 2 DMA handshaking interfaces called I2Cn_TX and I2Cn_RX as shown in
Table 49: DMAC MUX - selecting the peripheral .
●
The I2Cn_TX is composed by the following signals: dma_tx_req (burst request),
dma_tx_single (single request) and dma_tx_ack (clear).
●
The I2Cn_RX is composed by the following signals: dma_rx_req (burst request),
dma_rx_single (single request) and dma_rx_ack (clear).
The relevant DMA settings are discussed in the following sections.
Enabling the DMA controller interface
To enable the DMA Controller interface on the I2C, you must write the DMA Control Register
(IC_DMA_CR). Writing a 1 into the TDMAE bit field of IC_DMA_CR register enables the I2C
transmit handshaking interface. Writing a 1 into the RDMAE bit field of the IC_DMA_CR
register enables the I2C receive handshaking interface.
Overview of operation
As a block flow control device, the DMA Controller is programmed by the processor with the
number of data items (block size) that are to be transmitted or received by I2C; this is
programmed into the BLOCK_TS field of the DMAC CTLx register.
The block is broken into a number of transactions, each initiated by a request from the I2C.
The DMA Controller must also be programmed with the number of data items (in this case,
I2C FIFO entries) to be transferred for each DMA request. This is also known as the burst
transaction length and is programmed into the SRC_MSIZE/DEST_MSIZE fields of the
DMAC CTLx register for source and destination, respectively.
Figure 156 shows a single block transfer, where the block size programmed into the DMA
Controller is 12 and the burst transaction length is set to 4. In this case, the block size is a
multiple of the burst transaction length. Therefore, the DMA block transfer consists of a
series of burst transactions. If the I2C makes a transmit request to this channel, four data
items are written to the I2C TX FIFO. Similarly, if the I2C makes a receive request to this
channel, four data items are read from the I2C RX FIFO. Three separate requests must be
made to this DMA channel before all 12 data items are written or read.
Doc ID 018553 Rev 3
437/590
I2C bus controllers (I2C)
RM0078
Figure 156. Breakdown of DMA transfer into burst transactions
12 data items
DMA
multi-block transfer
level
12 data items
DMA
block
level
DMA burst
transaction 1
DMA burst
transaction 2
4 data items
4 data items
DMA burst
transaction 3
4 data items
Block size : DMA.CTLx.BLOCK_TS=12
Number of data items per source burst transaction : DMA.CTLx.SRC_MSIZE = 4
I2C receive FIFO watermark level: I2C.DMARDLR + 1 = DMA.CTLx.SRC_MSIZE = 4
When the block size programmed into the DMA Controller is not a multiple of the burst
transaction length, as shown in Figure 157, a series of burst transactions followed by single
transactions are needed to complete the block transfer.
Figure 157. Breakdown of DMA transfer into single and burst transactions
15 data items
DMA
multi-block transfer
level
15 data items
DMA
block
level
DMA burst
transaction 1
DMA burst
transaction 2
4 data items
4 data items
DMA burst
transaction 3
4 data items
DMA single
transaction 1
1 data item
DMA single
transaction 2
DMA single
transaction 3
1 data item
1 data item
Block size : DMA.CTLx.BLOCK_TS=15
Number of data items per burst transaction : DMA.CTLx.DEST_MSIZE = 4
I2C transmit FIFO watermark level: I2C.IC_DMA_TDLR = DMA.CTLx.DEST_MSIZE = 4
Transmit watermark level and transmit FIFO underflow
During I2C serial transfers, transmit FIFO requests are made to the DMAC whenever the
number of entries in the transmit FIFO is less than or equal to the DMA Transmit Data Level
Register (IC_DMA_TDLR) value; this is known as the watermark level. The DMAC responds
by writing a burst of data to the transmit FIFO buffer, of length CTLx.DEST_MSIZE.
438/590
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
Data should be fetched from the DMA often enough for the transmit FIFO to perform serial
transfers continuously; that is, when the FIFO begins to empty another DMA request should
be triggered. Otherwise, the FIFO will run out of data causing a STOP to be inserted on the
I2C bus. To prevent this condition, the user must set the watermark level correctly.
Choosing the transmit watermark level
Consider the example where the assumption is made:
DMA.CTLx.DEST_MSIZE = FIFO_DEPTH - I2C.IC_DMA_TDLR
Here the number of data items to be transferred in a DMA burst is equal to the empty space
in the Transmit FIFO. Consider two different watermark level settings.
Case 1: IC_DMA_TDLR = 2
●
Transmit FIFO watermark level = I2C.IC_DMA_TDLR = 2
●
DMA.CTLx.DEST_MSIZE = FIFO_DEPTH - I2C.IC_DMA_TDLR = 6
●
I2C transmit FIFO_DEPTH = 8
●
DMA.CTLx.BLOCK_TS = 30
Figure 158. Case 1 watermark levels
FIFO_DEPTH = 8
Transmit FIFO
Watermark level
EMPTY
FIFO_DEPTH - I2C.IC_DMA_TDLR = 6
I2C.IC_DMA_TDLR = 2
Data Out
Data In
DMA
Controller
FULL
I2C Transmit FIFO
Therefore, the number of burst transactions needed equals the block size divided by the
number of data items per burst:
DMA.CTLx.BLOCK_TS/DMA.CTLx.DEST_MSIZE = 30/6 = 5
The number of burst transactions in the DMA block transfer is 5. But the watermark level,
I2C.IC_DMA_TDLR, is quite low. Therefore, the probability of an I2C underflow is high
where the I2C serial transmit line needs to transmit data, but where there is no data left in
the transmit FIFO. This occurs because the DMA has not had time to service the DMA
request before the transmit FIFO becomes empty.
Case 2: IC_DMA_TDLR = 6
●
Transmit FIFO watermark level = I2C.IC_DMA_TDLR = 6
●
DMA.CTLx.DEST_MSIZE = FIFO_DEPTH - I2C.IC_DMA_TDLR = 2
●
I2C transmit FIFO_DEPTH = 8
●
DMA.CTLx.BLOCK_TS = 30
Doc ID 018553 Rev 3
439/590
I2C bus controllers (I2C)
RM0078
Figure 159. Case 2 watermark levels
EMPTY
Transmit FIFO
Watermark level
FIFO_DEPTH = 8
FULL
FIFO_DEPTH - I2C.iC_DMA_TDLR = 2
I2C.IC_DMA_TDLR = 6
Data In
DMA
Controller
Data Out
I2C Transmit FIFO
Number of burst transactions in Block:
DMA.CTLx.BLOCK_TS/DMA.CTLx.DEST_MSIZE = 30/2 = 15
In this block transfer, there are 15 destination burst transactions in a DMA block transfer. But
the watermark level, I2C.IC_DMA_TDLR, is high. Therefore, the probability of an I2C
underflow is low because the DMA controller has plenty of time to service the destination
burst transaction request before the I2C transmit FIFO becomes empty.
Thus, the second case has a lower probability of underflow at the expense of more burst
transactions per block. This provides a potentially greater amount of AMBA bursts per block
and worse bus utilization than the former case.
Therefore, the goal in choosing a watermark level is to minimize the number of transactions
per block, while at the same time keeping the probability of an underflow condition to an
acceptable level. In practice, this is a function of the ratio of the rate at which the I2C
transmits data to the rate at which the DMA can respond to destination burst requests.
For example, promoting the channel to the highest priority channel in the DMA, and
promoting the DMA master interface to the highest priority master in the AMBA layer,
increases the rate at which the DMA controller can respond to burst transaction requests.
This in turn allows the user to decrease the watermark level, which improves bus utilization
without compromising the probability of an underflow occurring.
Selecting DEST_MSIZE and transmit FIFO overflow
As can be seen from Figure 159, programming DMA.CTLx.DEST_MSIZE to a value greater
than the watermark level that triggers the DMA request may cause overflow when there is
not enough space in the I2C transmit FIFO to service the destination burst request.
Therefore, the following equation must be adhered to in order to avoid overflow:
DMA.CTLx.DEST_MSIZE <= I2C.FIFO_DEPTH - I2C.IC_DMA_TDLR (1)
In Case 2: IC_DMA_TDLR = 6, the amount of space in the transmit FIFO at the time the
burst request is made is equal to the destination burst length, DMA.CTLx.DEST_MSIZE.
Thus, the transmit FIFO may be full, but not overflowed, at the completion of the burst
transaction.
Therefore, for optimal operation, DMA.CTLx.DEST_MSIZE should be set at the FIFO level
that triggers a transmit DMA request; that is:
DMA.CTLx.DEST_MSIZE = I2C.FIFO_DEPTH - I2C.IC_DMA_TDLR (2)
This is the setting used in Figure 157.
Adhering to equation (2) reduces the number of DMA bursts needed for a block transfer, and
this in turn improves AMBA bus utilization.
440/590
Doc ID 018553 Rev 3
RM0078
Note:
I2C bus controllers (I2C)
The transmit FIFO will not be full at the end of a DMA burst transfer if the I2C has
successfully transmitted one data item or more on the I2C serial transmit line during the
transfer.
Receive watermark level and receive FIFO overflow
During I2C serial transfers, receive FIFO requests are made to the DMAC whenever the
number of entries in the receive FIFO is at or above the DMA Receive Data Level Register;
that is, IC_DMA_RDLR+1. This is known as the watermark level. The DMAC responds by
writing a burst of data to the transmit FIFO buffer of length CTLx.SRC_MSIZE.
Data should be fetched by the DMA often enough for the receive FIFO to accept serial
transfers continuously; that is, when the FIFO begins to fill, another DMA transfer is
requested. Otherwise, the FIFO will fill with data (overflow). To prevent this condition, the
user must correctly set the watermark level.
Choosing the receive watermark level
Similar to choosing the transmit watermark level described earlier, the receive watermark
level, IC_DMA_RDLR+1, should be set to minimize the probability of overflow, as shown in
Figure 160. It is a trade-off between the number of DMA burst transactions required per
block versus the probability of an overflow occurring.
Selecting SRC_MSIZE and Receive FIFO Underflow
As can be seen in Figure 160, programming a source burst transaction length greater than
the watermark level may cause underflow when there is not enough data to service the
source burst request. Therefore, equation 3 below must be adhered to avoid underflow.
If the number of data items in the receive FIFO is equal to the source burst length at the time
the burst request is made – DMA.CTLx.SRC_MSIZE – the receive FIFO may be emptied,
but not underflowed, at the completion of the burst transaction. For optimal operation,
DMA.CTLx.SRC_MSIZE should be set at the watermark level; that is:
DMA.CTLx.SRC_MSIZE = I2C.IC_DMA_RDLR + 1 (3)
Adhering to equation (3) above reduces the number of DMA bursts in a block transfer, which
in turn can avoid underflow and improve AMBA bus utilization.
Note:
The receive FIFO will not be empty at the end of the source burst transaction if the I2C has
successfully received one data item or more on the I2C serial receive line during the burst.
Figure 160. I2C Receive FIFO
EMPTY
Receive FIFO
Watermark level
FULL
Data Out
I2C.IC_DMA_RDLR + 1
DMA
Controller
Data In
I2C Receive FIFO
Doc ID 018553 Rev 3
441/590
I2C bus controllers (I2C)
RM0078
Handshaking interface operation
The following sections discuss the handshaking interface.
Note:
For I2C0, dma_tx_req = I2C0_TX and dma_rx_req = I2C0_RX
dma_tx_req, dma_rx_req
The request signals for source and destination, dma_tx_req and dma_rx_req, are activated
when their corresponding FIFOs reach the watermark levels as discussed earlier.
The DMAC uses rising-edge detection of the dma_tx_req signal/dma_rx_req to identify a
request on the channel. Upon reception of the dma_tx_ack/dma_rx_ack signal from the
DMAC to indicate the burst transaction is complete, the I2C de-asserts the burst request
signals, dma_tx_req/dma_rx_req, until dma_tx_ack/dma_rx_ack is de-asserted by the
DMAC.
When the I2C samples that dma_tx_ack/dma_rx_ack is de-asserted, it can re-assert the
dma_tx_req/dma_rx_req of the request line if their corresponding FIFOs exceed their
watermark levels (back-to-back burst transaction). If this is not the case, the DMA request
lines remain de-asserted.
Figure 161 shows a timing diagram of a burst transaction where pclk = hclk. Figure 162
shows two back-to-back burst transactions where the hclk frequency is twice the pclk
frequency.
Figure 161. Burst transaction – pclk = hclk
pclk
hclk
burst transaction request
dma_tx_req
burst transaction complete
dma_tx_ack
dma_tx_single
not sampled by the DW_ahb_dmac for burst transactions
Figure 162. Back-to-back burst transaction – hclk = 2*pclk
hclk
pclk
burst transaction request
burst transaction request
dma_rx_req
burst transaction complete
burst transaction complete
dma_rx_ack
dma_rx_single
442/590
not sampled by the DW_ahb_dmac for burst transactions
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
The handshaking loop is as follows:
dma_tx_req/dma_rx_req asserted by I2C
Note:
–
dma_tx_ack/dma_rx_ack asserted by DMAC
–
dma_tx_req/dma_rx_req de-asserted by I2C
–
dma_tx_ack/dma_rx_ack de-asserted by DMAC
–
dma_tx_req/dma_rx_req reasserted by I2C, if back-to-back transaction is required
The burst transaction request signals, dma_tx_req and dma_rx_req, are generated in the
I2C off pclk and sampled in the DMAC by hclk. The acknowledge signals, dma_tx_ack and
dma_rx_ack, are generated in the DMAC off hclk and sampled in the I2C of pclk. The
handshaking mechanism between the DMAC and the I2C supports quasi-synchronous
clocks; that is, hclk and pclk must be phase-aligned, and the hclk frequency must be a
multiple of the pclk frequency.
Two things to note here:
1.
The burst request lines, dma_tx_req signal/dma_rx_req, once asserted remain
asserted until their corresponding dma_tx_ack/dma_rx_ack signal is received even if
the respective FIFO’s drop below their watermark levels during the burst transaction.
2.
The dma_tx_req/dma_rx_req signals are de-asserted when their corresponding
dma_tx_ack/dma_rx_ack signals are asserted, even if the respective FIFOs exceed
their watermark levels.
dma_tx_single, dma_rx_single
The dma_tx_single signal is a status signal. It is asserted when there is at least one free
entry in the transmit FIFO and cleared when the transmit FIFO is full. The dma_rx_single
signal is a status signal. It is asserted when there is at least one valid data entry in the
receive FIFO and cleared when the receive FIFO is empty.
These signals are needed by only the DMAC for the case where the block size,
CTLx.BLOCK_TS, that is programmed into the DMAC is not a multiple of the burst
transaction length, CTLx.SRC_MSIZE, CTLx.DEST_MSIZE, as shown in Figure 157. In this
case, the DMA single outputs inform the DMAC that it is still possible to perform single data
item transfers, so it can access all data items in the transmit/receive FIFO and complete the
DMA block transfer. The DMA single outputs from the I2C are not sampled by the DMAC
otherwise. This is illustrated in the following example.
Consider first an example where the receive FIFO channel of the I2C is as follows:
DMA.CTLx.SRC_MSIZE = I2C.iC_DMA_RDLR + 1 = 4
DMA.CTLx.BLOCK_TS = 12
For the example in Figure 156, with the block size set to 12, the dma_rx_req signal is
asserted when four data items are present in the receive FIFO. The dma_rx_req signal is
asserted three times during the I2C serial transfer, ensuring that all 12 data items are read
by the DMAC. All DMA requests read a block of data items and no single DMA transactions
are required. This block transfer is made up of three burst transactions.
Now, for the following block transfer:
DMA.CTLx.SRC_MSIZE = I2C.IC_DMA_RDLR + 1 = 4
DMA.CTLx.BLOCK_TS = 15
Doc ID 018553 Rev 3
443/590
I2C bus controllers (I2C)
RM0078
The first 12 data items are transferred as already described using three burst transactions.
But when the last three data frames enter the receive FIFO, the dma_rx_req signal is not
activated because the FIFO level is below the watermark level. The DMAC samples
dma_rx_single and completes the DMA block transfer using three single transactions. The
block transfer is made up of three burst transactions followed by three single transactions.
Figure 163 shows a single transaction. The handshaking loop is as follows:
dma_tx_single/dma_rx_single asserted by I2C
–
dma_tx_ack/dma_rx_ack asserted by DMAC
–
dma_tx_single/dma_rx_single de-asserted by I2C
–
dma_tx_ack/dma_rx_ack de-asserted by DMAC.
Figure 163. Single transaction
m0
m1
m2
n0
n1
n2
n3
n4
pclk
hclk
dma_rx_req
single transaction complete
dma_rx_ack
dma_rx_single
Figure 164 shows a burst transaction, followed by three back-to-back single transactions,
where the hclk frequency is twice the pclk frequency.
Figure 164. Burst transaction + 3 back-to-back singles – hclk = 2*pclk
hclk
pclk
burst transaction request
dma_tx_req
dma_tx_ack
burst transaction complete
Single transaction complete
Single transaction complete
Single transaction complete
dma_tx_single
Note:
444/590
The single transaction request signals, dma_tx_single and dma_rx_single, are generated in
the I2C on the pclk edge and sampled in DMAC on hclk. The acknowledge signals,
dma_tx_ack and dma_rx_ack, are generated in the DMAC on the hclk edge hclk and
sampled in the I2C on pclk. The handshaking mechanism between the DMAC and the I2C
supports quasi-synchronous clocks; that is, hclk and pclk must be phase aligned and the
hclk frequency must be a multiple of pclk frequency.
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
27.5
Programming
Note:
It is important to note that the I2C should only be set to operate as an I2C Master, or I2C
Slave, but not both simultaneously. This is achieved by ensuring that bit 6
(IC_SLAVE_DISABLE) and 0 (IC_MASTER_MODE) of the IC_CON register are never set
to 0 and 1, respectively.
27.5.1
Slave mode operation
This section discusses slave mode procedures.
Initial configuration
To use the I2C as a slave, perform the following steps:
Note:
1.
Disable the I2C by writing a ‘0’ to bit 0 of the IC_ENABLE register.
2.
Write to the IC_SAR register (bits 9:0) to set the slave address. This is the address to
which the I2C responds.
3.
Write to the IC_CON register to specify which type of addressing is supported (7- or
10-bit by setting bit 3). Enable the I2C in slave-only mode by writing a ‘0’ into bit 6
(IC_SLAVE_DISABLE) and a ‘0’ to bit 0 (MASTER_MODE).
Slaves and masters do not have to be programmed with the same type of addressing 7- or
10-bit address. For instance, a slave can be programmed with 7-bit addressing and a master
with 10-bit addressing, and vice versa.
4.
Note:
Enable the I2C by writing a ‘1’ in bit 0 of the IC_ENABLE register.
Depending on the reset values chosen, steps 2 and 3 may not be necessary because the
reset values can be configured. For instance, if the device is only going to be a master, there
would be no need to set the slave address because you can configure I2C to have the slave
disabled after reset and to enable the master after reset. The values stored are static and do
not need to be reprogrammed if the I2C is disabled.
Slave-transmitter operation for a single byte
When another I2C master device on the bus addresses the I2C and requests data, the I2C
acts as a slave-transmitter and the following steps occur:
1.
The other I2C master device initiates an I2C transfer with an address that matches the
slave address in the IC_SAR register of the I2C.
2.
The I2C acknowledges the sent address and recognizes the direction of the transfer to
indicate that it is acting as a slave-transmitter.
3.
The I2C asserts the RD_REQ interrupt (bit 5 of the IC_RAW_INTR_STAT register) and
holds the SCL line low. It is in a wait state until software responds. If the RD_REQ
interrupt has been masked, due to IC_INTR_MASK[5] register (M_RD_REQ bit field)
being set to 0, then it is recommended that a hardware and/or software timing routine
be used to instruct the CPU to perform periodic reads of the IC_RAW_INTR_STAT
register.
a)
Reads that indicate IC_RAW_INTR_STAT[5] (R_RD_REQ bit field) being set to 1
must be treated as the equivalent of the RD_REQ interrupt being asserted.
b)
Software must then act to satisfy the I2C transfer.
c)
The timing interval used should be in the order of 10 times the fastest SCL clock
period the I2C can handle. For example, for 400 kb/s, the timing interval is 25 us.
Doc ID 018553 Rev 3
445/590
I2C bus controllers (I2C)
Note:
The value of 10 is recommended here because this is approximately the amount of time
required for a single byte of data transferred on the I2C bus.
4.
Note:
RM0078
If there is any data remaining in the TX FIFO before receiving the read request, then
the I2C asserts a TX_ABRT interrupt (bit 6 of the IC_RAW_INTR_STAT register) to
flush the old data from the TX FIFO.
Because the I2C’s TX FIFO is forced into a flushed/reset state whenever a TX_ABRT event
occurs, it is necessary for software to release the I2C from this state by reading the
IC_CLR_TX_ABRT register before attempting to write into the TX FIFO. See register
IC_RAW_INTR_STAT for more details.
If the TX_ABRT interrupt has been masked, due to of IC_INTR_MASK[6] register
(M_TX_ABRT bit field) being set to 0, then it is recommended that re-using the timing
routine (described in the previous step), or a similar one, be used to read the
IC_RAW_INTR_STAT register.
a)
Reads that indicate bit 6 (R_TX_ABRT) being set to 1 must be treated as the
equivalent of the TX_ABRT interrupt being asserted.
b)
There is no further action required from software.
c)
The timing interval used should be similar to that described in the previous step for
the IC_RAW_INTR_STAT[5] register.
5.
Software writes to the IC_DATA_CMD register with the data to be written (by writing a
‘0’ in bit 8).
6.
Software must clear the RD_REQ and TX_ABRT interrupts (bits 5 and 6, respectively)
of the IC_RAW_INTR_STAT register before proceeding.
If the RD_REQ and/or TX_ABRT interrupts have been masked, then clearing of the
IC_RAW_INTR_STAT register will have already been performed when either the
R_RD_REQ or R_TX_ABRT bit has been read as 1.
7.
The I2C releases the SCL and transmits the byte.
8.
The master may hold the I2C bus by issuing a RESTART condition or release the bus
by issuing a STOP condition.
Slave-receiver operation for a single byte
When another I2C master device on the bus addresses the I2C and is sending data, the I2C
acts as a slave-receiver and the following steps occur:
Note:
1.
The other I2C master device initiates an I2C transfer with an address that matches the
I2C’s slave address in the IC_SAR register.
2.
The I2C acknowledges the sent address and recognizes the direction of the transfer to
indicate that the I2C is acting as a slave-receiver.
3.
I2C receives the transmitted byte and places it in the receive buffer.
If the RX FIFO is completely filled with data when a byte is pushed, then an overflow occurs
and the I2C continues with subsequent I2C transfers. Because a NACK is not generated,
software must recognize the overflow when indicated by the I2C (by the R_RX_OVER bit in
the IC_INTR_STAT register) and take appropriate actions to recover from lost data. Hence,
there is a real time constraint on software to service the RX FIFO before the latter overflow
as there is no way to reapply pressure to the remote transmitting master. You must select a
deep enough RX FIFO depth to satisfy the interrupt service interval of their system.
4.
446/590
I2C asserts the RX_FULL interrupt (IC_RAW_INTR_STAT[2] register).
If the RX_FULL interrupt has been masked, due to setting IC_INTR_MASK[2] register
to 0 or setting IC_TX_TL to a value larger than 0, then it is recommended that a timing
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
routine (described in Slave-transmitter operation for a single byte on page 445) be
implemented for periodic reads of the IC_STATUS register. Reads of the IC_STATUS
register, with bit 3 (RFNE) set at 1, must then be treated by software as the equivalent
of the RX_FULL interrupt being asserted.
5.
Software may read the byte from the IC_DATA_CMD register (bits 7:0).
6.
The other master device may hold the I2C bus by issuing a RESTART condition or
release the bus by issuing a STOP conditions.
Slave-transfer operation for bulk transfers
In the standard I2C protocol, all transactions are single byte transactions and the
programmer responds to a remote master read request by writing one byte into the slave’s
TX FIFO. When a slave (slave-transmitter) is issued with a read request (RD_REQ) from the
remote master (master-receiver), at a minimum there should be at least one entry placed
into the slave-transmitter’s TX FIFO. I2C is designed to handle more data in the TX FIFO so
that subsequent read requests can take that data without raising an interrupt to get more
data. Ultimately, this eliminates the possibility of significant latencies being incurred between
raising the interrupt for data each time had there been a restriction of having only one entry
placed in the TX FIFO.
This mode only occurs when I2C is acting as a slave-transmitter. If the remote master
acknowledges the data sent by the slave-transmitter and there is no data in the slave’s TX
FIFO, the I2C holds the I2C SCL line low while it raises the read request interrupt
(RD_REQ) and waits for data to be written into the TX FIFO before it can be sent to the
remote master.
If the RD_REQ interrupt is masked, due to bit 5 (M_RD_REQ) of the IC_INTR_STAT
register being set to 0, then it is recommended that a timing routine be used to activate
periodic reads of the IC_RAW_INTR_STAT register. Reads of IC_RAW_INTR_STAT that
return bit 5 (R_RD_REQ) set to 1 must be treated as the equivalent of the RD_REQ
interrupt referred to in this section. This timing routine is similar to that described in Slavetransmitter operation for a single byte on page 445.
The RD_REQ interrupt is raised upon a read request, and like interrupts, must be cleared
when exiting the interrupt service handling routine (ISR). The ISR allows you to either write
1 byte or more than 1 byte into the TX FIFO. During the transmission of these bytes to the
master, if the master acknowledges the last byte. Then the slave must raise the RD_REQ
again because the master is requesting for more data.
If the programmer knows in advance that the remote master is requesting a packet of n
bytes, then when another master addresses I2C and requests data, the TX FIFO could be
written with n number bytes and the remote master receives it as a continuous stream of
data. For example, the I2C slave continues to send data to the remote master as long as the
remote master is acknowledging the data sent and there is data available in the TX FIFO.
There is no need to hold the SCL line low or to issue RD_REQ again.
If the remote master is to receive n bytes from the I2C but the programmer wrote a number
of bytes larger than n to the TX FIFO, then when the slave finishes sending the requested n
bytes, it clears the TX FIFO and ignores any excess bytes.
The I2C generates a transmit abort (TX_ABRT) event to indicate the clearing of the TX FIFO
in this example. At the time an ACK/NACK is expected, if a NACK is received, then the
remote master has all the data it wants. At this time, a flag is raised within the slave’s state
machine to clear the leftover data in the TX FIFO. This flag is transferred to the processor
Doc ID 018553 Rev 3
447/590
I2C bus controllers (I2C)
RM0078
bus clock domain where the FIFO exists and the contents of the TX FIFO is cleared at that
time.
27.5.2
Master mode operation
This section discusses master mode procedures.
Initial configuration
The initial configuration procedure for Master Mode Operation depends on the configuration
parameter I2C_DYNAMIC_TAR_UPDATE. When set to “Yes” (1), the target address and
address format can be changed dynamically without having to disable I2C. This parameter
only applies to when I2C is acting as a master because the slave requires the component to
be disabled before any changes can be made to the address.
The procedures are very similar and are only different with regard to where the
IC_10BITADDR_MASTER bit is set (either bit 4 of IC_CON register or bit 12 of IC_TAR
register).
I2C_DYNAMIC_TAR_UPDATE = 1
To use the I2C as a master when the I2C_DYNAMIC_TAR_UPDATE configuration
parameter is set to “Yes” (1), perform the following steps:
Note:
1.
Disable the I2C by writing 0 to the IC_ENABLE register.
2.
Write to the IC_CON register to set the maximum speed mode supported for slave
operation (bits 2:1) and to specify whether the I2C starts its transfers in 7/10 bit
addressing mode when the device is a slave (bit 3).
3.
Write to the IC_TAR register the address of the I2C device to be addressed. It also
indicates whether a General Call or a START BYTE command is going to be performed
by I2C. The desired speed of the I2C master-initiated transfers, either 7-bit or 10-bit
addressing, is controlled by the IC_10BITADDR_MASTER bit field (bit 12).
4.
Only applicable for high-speed mode transfers. Write to the IC_HS_MADDR register
the desired master code for the I2C. The master code is programmer-defined.
5.
Enable the I2C by writing a 1 in the IC_ENABLE register.
6.
Now write the transfer direction and data to be sent to the IC_DATA_CMD register. If
the IC_DATA_CMD register is written before the I2C is enabled, the data and
commands are lost as the buffers are kept cleared when I2C is not enabled.
For multiple I2C transfers, perform additional writes to the TX FIFO such that the TX FIFO
does not become empty during the I2C transaction. If the TX FIFO is completely emptied at
any stage, then further writes to the TX FIFO results in an independent I2C transaction.
Dynamic IC_TAR or IC_10BITADDR_MASTER update
The I2C supports dynamic updating of the IC_TAR (bits 9:0) and IC_10BITADDR_MASTER
(bit 12) bit fields of the IC_TAR register. In order to perform a dynamic update of the IC_TAR
register, the I2C_DYNAMIC_TAR_UPDATE configuration parameter must be set to “Yes”
(1). You can dynamically write to the IC_TAR register provided the following conditions are
met:
1.
I2C is not enabled (IC_ENABLE=0);
2.
I2C is enabled (IC_ENABLE=1); AND
I2C is NOT engaged in any Master (tx, rx) operation (IC_STATUS[5]=0); AND
OR
448/590
Doc ID 018553 Rev 3
RM0078
I2C bus controllers (I2C)
I2C is enabled to operate in Master mode (IC_CON[0]=1); AND
there are NO entries in the TX FIFO (IC_STATUS[2]=1)
Master transmit and master receive
The I2C supports switching back and forth between reading and writing dynamically. To
transmit data, write the data to be written to the lower byte of the I2C Rx/Tx Data Buffer and
Command Register (IC_DATA_CMD). The CMD bit [8] should be written to 0 for I2C write
operations. Subsequently, a read command may be issued by writing “don’t cares” to the
lower byte of the IC_DATA_CMD register, and a 1 should be written to the CMD bit. The I2C
master continues to initiate transfers as long as there are commands present in the transmit
FIFO. If the transmit FIFO becomes empty, the I2C inserts a STOP condition after
completing the current transfers.
27.5.3
Disabling I2C
The register IC_ENABLE_STATUS is added to allow software to unambiguously determine
when the hardware has completely shutdown in response to the IC_ENABLE register being
set from 1 to 0. Only one register is required to be monitored, as opposed to monitoring two
registers (IC_STATUS and IC_RAW_INTR_STAT) which is a requirement for I2C versions
1.05a or earlier.
Procedure
Note:
1.
Define a timer interval (ti2c_poll) equal to the 10 times the signaling period for the
highest I2C transfer speed used in the system and supported by I2C. For example, if
the highest I2C transfer mode is 400 kb/s, then this ti2c_poll is 25us.
2.
Define a maximum time-out parameter, MAX_T_POLL_COUNT, such that if any
repeated polling operation exceeds this maximum value, an error is reported.
3.
Execute a blocking thread/process/function that prevents any further I2C master
transactions to be started by software, but allows any pending transfers to be
completed.
This step can be ignored if I2C is programmed to operate as an I2C slave only.
4.
The variable POLL_COUNT is initialized to zero.
5.
Set IC_ENABLE to 0.
6.
Read the IC_ENABLE_STATUS register and test the IC_EN bit (bit 0). Increment
POLL_COUNT by one. If POLL_COUNT >= MAX_T_POLL_COUNT, exit with the
relevant error code.
7.
If IC_ENABLE_STATUS[0] is 1, then sleep for ti2c_poll and proceed to the previous
step. Otherwise, exit with a relevant success code.
Doc ID 018553 Rev 3
449/590
General purpose I/O (GPIOA-B)
28
RM0078
General purpose I/O (GPIOA-B)
This chapter focuses on GPIO functionality and operation.
For the GPIO feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
28.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The SPEAr1340 device integrates 2 instances of a general purpose I/O digital block,
identified as GPIOA and GPIOB.
The GPIO block provides 8 programmable inputs or outputs. Each input/output can be
controlled through an APB interface.
Figure 165. GPIOA and GPIOB block diagram
GPIOINTR
PCLK
nGPEN[7:0]
PRESETn
GPIOA/B
GPOUT[7:0]
APB slave
interface
Interfaced with
GPIO_A/B[7:0]
GPIN[7:0]
See also: Figure 166: GPIO detailed block diagram
28.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
28.3
Clocks
The GPIO block uses PCLK, the APB clock.
See also: Chapter 5: Reset and clock generator (RCG).
450/590
Doc ID 018553 Rev 3
RM0078
28.4
General purpose I/O (GPIOA-B)
Resets
The APB reset, PRESETn, is used to reset the GPIO block. All block registers are cleared
during power-on-reset (LOW). This disables the output drivers for the GPIO lines, so that the
pins are configured as inputs.
28.5
Interrupts
GPIO sends a single interrupt, GPIOINTR, to the interrupt controller.
See also: Section 28.6.2: Interrupt detection logic
28.6
Functional description
Figure 166 shows a block diagram of the GPIO block with its main interfaces.
Figure 166. GPIO detailed block diagram
GPIO
APB interface
Register block
PSEL
Input/output
control
PENABLE
PWRITE
nGPIODIR[7:0]
GPIODATA[7:0]
GPINSync2[7:0]
nGPEN[7:0]
Input/
output
multiplexor
GPIN[7:0]
PADDR[11:2]
PWDATA[7:0]
GPIO_A/B[7:0]
GPOUT[7:0]
ID
PRDATA[7:0]
PRESETn
PCLK
28.6.1
Interrupt
control
Interrupt
detection logic
GPIOINTR
APB interface
The APB interface generates read and write decodes for accesses to control, interrupt, and
data registers. A read-only decode is provided to access the ID codes.
Doc ID 018553 Rev 3
451/590
General purpose I/O (GPIOA-B)
28.6.2
RM0078
Interrupt detection logic
The interrupt section of the GPIO is controlled by a set of seven registers, each controlling a
different feature or condition of the interrupt triggering chain. You can select the source of
the interrupt, its polarity, and edge properties.
GPIO has the ability to generate mask-programmable interrupts based on the level, or
transitional value of any of its GPIO lines.
When one or more GPIO lines cause an interrupt, a single interrupt output GPIOINTR is
sent to the interrupt controller. Refer to Appendix A: Interrupts for GPIO interrupts lines.
You can configure interrupts so that they are generated either on a change in the level, or on
an edge of the GPIO line. The edge and level on which the interrupt must be generated is
programmable. The set of seven registers in the APB interface allow the following
functionality:
●
interrupt generation either on a change in the level, one edge, or both edges of the
GPIO line
●
reading raw and masked interrupt status
●
reading from and writing to the interrupt enable
●
interrupt clear (write-only).
Each input/output line has a corresponding masked interrupt output line. Setting the
appropriate mask bit HIGH enables the interrupt.
For edge-triggered interrupts, software must clear the interrupt to enable any further
interrupts. For a level case, it is assumed that the external source holds the level constant
for the interrupt to be recognized by the processor.
Three registers are required to define the edge or sense that causes an interrupt:
●
GPIOIS (Interrupt sense register)
●
GPIOIBE (Interrupt both edges register)
●
GPIOIEV (Interrupt event register)
Figure 167 shows how the bits of the three registers combine to select an interrupt source
event.
Note:
452/590
Each bit of the interrupt registers corresponds to a GPIO pin.
Doc ID 018553 Rev 3
RM0078
General purpose I/O (GPIOA-B)
Figure 167. GPIO interrupt registers
Start
No = 0
GPIOIE
masked?
Interrupt
masked
Yes = 1
0
GPIOIBE
both edges?
1
GPIOIS
edge/level?
Yes = 1
No = 0
1
GPIOIEV
rising/falling?
1
0
0
GPIOIEV
HIGH/LOW?
Registers to be programmed
Table 141 shows how an interrupt is triggered by a rising edge detected on input pin 2.
Table 141. Triggering an interrupt from pin 2
Register
Desired trigger
7
6
5
4
3
2
1
0
GPIOIS
0= edge
1= level
x
x
x
x
x
0
x
x
GPIOIBE
0= single edge
1= both edges
x
x
x
x
x
0
x
x
GPIOIEV
LOW level, or negative edge
HIGH level, or positive edge
x
x
x
x
x
1
x
x
GPIOIE
0= masked
1= not masked
0
0
0
0
0
1
0
0
If any GPIOIE register bit is 0, the interrupt triggering on the associated line is disabled. In
Table 141 an x indicates that the value of the associated bit is irrelevant, a consequence of
the bit being masked by the GPIOIE register setting.
You must perform programming of the interrupt control registers when the respective
interrupts are not enabled. Writing to interrupt control registers can generate spurious
interrupts if the corresponding bits are enabled.
See also: Section : Recommendations on page 454.
Doc ID 018553 Rev 3
453/590
General purpose I/O (GPIOA-B)
RM0078
28.7
Operation
28.7.1
Interrupt configuration
On application of PRESETn as LOW:
●
interrupts in the desired line are disabled by clearing the corresponding bit in GPIOIE
●
all registers are cleared to zero
●
input and output pins are configured as inputs
●
interrupts to the external world are all masked as disabled
●
raw interrupts are cleared to zero
●
edge triggered interrupts are selected as source
Recommendations
If you want to generate edge-triggered interrupts you must perform the following initialization
sequence to avoid spurious interrupts being interpreted by the system.
1.
Program GPIOIBE appropriately as individual or both-edge detection
2.
Program GPIOIEV, if you have selected individual edge transactions previously
3.
Program GPIOIS to select edge-triggered path
4.
Apply three clock pulses to clean interrupt pipeline (wait for three PCLK periods)
5.
Ensure GPIN[7:0] bus remains stable throughout this operation
6.
Clear all interrupts by writing 0xFF to GPIOIC
7.
Program GPIOIE to enable interrupts
For example, to detect an interrupt on a rising edge of the signal on pin 2, you should
configure the GPIO registers as follows:
GPIOIBE
&= 0xFB;
GPIOIEV
|= 0x4;
GPIOIS
&= 0xFB;
// Waiting for 3 PCLK cycles, assuming ARM core clock is 4 times PCLK
for(int i = 0; i < 12; i++);
454/590
GPIOIC
= 0x4;
GPIOIE
|= 0x4;
Doc ID 018553 Rev 3
RM0078
28.7.2
General purpose I/O (GPIOA-B)
Operation of the input/output lines (I/O read/write)
The GPIO block comprises eight programmable input/output lines. Data and control for
these lines are provided by the data register GPIODATA and the data direction register
GPIODIR.
On reads, the data register contains the current status of the GPIO pins, whether they are
configured as input or output. Writing to the data register only affects the pins that are
configured as outputs.
Data register (GPIODATA)
The address bus is used as a mask on read/write operations of the data register GPIODATA.
The eight address lines used as a mask are PADDR[9:2]. Therefore, the GPIODATA register
effectively covers 256 locations in the address space.
Data direction register (GPIODIR)
The data direction register operates in the following manner:
●
0 indicates the corresponding I/O pin is defined as an input
●
1 indicates the corresponding I/O pin is defined as an output
Write operation
1.
Set desired bits of GPIODIR to '1', making IO pin an output.
2.
Choose the write address for the GPIODATA register so that the corresponding bits of
PADDR[9:2] are set to '1', unmasking the write operation to the desired IO pins.
3.
Write to the address chosen in step 2.
During a write, PADDR[9:2] bits behave as follows:
●
If the address bit associated with the data bit is HIGH (unmasked), the value of the
associated GPIODATA register bit is altered.
●
If it is LOW (masked), the associated GPIODATA register bit is left unchanged.
For example:
PADDR[9:2] = 'b00000000 -> all bits of GPIODATA are masked.
PADDR[9:2] = 'b11111111 -> all bits of GPIODATA are unmasked.
PADDR[9:2] = 'b00001110 -> bits 1, 2 and 3 of GPIODATA are unmasked.
If bits 1, 2 and 5 of GPIODATA need to be wriiten leaving the remaining bits (0, 3, 4, 6 and 7)
unchanged, the PADDR[9:2] bits (used as mask) should be 'b00100110.
PADDR[9:0] = 'b0010011000 (bits 1:0 are appended as "00").
Therefore, the address which should be accessed for write operation is:
GPIODATA + 'b0010011000 (or GPIODATA + 0x098)
When a value of 0xFB is written to the address 0x098 then:
●
bits 5, and 1 of the GPIO pins are set to 1, and bit 2 is set to 0
●
the other bits are not changed.
Doc ID 018553 Rev 3
455/590
General purpose I/O (GPIOA-B)
RM0078
Figure 168 shows the above effect of the address value of 0x098 operating on the data
value of 0xFB.
Figure 168. Example to write to address 0x098
Note:
PADDR[9:2]
9
8
7
6
5
4
3
2
0x098
0
0
1
0
0
1
1
0
0xFB
1
1
1
1
1
0
1
1
GPIODATA
u
u
1
u
u
0
1
u
7
6
5
4
3
2
1
0
0
0
In Figure 168 “u” indicates that the bit value is unchanged.
Read operation
During a read, PADDR[9:2] bits behave as follows:
●
If the bit is HIGH, the associated data bit value is read.
●
If the bit is LOW, the data bit is read as '0'.
For example:
If bits 0, 4 and 5 are need to be read, PADDR[9:2] = 'b000110001.
Therefore, the address which should be accessed for read = GPIODATA + 'b00011000100
(or GPIODATA + 0x0C4).
When reading from 0x0C4 then:
●
bits 5, 4, and 0 of the GPIO pins are returned
●
the value of bits 7, 6, 3, 2, and 1 are returned as zero, regardless of their state.
Figure 169 shows a read from the address 0x0C4 and the output on the PRDATA[7:0] lines.
Figure 169. Example to read from address 0x0C4
456/590
PADDR[9:2]
9
8
7
6
5
4
3
2
0x0C4
0
0
1
1
0
0
0
1
GPIN[7:0]
1
1
1
1
1
1
1
0
PRDATA[7:0]
0
0
1
1
0
0
0
0
7
6
5
4
3
2
1
0
Doc ID 018553 Rev 3
0
RM0078
29
Extended general purpose I/O (XGPIO)
Extended general purpose I/O (XGPIO)
This chapter focuses on XGPIO functionality and operation.
For the XGPIO feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
29.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
Extended general purpose I/Os are individually programmable input/output pins (output by
default) through an AHB slave interface.
Figure 170. XGPIO block diagram
GPIO_INT
HCLK
GPIO_EN[249:0]
HRESETn
XGPIO
GPIO_OUT[249:0]
AHB slave
interface
GPIO_IN[249:0]
29.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
.
29.3
Clocks
The XGPIO module uses the system AHB clock, HCLK.
See also: Chapter 5: Reset and clock generator (RCG).
29.4
Interrupts
The signal rising or falling edge from all IO pins is recorded and ORed to provide an
interrupt. Refer to Appendix A: Interrupts for the XGPIO interrupt line.
Doc ID 018553 Rev 3
457/590
Extended general purpose I/O (XGPIO)
29.5
RM0078
Functional description
The XGPIO module contains a set of registers mapped to external general purpose IOs.
The registers are accessible through an AHB slave interface.
The module can also generate interrupts on the falling or rising edge (programmable) of the
signal from any of the IOs. For reading, writing, or generating an interrupt from the IOs, the
pins must be enabled as XGPIOs.
As shown in the example of Figure 171, to enable SPEAr1340 pads as XGPIOs, you must
configure the PAD_FUNCTION_EN_* miscellaneous registers. GPIO_IN*, GPIO_OUT* and
GPIO_EN* are XGPIO register names.
All these registers are documented in the RM0089, Reference manual, SPEAr1340 address
map and registers.
Figure 171. Mapping of XGPIO40 pad to XGPIO registers
MUX
LOGIC
GPIO_IN1
31 30
9 8 7
XGPIO40
2 1 0
IN
GPIO_OUT1
31 30
IO
9 8 7
2 1 0
9 8 7
2 1 0
OUT
GPIO_EN1
31 30
EN
PAD_FUNCTION_EN_2[3]
29.5.1
XGPIO IN read
When enabled as XGPIO, the status of the signal on IO pins is always available in
GPIO_IN0 to GPIO_IN7 registers.
An AHB read sent to GPIO_IN0 through GPIO_IN7 provides the status of the signals on the
XGPIOs.
For example, a read from register GPIO_IN1 will provide the status of XGPIO32 to
XGPIO63 in the bits 0 to 31 respectively.
29.5.2
XGPIO OUT write
To drive a signal on the IO, the XGPIO must be enabled as an output: in the GPIO_ENx
registers, set the bit that corresponds to the desired IO. For example, to set XGPIO40 as an
output, bit 8 of GPIO_EN1 register should be reset to '0'.
By default, all of the XGPIOs are enabled as outputs (GPIO_ENx registers default to 0x0).
Once the XGPIO is programmed as an output, the values of the bits from the GPIO_OUTx
registers are present on the XGPIO.
The XGPIO can then be driven as required by writing to the GPIO_OUTx register.
458/590
Doc ID 018553 Rev 3
RM0078
Extended general purpose I/O (XGPIO)
If an XGPIO is in input mode, the value of the corresponding bit from GPIO_OUTx register
will not have any effect on it.
The following example illustrates driving XGPIO40 out to '1':
29.5.3
PAD_FUNCTION_EN_2
&= 0xFFFFFDFF;
GPIO_EN1
&= 0xFFFFFEFF;
GPIO_OUT1
|= 0x100;
Using an XGPIO pin as an interrupt
An interrupt can be generated on the rising or falling edge of the signal on an XGPIO.
Figure 172. Interrupt detection logic on XGPIOs
GPIO_EN
GPIO_IN
R
E
G
R
E
G
EDGE
DET
GPIO_INT
GPIO_IRQ
IRQ_EDGE
GPIO_IRQ_MASK
Enabling interrupt generation
1.
Initialize interrupt ID[139] (GPIO_IRQx)
2.
Enable as XGPIOs the pins on which the interrupt is to be received
(PAD_FUNCTION_EN_x).
3.
Enable the XGPIOs in step 2 as inputs (GPIO_ENx).
4.
Program the desired edge for interrupt captures (rising or falling edge); by default,
falling edge is captured as interrupt (register IRQ_EDGEx).
5.
Disable the interrupt masks on the enabled XGPIOs (GPIO_IRQ_MASKx).
The following example illustrates setting XGPIO40 for interrupt detection on rising-edge of
the signal:
PAD_FUNCTION_EN_2
&= 0xFFFFFDFF;
GPIO_EN1
|= 0x100;
IRQ_EDGE1
|= 0x100;
GPIO_IRQ_MASK1
&= 0xFFFFFEFF;
Doc ID 018553 Rev 3
459/590
Extended general purpose I/O (XGPIO)
RM0078
Checking XGPIO interrupt status
●
Use registers GPIO_IRQ0 through GPIO_IRQ7
For example, reading from GPIO_IRQ1 gives the interrupt status from XGPIO32 through
XGPIO63.
Clearing an interrupt
●
In the appropriate GPIO_IRQx register, write 0 to the interrupt bit to be cleared,
keeping the remaining bits 1.
For example, to clear interrupt from XGPIO40: GPIO_IRQ1 = 0xFFFFFEFF;
460/590
Doc ID 018553 Rev 3
RM0078
30
Keyboard controller (KBD)
Keyboard controller (KBD)
This chapter focuses on KBD functionality and operation.
For the KBD feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
30.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The GPIO keyboard controller integrated in SPEAr1340 offers a 3-mode input and output
port. It provides an12-bit GPIO, or 6x6 keyboard, or 2x2 keyboard plus 8-bit GPIO, and
offers an interface to the industry standard APB bus.
Figure 173. Keyboard controller block diagram
APB
+Wrapper
ST_1-ST_2/
ST_1-ST_6
Key
switch
matrix
KBD_1-KBD_2/
KBD_1-KBD_6/
Parallel/
KBD port
CLK
Reg_Out_Enable
DOUT
REG
OUT
20
LOAD/LATCH
Interrupts
CLK
Clock
DIN
Reset
REG
IN
20
Reg_In_Enable
30.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
Doc ID 018553 Rev 3
461/590
Keyboard controller (KBD)
30.3
RM0078
Clocks
KBD has the APB clock PCLK. The KBD scan rate varies across PCLK dividers. The PCLK
frequency value has to be programmed as per the input clock frequency of KBD. Supposing
that the clock frequency is X MHz, then the PCLK frequency should be equal to X. KBD
must detect the microsecond. When the counter reaches the value X, this means that it
reached 1 microsecond. So, to calculate the microsecond, it is necessary to program this
value aligned to the input clock.
See also: Chapter 5: Reset and clock generator (RCG).
30.4
Interrupts
An interrupt is produced when a pressed key is scanned and properly validated (described
in Section 30.5.1: Operating modes). This interrupt triggers the software interrupt service
routine by reading and writing STATUSREG. Writing zero (0) in this register is essential to
make interrupt line low.
30.5
Functional description
The keyboard controller provides an APB bus interface with a test wrapper, intro chip
signals, and 20 programmable I/O pins of which only 12 are exposed to the SoC interface.
The wrapper converts internal chip signals into AMBA-compatible signals (non-AMBA
signals are clock, reset, and interrupts).
Control, status and data signals are all accessible through the APB bus interface.
Table 142 cross-references block signals and external interconnections.
Note:
Different modes require that different pins be connected to the key switch matrix.
Section 30.5.1: Operating modes provides a description of each operating mode.
Table 142. Block signals and external interconnection cross reference
PORT PIN
GPIO
KEYBOARD6x6
KEYBOARD2x2
ROW0
GPIO0
Keyboard Output (ROW0)
Keyboard Output (ROW0)
ROW1
GPIO1
Keyboard Output (ROW1)
Keyboard Output (ROW1)
ROW2
GPIO2
Keyboard Output (ROW2)
GPIO2
ROW3
GPIO3
Keyboard Output (ROW3)
GPIO3
ROW4
GPIO4
Keyboard Output (ROW4)
GPIO4
ROW5
GPIO5
Keyboard Output (ROW5)
GPIO5
COL0
GPIO9
Keyboard Input (COL0)
Keyboard Input (COL0)
COL1
GPIO10
Keyboard Input (COL1)
Keyboard Input (COL1)
COL2
GPIO11
Keyboard Input (COL2)
GPIO11
COL3
GPIO12
Keyboard Input (COL3)
GPIO22
462/590
Doc ID 018553 Rev 3
RM0078
Keyboard controller (KBD)
Table 142. Block signals and external interconnection cross reference (continued)
PORT PIN
GPIO
KEYBOARD6x6
KEYBOARD2x2
COL4
GPIO13
Keyboard Input (COL4)
GPIO13
COL5
GPIO14
Keyboard Input (COL5)
GPIO14
30.5.1
Operating modes
General purpose input output interface mode (GPIO)
At power-on, all pins are inputs by default. In GPIO mode, each of the available 12 signals
can be individually programmed to be an output through the APB bus. Once programmed,
each pin maintains its identity as an input or an output.
To enable the GPIO mode, set the mode control bits in the mode control register to [01].
The ARM may read or write to the data register at any time. Writing to a pin that has been
programmed as an input has no effect. Reading this register provides the status/values on
all of the pins, inputs, and outputs.
Keyboard interface mode
In keyboard mode, the value of an externally connected keyboard (scanned at a
programmed rate) can be read from the APB bus.
If the key number select bits in the mode control register are set to [01], the keyboard
contains up to 36 keys. Twelve port pins provide a 6x6 scanning matrix; six of the pins are
strobes, and six of the pins are inputs. If the key number select bits in the mode control
register are set to [10], the keyboard contains up to four keys. Two port pins provide a 2x2
scanning matrix; two of the pins are strobes, and two of the pins are inputs. In this case the
remaining 8 pins can be used as GPIO.
The circuitry scans the keys at a rate of 10, 20, 40 or 80 ms, controlled by the software. Two
successive cycles are required to validate a key. Only one key is allowed down in a scan
cycle. Once validated as being down, the “no key down” condition must be validated for two
complete cycles when the key is released. Every valid key condition causes the value of the
key to be written to a register and an interrupt is set. The key value is coded on eight bits;
the lower nibble refers to the column number (0, 1,2…8), and the higher nibble gives the row
number (0,1,2…8) of the key-pressed. Control register bits b3 and b2 determine the
keyboard scanning rate. Each time the timer expires, the keyboard is scanned. The strobes
are each active for sixty microseconds, so in keyboard 6x6, keyboard is scanned in 360
micro seconds; in keyboard 2x2, keyboard is scanned in 120 micro seconds. If only one key
down is detected and it is the same key as on the previous scan, a bit is set in the Status
register indicating new key data. The code for the key is written to the Keyboard value
register. Key release is signaled only once.
The keypad encoder initialization is made one time when the application starts (prescaler
load value, keyboard enable, scan rate, keyboard operation mode), and then the software
handles the interrupt line in order to process keyboard interrupts.
Doc ID 018553 Rev 3
463/590
Keyboard controller (KBD)
RM0078
Table 143. Key-code table (hex values)
Note:
COL(0)
COL(1)
COL(2)
COL(3)
COL(4)
COL(5)
Row(0)
0x00
0x01
0x02
0x03
0x04
0x05
Row(1)
0x10
0x11
0x12
0x13
0x14
0x15
Row(2)
0x20
0x21
0x22
0x23
0x24
0x25
Row(3)
0x30
0x31
0x32
0x33
0x34
0x35
Row(4)
0x40
0x41
0x42
0x43
0x44
0x45
Row(5)
0x50
0x51
0x52
0x53
0x54
0x55
1
For the value above, if the PCLK frequency does not exactly equal an integer, use the
function below to calculate the new value:
New value = {Pclk_frequency}*{original value}/{int(Pclk_frequency)}
2
In KBD mode, IOs used as ROWs (such as strobe signals) are connected as OPEN DRAIN,
and IOs used as COLs (such as key pressed) are connected as normal bidirectional. In
GPIO mode, all IOs are connected as normal bidirectional.
3
PARDATAREG bits are not directly one to one mapped with external IOs. Mapping between
IOs and PARDATAREG bits is shown in the table here below.
Table 144. Mapping between external pins and PARDATAREG bits(1)
IOs
PARDATAREG bits
IOs
PARDATAREG bits
ROW0
PARADATAREG[0]
COL0
PARADATAREG[9]
ROW1
PARADATAREG[1]
COL1
PARADATAREG[10]
ROW2
PARADATAREG[2]
COL2
PARADATAREG[11]
ROW3
PARADATAREG[3]
COL3
PARADATAREG[12]
ROW4
PARADATAREG[4]
COL4
PARADATAREG[13]
ROW5
PARADATAREG[5]
COL5
PARADATAREG[14]
1. When KBD is used in 2x2 keyboard configuration the IOs {ROW0, ROW1} and {COL0, COL1} are
connected to the 2x2 keyboard matrix.The remaining IOs {ROW2, ROW3, ROW4, ROW5} and {COL2,
COL3, COL4, COL5} can be used in GPIO mode.
464/590
Doc ID 018553 Rev 3
RM0078
31
A/D converter (ADC)
A/D converter (ADC)
This chapter focuses on ADC functionality and operation.
For the ADC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
31.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
SPEAr1340 integrates a 10-bit resolution analog-to-digital converter. The resolution can be
extended up to 17 bits by programming the controller.
31.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
31.3
Clocks
The ADC uses the following clocks:
●
PCLK: the APB clock, 83 MHz
●
CLK: Test clock, 2.5 MHz – 20 MHz
●
ADC_CLK. It is used to program ADC clock frequency. The duty cycle is done by the
ratio of ADC_CLK_H and ADC_CLK_L values while the frequency is the APB clock
frequency divided by the sum of these values. The maximum frequency of CLK_ADC is
20 MHz while the minimum is 2.5 MHz; this implies:
5 ≤( ADC_CLK_H + ADC_CLK_L ) ≤33
31.4
Interrupts
ADC generates a conversion ready interrupt to ARM which indicates that:
●
In normal mode, the single conversion on the selected channel is finished
●
In enhanced mode, the conversions on all the enabled channels are finished
Doc ID 018553 Rev 3
465/590
A/D converter (ADC)
31.5
RM0078
Functional description
The ADC requires 50 µS to enter into functional mode from power-down state. To enable the
ADC, the POWERUP bit (bit 4) of ADC_STATUS register should be set to 1.
The conversion starts when ENABLE bit (bit 0) of ADC_STATUS register is set to 1. When
the conversion is completed, an interrupt signal is generated (bit 8 of ADC_STATUS register
is set to 1). At this point, the reading of the data begins. When the reading is finished, the
interrupt is cleared.
–
When POWERUP = 0, the ADC is inactive and output latches contain the last
conversion.
–
Setting POWERUP = 1, the ADC enters in functional mode after 50 µS.
To start the conversion you need to set ENABLE = 1. A dedicated circuit controls the internal
start signal START (see Figure 174 below). When START = 1, an end of conversion signal
(EOC) is reset to 0, the conversion data field of the AVERAGE register is reset to the value
0x0 and the acquisition occurs. Then, the finite-state machine (FSM) switches START= 0
and the conversion phase takes place. The number of clock cycles required to complete a
conversion is 13.
At the end of conversion and after the reading, if POWERUP bit is 1, EN to ADC is kept to 1.
In this way, the next conversion needs only an ADC_STATUS writing and FSM does not wait
for start-up time. If POWERUP bit is 0, the FSM switches off the ADC and the next
conversion requires again start-up time.
Figure 174. Timing diagram of ADC conversions
CLK
EN
13 CYCLES
13 CYCLES
START
EOC
DATA OF LAST CONV
31.5.1
DATA VALID
DATA VALID
Enhanced mode
If ENM bit (bit 10) of ADC_STATUS register is set to 1, you can perform conversions on the
selected channels in a continuous way. The start of conversions may be external
(EXTSCANRATE bit = 1) or internal. In the first case you need an external signal to start the
conversions, while in the second case you need to configure the SCAN_RATE register to set
the number of APB clock cycles between the start of two consecutive scan conversions.
To read the conversion results, you need to read CHx_DATA registers. Bit 17 is the VALID
bit. This bit is 1 when read data is valid, while 0 in the following cases:
466/590
●
ENM = 0
●
Bit 0 of CHx_CTRL = 0
●
The controller is writing result in it.
Doc ID 018553 Rev 3
RM0078
A/D converter (ADC)
Starting from channel 0, it is possible to select a request to DMA (DmaEn = 1) in order to
transfer the converted data on channels from 0 to DmaLastCh, which is a programmable
value (ADC_STATUS register). When the conversion on DmaLastCh is completed, the
controller performs a burst request to DMA to start transferring the converted data. In the
meantime, ADC continues converting the remaining channels. At the next conversion on
channel #0, ADC checks if the last DMA transfer is completed. If it is completed, the
controller starts a new conversion on all the channels in the programmed range. If the DMA
transfer is not completed yet, the ADC continues the conversion starting from the
DmaLastCh+1.
Example:
If the range of the enabled channels is: [0, 1, 2, 3, 6, 7] and DmaLastCh=3, at the end of the
conversion on channel#3 the ADC sends a burst request to DMA. While DMA is transferring
the converted data from channels #0 to #3, the ADC continues the conversion on the
remaining channels: #6 and #7. At the end of the conversion on channel #7, the ADC should
restart from channel #0. However if the DMA is still transferring data, it will start from
channel #6 (DmaLastCh + 1).
Note:
If all the enabled channels are selected for DMA transfers, the conversions will stall.
31.5.2
Touchscreen mode
Each CHx_CTRL register has a TOUCHSCREEN bit (bit 4) which indicates that the related
channel is used for the touchscreen feature. Enabling this bit, you can select 2 or 4 channels
to dedicate to a single or a double touchscreen.
When selecting 2 channels, the 1st channel converts the X value and the 2nd one the Y
value; when selecting 4 channels, the 1st and 2nd channel convert the 2 X values and the
3rd and 4th one convert the 2 Y values. In both cases, a signal (XY_SEL) is generated to
allow the switch of the X/Y axis: it is high when converting the X values, while low when
converting the Y ones.
To enable the touchscreen, it is also mandatory to enable the channel by writing both the
bits 0 and 4 of the CHx_CTRL register. You must also set 2 or 4 channels: any different
number causes a wrong behavior of ADC controller.
31.5.3
High-resolution mode
The resolution of the ADC analog cell is 10 bits, but resolution can be extended in high
resolution mode. High resolution mode is enabled by setting the HIGHRESOLUTION bit in
the ADC_STATUS register.
In high resolution mode, the ADC performs oversampling. The number of samples is
programmable via the NSAMPLES bits (bits 7:5) in the ADC_STATUS register.
The sum of the converted results can be read from the AVERAGE register.
By reading the sum of the conversion results, software can use decimation or interpolation
averaging methods to obtain higher resolution (> 10-bits).
By oversampling 4 times, the resolution can be increased by 1 bit, by oversampling 16
times resolution can be increased by 2 bits and so on. Instead of dividing the sum of the
converted results by NSAMPLES as in normal averaging, the sum of the samples read from
the AVERAGE register should be right shifted n bits, where n is the desired number of bits of
resolution. In enhanced mode (ENM=1) and if High resolution mode is enabled
(HIGHRESOLUTION=1), the number of samples can be defined individually for each
Doc ID 018553 Rev 3
467/590
A/D converter (ADC)
RM0078
channel in the CHx_CTRL registers and the sum of the converted results can be read from
the CHx_DATA registers.
31.5.4
DMA handshaking interface
The ADC has a DMA handshaking interface called ADC_TX as shown in Table 49: DMAC
MUX - selecting the peripheral . It is composed by only 2 signals: DMA_SREQ (single
request) and DMA_CLR (clear).
The ADC uses this interface to transfer the digital converted data to the DMAC, so it is
always used in peripheral-to-memory mode and the flow controller is always the DMAC.
468/590
Doc ID 018553 Rev 3
RM0078
32
PWM generators (PWM)
PWM generators (PWM)
This chapter focuses on PWM functionality and operation.
For the PWM feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
32.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The PWM module has three configuration registers for each of the four independent
channels (PWM1, PWM2, PWM3, PWM4), and one additional register that has the master
enable bit for synchronous channel operation.
Figure 175. PWM block diagram
PWM
Prescaler
1
Pulse
generator
1
PWM1
Prescaler
2
Pulse
generator
2
PWM2
Prescaler
3
Pulse
generator
3
PWM3
Prescaler
4
Pulse
generator
4
PWM4
PCLK
PRESETn
APB
interface
32.1.1
Configuration
registers
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
32.2
Clocks
The PWM module uses the system APB clock, PCLK.
See also: Chapter 5: Reset and clock generator (RCG).
Doc ID 018553 Rev 3
469/590
PWM generators (PWM)
RM0078
32.3
Functional description
32.3.1
Prescaler
The prescaler contains a 14-bit counter which generates an enable signal for the pulse
generators. When the corresponding PWM channel is enabled, the prescale counter is
compared against the programmed prescale value (bits 15:2 of Control_Regx). If the
counter matches the programmed value, the PWM generator’s enable signal becomes
HIGH, otherwise it is kept LOW.
If the programmed value is ‘0’, the PWM generator’s enable signal always remains HIGH;
this allows the PWMs to operate at the maximum frequency of PCLK – 83 MHz.
32.3.2
Pulse generator
The pulse generator consists of a 16-bit counter which operates at PCLK, but incremented
only when enabled by the prescaler.
The counter value is compared to the programmed duty and period values for generating
PWM waves. The counter is incremented periodically, and is reset to ‘0’ every time it
reaches the programmed period value. The PWM output is driven ‘1’ for counter values
ranging from 0 to (duty - 1), and ‘0’ for values ranging from duty to period.
The duty and period values are configured through the registers: Duty_Regx and
Period_Regx respectively.
32.4
Programming
32.4.1
Configuring a channel
Each channel of the PWM module is configured through three configuration registers:
●
Control_Regx
●
Duty_Regx
●
Period_Regx
To configure a channel:
1.
Set a prescale value (Control_Regx)
If required, prescale the input clock to the PWM counter (see Note 1 on page 471).
The clock can be scaled from PCLK – 83 MHz.
2.
Set the desired period (Period_Regx)
This value corresponds to the number of prescaled clock cycles.
3.
Set the desired duty cycle (Duty_Regx)
This value also corresponds to the number of prescaled clock cycles. Make the duty
cycle less than or equal to the period (set in step 2), or the output will remain high. For
more information, see notes on page 471.
470/590
Doc ID 018553 Rev 3
RM0078
PWM generators (PWM)
4.
Enable the output channel(s) (Control_Regx bit 0 and Master_Ctrl bit 0)
If the channels must work synchronously:
a)
Enable bit 0 of all of the required Control_Regx.
b)
Enable Master_Ctrl bit 0.
If no synchronous operation is required:
a)
Enable Master_Ctrl bit 0.
b)
Configure and enable the desired channels.
See also: Figure 176: Output pulse generation example (Duty = 3, Period = 7) on page 471
1
Without any prescaling, the output pulse minimum frequency can be
83000 / (2^16 + 1) ~= 1.266 KHz.
For an output pulse below this frequency, the input clock must be prescaled to the PWM
counter; any combination of prescaling and period factors can be used that result in the
desired output frequency.
Example of generating a 4 KHz pulse with a 50% duty cycle:
Prescale = 0x0;
Period = 0x510D;
Duty = 0x2887
Prescale = 0x19;
Period = 0x33D;
Duty = 0x19F
2
If the duty value is greater than the period value, the corresponding channel output remains
high.
3
The required minimum value of the duty register is 1 (0 = LOW output). Because of this, the
maximum output frequency value is PCLK/2 with a duty cycle of 50%, achieved by
setting Duty = 1, Period = 1, and Prescaler = 0.
The maximum PCLK frequency = 83 MHz.
Duty cycle = Duty/(Period+1) * 100 %, where the duty setting is less than or equal to the
period setting.
Duty cycle resolution = 1/(Period + 1) (duty cycle is defined in terms of prescaler output
clock pulses).
Minimum duty cycle = 100 /(Period + 1) %
= 100 / (2^16 + 1) % ~= 0.0015 % (for max period setting of 0xFFFF).
Maximum duty cycle = 100 % (HIGH output, no pulse, duty value greater than period value).
Figure 176. Output pulse generation example (Duty = 3, Period = 7)
Prescaled Counter clock
nf
id
e
Note:
PWMx O/P
PWM_duty = 3
(Duty Reg)
PWM_period = 8
(Period Reg + 1)
Doc ID 018553 Rev 3
471/590
HDMI CEC interfaces (CEC)
33
RM0078
HDMI CEC interfaces (CEC)
This chapter focuses on CEC functionality and operation.
For the CEC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
33.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
CEC is an asynchronous transfer mode adaptation layer (AAL) protocol that provides highlevel control functions among the various audiovisual products in a user’s environment. CEC
operates at low speeds, with minimal processing and memory overhead.
Figure 177. CEC block diagram
472/590
Doc ID 018553 Rev 3
RM0078
33.2
HDMI CEC interfaces (CEC)
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
33.3
Clocks
The CEC is operated with a single AMBA clock, HCLK.
See also: Chapter 5: Reset and clock generator (RCG)
33.4
Interrupts
CEC generates the interrupt cec_it, the status of which can be found by reading register
CEC_CTL. This interrupt clears when the corresponding bit is cleared in register CEC_CTL.
33.5
Functional description
The CEC interface handles complete messages, but requires that the host CPU provides or
unloads the data bytes one-by-one.
33.5.1
Control logic
The CEC interface assumes one of the following states:
●
STANDBY
●
IDLE
●
RX
●
RX_ERROR
●
TX
●
TX_ERROR
Doc ID 018553 Rev 3
473/590
HDMI CEC interfaces (CEC)
RM0078
Figure 178. CEC control logic
STANDBY
STANDBY is entered on a chip reset, or on a reset of CEC_CFG.P_EN, and exited by
setting the P_EN bit. In STANDBY:
●
Any on-going transmission or reception operation is not interrupted and completes
normally. The interface is actually in STANDBY mode when the P_EN bit is read back
as 0.
●
Activity on the CEC line is ignored, and the clock prescaler is stopped for minimum
power consumption.
IDLE
IDLE is entered whenever a message has been transmitted or received successfully, or
whenever an error has been processed.
In IDLE, the CEC interface looks for either a transmit request (the TSOM bit is set) or a start
bit.
RX
RX is entered when a start bit is detected while no message is pending for transmission.
Once the header is received, the destination address is compared with the value
programmed in the own address register CEC_OAR. If a match is not found and the address
is not the broadcast address 0xF, the block is not acknowledged and the controller reverts
474/590
Doc ID 018553 Rev 3
RM0078
HDMI CEC interfaces (CEC)
back to the IDLE state. Otherwise, the controller remains in the RX state where the host
CPU is requested to retrieve all message bytes from the RX buffer one-by-one.
The RBTF bit set in the control register CEC_CTL signals an available byte. The host CPU
can become aware of this either by polling the latter register or by enabling interrupts in the
configuration register CEC_CFG. If the RBTF bit is not cleared by the time a new block is
received, the new block is not acknowledged, forcing the initiator to restart the message
transmission, giving the host CPU another chance to retrieve all message bytes on time.
Note:
It is the responsibility of the software driver to ignore messages where the number of
operands is less than the number specified for that opcode.
Figure 179. Example: a complete message reception
Note:
Because a message may have been queued for transmission but arbitration lost, two
different values can be read from the control register CEC_CTL.
RX_ERROR
The interface enters the RX_ERROR state when a condition listed in Table 145 occurs.
The RX_ERROR state is not left until the receive error flag RX_ERR is cleared, and when
the error state is left depends on the selected error resync mode:
●
Default mode waits for an inter-frame spacing of at least five bit times
●
Advanced mode leaves immediately
Table 145. RX_ERROR conditions, types, and actions
Error
condition
Error
type
Action
A broadcast message is
negatively acknowledged
Acknowledge No specific action is taken
The RBTF bit is not cleared while
a new byte is ready to be written to
the RX buffer
RBTF
Directly-addressed messages are not acknowledged,
and broadcast messages are negatively acknowledged
A start bit is detected before
the end-of-message flag
Start bit
No specific action is taken
Doc ID 018553 Rev 3
475/590
HDMI CEC interfaces (CEC)
RM0078
Table 145. RX_ERROR conditions, types, and actions
Error
condition
Error
type
A rising edge on the CEC line is detected
outside the applicable window
Action
Bit timing
The CEC line pulls low for 70 time quanta
A falling edge on the CEC line is detected
Bit period
outside the applicable window
Figure 180. Example: RX_ERROR
Note:
Because a message may have been queued for transmission but arbitration lost, two
different values can be read from the control register CEC_CTL.
TX
The interface enters the TX state when the TSOM bit in the control register CEC_CTL is set.
Once in this state, the interface ensures that the required signal-free time has elapsed
before generating a start bit by waiting for the quanta counter of the bit timing logic to
exceed the value listed in Table 146, unless another device emits a start bit (in which case
the arbitration phase begins and lasts until the initiator address is fully transmitted).
Table 146. Wait loop
Previous state
Note:
476/590
Wait value
TX_ERROR
192
The device was receiving
288
Any other
384
It is the responsibility of the software driver to send an initiator address consistent with the
logical address programmed in the own address register CEC_OAR. The arbitration is lost if
the received initiator address, contained in the least significant nibble of the shift register,
differs from the initiator address still present in the TX buffer. In this case, the controller
Doc ID 018553 Rev 3
RM0078
HDMI CEC interfaces (CEC)
switches to the RX state immediately, but continues to try to transmit after the receive phase
until it is granted ownership of the bus.
If arbitration is not lost, a new byte is requested to be written to the TX buffer each time the
TBTF bit is set in the control register. The host CPU can become aware of this by polling the
control register or by enabling interrupts in the configuration register CEC_CFG. If it does
not achieve the required task on time, a transmit error flag TX_ERR is set. The message is
transmitted successfully when the TEOM bit is set, but it should be considered lost as soon
as the TX_ERR bit is set.
Figure 181. Example: a complete message transmission
TX_ERROR
The interface enters the TX_ERROR state when a condition listed in Table 147 occurs.
If the TX_ERROR state is not left before the transmit error flag TX_ERR is cleared, when
the error state is left depends on the selected error resync mode:
●
The default mode waits for an inter-frame spacing of at least three bit times
●
The advanced mode leaves immediately
Doc ID 018553 Rev 3
477/590
HDMI CEC interfaces (CEC)
RM0078
Table 147. TX_ERROR conditions, types, and actions
Error
condition
Error
type
A directly-addressed message block is not
acknowledged,
or
A broadcast message block is negatively
acknowledged
Action
Because no error signaling mechanism is specified
for the initiator, no specific action is undertaken
Acknowledge
apart from aborting the current message and
clearing the transmit request flag TSOM.
The error handler decides whether retransmission is
The TBTF bit is not cleared when the
possible, depending on whether transmission has
RBTF
requested byte must be transmitted
already failed six times or not, and sets the transmit
request flag if required.
The timing bit logic senses an unexpected bit Line
Figure 182. Example: a TX_ERROR
478/590
Doc ID 018553 Rev 3
RM0078
33.5.2
HDMI CEC interfaces (CEC)
Bit timing logic
The bit timing logic (BTL) is in charge of extracting valid bits from the CEC line and for
signaling line errors. It operates at a 0.05 ms time quantum, because the bit timings in the
specification are expressed with this level of precision.
The Rx data is resynchronized on the system clock and a 2/3 majority voter removes high
frequency spikes before processing at the time-quantum rate. Also, to improve immunity to
transition bounces and positive spikes, transitions are ignored for one time quantum period
following a valid edge.
On a valid Rx falling edge, the quanta counter is captured and reset. If the captured value is
outside valid bounds (see Figure 183), a bit period error has been detected and is signaled
by pulling the line low for 70 time quanta.
On a valid Rx rising edge, the quanta counter is captured and compared against valid
windows. If the edge is found to be outside, a line error is signaled unless the device has
been programmed not to report such violations.
Note:
If a line error occurs while a start bit is expected, the whole message is ignored and no error
is reported.
In the absence of a rising edge, the quanta counter is left counting up to 511.
Retransmission is allowed when the counter value is above 192. A new initiator may transmit
when the counter is above 288, but the same initiator must wait until the counter reaches
384.
Figure 183. Quanta counter timing
Doc ID 018553 Rev 3
479/590
HDMI CEC interfaces (CEC)
33.5.3
RM0078
Bit shaping logic (BSL)
The bit shaping logic generates the proper line waveform to signal a start bit, a logical 1 data
bit, or an error bit. The same time quantum is used as for the bit timing logic.
Figure 184. Bit shaping logic timing
33.5.4
Prescaler
The prescaler defines the time quantum for the bit timing logic and the bit shaping logic, and
provides a time quantum reference for complying with the required signal-free time.
A 12-bit counter provides the necessary 50 s timebase, allowing for system clocks up to
82 MHz. The counter resets at the beginning of every bit to enable the bit timing logic to
operate with maximum precision.
33.5.5
Normal functional behavior
Message description
All transactions on the CEC line consist of an initiator and one or more followers. The
initiator sends the message structure and the data. The follower is the receives any data,
and sets any acknowledgement bits.
A message is conveyed in a single frame, which consists of a start bit followed by a header
block and, optionally, an opcode and a variable number of operand blocks. All these blocks
are made of an 8-bit payload (with the most significant bit transmitted first) followed by an
end-of-message (OEM) bit and an acknowledge (ACK) bit.
The EOM bit is set in the last block of a message and kept reset in all others. If a message
contains additional blocks after an OEM is indicated, those additional blocks should be
ignored. The EOM bit may be set in the header block to ‘ping’ other devices, to ascertain
whether they are active.
The acknowledge bit is always set to high impedance by the initiator so that it can be driven
low either by the follower, which has read its own address in the header, or by the follower
that needs to reject a broadcast message.
The header comprises the source logical address field and the destination logical address
field. The special address 0xF is used for broadcast messages.
480/590
Doc ID 018553 Rev 3
RM0078
HDMI CEC interfaces (CEC)
Figure 185. Message description
Bit timing
The format of the start bit is unique and identifies the start of message. It should be
validated by its low duration and by its total duration.
All the remaining data bits in the message, after the start bit, have consistent timing. The
high-to-low transition at the end of the data bit is the start of the next data bit, except for the
final bit where the CEC line remains high.
Figure 186. Bit timing
Doc ID 018553 Rev 3
481/590
HDMI CEC interfaces (CEC)
RM0078
Line use
Devices that wish to transmit or retransmit a message onto the CEC line must ensure that
the CEC line has been inactive for a number of bit periods. This signal-free time is defined
as the time since the final bit of the previous frame, and depends on the initiating device and
on the current status as shown in Figure 187.
Figure 187. Signal-free time
Because only one initiator is allowed at any one time, an arbitration mechanism is provided
to avoid conflict when more than one initiator begins transmitting at the same time.
CEC line arbitration starts with the leading edge of the start bit and continues until the end of
the initiator address bits within the header block. During this period, the initiator monitors the
CEC line and if, while driving the line to high impedance, it reads it back to 0, the initiator
assumes it has lost arbitration and stops transmitting. It then becomes a follower.
Figure 188. Arbitration phase
33.5.6
Error conditions and error handling
Bit error
A data bit (excluding the start bit) is considered invalid if the period between falling edges is
less than the minimum bit period. A follower is expected to note such errors by generating a
low bit period on the CEC line of 1.4 to 1.6 times the normal data bit period (nominally
3.6 ms).
Figure 189. Bit error
482/590
Doc ID 018553 Rev 3
RM0078
HDMI CEC interfaces (CEC)
Message error
A message is considered to be lost and therefore able to be retransmitted if:
●
A message is not acknowledged in a directly-addressed message
●
A message is negatively acknowledged in a broadcast message
●
A low impedance is detected on the CEC line when this condition is not expected (line
error)
Attempt retransmission at least once, and up to five times.
Doc ID 018553 Rev 3
483/590
Display controller (CLCD)
34
RM0078
Display controller (CLCD)
This chapter focuses on CLCD functionality and operation.
For the CLCD feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
34.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The TFT LCD controller provides all of the necessary control signals to interface directly to a
variety of TFT LCD panels.
The following figure shows the display controller block diagram.
Figure 190. LCD controller block diagram
AHB
slave
interface
Processor
status & control
registers
LCD
timing and pixel
clock
generation
LCD
timing & control
Pulse width
modulation
generator
LCD PWM
DMA
controller
AXI
master
interface
Input
FIFO
2048x64
Pixel unpack
Palette
(256 x 16)
Output
FIFO
16 Words x
18/24-bit
LCD
data
Output
formatter
Interrupt
Interrupt status & mask registers
34.2
Pins
For a complete pin description, refer to Doc ID 023063, Data sheet, SPEAr1340, Dual-core
Cortex A9 HMI embedded MPU.
484/590
Doc ID 018553 Rev 3
RM0078
34.3
Display controller (CLCD)
Clocks
The LCD controller core has the following clock domains:
●
●
Bus clock (HCLK) domain
–
AXI Master and AHB slave interfaces
–
Control and status registers
–
DMA controller
–
Write side of the palette two-port RAM
–
Write side of the input FIFO
–
Interrupt controller
Pixel clock (PCLK) domain
–
Read side of the input FIFO
–
Read side of the palette two-port RAM
–
Pixel unpack
–
Timing & control unit
–
Output formatter
Within the pixel clock domain, there are two versions of PCLK: the internal pixel clock,
PCLK, which serves as the on-chip clock for the LCD pixel pipeline logic; and the external
pixel clock, lcd_pclk, which serves as the off-chip clock to the LCD panel pixel clock input.
The clock generator derives PCLK and lcd_pclk from input HCLK or pclk_in. The hclk is the
slave bus clock input. The pclk_in is a separate LCD input clock. The clock generator
outputs are determined by the pixel clock timing register (PCTR) programming parameters.
The generator can generate PCLK dividing down from HCLK rates from 1 (bypass) to 128.
Because it might not be convenient to derive PCLK from HCLK, a separate clock input,
pclk_in, is provided. The pclk_in can then be set to the exact value required by the LCD
panel using MISC registers.
Finally, note that lcd_pclk is different from PCLK in that lcd_pclk can be held inactive while
control register 1 (CR1), programming bit LCE is inactive. This facilitates the power
sequencing requirements of the panel.
See also: Chapter 5: Reset and clock generator (RCG).
34.4
Interrupts
There are three coordinated interrupt registers: the interrupt status register (ISR), the
interrupt mask register (IMR), and the interrupt vector register (IVR). The ISR and IMR are
both read/write registers while the IVR is read-only.
Any of the internally generated interrupts sets a corresponding bit in the ISR. If the error’s
corresponding mask bit is set in the IMR, then the corresponding error bit in the IVR register
sets, generating an interrupt to the processor. The processor interrupt handler can respond
by reading the IVR to determine the particular interrupt to process. At the end of an interrupt
response, the programming can reset the interrupt in the ISR by writing logic 1 to the
corresponding interrupt bit in the ISR. Through the IMR register the programming has full
control over which interrupts to enable.
Doc ID 018553 Rev 3
485/590
Display controller (CLCD)
RM0078
34.5
Functional description
34.5.1
LCD controller core
The LCD is first initialized by the processor by means of the AHB slave bus interface. This
interface is a read/write interface in which the LCD can only respond – and not initiate – to
bus transactions. Minimal setup of the control and status registers are the timing registers
for horizontal and vertical timing signals (registers HTR, VTR1, VTR2 and HVTER) and the
DMA base address register (register DBAR). After that, the control bit LCE in the control
register must be set and the LCD runs, accessing frame buffer memory and processing and
piping the data through to the display.
If you use the palette, you must first load it. The palette is programmable by the PSS bit of
CR1. There are two options for loading the palette:
●
Statically: by the processor via the AHB slave interface, or
●
Dynamically: with each frame, via the AXI master interface and the frame buffer. The
start of each frame begins with an internal start sync pulse from the timing and control
unit (see Section 34.5.3: Timing and control unit), coincident with the vertical
synchronization signal. This start sync pulse initiates the DMA controller to start
accessing data from frame buffer memory via the master interface; the start sync pulse
also initiates the pixel unpack to start accepting data from its side of the input FIFO.
The master interface initiates read transactions with the bus. There are programmable
options for 4, 8, 16 word bursts read lengths to improve bus utilization. Received frame data
is written into the input FIFO. The FIFO bridges the two clock domains.
The pixel unpack will unpack 1, 2, 4, 8, 16, 18, or 24 bit-per-pixel (bpp) words from the frame
buffer word. Depending on the bpp programming and whether a palette is used, the pixel
data is sent to either the palette or the output formatter. The output formatter contains an
output FIFO which queues ready pixels for synchronization with the LCD panel timing
signals.
The LCD supports 2 port interfaces with the addition of a second pixel processing pipeline.
The additional pipeline consists of a Palette, output FIFO, and output formatter. Control bit
LPS in control register 1 (register CR1) directs the pixel flow within the pixel unpack module
to either port/link 1 pipeline only, or unpacking adjacent pixels and presenting them in
parallel to both port/link 1 and 2 pipelines.
486/590
Doc ID 018553 Rev 3
RM0078
34.5.2
Display controller (CLCD)
Master and slave bus interfaces
The LCD controller supports the following bus interfaces:
●
●
34.5.3
AMBA 2.0 AHB slave interface, that connects the processor to the LCD controller’s
control & status registers, including the palette RAM. It is characterized by:
–
32-bit data interface
–
SINGLE word burst
–
OKAY response only
AMBA 3.0 AXI master, that connects the frame buffer memory to the LCD controller’s
DMA controller and input FIFO. It is characterized by:
–
64-bit data interface
–
4, 8, 16 word bursts
–
incrementing-address burst (INCR) only
–
aligned transfers only
–
outstanding read is supported (maximum number of outstanding memory read
requests is 4, depending on the MRR register)
–
Three error monitors, generating a maskable interrupt: Read burst length error,
Return ID error, Response signal error
–
Overlap read burst is not supported
–
Out-of order transaction completion is not supported
–
No FIXED or WRAP burst types allowed
Timing and control unit
The timing and control unit uses the horizontal timing register (HTR), the vertical timing
registers 1 & 2 (VTR1, VTR2) and the horizontal/vertical timing extension register (HVTER)
to generate timing signals lcd_vsync, lcd_hsync, and lcd_de to the LCD panel. The timing
unit remains inactive till control bit LCE in control register 1 (CR1) goes active. At that point
the timing & control unit runs till LCE is de-asserted. At that time, the timing unit will keep
running till the end of the current frame, and then orderly shut down. The timing unit can be
reactivated with LCE re-asserted, but often power to the display must be re-cycled. This can
be accomplished by control bit LPE in control register 1 (CR1) connected as an enable to an
external power source for the LCD panel.
Control bit LCE also plays a role in power sequencing. On startup, while LCE is inactive,
timing signals lcd_pclk, lcd_hsync, lcd_vsync, lcd_de and data signals lcd_r[7:0], lcd_g[7:0],
lcd_b[7:0] are held to logic zero. On LCE shutdown, after the current frame being displayed
completes and the timing unit halts, these same signals are forced to logic zero. at that point
power can safely be removed from the LCD panel.
The timing unit provides interrupt VCT which triggers on one-of-four timing trigger points
during the vertical scan period. The point of triggering is programmable via the interrupt
scan compare register (ISCR).
34.5.4
DMA controller & memory interface
The DMA controller initializes via the internal frame start pulse with the transfer of the DMA
base address register (DBAR) to the DMA current address register (DCAR) and the
commencement of the first memory transfer transaction. The numbers of words in a burst
are programmed by FDW in control register 1 (CR1). Based on FDW and the number of
Doc ID 018553 Rev 3
487/590
Display controller (CLCD)
RM0078
empty words in the FIFO, a service request by the DMA controller to the master interface
initiates a frame buffer read. The DMA controller keeps total track of the number of words
per frame that are fetched from frame buffer memory. If the PSS bit in CR1 is set, indicating
palette load from the frame buffer, the DMA load is divided into two segments. First, the
palette is loaded directly from frame buffer memory (bypassing the input FIFO), based on
the number of words indicated by bits-per-pixel control bits BPP in CR1. Second, after the
palette is loaded, the DMA controller loads the appropriate number of frame buffer words for
each frame through the input FIFO.
If, on the other hand, the PSS bit in CR1 is not set, indicating palette load from the
processor via the slave interface, or if there is no palette required (for 16, 18, 24 bpp), the
DMA controller only loads the appropriate number of frame buffer words for each frame
through the input FIFO.
The software must program the DMA end address register (DEAR) with the frame buffer end
address. The DMA controller will keep reading frame buffer words until the current address
in DCAR equals DEAR. At that point, the DMA Controller will halt until the next frame, when
it reads from the frame buffer starting with the address in DBAR.
In case of outstanding memory read requests (i.e. MRR register equal to 01 or 10), the
register DEAR_MRR overrides DEAR register. When MRR = 01 or 10, DEAR_MRR serves
as a lookahead ad-dress, preventing the DMAC from generating memory read request when
there are overlap read request outstanding.
34.5.5
Frame buffer organization
The frame buffer memory is not included in the LCD controller core. The frame buffer
attached to the master interface provides encoded or unencoded pixels for display on the
LCD panel. If the PSS bit in CR1 is set, indicating palette load from the frame buffer, the
lowest memory locations of the frame buffer must contain the contents for load into the
palette by the DMA controller. Table 148 lists the number of the frame buffer memory words
that need to be allocated for each palette based on the bits-per-pixel control bits BPP in
CR1.
Table 148. Frame buffer support for palette load (PSS =1)
Frame buffer bits-per pixel
(bpp)
Palette size required
Number of required 32-bit
frame buffer words
1
2 entries by 16-bit
1
2
4 entries by 16-bit
2
4
16 entries by 16-bit
8
8
256 entries by 16-bit
128
In CLCD 1-port and 2-ports configurations, both palettes have to be loaded from the frame
buffer (even if in the 1-port case one palette is not used). So the number of frame buffer
memory words that need to be allocated is doubled.
Frame buffer format for no palette load from frame buffer memory, or for bits-per-pixel of 16,
18, 24, which require no palette, is shown in Table 149.
488/590
Doc ID 018553 Rev 3
RM0078
Display controller (CLCD)
Table 149. Frame buffer organization, PSS =0 or BPP = 16, 18, 24 bpp
Frame buffer base address offset
Frame buffer contents
0x0
Start of pixel data
Frame buffer format for palette load from frame buffer memory, with bits-per-pixel of 1, is
shown in Table 150. Addresses from 0x08 to 0x1C do not contain any specific data.
Table 150. Frame buffer organization, PSS =1, BPP = 1 bpp
FB base address offset
Frame buffer contents
0x0
Palette1 Entry 1
Palette1 Entry 0
0x4
Palette2 Entry 1
Palette2 Entry 0
...
...
...
0x20
Start of encoded pixel data
Frame buffer format for palette load from frame buffer memory, with bits-per-pixel of 2, is
shown in Table 151. Addresses from 0x10 to 0x1C do not contain any specific data.
Table 151. Frame buffer organization, PSS =1, BPP = 2 bpp
FB base address offset
Frame buffer contents
0x0
Palette1 Entry 1
Palette1 Entry 0
0x4
Palette2 Entry 1
Palette2 Entry 0
0x08
Palette1 Entry 3
Palette1 Entry 2
0x0C
Palette2 Entry 3
Palette2 Entry 2
...
...
...
0x20
Start of encoded pixel data
Frame buffer format for palette load from frame buffer memory, with bits-per-pixel of 4 is
shown in Table 152.
Table 152. Frame buffer organization, PSS =1, BPP = 4 bpp
FB base address offset
Frame buffer contents
0x00
Palette1 Entry 1
Palette1 Entry 0
0x04
Palette2 Entry 1
Palette2 Entry 0
0x08
Palette1 Entry 3
Palette1 Entry 2
0x0C
Palette2 Entry 3
Palette2 Entry 2
0x10
Palette1 Entry 5
Palette1 Entry 4
0x14
Palette2 Entry 5
Palette2 Entry 4
0x18
Palette1 Entry 7
Palette1 Entry 6
0x1C
Palette2 Entry 7
Palette2 Entry 6
Doc ID 018553 Rev 3
489/590
Display controller (CLCD)
RM0078
Table 152. Frame buffer organization, PSS =1, BPP = 4 bpp (continued)
FB base address offset
Frame buffer contents
0x20
Palette1 Entry 9
Palette1 Entry 8
0x24
Palette2 Entry 9
Palette2 Entry 8
0x28
Palette1 Entry 11
Palette1 Entry 10
0x2C
Palette2 Entry 11
Palette2 Entry 10
0x30
Palette1 Entry 13
Palette1 Entry 12
0x34
Palette2 Entry 13
Palette2 Entry 12
0x38
Palette1 Entry 15
Palette1 Entry 14
0x3C
Palette2 Entry 15
Palette2 Entry 14
0x40
Start of encoded pixel data
Frame buffer format for palette load from frame buffer memory, with bits-per-pixel of 8, is
shown in Table 153.
Table 153. Frame buffer organization, PSS =1, BPP = 8 bpp
FB base address offset
0x000
Palette1 Entry 1
Palette1 Entry 0
0x004
Palette2 Entry 1
Palette2 Entry 0
0x008
Palette1 Entry 3
Palette1 Entry 2
0x00C
Palette2 Entry 3
Palette2 Entry 2
0x010
Palette1 Entry 5
Palette1 Entry 4
0x014
Palette2 Entry 5
Palette2 Entry 4
0x018
Palette1 Entry 7
Palette1 Entry 6
0x01C
Palette2 Entry 7
Palette2 Entry 6
0x020
Palette1 Entry 9
Palette1 Entry 8
0x024
Palette2 Entry 9
Palette2 Entry 8
0x028
Palette1 Entry 11
Palette1 Entry 10
0x02C
Palette2 Entry 11
Palette2 Entry 10
0x030
Palette1 Entry 13
Palette1 Entry 12
0x034
Palette2 Entry 13
Palette2 Entry 12
0x038
Palette1 Entry 15
Palette1 Entry 14
0x03C
Palette2 Entry 15
Palette2 Entry 14
...
...
...
0x3F8
Palette1 Entry 255
Palette1 Entry 254
0x3FC
Palette2 Entry 255
Palette2 Entry 254
0x400
490/590
Frame buffer contents
Start of encoded pixel data
Doc ID 018553 Rev 3
RM0078
Display controller (CLCD)
34.5.6
Input FIFOs
There are 3 input FIFOs: one for base screen resolution (always active) and two for overlay
screen (if overlays are enabled). Each FIFO has a 2k (2048)-word depth by 64-bit width
memory size.
The DMA controller and master interface control the write side on the bus clock (ACLK)
domain, while the pixel unpack controls the read side on the pixel clock (PCLK) domain.
Grey encoded address pointers are used for FIFO empty and full flag calculations as well as
when there are 4, 8 or 16 empty word locations.
Based on the FDW programming bits in control register 1 (CR1) and the number of empty
FIFO locations, a service request for 4, 8, or 16-word bursts from memory is issued by the
DMA Controller to the master interface. Note that 16 word bursts should only be used for
optional FIFO implementations larger that N=16 words so as not to unnecessarily starve the
Pixel Unpack for data.
Interrupts IFO (Input FIFO – Overrun) and IFU (Input FIFO – Underrun) trigger whenever
there is a FIFO write with no empty locations or read with no valid data. The write side by
design cannot overrun the FIFO. While tested, this protection remains for potential error
analysis. The read side unpack logic can cause an underrun, but this is due to insufficient
Master Bus bandwidth or frame buffer memory response, causing the input FIFO to go
empty while there is a request for data by the unpack logic.
34.5.7
Pixel unpack
The pixel unpack reads 32-bit data from the input FIFO and extracts 1, 2, 4, 8, 16, 18, or 24
bits-per-pixel data depending on the BPP programming bits in CR1. Note that 1, 2, 4, 8 bpp
are encoded pixels that index an entry into the palette while 16, 18, 24 bpp are unencoded
pixels that directly drive the LCD panel via the output formatter. The CLCD supports bigendian, little-endian, and Windows CE data formats. With each frame, the internal start sync
pulse from the timing & control unit initializes the pixel unpack to start de-queuing words
from the Input FIFO as they appear on the read side.
The following tables list the structure of the data in each frame buffer word in the input FIFO
corresponding to the endian and BPP programming combinations. For each of the three
supported data formats, the pixel unpack extracts the appropriate display pixel from the data
word.
The following are the three data types, with assigned mnemonics:
●
LEB_LEP: EB0 = 0, EPO = 0 little endian frame buffer byte, placed in little endian pixel
byte
●
BEB_BEP: EB0 = 1, EPO = X big endian frame buffer byte, placed in big endian pixel
byte
●
LEB_BEP (1,2,4 bpp only): EB0 = 0, EPO = 1 little endian frame buffer byte, placed in
big endian pixel byte (Windows CE format)
Table 154. LEB_LEP, Input FIFO Read Side bits [31:16]
Input FIFO Read Side Output Bits
BPP
1
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
P31
P30
P29
P29
P27
P26
P25
P24
P23
P22
P21
P20
P19
P18
P17
P16
Doc ID 018553 Rev 3
491/590
Display controller (CLCD)
RM0078
Table 154. LEB_LEP, Input FIFO Read Side bits [31:16] (continued)
Input FIFO Read Side Output Bits
BPP
31
30
29
P15
28
27
P14
26
25
P13
24
23
P12
22
21
P11
20
19
P10
18
17
P9
16
P8
2
1
0
1
0
1
0
1
P7
0
1
0
1
P6
0
1
0
P5
1
0
P4
4
3
2
1
0
3
2
1
0
3
2
1
0
P3
3
2
1
0
P2
8
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
-
-
-
-
-
-
17
16
23
22
21
20
19
18
17
16
P1
16
15
14
13
12
11
10
9
8
P0
18
-
-
-
-
-
-
-
-
P0
24
-
-
-
-
-
-
-
-
Table 155. LEB_LEP, Input FIFO Read Side bits [15:0]
Input FIFO Read Side Output Bits
BPP
1
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
P15
P14
P13
P12
P11
P10
P9
P8
P7
P6
P5
P4
P3
P2
P1
P10
P7
P6
P5
P4
P3
P2
P1
P0
2
1
0
1
0
1
0
P3
1
0
1
0
P2
1
0
1
0
P1
1
0
P0
4
3
2
1
0
3
2
1
0
3
2
1
0
P1
3
2
1
0
3
2
1
0
P0
8
7
492/590
6
5
4
3
2
1
0
7
Doc ID 018553 Rev 3
6
5
4
RM0078
Display controller (CLCD)
Table 155. LEB_LEP, Input FIFO Read Side bits [15:0] (continued)
Input FIFO Read Side Output Bits
BPP
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
-
-
-
-
-
-
17
16
23
22
21
20
19
18
17
16
P0
16
15
14
13
12
11
10
9
8
P0
18
-
-
-
-
-
-
-
-
P0
24
-
-
-
-
-
-
-
-
Table 156. BEB_BEP, Input FIFO Read Side bits [31:16]
Input FIFO Read Side Output Bits
BPP
1
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
P0
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
P15
P0
P1
P2
P3
P4
P5
P6
P7
2
1
0
1
0
1
0
P0
1
0
1
0
P1
1
0
1
0
P2
1
0
P3
4
3
2
1
0
3
2
1
0
3
2
1
0
P0
3
2
1
0
P1
8
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
6
5
4
3
2
1
0
-
-
-
-
-
17
16
22
21
20
19
18
17
16
P0
16
15
14
13
12
11
10
9
8
7
P0
18
-
-
-
-
-
-
-
-
-
P0
24
-
-
-
-
-
-
-
-
23
Doc ID 018553 Rev 3
493/590
Display controller (CLCD)
RM0078
Table 157. BEB_BEP, Input FIFO Read Side bits [15:0]
Input FIFO Read Side Output Bits
BPP
1
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
P16
P17
P18
P19
P20
P21
P22
P23
P24
P25
P26
P27
P28
P29
P30
P31
P8
P9
P10
P11
P12
P13
P14
P15
2
1
0
1
0
1
0
P4
1
0
1
0
P5
1
0
1
0
P6
1
0
P7
4
3
2
1
0
3
2
1
0
3
2
1
0
P2
3
2
1
0
P3
8
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
P1
16
15
14
13
12
11
10
9
8
P0
18
15
14
13
12
11
10
9
8
P0
24
15
14
13
12
11
10
9
8
Table 158. LEB_BEP, Input FIFO Read Side bits [31:16]
Input FIFO Read Side Output Bits
BPP
1
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
P24
P25
P26
P27
P28
P29
P30
P31
P16
P17
P18
P19
P20
P21
P22
P23
P12
P13
P14
P15
P8
P9
P10
P11
2
1
0
1
0
1
0
P6
1
0
1
0
P7
1
0
1
0
P4
1
0
1
0
P5
4
3
494/590
2
1
0
3
2
1
0
3
Doc ID 018553 Rev 3
2
1
0
3
2
RM0078
Display controller (CLCD)
Table 158. LEB_BEP, Input FIFO Read Side bits [31:16] (continued)
Input FIFO Read Side Output Bits
BPP
31
30
29
28
27
26
25
24
23
22
21
20
P3
19
18
17
16
P2
8
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
-
-
-
-
-
-
17
16
23
22
21
20
19
18
17
16
P1
16
15
14
13
12
11
10
9
8
P0
18
-
-
-
-
-
-
-
-
P0
24
-
-
-
-
-
-
-
-
Table 159. LEB_ BEP, Input FIFO Read Side bits [15:0]
Input FIFO Read Side Output Bits
BPP
1
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
P8
P9
P10
P11
P12
P13
P14
P15
P0
P1
P2
P3
P4
P5
P6
P7
P4
P5
P6
P7
P0
P1
P2
P3
2
1
0
1
0
1
0
1
P2
0
1
0
P3
1
0
1
0
P0
1
0
P1
4
3
2
1
0
3
2
1
0
3
2
1
0
P1
3
2
1
0
P0
8
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
P0
16
15
14
13
12
11
10
9
8
P0
18
15
14
13
12
11
10
9
8
Doc ID 018553 Rev 3
495/590
Display controller (CLCD)
RM0078
Table 159. LEB_ BEP, Input FIFO Read Side bits [15:0] (continued)
Input FIFO Read Side Output Bits
BPP
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
P0
24
15
34.5.8
14
13
12
11
10
9
8
Palette lookup table
The palette is a 256 entry by 16-bit lookup table implemented as a two-port 128 entry by
32- bit RAM. One port ties in with the bus clock (HCLK) domain, and based on control bit
PSS in control register 1 (CR1), either the slave interface via the processor, or the master
interface via the DMA controller and the frame buffer memory can fill the palette. Regardless
of the PSS setting, the processor via the slave interface can always read the palette RAMs
contents. The second port ties in with the pixel clock (PCLK) domain, enabling the palette
RAMs contents to be indexed by the pixel unpack encoded pixel output. The palette’s output
flows to the output formatter.
Selection of which 16-bit half of the 32-bit Palette entry is determined by the endian setting
and the least significant bit of the indexing encoded pixel input. Control bit EBO in CR1
determines the endian setting: In little endian mode, when the input index encoded pixel
least significant bit is zero, the lower 16-bit palette entry is selected; In big endian mode,
when the input index encoded pixel least significant bit is zero, the upper 16-bit palette entry
is selected.
34.5.9
Output FIFO and formatter
The output formatter contains an output FIFO which comprises of a 16 word by 24-bit
memory. Depending on the bits-per-pixel programming, the incoming selection is either the
pixel unpack (16, 18, 24 bpp) or the palette (1, 2, 4, 8 bpp).
The output FIFO is slave to the unpack pixel which drives pixels to it either directly or
through the palette. The output FIFO provides back-pressure pipeline freeze capability
when it cannot accept another pixel for queuing. This allows the LCD controller to prefetch
frame buffer data at the start of a frame, filling up both input and output FIFOs and then
freezing till the first line is ready to display. Once the first data enable (lcd_de) signal from
the timing & control unit is active, the output FIFO read side continuously reads for the
remainder of each active horizontal line period. These reads in turn reactivate the unpack
pixel and subsequently the DMA controller to access frame buffer data on a demand basis.
The output formatter interprets the pixel read from the output FIFO according to control bits
BPP, OPS and RGB defined in CR1.
Both the write side and the read side of the output FIFO are on the pixel clock (PCLK)
domain. Grey encoded address pointers are used for FIFO empty and full flag calculations
as well as the lookahead pipeline freeze signal. The treatment of the output FIFO in this way
allows for the same design for both the input and output FIFOs, and enables the output FIFO
to read on a different clock domain in future designs.
Interrupts OFO (Output FIFO – Overrun) and OFU (Output FIFO – Underrun) trigger
whenever there is a FIFO write with no empty locations or read with no valid data. The write
side logic by design cannot overrun (because of the back-pressure pipeline freeze
496/590
Doc ID 018553 Rev 3
RM0078
Display controller (CLCD)
capability). While tested, this protection remains for potential error analysis. The read side
can cause an OFU interrupt, and the cause of this could be inadequate bus bandwidth in
accessing frame buffer data.
34.5.10
Power sequencing
The LCD controller provides the following power-up sequencing support:
1.
2.
Power is applied to the VLSI device containing the LCD controller core and the LCD
panel. Internally the LCD controller core holds the following signals to logic zero:
–
lcd_vsync
–
lcd_r[7:0]
–
lcd_hsync
–
lcd_g[7:0]
–
lcd_de
–
lcd_b[7:0]
–
lcd_pclk
After a pre-determined amount of time specified by the LCD panel and controlled by a
processor timer, the control bit LCE in control register 1 is set to on. With LCE is on, the
signals to the LCD panel listed in step 1 are free to drive to their programmed active
levels.
The LCD controller provides the following power-down sequencing support:
a)
Control bit LCE in control register 1 is set to off.
b)
After the current frame being displayed completes, the signals to the LCD panel
listed above are forced to zero.
c)
At the time the signals to the LCD panel are forced to zero, interrupt LDD is
generated, signaling frame completion. After a pre-determined amount of time
specified by the LCD panel, power to the display can be removed.
Note:
The control bit LPE in CR1, connected as an enable to an external power source for the
LCD panel, can be used for enabling and disabling power to the LCD panel.
34.5.11
Pulse-width modulation
In order to support TFT LCD panels with LED for backlighting, a pulse-width modulation
(PWM) module is added to the CLCD. Typically, a DC-DC converter provides the constant
current to the LEDs, and the converter contains a brightness input. Modulating the
brightness input with a PWM signal trades-off power consumed by the panels versus
brightness.
The PWM module has two sources for a clock: the slave bus HCLK or the pclk_in. The
selected clock is pre-scaled to the desired PWM frequency, which in turn is modulated in
pulse width by the PWM duty cycle register (PWMDCR).
34.5.12
Overlay windows
The CLCD supports up to 2 overlay windows. These overlay windows overlay the
background graphics screen. A key feature of an overlay window is that the window is read
from a separate section of memory in substitution of the background window. Thus, there is
no increase of the master bus bandwidth required when activating an overlay window.
Doc ID 018553 Rev 3
497/590
Display controller (CLCD)
RM0078
Each overlay window contains the following register definitions:
Note:
●
A control bit in register overlay window enable register (OWER), which enables or
disables the overlay window.
●
Overlay window X-Coordinates X_START, X_END in register overlay window X start /
end register x (OWXSER_x). Note that there are 2 of these registers (x = 0 to 1), one
for each potential overlay window.
●
Overlay window Y-Coordinates Y_START, Y_END in register overlay window Y start /
end register x (OWYSER_x). Note that there are 2 of these registers (x = 0 to 1), one
for each potential overlay window.
●
Start address of frame buffer memory for overlay window in register overlay window
DMA base address register x (OWDBAR_x). The contents for the overlay window are
located in a separate frame buffer memory section, pointed to by OWDBAR_x.
●
The current address of overlay window x within the DMA controller is in register overlay
window DMA current address register x (OWDCAR_x).
●
End address of frame buffer memory for overlay window in register overlay window
DMA end address register x (OWDEAR_x). It is compared to register OWDCAR_x to
determine last frame buffer memory address to read from for overlay window x.
There are 2 register sets (x = 0 to 1), one set for each potential overlay window, with each
set consisting of the following registers: OWXSER_x, OWYSER_x, OWDBAR_x,
OWDEAR_x and OWDCAR_x.
When a overlay window is properly programmed (when its registers OWXSER_x,
OWYSER_x, OWDBAR_x, OWDEAR_x and OWDCAR_x are written and the corresponding
OWE bit in register OWER is set to 1), the bandwidth requirement increases for CLCD
master in the following way:
●
For the base window only: the bandwidth is for the base window only
●
For the base window + 1 overlay: the bandwidth is for the base window + the size of 1st
overlay window
●
For the base window + 2 overlays: the bandwidth is for the base window + the size of
1st overlay window + the size of 1st overlay window
A single overlay window over a background graphics window (Figure 191 below) depicts an
overlay window with its X_START, Y_START, X_END, Y_END coordinates. The origin is
defined as the upper-left point in the screen.
Figure 191. A single overlay window over a background graphics window
Y start
Y end
X start
498/590
Doc ID 018553 Rev 3
X end
RM0078
35
Graphics processing unit (GPU)
Graphics processing unit (GPU)
This chapter focuses on GPU functionality and operation.
For the GPU feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
35.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The GPU is a complex accelerator. It is mainly intended to be used through the binary
device driver and high-level 3D libraries (OpenGL) made available by IP vendor (ARM Ltd)
and that can be obtained from STMicroelectronics.
Figure 192 shows a typical graphics system.
Figure 193 on page 501 shows the functional blocks within the GPU.
Figure 192. GPU top level block diagram
Geometry processing refers to the
various tasks the geometry
processor performs.
Geometry
processor
Pixel
processor
During geometry processing, the
geometry processor converts
geometric descriptions of each
object to be drawn into a list of
polygons for rendering, and passes
this list to the pixel processor.
See: Functional description on
page 501
Produces a final image from the list of
primitives generated by the geometry
processor.
Collectively, the tasks that the pixel
processor performs are referred to as
rendering
See: Pixel processor on page 503
Memory management unit (MMU)
System bus
Display controller
Enables access checking and
translation for all pixel and geometry
processor memory accesses.
All memory accesses from the pixel and
geometry processor use the MMU for
access checking and translation.
See: Memory management unit (MMU)
on page 505
Display
Doc ID 018553 Rev 3
499/590
Graphics processing unit (GPU)
35.2
RM0078
Clocks
See also: Chapter 5: Reset and clock generator (RCG).
35.3
●
PCLK is the system APB clock. This clock is primarily used to program GPU registers.
●
MALI_SUBSYS_AXI_m_aclk is the system AXI clock. This clock is primarily used to
perform Read/Write on external memory by the GPU’s internal DMA.
●
MALI_200Mhz_clk is the main GPU clock. Different GPU logics run on this clock
(except for interface logic such as APB or AXI). Synthesizer SYNTH3 generates this
clock.
Interrupts
The GPU provides the following interrupt request signals:
●
IRQ_m200 for the pixel processor
●
IRQ_mgp2 for the geometry processor
●
IRQ_mmu for the MMU
In addition to the physical interrupt lines listed above, each unit has several logical
interrupts.
See also: Appendix A: Interrupts
35.4
500/590
Resets
●
PRESET_n is the APB reset.
●
MALI_SUBSYS_AXI_m_RESETNn is the AXI reset.
●
MALI_200Mhz_rstn is generated by the synthesizers, and is primarily used to reset
MALI internal logic, including the pixel processor and geometry processor.
Doc ID 018553 Rev 3
RM0078
35.5
Graphics processing unit (GPU)
Functional description
Figure 193. GPU functional block diagram
GPU
Functional description
System bus interface
Vertex shader core
Vertex shader
PLBU
command
processor
Vertex
loader
On-chip bus
Configuration registers
Pixel processor
System bus interface
Tile writeback unit
Polygon list
reader
Vertex
loader
system bus
Vertex
storer
Polygon list
builder unit
(PLBU)
Memory management unit (MMU)
Vertex
shader
command
processor
Tile buffers
RSW
Triangle setup unit
Rasterizer
Fragment
shader
Blending
unit
Configuration registers
Doc ID 018553 Rev 3
501/590
Graphics processing unit (GPU)
35.5.1
RM0078
Geometry processor
The following are the geometry processor’s primary tasks:
Note:
●
Transform and Lighting (T&L). The input to the geometry processor is raw geometric
descriptions of every object to be drawn in a scene. During T&L, the geometry
processor scales, rotates, and positions the geometry of objects in the scene, and also
calculates and assigns values to the vertices. These values are called varyings, and
are required for rendering. The most common varyings are texture co-ordinates and
colors.
●
Primitive assembly. Primitive assembly involves the PLBU linking vertices together to
form different primitives. Primitive assembly can be implicit by the order of the vertices,
or explicit by an additional index array. When the vertices are specified for a triangle
primitive, the rotation order of the vertices is important because this implicitly specifies
a front and back face of the primitive.
●
Automatic back face culling. After primitive assembly, back face culling removes all the
primitives on the back side of the object that would not be visible because only the back
face would be visible from the viewing plane.
●
Primitive list assembly (optional; can be enabled or disabled). Because the pixel
processor is a tile-based renderer, the geometry processor must prepare a list of all the
primitives required for the pixel processor to render a tile. For each primitive, the PLBU
writes a list entry for each tile in which part of the primitive might be visible.
When using the geometry processor, a user-specified program called a vertex shader runs
on every vertex in a frame. The vertex shader performs:
●
geometry transformations
●
projection correction
●
lighting calculations
●
other per-vertex calculations.
Vertex shader command processor
The vertex shader command processor reads and executes commands from a command
list stored in memory. The command list is a list of commands intended to set up and
configure execution of the vertex shader core. This enables the vertex shader core to
execute multiple jobs without CPU intervention.
Vertex shader core
The vertex loader is a DMA unit that loads per-vertex data for processing. It can accept data
from up to 16 distinct streams, each corresponding to one of 16 input registers in the vertex
shader.
The vertex shader is the most important single unit of the geometry processor. This unit
performs most of the required calculations for each vertex. The vertex shader runs a
program on each vertex of a 3D model, typically performing T&L for the model. The vertex
storer stores data from the output registers of the vertex shader, in memory. The vertex
storer can export data to integer or floating point numbers of different sizes.
Polygon list builder unit (PLBU)
The PLBU creates lists of the polygons that the pixel processor must draw. For each
polygon in a scene, the list builder decides which tiles the polygon covers, and adds the
polygon to the lists that draw those tiles. The PLBU only adds a polygon to lists where the
502/590
Doc ID 018553 Rev 3
RM0078
Graphics processing unit (GPU)
polygon might have to be drawn, reducing the work involved when the pixel processor
renders the scene. The PLBU discards polygons that are certain not to be visible.
The PLBU can handle up to 300 lists to support the tile-based rendering mode of the pixel
processor efficiently.
PLBU command processor
The PLBU command processor reads and executes commands from a command list stored
in memory. The command list sets up and configures execution of the PLBU. This enables
the PLBU to execute multiple jobs without CPU intervention.
In most cases, the PLBU is used on vertices produced by the vertex shader.
35.5.2
Pixel processor
Rendering is the term that describes the various tasks that the pixel processor performs.
During rendering, the pixel processor uses the information from the polygon list to produce a
final framebuffer image.
The pixel processor renders the scene by processing each tile individually. One tile is a
16x16 pixel section of the rendered frame. The processor renders each tile completely
before rendering the next tile.
The pixel processor performs the following rendering operations:
Triangle setup. This prepares the primitive for rendering by calculating various data that is
required to rasterize and shade the primitive.
Rasterization. The primitive is divided into independent fragments. In general, a fragment is
a pixel-sized piece of primitive that the shader pipeline processes, and that might become a
pixel or part of a pixel in the final framebuffer. Fragments that might be visible proceed to the
fragment shading stage, and fragments that are certain not to be visible are discarded.
Fragment shading. This stage determines how the fragment actually looks. In general, the
processor calculates a color for the fragment. The fragment shader takes varying variables
as input and uses them to interpolate data across the primitive. The fragment shader
typically performs texture lookups to calculate the color of the fragment.
Blending. The fragment is blended into the framebuffer to produce the final image. The
fragment is only included in the framebuffer after thorough testing. During blending, various
options enable you to decide how the fragment is combined with the existing framebuffer.
For example, you can make the fragment partially translucent so that the final color of the
fragment is a combination of its color and the existing color in the framebuffer. You can also
include anti-aliasing as an optional stage of the blending process.
Producing the framebuffer contents. After blending, the fragment becomes a pixel at a
certain position in the tile buffer. If no other fragment overwrites that position, the fragment
becomes a pixel in the final framebuffer. Multi-sampling techniques to obtain sharper final
images can be applied to the pixel at this stage.When the internal tile buffer is completely
rendered, it is written to the framebuffer in main memory.
Polygon list reader
The polygon list reader reads the polygon lists from main memory and executes commands
from the lists. Each primitive in the polygon list contains a pointer to the corresponding RSW
and vertex data for that primitive. The polygon list reader passes on information about the
primitives and controls the operation of the GPU.
Doc ID 018553 Rev 3
503/590
Graphics processing unit (GPU)
RM0078
RSW behavior
The RSW is a data structure in main memory that contains the render state of polygons.
This render state conforms to the definition in the OpenGL ES API. The RSW defines how to
rasterize and render the polygon. The GPU keeps a local cache of RSWs for immediate
processing. The different pipeline stages in the renderer each reference the RSWs to
determine how to process the primitives. Therefore RSW data must be available to the
renderer for all the primitives currently in the pipeline. Because the GPU permits RSW data
for many primitives to be active at the same time there is no requirement to stall or flush the
pipeline for a change of renderer state.
Vertex loader
For each primitive in the polygon list, the vertex loader fetches the required vertices from
memory. The vertices must be fully transformed to screen co-ordinates, typically by running
a vertex shader program in the geometry processor. When all the vertices required by a
primitive are available, the full vertex set is sent to the triangle setup unit.
Triangle setup unit
The triangle setup unit takes data from the vertex loader and polygon list reader and uses
vertex data to compute coefficients for edge equations and varying interpolation equations.
The unit passes the results of its computation to the rasterizer.
Rasterizer
The rasterizer takes coefficients and equations from the triangle setup unit and uses these
to divide polygons into fragments. The rasterizer generates fragments that align with pixels
in the tile and passes the fragments in to the fragment shader and then to the blending unit.
Fragment shader
The fragment shader is a programmable unit that calculates how each fragment of a
primitive looks. The fragment shader program specified in the RSW for the primitive is
executed for each fragment produced by the rasterizer. The fragment shader program
consists of very long instruction words (VLIW), and can use all of the functional units of the
fragment shader core in a single instruction.
Blending unit
When a fragment successfully exits the fragment shader, the blending unit blends the
calculated fragment value into the current framebuffer value at that position. The current
RSW selects the blend operation to use.
Tile buffers
The tile buffers take inputs from the fragment shader and perform various tests on the
fragments, such as Z tests and stencil tests. When the tile is fully rendered it is written to the
framebuffer. Four subpixel values are stored for each visible pixel, to support 4x anti-aliasing
without performance degradation. The tile buffers include:
504/590
●
an 8-bit stencil buffer that stores stencil values
●
a 24-bit Z buffer that stores depth values
●
a 32-bit color buffer
Doc ID 018553 Rev 3
RM0078
Graphics processing unit (GPU)
Tile writeback unit
The writeback unit writes the content of the tile buffer to system memory after the tile has
been completely rendered.
35.5.3
Memory management unit (MMU)
The MMU controls and translates memory accesses initiated by the GPU. The MMU
controls data through data structures based on pages and tables.
The MMU connects to the bus infrastructure.
35.6
Operation
Figure 194. The GPU software architecture
Graphics application
EGL driver
Graphics drivers
Graphics standards shared layer
Base driver
Operating system
Device driver
GPU hardware
Doc ID 018553 Rev 3
505/590
Graphics processing unit (GPU)
35.6.1
RM0078
3D system level operation
Figure 195. Typical 3D graphics flow
Start scene
Graphics application makes API calls to start the process and initialize context
Base driver allocates memory structures for geometry processing
Graphics driver fills in the memory structures for geometry processing
See also Figure 196: Geometry processor data structure on
page 507, and
Figure 197: Pixel processor data structure on page 508.
Graphics driver submits its rendering job to the device driver
Device driver starts geometry processing in hardware
For each vertex,
the hardware processes geometry data.
Vertex shading
Primitive assembly
Tile list generation
Base driver allocates memory structures for pixel processing
Graphic driver fills in the memory structures for pixel processing
Graphics driver submits its rendering job to the device driver
Device driver configures and starts the pixel processor
For each fragment,
the hardware processes polygons and
writes to the framebuffer.
Rasterization
Texturing and fragment shading
Blending
Write to framebuffer
Operating system updates display
End scene
506/590
Doc ID 018553 Rev 3
The pixel processor reads in the
polygon lists, render states, and
textures that the geometry processor
has defined. From this information,
the final graphical image is created.
The pixel processor writes the image
to the framebuffer.
(Pixel processor on page 503)
RM0078
Graphics processing unit (GPU)
Figure 196. Geometry processor data structure
Vertex data block memory in
...
Vertex data block 1
Vertex data block 2
Vertex data block memory out
Vertex data block n
Vertex shading
...
Vertex shader
command list memory
Vertex data block 1
Vertex data block 2
Vertex data block n
...
VS command1
VS command n
Polygon list memory
Vertex list
memory
Polygon list tile 1
Polygon list command 1
...
...
Vertex index 1
Polygon list command n
Polygon list building
...
Vertex index n
Polygon list builder
command list memory
Polygon list tile n
Polygon list command 1
...
...
PLB command1
Polygon list command n
PLB command n
Doc ID 018553 Rev 3
507/590
Graphics processing unit (GPU)
RM0078
Figure 197. Pixel processor data structure
Polygon list memory
Vertex data block memory
Polygon list for tile 1
Vertex data block 1
Vertex data block 2
Vertex data block 3
Vertex data block 4
Polygon list command 1
...
Polygon list command 2
Polygon list primitive 1
...
Polygon list primitive 2
Vertex data block n
Render state word 1
Render state word 2
...
...
Polygon list command n
Render state word memory
Render state word n
Polygon list for tile 2
Shader program memory
Polygon list command 1
Shader program 1
Shader program 2
...
Polygon list command 2
Polygon list primitive 1
Shader program n
...
Polygon list primitive 2
Remap table memory
Polygon list command n
508/590
Uniform remap table
Uniforms
Texture descriptor
remap table
Textures
Doc ID 018553 Rev 3
RM0078
35.6.2
Graphics processing unit (GPU)
2D system level operation
Figure 198. 2D graphics process flow
Start scene
Graphics application makes API calls to start the process and initialize context
Graphics driver processes
input and creates
geometry data
See Initial graphics API
calls
Stroked and filled path geometry generation
Transformation
Base driver allocates memory structures for geometry processing
Graphics driver fills in the memory structures for geometry processing
Graphics driver submits its rendering job to the device driver
Device driver starts geometry processing in hardware
Hardware processes
geometry data.
Cached geometry transformation
Tile list generation
Base driver allocates memory structures for pixel processing
Graphic driver fills in the memory structures for pixel processing
Graphics driver submits its rendering job to the device driver
Device driver configures and starts the pixel processor
Hardware processes
pixel data
Rasterization
The pixel processor rasterizes triangles that are inside
the drawing area and within the clipping rectangles.
Clipping
The GPU draws interior fills for the
rasterized geometry. The fill style is
determined by the specified drawing
style and paint objects.
The GPU samples image colors
and combines them with the
generated paint, depending on
the image mode.
Paint generation
More
image
processing
needed?
Yes
Image interpolation
See also, 2D filter
processing.
No
The result from the blending stage
is merged with the destination
framebuffer for display, using the
alpha mask as the blend factor.
Blending
The GPU blends the color with the destination
color using the blend function specified by the
graphic driver.
Masking and anti-aliasing
Write to framebuffer
Operating system updates display
End scene
Doc ID 018553 Rev 3
509/590
Graphics processing unit (GPU)
RM0078
Initial graphics API calls
●
drawing style
●
transformations
●
paints
●
paths
●
images
●
mask buffer and scissor rectangles initialization
Stroked and filled path geometry generation
●
Transformed path divided into line-loops
●
Line-loops tessellated into sets of triangles that represent the filled path
●
Triangles generated that represent the stroke. The driver generates widened stroke
geometry from the path data and stroke style settings.
Transformation
●
Path transformed from user space to surface space
●
Image transformed from user space to surface space
2D filter processing
The filtered image is processed at the image interpolation stage shown in Figure 198 on
page 509, using the destination image format.
Figure 199. GPU image filter process flow
Source image
The source image is converted to a format compatible
with the destination image.
Source image normalized
The GPU performs an image filter operation in the
software driver on the normalized source image.
Filtering
Conversion to destination format
The result from the filtering operation is converted to
the destination image format.
Destination image
Example: applying a blur filter to an image and creating a fade between the original image and the blurred
version of the image:
510/590
1.
Access the source image, then blur the image using the software driver to produce a destination image.
2.
Draw the source image in the framebuffer. See also Figure 198 on page 509.
3.
Draw the blurred image with a specific blend function enabled. See also Figure 198 on page 509.
4.
Repeat steps above as needed.
Doc ID 018553 Rev 3
RM0078
35.6.3
Graphics processing unit (GPU)
Graphics pipeline level operation
Figure 200. Typical graphics pipeline flow
Start processing
Initial processing
Modeling transformation
Viewing transformation
Per-vertex lighting
The API level drivers create data structures for the GPU and configures the
hardware for each scene. The software generates data structures for render
state words (RSWs) and texture descriptors.
The geometry processor runs a vertex shader program for each vertex. This
shader program can perform transform, lighting, viewport transformation, and
perspective transformation. Vertices are then assembled into graphics
primitives, and polygon lists are built for the pixel processor.
The functional blocks used: Vertex shader command processor; Vertex shader
core; Polygon list builder unit (PLBU); PLBU command processor.
Projection transformation
Rasterization
Fragment shading
The pixel processor:
• Reads in polygon list data and commands from the polygon list. The polygon
list entries point to the appropriate RSWs.
• Reads in the RSWs to internal memory.
• Reads in vertices for each primitive. When all required vertices are read,
coefficients and equations for rasterization are calculated in a process called
triangle setup.
• Rasterizes the polygons, and runs fragment shaders. The rasterizer takes the
coefficients and equations from the triangle setup unit and creates fragments.
A fragment shader program is then run on each fragment to calculate the
color of the fragment.
The functional blocks used: RSW behavior; Polygon list reader; Vertex loader;
Triangle setup unit; Rasterizer; Fragment shader.
Blending
Write to framebuffer
To produces the final display data for the frame buffer, the pixel processor:
• Creates blended fragments in a blending unit. The blending unit takes
configuration information from the RSW and applies the corresponding
blending functions to the fragments. The blending unit blends the fragments
with the color already present at the corresponding location in the frame
buffer.
• Tests the fragments and updates the frame buffer. The pixel processor stores
fragments in tile buffers. The tile buffer calculates which fragments are visible
and which are hidden and passes the visible fragments to the frame buffer.
• Writes the content of the tile buffer to system memory after the tile has been
completely rendered.
The functional blocks used: Blending unit; Tile buffers; Tile writeback unit.
End processing
Doc ID 018553 Rev 3
511/590
Video decoder (VDEC)
36
RM0078
Video decoder (VDEC)
This chapter focuses on VDEC functionality and operation.
For the VDEC feature list, refer to the SPEAr1340 datasheet:
●
Doc ID 023063, Data sheet, SPEAr1340, Dual-core Cortex A9 HMI embedded MPU
For technical details about the programmable registers, refer to the following companion
document:
●
36.1
RM0089, Reference manual, SPEAr1340 address map and registers
Overview
The VDEC functional block is a complex subsystem. It is mainly intended to be used through
the binary device driver and low-level software layers that can be obtained from
STMicroelectronics.
Figure 202 shows the video decoder block diagram.
Figure 201. Decoder functional block diagrams
Decoder control software
Application programming interface
External memory
MPEG-2 Strm.
Header
Decode
MPEG-4 Strm.
Header
Decode
H.264 Strm.
Header
Decode
VC-1 Strm.
Header
Decode
RV Strm.
Header
Decode
JPEG Strm.
Header
Decode
Hardware drivers
System bus
Bus interface
Alpha
Blending
Rotation
Inter / Intra
Prediction
MV
Decode
Entropy
Decode
Dithering
Deinterlace
Deblocking
Filter
Inverse
Transform
RLC
Decode
Scaling
RGB
Conversion
Decoder and post processor hardware
512/590
VP6/7/8 Strm.
Header
Decode
Doc ID 018553 Rev 3
AC/DC
Prediction
AVS Strm.
Header
Decode
RM0078
Video decoder (VDEC)
Supported standards, profiles and levels
Table 160. Supported standards, profiles and levels
Standard
Decoder support
H.264
–
–
–
–
Baseline Profile, levels 1 - 4.2
Main Profile, levels 1 - 4.2
High Profile, levels 1 - 4.2
Image size up to 1080p at level 4.2
SVC
– Scalable Baseline Profile, base layer only
– Scalable High Profile, base layer only
MPEG-4
– Simple Profile, levels 0 - 6
– Advanced Simple Profile, levels 0 - 5
MPEG-2
– Main Profile, low, medium and high levels
MPEG-1
– Main Profile, low, medium and high levels
H.263
– Profile 0, levels 10-70. Image size up to 720x576
Sorenson Spark
– Bitstream version 0 and 1
VC-1
– Simple Profile, low, medium and high levels
– Main Profile, low, medium and high levels
– Advanced Profile, levels 0-3
JPEG
– Baseline interleaved
RV
– RV8
– RV9
– RV10
VP6
– VP6.0 (Simple Profile)
– VP6.1
– VP6.2 (Advanced Profile)
VP7
– VP7 versions 0-3
VP8
– VP8 version 2 (WebM)
AVS
– P2 Jizhun Proflie, level 6.0 and 6.2
DivX
– DivX Home Theater Profile Qualified TM
– DivX3/4/5/6
Possible deviations from the tools specified by these levels, and other points to notice are
listed in Table 161.
Table 161. Deviations from the supported profiles and levels
Standard
Tool
Decoder support
AVS
4:2:2 sampling
Not supported
H.263
Time code extensions
Not supported
H.264
Slice groups (FMO)
If more than one slice group used, software
performs entropy decoding.
Doc ID 018553 Rev 3
513/590
Video decoder (VDEC)
RM0078
Table 161. Deviations from the supported profiles and levels (continued)
Standard
36.2
Tool
Decoder support
H.264
Arbitrary slice order
Supported, software performs entropy decoding.
H.264
Redundant slices
Supported, but not utilized; redundant slices are
skipped by software.
H.264
Image cropping
Not performed by the decoder, cropping
parameters are returned to the application.
SVC
Enhancement layers
Not supported
MPEG-4
Data partitioning
Supported, software performs entropy decoding.
MPEG-4
Global motion
compensation
Not supported
VC-1
Multi-resolution
Supported, upscaling will be performed by the
postprocessor.
VC-1
Range mapping
Supported, range mapping will be performed by
the post-processor.
JPEG
Non-interleaved data
order
Not supported
Clocks
See also: Chapter 5: Reset and clock generator (RCG)
The following clocks are used inside the decoder wrapper:
●
ACLK(4) is the system AXI clock. It is used by the asynchronous bridge to interface the
decoder AXI Master with the AXI bus.
●
HCLK(4) is the system AHB clock. It is used by the asynchronous bridge to interface
the decoder AHB Slave with the AHB bus.
●
DCLK is the decoder’s core clock (235 MHz). It is sourced from clock synthesizer
SYNT0.
The decoder AXI Master and AHB Slave both use this clock.
36.3
Interrupts
The decoder and post-processor share a common interrupt line (XINTDec, interrupt line
ID[113]) for all the interrupts generated.
36.3.1
Decoder interrupts
When the decoder hardware wants the software attention, it sets the interrupt bit high with
one of the status flags providing information about the reason for the interrupt. When the
software has handled the interrupt it must reset all status flags to zero. The interrupt bit
stays high until software has reset it.
4. ACLK and HCLK are the same clock; they are connected to the system bus clock AHCLK.
514/590
Doc ID 018553 Rev 3
RM0078
Video decoder (VDEC)
The interrupt method can be set to interrupting or polling.
Table 162. Decoder interrupt register (SWREG1 OFFSET 0X4)
Bit
Name
31:25 -
24
sw_dec_pic_inf
23:19 -
Function
Not used
B slice detected. This signal is driven high during picture ready interrupt if B-type slice is
found. This bit does not launch interrupt but is used to inform software about h264 tools.
DIVX3: For DIVX3 this bit tells the value of extension header flag (flag called
FLIPFLOP)
Not used
18
sw_dec_timeout
Interrupt status bit decoder timeout. When high, decoder has made no bus transactions
in 2^18-1 clock cycles and has not set an interrupt.
Possible only if timeout interrupting is enabled. This should be considered as an
encountered error in the input stream.
Note: Post-processor transactions affect this feature;
running stand-alone post-processing while decoding may prevent decoder timeout
interrupts.
17
sw_dec_slice_int
Interrupt status bit dec_slice_decoded. When high software must set new base
addresses for sw_dec_out_base and sw_jpg_ch_out_base before reseting this status
bit. Used for JPEG and VP8 web-p modes.
16
sw_dec_error_int
Interrupt status bit input stream error.
When high, an error is found in input data stream decoding, and software must perform
error concealment. HW will self reset.
15
sw_dec_aso_int
Interrupt status bit ASO (Arbitrary Slice Ordering) detected.
When high, hardware has encountered Arbitrary Slice Order tool in the input H.264
stream data, and software must perform entropy decoding. Hardware will self-reset.
14
Interrupt status bit input buffer empty.
sw_dec_buffer_int When high, the input stream buffer is empty but the picture is not ready. Software must
provide a new stream pointer to hardware. Hardware will not self-reset.
13
sw_dec_bus_int
Interrupt status bit - Error response from bus.
When high, hardware has received an error response from the bus while accessing
external memory. This is a fatal error possibly caused by the incorrect allocation of
decoder linear memory. Hardware will self-reset.
12
sw_dec_rdy_int
Interrupt status bit decoder.
When this bit is high decoder has decoded a picture. HW will self reset.
11:9 -
8
7:5
4
Not used
sw_dec_irq
Decoder IRQ.
This bit drives the interrupt line, OR gated with the post-processor interrupt bit. Software
will reset this after the interrupt is handled. The interrupt line is not used for the decoder
if the interrupt disable bit for decoder is high.
-
Not used
sw_dec_irq_dis
Decoder IRQ disable.
When high, there are no interrupts concerning decoder from HW. Polling must be used
to see the interrupt status.
Doc ID 018553 Rev 3
515/590
Video decoder (VDEC)
RM0078
Table 162. Decoder interrupt register (SWREG1 OFFSET 0X4) (continued)
Bit
3:1
0
Name
Function
-
Not used
sw_dec_e
Decoder enable.
Setting this bit high will start the decoding operation. HW will reset this when picture is
processed or ASO or stream error is detected or bus error or timeout interrupt is given.
516/590
Doc ID 018553 Rev 3
RM0078
36.3.2
Video decoder (VDEC)
Post-processor interrupts
The post-processing interrupt register contains information for the post-processor.
Table 163. Post-processing interrupt register (swreg60 offset 0xf0)
Bit
Name
Function
13
sw_pp_bus_int
Interrupt status bit - Error response from bus.
When high, hardware has received an error response from the bus while accessing
external memory. This is a fatal error possibly caused by the incorrect allocation of
postprocessor linear memory. Hardware will self-reset. In pipeline mode this bit is not
used
12
sw_pp_rdy_int
Interrupt status bit pp.
When this bit is high post processor has processed a picture in external mode. In pipeline
mode this bit is not used.
11:9
-
8
sw_pp_irq
7:5
-
4
sw_pp_irq_dis
3:2
-
1
0
Not used
Post-processor IRQ.
This bit drives the interrupt line, OR gated with the decoder interrupt bit.
Software will reset this after the interrupt is handled. The interrupt line is not used if the
interrupt disable bit for postprocessor is high.
Not used
Post-processor IRQ disable. When high, there are no interrupts from HW concerning post
processing. Polling must be used to see the interrupt
Not used
Decoder – post-processing pipeline enable:
sw_pp_pipeline_e 0 = Post-processor is processing different picture than decoder or is disabled
1 = Post-processing is performed in pipeline with decoder
sw_pp_e
External mode post-processing enable. This bit will start the post-processing operation.
Not to be used if PP is in pipeline with decoder (sw_pp_pipeline_e = 1). HW will reset
this when picture is post-processed.
Doc ID 018553 Rev 3
517/590
Video decoder (VDEC)
36.4
RM0078
Functional description
Figure 202. Video decoder detailed block diagram
romd
(all 6 instances)
axiahbg1dec
Master
Interface
AXI
streamd
busifd
X2X
Bridge
hwg1core
axiwmfid
AXI
BUS
scd
Mvd
bsd
DCLK
(235 MHz)
Domain
Fuse
Post
Processor
filterd
Slave
Interface
AHB
H2HA
sync
32
transd
refbufferd
pred
ppd
ahbwsifd
AHB
BUS
acdcd
Fuse
Decoder
ACLK
(166 MHz)
Domain
HCLK
(166 MHz)
Domain
clkctrld
Hwg1swr
(swrdec &
swrpp)
ramd
Decoders are operated using the application programming interface (API).
518/590
●
H.264 decoder on page 519
●
MPEG-4 / H.263 / Sorenson Spark decoder on page 521
●
MPEG-2 / MPEG-1 decoder on page 523
●
JPEG decoder on page 525
●
VC-1 decoder on page 526
●
RV decoder on page 528
●
VP6 decoder on page 530
●
VP7/VP8 decoder on page 532
●
AVS decoder on page 534
●
DivX decoder on page 536
Doc ID 018553 Rev 3
XINT
(Interrupt)
RM0078
36.4.1
Video decoder (VDEC)
H.264 decoder
Table 164. H.264 / SVC decoder base layer features
Feature
Decoder support
Input data format
H.264 byte or NAL unit stream / SVC stream
Decoding scheme
– Frame by frame (or field by field)
– Slice by slice
Output data format
– YCbCr 4:2:0 semi-planar format(1)
– YCbCr 4:0:0 (monochrome)
Supported image size
– 48 x 48 to 1920 x 1088(2)
– Step size 16 pixels(3)
Maximum frame rate
30 fps at 1080p(4)
Maximum bit rate
As specified by H.264 HP level 4.2
Error detection and concealment
Supported
1. In semi-planar format, the Cb and Cr components are interleaved pixel by pixel in a separate plane. This
allows more efficient bus usage compared to the planar YCbCr format due to longer bursts that can be
used in chrominance data transferring.
2. The maximum decoder output size is configurable up to 1920 x 1088. Internal memory size is affected by
the selected configuration.
3. The decoder crops video fields that are a multiple of eight pixels in the vertical direction.
4. Achievable resolution and frame rate depending on specific stream content and system load.
Figure 203. H.264 decoder initialization
H.264
Decoder
H.264
Application
H264Declnit(&declnst, 0, 0, 0)
Initialize H.264
decoder
H264DEC_OK
Receive H.264
stream start
H264DecDecode(declnst, &decInput, &decOutput)
Decode H.264
parameter sets
H264DEC_STRM_PROCESSED
Receive first H.264
coded data slice
H264DecDecode(decInst, &decOutput)
Activate parameter sets
based on information
contained in first picture slice
(IDR picture)
H264DEC_HDRS_RDY
H264DecGetInfo(decInst, &decInfo)
H264DEC_OK
To get information about decoded stream
(such as picture dimensions and cropping
information), call H264DecGetInfo
Doc ID 018553 Rev 3
519/590
Video decoder (VDEC)
RM0078
Figure 204. H.264 / SVC decoder basic process
520/590
Doc ID 018553 Rev 3
RM0078
36.4.2
Video decoder (VDEC)
MPEG-4 / H.263 / Sorenson Spark decoder
Table 165. MPEG-4 / H.263 / Sorenson Spark decoder features
Feature
Decoder support
Input data format
MPEG-4 / H.263 / Sorenson Spark elementary video stream
Decoding scheme
– Frame by frame (or field by field)
– Video packet by video packet
Output data format
YCbCr 4:2:0 semi-planar
Supported image size
– 48 x 48 to 1920 x 1088 (MPEG-4, Sorenson Spark)(1)
– 48 x 48 to 720 x 576 (H.263)
– Step size 16 pixels(2)
Maximum frame rate
30 fps at 1080p(3)
Maximum bit rate
As specified by MPEG-4 ASP level 5
Error detection and concealment
Supported
1. The maximum decoder output size is configurable up to 1920 x 1088. Internal memory size is affected by
the selected configuration.
2. The decoder crops video fields that are a multiple of eight pixels in the vertical direction.
3. Achievable resolution and frame rate depending on specific stream content and system load.
Figure 205. MPEG-4 / H.263 / Sorenson Spark decoder initialization
Doc ID 018553 Rev 3
521/590
Video decoder (VDEC)
RM0078
Figure 206. MPEG-4 / H.263 / Sorenson Spark decoder basic procces
522/590
Doc ID 018553 Rev 3
RM0078
36.4.3
Video decoder (VDEC)
MPEG-2 / MPEG-1 decoder
Table 166. MPEG-2 / MPEG-1 features
Feature
Decoder support
Input data format
MPEG-2 / MPEG-1 elementary video stream
Decoding scheme
– Frame by frame (or field by field)
– Video packet by video packet
Output data format
YCbCr 4:2:0 semi-planar format
– 48 x 48 to 1920 x 1088(1)
Supported image size
– Step size 16 pixels
Maximum frame rate
30 fps at 1080p(2)
Maximum bit rate
As specified by MPEG-2 MP high level
Error detection and concealment
Supported
1. The maximum decoder output size is configurable up to 1920 x 1088. Internal memory size is affected by
the selected configuration
2. Achievable resolution and frame rate depending on specific stream content and system load.
Figure 207. MPEG-2 / MPEG-1 decoder initialization
Doc ID 018553 Rev 3
523/590
Video decoder (VDEC)
RM0078
Figure 208. MPEG-2 / MPEG-1 decoder basic procces
524/590
Doc ID 018553 Rev 3
RM0078
36.4.4
Video decoder (VDEC)
JPEG decoder
Table 167. JPEG decoder features
Features
Decoder support
Input data format
– JFIF file format 1.02
– YCbCr 4:0:0, 4:2:0, 4:2:2, 4:4:0, 4:1:1 and 4:4:4 Video frame storage
formats
Decoding scheme
– Input: buffer by buffer, from 5 kB to 8 MB at a time(1)
– Output: from 1 MB row to 16 Mpixels at a time(2)
Output data format
YCbCr 4:0:0, 4:2:0, 4:2:2, 4:4:0, 4:1:1 and 4:4:4 semi-planar
Supported image size
– 48 x 48 to 8176 x 8176 (66.8 Mpixels)
– Step size 8 pixels(3)
Maximum data rate
Up to 76 million pixels per second(4)
Thumbnail decoding
JPEG compressed thumbnails supported
Error detection
Supported
1. Programmable buffer size for optimizing performance and memory consumption. Interrupt is issued when
buffer runs empty, and the control software loads more stream to external memory.
2. Programmable output slice size for optimizing performance and memory consumption. Interrupt is issued
when the requested area is decoded. The control software can be used to switch the decoder output
picture base address each time.
3. Non-16x16 dividable resolutions are filled to the 16-pixel boundary.
4. Actual maximum data rate depends on the logic clock frequency and the JPEG compression rate. The
given figure applies to high-quality JPEG with logic running at 200 MHz.
Figure 209. JPEG decoder basic process
Doc ID 018553 Rev 3
525/590
Video decoder (VDEC)
36.4.5
RM0078
VC-1 decoder
Table 168. VC-1 decoder features
Feature
Decoder support
Input data format
VC-1 stream
Decoding scheme
– Frame by frame (or field by field)
– Slice by slice
Output data format
YCbCr 4:2:0 semi-planar format
Supported image size
– 48 x 48 to 1920 x 1088(1)
– Step size 16 pixels(2)
Maximum frame rate
30 fps at 1080p(3)
Maximum bit rate
As specified by VC-1 AP level 3
Error detection and concealment
Supported
1. The maximum decoder output size is configurable up to 1920 x 1088. Internal memory size is affected by
the selected configuration. For interlaced sequences, field size must be at least 48 x48.
2. The decoder crops video fields that are a multiple of eight pixels in the vertical direction.
3. Achievable resolution and frame rate depending on specific stream content and system load.
Figure 210. VC-1 decoder initialization
526/590
Doc ID 018553 Rev 3
RM0078
Video decoder (VDEC)
Figure 211. VC-1 decoder basic procces
Doc ID 018553 Rev 3
527/590
Video decoder (VDEC)
36.4.6
RM0078
RV decoder
Table 169. RV decoder features
Feature
Decoder support
Input data format
RV8, RV9, or RV10 stream
Decoding scheme
– Frame by frame
– Slice by slice
Output data format
YCbCr 4:2:0 semi-planar format
Supported image size
– 48 x 48 to 1920 x 1088(1)
– Step size 16 pixels
Maximum frame rate
30 fps at 1080p(2)
Maximum bit rate
As specified by RV specification
Error detection and concealment
Supported
1. The maximum decoder output size is configurable up to 1920 x 1088. Internal memory size is affected by
the selected configuration.
2. Achievable resolution and frame rate depending on specifi