UG0574: RTG4 FPGA Fabric User Guide

RTG4 FPGA Fabric
UG0574 User Guide
UG0574: RTG4 FPGA Fabric User Guide
Table of Contents
About this Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Additional Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1 Fabric Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Fabric Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Logic Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Interface Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
I/O Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
FPGA Routing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Fabric Array Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Large SRAM (LSRAM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
LSRAM Resources Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Port List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Port Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Memory Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Dual-Port Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Two-Port Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Read Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Write Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reset Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Block Select Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Read Enable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
30
33
34
35
36
37
3 Micro SRAM (uSRAM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
uSRAM Resource Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Port List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Port Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Read Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Write Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reset Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Revision 2
45
51
52
53
55
2
UG0574: RTG4 FPGA Fabric User Guide
4 uPROM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
uPROM Resource Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
56
56
57
Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Port List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Operational Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5 Mathblocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Mathblock Resource Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
How to Use Mathblocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Mathblock Use Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Coding Style Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6 I/Os. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
MSIO, MSIOD, and DDRIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Transmit Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Receive Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Input Programming Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
On-Die Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
85
85
85
85
Radiation Hardening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
I/O Banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Supported I/O Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Single-Ended Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Voltage-Referenced Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Differential Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
I/O Programmable Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Programmable Input Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Pre-Emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Programmable Slew Rate Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Programmable Weak Pull-Up/Pull-Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Programmable Schmitt Trigger Input and Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Programmable Output Drive Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Configurable ODT and Driver Impedance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
93
93
93
94
95
96
Cold Sparing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5 V Input Tolerance and Output Driving Compatibility (only MSIO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Temperature Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
I/Os in Shared By Fabric and FDDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
DDRIOs with FDDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
DDRIOs with Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
MSIOs/MSIODs with Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
JTAG I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Dedicated I/Os . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Device Reset I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
SERDES I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Dedicated Global I/Os . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Revision 2
3
Table of Contents
A Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
B List of Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
C Product Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Customer Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Customer Technical Support Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contacting the Customer Technical Support Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
112
112
112
112
112
Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
My Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Outside the U.S. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
ITAR Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4
R e vi s i o n 2
About this Guide
Purpose
The RTG4™ field programmable gate array (FPGA) device integrates fourth generation flash-based
FPGA fabric with radiation tolerance. The RTG4 architecture has been designed to be RadiationTolerant, at the silicon level. The FPGA fabric (the Fabric is the Digital Logic section of RTG4) is
composed of 4-input look-up table (LUT) logic elements and includes embedded memories and
mathblocks for digital signal processing (DSP) capabilities. This document describes the RTG4 FPGA
fabric architecture, embedded memories, mathblocks, fabric routing, and input/output (I/O).
Contents
This user guide contains the following chapters:
•
Chapter 1 - Fabric Architecture
•
Chapter 2 - Large SRAM (LSRAM)
•
Chapter 3 - Micro SRAM (uSRAM)
•
Chapter 4 - uPROM
•
Chapter 5 - Mathblocks
•
Chapter 6 - I/Os
Additional Documentation
Table 1-1 shows additional documentation that are available for RTG4 FPGAs. Refer to the RTG4
Documentation web page for a complete up-to-date listing.
Table 1-1 • RTG4 Additional Documents
Document
Description
RTG4 FPGA Product Brief
Provides an overview of RTG4 FPGA family of devices, features and benefits,
and ordering information.
RTG4 FPGA Datasheet
Provides details about RTG4 AC characteristics, DC characteristics, switching
characteristics, and general specifications.
RTG4 FPGA Pin Descriptions
Contains RTG4 pin descriptions, bank location diagrams, packaging
information, and links to pin assignment tables.
RTG4 FPGA High Speed DDR Describes the high-speed memory interfaces in the RTG4 FPGA devices. The
Interfaces User Guide
functionalities of FDDR subsystems and configurations are also described.
RTG4 FPGA High Speed Serial
Interfaces User Guide
Provides details about high-speed serial interfaces (SERDES) and integrated
functionality support for multiple protocols within the RTG4 FPGA.
RTG4 FPGA Clocking Resources
User Guide
Describes the RTG4 FPGA devices clocking resources that include, FPGA fully
SET hardened fabric global network, clock conditioning circuitry (CCCs) with
dedicated Radiation-hardened phase-locked loops (PLLs), and a radiation
hardened 50 MHz RC oscillator.
RTG4 FPGA System Controller Describes the System Controller that manages programming, initialization, and
User Guide
configuration of the RTG4 FPGA devices and also the subsystems and
interfaces available in the System Controller.
Revision 2
5
About this Guide
Table 1-1 • RTG4 Additional Documents (continued)
Document
Description
RTG4 FPGA Programming User Describes the programming modes that the RTG4 FPGAs support and
Guide
provides details about implementation of programming modes that are
validated in the RTG4 devices. The RTG4 device programming security,
debugging features and methods are not discussed in this document.
RTG4 Debugging User Guide
RTG4 Board Layout User Guide
RTG4 Board Design User Guide
Libero SoC User Guide
6
Describes the usage of the Libero® System-on-Chip (SoC) software and the
design flow.
R e vi s i o n 2
1 – Fabric Architecture
Introduction
The RTG4 FPGA fabric comprises an array of flash-technology based radiation tolerant logic elements
and embedded hard ASIC blocks such as large static random access memory (LSRAM), micro SRAM
(uSRAM) for data storage, and mathblocks for DSP. These elements are arranged as several rows inside
the fabric and interconnected by the clustered routing architecture. Each element in the fabric has a
distinct logical coordinate value assigned to it. The registers in embedded hard blocks have an option to
mitigate the single-event transients and memories have built-in error detection and correction (EDAC)
with 1-bit error correction, 2-bit error detection. As it is flash-technology, the RTG4 configuration is nonvolatile and does not require programming the logic elements every time during the device power-up.
Figure 1-1 on page 8 shows a simple layout of the RTG4 FPGA fabric architecture.
Three types of resources constitute the major part of the fabric logic elements:
•
Logic Element
•
Interface Logic Element
•
I/O Module
Logic elements: The logic element is the basic element used for implementing the combinatorial
circuits, arithmetic functions, and sequential circuits inside the fabric. Each logic element consists of a
4-input LUT, a self-corrected triple module redundancy (STMR) flip-flop, and a dedicated carry chain.
The STMR flip-flops have an option to mitigate single-event transients.
Interface logic elements: The interface logic element is the logic element that interfaces the embedded
hard blocks to the fabric. It enables the accessibility of the embedded hard block through the fabric
routing. It is structurally similar to the basic logic element without the dedicated carry chain. It can be
used to implement the combinatorial and sequential circuits, if the design does not use the associated
embedded hard block.
I/O modules: The I/O module forms the digital part of the fabric user I/Os, also called as multi-standard
inputs/outputs (MSIOs). The I/O module enables the user I/Os to be connected to the fabric routing.
The RTG4 fabric uses a clustered routing architecture to interconnect the various elements inside the
fabric. In the clustered architecture, various logic elements are grouped together to form the clusters.
There are three types of clusters in the RTG4 FPGA fabric:
•
Logic clusters
•
Interface clusters
•
I/O clusters
The logic cluster is composed of 12 logic elements, the interface cluster is composed of 12 interface logic
elements, and I/O clusters are composed of 3 I/O modules that are distributed on all four sides of the
device, as shown in Figure 1-1 on page 8 (north, south, east, and west I/O clusters).
Revision 2
7
Fabric Architecture
Fabric Resources
Table 1-1 shows the fabric resources available on RTG4 devices.
Table 1-1 • Fabric Resources for RTG4 Devices
Fabric Resource
RT4G075
RT4G150
77,712
151,824
LSRAM 24.5 Kbit blocks
111
209
uSRAM 1.5 Kbit blocks
112
210
uPROM
254
381
Mathblocks
224
462
8
8
Logic elements
(4-input LUT + TMR/SET FF)
PLLs and CCCs (Rad Tolerant)
SERDES + Hardened IP
(PCI Express)
Logic Cluster
Logic Clusters
LSRAMs
uSRAM
uPROM
8
R e vi s i o n 2
Logic Element
Logic Element
Logic Element
PLL and CCC
SERDES + Hardened IP
(PCI Express)
Mathblocks
Figure 1-1 • RTG4 Simple Layout
Logic Element
Logic Element
Logic Element
Logic Element
Logic Element
Logic Element
Logic Element
Logic Element
Logic Element
One Logic Cluster
UG0574: RTG4 FPGA Fabric User Guide
Architecture Overview
The RTG4 FPGA fabric has rows composed of the following:
•
Logic cluster
•
Interface cluster
•
I/O cluster
•
LSRAM
•
uSRAM
•
Mathblocks
•
Global clock distribution stripes
Logic Cluster
The logic cluster is a combination of 12 logic elements with a dedicated hardwired carry chain
implemented for all 12 logic elements. The logic clusters contain routing MUXes. Each routed signal is
driven by a unique logic element output or by a routing MUX. All the logic elements are interconnected
with feedback from outputs to inputs. The intra-routing inside the logic clusters has very low propagation
delay compared to the routing outside the logic clusters.
Each LUT, D-flip-flop, and the carry-circuit in the logic cluster has an individual X-Y logical coordinate
assigned, and this makes them independently addressable. Figure 1-2 shows the top-level logic cluster
layout diagram.
'HGLFDWHG&DUU\&KDLQ
&OXVWHU&DUU\,1
&OXVWHU&DUU\2XW
/RJLF(OHPHQWV
,QWUDFOXVWHU
5RXWLQJ
5RXWLQJ
0X[HV
%XIIHUV
Figure 1-2 • Top-Level Logic Cluster Layout
Revision 2
9
Fabric Architecture
Logic Element
The logic elements is a base element in a logic cluster that consists of:
•
Combinational logic element (CLE) - 4-LUT with Carry Chain
•
Sequential logic element (SLE) - STMR flip-flop
Figure 1-3 shows the functional block diagram of the logic element with a carry chain.
6680
<
4
/2*,&02'8/(
&LQ
&RXW
&RXW
/2*,&02'8/(
'
/87
ZLWK&DUU\&KDLQ
6705
)OLS)ORS
(1
&/.
6/B1
/2*,&02'8/(
4
GDWD
<
&LQ
$/B1
$
%
&
DOBQ
VOBQ
FORFN
HQ
'
'
5RXWLQJ08;HV
Figure 1-3 • Functional Block Diagram of Logic Element
Combinational Logic Element
Each CLE consists of:
•
A 4-input LUT
•
A dedicated carry chain based on the carry look-ahead technique
The 4-input LUT can be configured to implement any 4-input combinatorial function or an arithmetic
function, where the LUT output is XORed with carry the input (Cin) to generate the sum (S) output. The
sum output, S, is typically used as an output for arithmetic functions but can also be used as an output for
logical functions along with the other output, Y, when the LUT is used to implement combinatorial
functions.
Each logic element has a dedicated 3-bit look-ahead carry implementation that is used to implement a
dedicated carry chain between the logic elements when the LUT is used to implement arithmetic
operations. Each cluster has one carry initialization bit and four look-ahead circuits.
The carry chain has hardwired routing nets running between the logic elements, which reduces the carry
propagation delay through the carry chain, and hence gives better performance.
10
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Sequential Logic Element
Each logic element has a SET-mitigated asynchronous self corrected TMR-D flip-flop (STMR), which can
be used as a sequential logic element. The self corrected TMR flip-flop can be configured as a register.
Figure 1-4 shows the functional block diagram of STMR flip-flop. Each STMR flip-flop has Asynchronous
majority voter logic that ensures SEU immunity within the timeline of an SET pulse width. Its triple
module redundancy mitigates the single event upset (SEU) errors.
It has hardened asynchronous (AL_n), synchronous load (SL_n), and clock enable (EN) inputs. AL_n
can be used as single global asynchronous set or reset signal shared to all fabric STMR flip-flop. It sets
or resets the register depending on configuration. SL_n can be used as synchronous set or reset signal
of each fabric STMR flip-flop. It sets or resets the register depending on configuration. The data input of
the STMR flip-flop can be fed from the direct input or from the outputs of the 4-input LUT inside the logic
element. Data input (D) has a programmable delay circuit to derive a delayed data for SET mitigation.
The delay value decides the maximum SET glitch width that can be filtered out.
STMR flip-flops support mitigated SET and non-mitigated SET modes. This can be set by using the
Libero SoC tool. Refer to the Libero SoC User Guide for more details on how to set the mitigation using
the Libero SoC software. Non-mitigated timing is significantly faster than the mitigated timing. Setting the
fabric flip-flops in critical timing paths to non-mitigated mode improves the application speed significantly
while reducing the radiation tolerance nominally.
G
&/.
6/BQ
$/BQ
T
G
))
'HOD\
FON
&
2
1
7
5
2
/
/
2
*
,
&
FRQWURO
6705RXWSXW
T
T
G
))
FON
FRQWURO
T
G
))
FON
FRQWURO
0DMRULW\YRWHU
GHOD\BHQ
GHOD\BVHO>@
Figure 1-4 • Functional Block Diagram of STMR Flip-Flop
Revision 2
11
Fabric Architecture
Interface Cluster
The interface cluster is similar to the logic cluster except that it is a combination of 12 interface logic
elements. These clusters are used to interface the inputs and outputs of the embedded hard blocks
(LSRAM, uSRAM, mathblocks, and CCCs) to the fabric routing. Each embedded hard block is spanned
by three interface clusters, as shown in Figure 1-5. The interface logic element can be used as a normal
logic elements (without carry chain) when the design does not use the associated embedded hard block.
(PEHGGHG,3V65$0VX65$0V0DWKEORFNV
&OXVWHUV:LGH
,QWHUIDFH/RJLF(OHPHQWV
,QWHUIDFH/RJLF(OHPHQWV
,QWHUIDFH
/RJLF
/87 ))
,QWHUIDFH
/RJLF
/87 ))
5RXWLQJ
5RXWLQJ
,3,QWHUIDFH&OXVWHU
,3,QWHUIDFH&OXVWHU
Figure 1-5 • IP interface Cluster
Interface Logic Element
The embedded hard IP blocks (LSRAM, uSRAM, and mathblocks) contain dedicated interface logic
elements. The embedded hard blocks are connected to the fabric routing structure through LUTs and
STMR-flip-flops on their inputs and outputs, and these together form the interface logic element.
Each embedded hard block is associated with 36 interface logic elements. This interface logic element is
structurally similar to a logic element with 4-input LUT, STMR-flip-flop, and without a dedicated carry
chain. Interface logic elements are TMR'd and have same SET mitigation as SLEs. If an embedded hard
block is used by the target design, the interface logic element is used to connect the I/Os of the
embedded hard block to the fabric routing. If an embedded hard block is not used by the design, the
interface logic element is available for use as normal logic elements for implementing combinatorial and
sequential circuits. These are in addition to the logic elements available in the fabric.
12
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
I/O Cluster
I/O clusters are combinations of I/O modules and the associated routing interfaces. Each I/O cluster
contains three I/O modules.
I/O Module
The I/O module includes the I/O digital (IOD) circuitry and the associated routing interface. Each user I/O
pad is connected to its own dedicated I/O module. The I/O module interfaces the user I/Os with the fabric
routing and enables the routing of external signals coming in through the I/Os to reach all the logic
elements. The I/O modules also enable the internal signals to reach the I/Os.
Figure 1-6 on page 14 shows the functional diagram of the complete I/O Module with the IOD and I/O
analog (IOA) sections. The IOD circuitry consists of the following:
•
Input registers: Used to register the inputs received from the I/Os. The input registers allow
capturing the input signals and synchronizing them to the design clock.
•
Output registers: Used in the I/O modules for registering the output signals at I/Os for better
design performance. The output register provides the registered version of the output signals to
the I/Os.
•
Output enable registers: Act as a control signal for the output, if the I/O is configured as a tristate
or bi-directional I/O.
•
Routing multiplexers (MUXes): These routing muxes are used to connect logic elements.
All these registers in the I/O modules are similar to the STMR flip-flop available in the logic element. For
a signal bus, these registers ensure that all the signal bus bits are synchronized to the clock signal when
sent out through I/Os. For more information on IOA, refer to "I/Os" on page 83.
Revision 2
13
Fabric Architecture
I/O Module (IOD)
IOA
Weak pull-up/pull-down
resistor control
PAD_P
DO_P
TX
Output data
outreg
OCLK
RX
OE_P
Differential
ODT
Output enable
outreg
ODT
0
1
DO_N
Output data
0
1
outreg
TX
PAD_N
OE_N
0
1
RX
Output enable
outreg
VREF
ODT
non-registered
input data
registered input data
DI_P
inreg
ICLK
non-registered
input data
DI_N
registered input data
inreg
DIFF_IN
DIFF_OUT
Figure 1-6 • I/O Module Functional Block Diagram
14
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
FPGA Routing Architecture
The RTG4 FPGA fabric has a clustered routing architecture. Clustering is hierarchical grouping of fabric
resources that allows improved area-efficient implementation of designs while maintaining optimal
performance. It also helps to reduce the run-time of the place-and-route software.
Routing Structure
Each routing interface includes multiple muxes and routing buffers. Each routed signal is driven by a
unique logic element output or routing MUX. The routing of a design is completed automatically by the
software, and hence the utilization of the routing resources is completely transparent to the user. The
selection among various routing resources by the place-and-route software is impacted by the design
constraints provided. Refer to the RTG4 SmartTime, I/O Editor and ChipPlanner User Guide in the
Libero SoC software for more details on how to use the constraints using the Libero SoC software.
Timing driven constraints and placement constraints can be used to constrain the good placement of
user logic. Knowledge of the routing architecture and functional modules is required for providing
effective design constraints to the software and to perform an optimal design implementation on the
RTG4 fabric.
In the RTG4 device, there are two types of fabric routing:
•
Inter-cluster routing
•
Intra-cluster routing
Figure 1-7 shows the fabric routing structure for the RTG4 device.
)URP2WKHU
&OXVWHUV
7R2WKHU
&OXVWHUV
,QWHUFOXVWHU5RXWLQJ
/RJLF(OHPHQWV
&OXVWHU
,QWUDFOXVWHU5RXWLQJ/HYHOVRI5RXWLQJ0X[HV
)URP$GMDFHQW
&OXVWHUV
2XWSXW08;HV
7R$GMDFHQW
&OXVWHUV
)URP2WKHU
&OXVWHUV
,QWHUFOXVWHU5RXWLQJ
7R2WKHU
&OXVWHUV
Figure 1-7 • Fabric Routing Structure
Inter-cluster routing spans the clusters and connects them. The inter-cluster routing resource is common
to all the clusters inside the fabric and is universal across the clusters.
Intra-cluster routing spans the modules that constitute a cluster. Intra-cluster routing varies from cluster
to cluster, depending on the functionality of the cluster. For example, the intra- cluster routing for an
interface cluster is different from that of a logic cluster. The differences in the routing of the various
interface clusters, depends on the embedded hard block to which they interface.
Revision 2
15
Fabric Architecture
Inter-cluster routing is different from intra-cluster routing. Inter-cluster routing never drives the inputs of
the functional modules (logic elements, interface logic elements, or I/O modules) directly and the outputs
of the functional modules do not drive the inter-cluster routing directly. Inter-cluster routing has to pass
through the intra-cluster routing to reach the functional modules. It makes RTG4 routing a fully clustered
routing architecture.
The global network can also drive intra-cluster routing through special routing MUXes. These global
routing MUXes bring in STMR flip-flop control signals such as clock, enable, and sets/resets. There are a
few short routing lines between the adjacent clusters and the inter-cluster, and intra-cluster routing
MUXes. These short paths are provided for better performance to the signals routed through these lines.
Fabric Array Coordinate System
All elements in the RTG4 FPGA fabric has individual logical X-Y coordinates associated with the fabric
array coordinate system. These logical coordinates are used by the place-and-route software when
implementing the design using the fabric elements. The place-and-route software can have constraints
set to place the design components in specific locations inside the fabric using this coordinate system.
Regions can be created inside the fabric and a particular part of the design can be assigned to that
region using the floor-planner in Libero SoC.
The boundaries of these regions can be specified using the array coordinates. Similarly, the embedded
hard blocks are also addressable through the fabric coordinate system.
The array coordinates are measured from the bottom left corner to the top right corner of the FPGA
fabric. Table 1-2 on page 17 provides the array coordinates of logical modules and embedded hard
blocks of the RTG4 devices. For more information on how to use array coordinates for region/placement
constraints, refer to the Libero SoC User Guide or online help (available in the software) for RTG4 Libero
SoC tools.
16
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
/65$0
0DWKEORFNV
X65$0
/65$0
0DWKEORFNV
X65$0
0DWKEORFNV
X65$0
/65$0
Figure 1-8 • RT4G150 Fabric Logical Coordinates
Table 1-2 • Fabric Array Coordinate Systems*
Logic Elements
uSRAM
Minimum
Maximum
Bottom Middle
Device
X
Y
X
Y
(X,Y)
RT4G075
–
–
–
–
RT4G150
–
–
–
–
LSRAM
Mathblocks
Top
Bottom
Middle
Top
Bottom Middle
Top
(X,Y)
(X,Y)
(X,Y)
(X,Y)
(X,Y)
(X,Y)
(X,Y)
(X,Y)
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
–
Note: *Coordinates will be filled when new devices are added.
Revision 2
17
2 – Large SRAM (LSRAM)
Introduction
The RTG4 FPGA fabric has embedded 24 Kbit SRAM blocks used for storing data. These LSRAMs are
arranged in multiple rows within the FPGA fabric and can be accessed through the fabric routing
architecture. The number of available LSRAM blocks depends on the specific RTG4 device, as shown in
Table 2-1 on page 19. For example, in the RT4G150 devices there are 209 LSRAM blocks available,
which are spread across three rows inside the fabric.
Features
RTG4 LSRAM blocks have the following features:
•
Each LSRAM block can store up to 24,576 bits of data and can be configured in any of the
following depth × width combinations: 512 × 36, 1K × 18, 2K × 12 or 2K × 9. Only the x12 port
width accesses the entire address space of the 24,576 bits. The ×9, ×18 and ×36 address space
is limited to 18,432 bits.
•
The registers in LSRAM block are similar to STMR flip-flop in fabric and have an option to mitigate
single-event transients.
•
Each LSRAM block contains two independent data ports - Port A and Port B.
•
The LSRAM block is synchronous for both read and write operations. These operations are
triggered on the rising edge of the clock.
•
The LSRAM block has built-in error detection and correction (EDAC) with 1-bit error correction, 2
bits error detection for the x18 and x36 modes; but not for the x9 and x12 modes. EDAC is
referred to as ECC in the description, ports, and timing diagrams.
•
When ECC is enabled, each port of the LSRAM block can raise flags to indicate single-bit-correct
and double-bit-detect.
•
LSRAM can be operated in dual-port mode and two-port mode.
•
LSRAM supports pipelined read and non-pipelined read (flow-through) operations.
•
LSRAM supports three types of write operations:
–
Simple write
–
Feed-Through write (write-bypass write)
–
Read before write
•
LSRAM has a read-enable control in both dual-port and two-port modes.
•
The address, data, block-port select, write-enable and read-enable inputs are registered.
•
An optional pipeline register with a separate enable and synchronous-reset is available at the
read-data port to improve the clock-to-out delay.
•
A write operation requires one clock cycle.
•
A read operation requires one clock cycle in Non-pipelined mode. In Pipelined mode, the output
data appears in the next cycle.
•
Read from both the ports at the same location is allowed.
•
Read and write on the same location at the same time is not allowed. Does not support built in
collision prevention or detection circuit in LSRAM.
Revision 2
18
UG0574: RTG4 FPGA Fabric User Guide
LSRAM Resources Table
Table 2-1 shows the LSRAM rows and the 24.5 Kb blocks available in the RTG4 devices.
Table 2-1 • RTG4 LSRAM (24.5 Kb Blocks) Resource Table
Blocks
LSRAM 24.5 K Blocks
RT4G075
RT4G150
111
209
Note: All numbers given above are per device.
Functional Description
This section provides the detailed description of the following:
•
Architecture Overview
•
Port List
•
Port Descriptions
Architecture Overview
The RTG4 LSRAM embedded memory includes the RAM1Kx24 macro. Figure 2-1 shows a simplified
block diagram of the LSRAM memory block and Table 2-2 on page 20 provides the port descriptions.
Figure 2-1 displays two independent data ports, the pipeline registers for read data delay, and the FeedThrough multiplexers to enable immediate access to the write data.
$B',1>@
(&&/RJLF
0X
[
(&&B(1
JK
3RUW$5RZGHFRGH
:ULWH&RQWURO
URX
(&&B(1
GWK
$B5(1
)HH
$B$''5>@
$B:(1>@
$B%/.>@
$B6567B1
$B&/.
$B'287>@
&ROXPQ
'HFRGH
(&&
/RJLF
3LSHOLQH
5HJLVWHU
$B6%B&255(&7
$B'%B'(7(&7
0HPRU\$UUD\
.[
$B'287B(1
$B:02'(>@
%B'287>@
&ROXPQ
'HFRGH
(&&
/RJLF
%B6%B&255(&7
3LSHOLQH
5HJLVWHU
%B'%B'(7(&7
%B:02'(>@
%B$''5>@
%B:(1>@
C
%B%/.>@
%B6567B1
%B&/.
%B5(1
%B'287B(1
3RUW%5RZGHFRGH
:ULWH&RQWURO
(&&B(1
(&&/RJLF
%B',1>@
Figure 2-1 • Simplified Functional Block Diagram for LSRAM
Revision 2
19
Large SRAM (LSRAM)
Port List
Table 2-2 • Port List for LSRAM Macro (RAM1KX18)
Direction
Type1
A_WIDTH[1:0]
Input
Static
A_WEN[1:0]2
Input
Dynamic
Port A Write enable
High
A_REN
Input
Dynamic
Port A Read enable
High
A_ADDR[10:0]
Input
Dynamic
Port A Address input
–
A_DIN[17:0]
Input
Dynamic
Port A Data input
–
Output
Dynamic
Port A Data output
–
A_BLK[2:0]
Input
Dynamic
Port A Block select
High
A_WMODE[1:0]
Input
Static
Port A Feed-Through write select
High
A_CLK
Input
Dynamic
Port A Clock
ADOUT_SRST_N
Input
Dynamic
Port A Pipeline Synchronous reset
Low
A_DOUT_EN
Input
Dynamic
Port A Pipeline register enable
High
A_DOUT_BYPASS
Input
Static
A_SB_CORRECT
Output
Dynamic
Port A 1-bit error correction flag
High
A_DB_DETECT
Output
Dynamic
Port A 2-bit error detection flag
High
Input
Static
Port B Width/depth mode select
–
Input
Dynamic
Port B Write enable
High
B_REN
Input
Dynamic
Port B Read enable
High
B_ADDR[10:0]
Input
Dynamic
Port B Address input
–
B_DIN[17:0]
Input
Dynamic
Port B Data input
–
Output
Dynamic
Port B Data output
–
B_BLK[2:0]
Input
Dynamic
Port B Block select
High
B_WMODE[1:0]
Input
Static
Port B Feed-Through write select
High
B_CLK
Input
Dynamic
Port B Clock
B_DOUT_SRST_N
Input
Dynamic
Port B Pipeline Synchronous reset
Low
B_DOUT_EN
Input
Dynamic
Port B Pipeline register enable
High
B_DOUT_BYPASS
Input
Static
B_SB_CORRECT
Output
Dynamic
Port B 1-bit error correction flag
High
B_DB_DETECT
Output
Dynamic
Port B 2-bit error detection flag
High
Port Name
Description
Polarity
PORT A
A_DOUT[17:0]
Port A Width/depth mode select
Port A output pipeline bypass mode
–
Rising
Active High
PORT B
B_WIDTH[1:0]
B_WEN[1:0]
2
B_DOUT[17:0]
Port B output pipeline bypass mode
Rising
Active High
Notes:
1. Static inputs are defined during the design time and need to be tied to 0 or 1.
2. If the LSRAM block is configured in Two-port mode with a write data width of x36 and read data width of x36, both
the bits of A_WEN and B_WEN must be tied to logic 1 and must not be dynamically changed.
20
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 2-2 • Port List for LSRAM Macro (RAM1KX18) (continued)
Direction
Type1
Description
Polarity
ECC
Input
Static
Error correction code (ECC) enable,
turns on the ECC encoders, decoders
and registers
High
ECC_BYPASS
Input
Static
ECC pipeline bypass
High
DELEN
Input
Static
SET mitigation
High
ARST_N
Input
Global
Pipe line registers Asynchronous
reset
Active Low
Output
Dynamic
Busy signal from SII
Active High
Input
Static
Lock access to SII
Active High
Port Name
Common Signals
BUSY
SECURITY
Notes:
1. Static inputs are defined during the design time and need to be tied to 0 or 1.
2. If the LSRAM block is configured in Two-port mode with a write data width of x36 and read data width of x36, both
the bits of A_WEN and B_WEN must be tied to logic 1 and must not be dynamically changed.
Port Descriptions
A_WIDTH[1:0] and B_WIDTH[1:0]
These signals are the depth × width mode selections for each port. Table 2-3 shows the depth × width
based on ports width selection.
Table 2-3 • Depth/Width Mode Selection
A_WIDTH/B_WIDTH
Depth/Width
00
2 K × 12
2K×9
01
1 K × 18
1×
512 × 36 (Two-port)
A_WEN[1:0] and B_WEN[1:0]
These signals are the write enables for each port to select read/write operations. Table 2-4 shows the
depth x width operations based on port write enable selection.
Table 2-4 • Read/Write Operation Selection1, 2
Depth x Width
A_WEN/B_WEN
Operation
1 K × 18
00
Read operation
11
Write operation
2K×9
2 K × 12
2 K × 12
2K×9
Notes:
1. In Dual-port mode, every port reads when the corresponding write enable (A_WEN/B_WEN) is 00 and
corresponding port select (A_BLK/B_BLK) is active.
2. In Two-port mode, the read port (Port A) reads in every clock cycle if A_BLK is active.
Revision 2
21
Large SRAM (LSRAM)
Table 2-4 • Read/Write Operation Selection1, 2 (continued)
Depth x Width
A_WEN/B_WEN
Operation
1 K × 18
01
Write [8:0]
10
Write [17:9]
11
Write [17:0]
512 × 36
A_WEN[1:0] = 11
Write [35:0]
(Two-port write - Port B)
B_WEN[1:0] = 11
Notes:
1. In Dual-port mode, every port reads when the corresponding write enable (A_WEN/B_WEN) is 00 and
corresponding port select (A_BLK/B_BLK) is active.
2. In Two-port mode, the read port (Port A) reads in every clock cycle if A_BLK is active.
A_ADDR[10:0] and B_ADDR[10:0]
These signals are the address buses for the two ports. In ×12 mode and ×9 mode 11 bits are used to
address the 2048 independent locations. In wider modes (×18, ×36) fewer address bits are used. The
used address bits are the most significant bits (MSBs). The unused bits are the least significant bits
(LSBs) and they must be grounded. Table 2-5 shows the address bus used and unused bits for
depth × width selections.
Table 2-5 • Address Bus Used and Unused Bits
A_ADDR/B_ADDR
Depth x Width
Used Bits
Unused Bits (to be grounded)
[10:0]
None
1 K × 18
[10:1]
[0]
512 × 36 (Two-port)
[10:2]
[1:0]
2 K × 12
2K×9
A_DIN[17:0] and B_DIN[17:0]
These signals are the data input buses for the two ports. In Dual-port mode, the data width can be 9 bits,
12 bits, or 18 bits. In Two-port mode, Port B becomes the write-only port. For a write data width of 36 bits,
A_DIN[17:0] becomes write data[35:18] and B_DIN[17:0] becomes write data[17:0]. The used bits for
any mode are LSB justified in the data bus and the unused MSB bits must be grounded. Table 2-6 shows
the data input buses used and unused bits for depth × width selections.
Table 2-6 • Data Input Buses Used and Unused Bits
A_DIN/B_DIN
Depth x Width
Used Bits
Unused Bits (to be grounded)
2 K × 12
[11:0]
[17:12]
2K×9
[8:0]
[17:9]
1 K × 18
[17:0]
None
512 × 36 (Two-port Write)
A_DIN[17:0] is [35:18]
None
B_DIN[17:0] is [17:0]
22
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
A_DOUT[17:0] and B_DOUT[17:0]
These signals are the data output buses for the two ports. In Dual-port mode, the data width can be either
9 bits, 12 bits, or 18 bits. In Two-port mode, Port A becomes the read-only port and Port B becomes
write-only port. For a read data width of 36 bits, A_DOUT[17:0] becomes read data[35:18] and
B_DOUT[17:0] becomes read data[17:0]. The used bits for any mode are LSB justified in the data bus
and the unused MSB must be grounded. Table 2-7 shows the data output buses used and unused bits
for depth × width selections.
Table 2-7 • Data Output Buses Used and Unused Bits
Depth x Width
A_DOUT/B_DOUT
Used Bits
Unused Bits (to be grounded)
2 K × 12
[11:0]
[17:12]
2K×9
[8:0]
[17:9]
1 K × 18
[17:0]
None
A_DOUT[17:0] is [35:18]
None
512 × 36 (Two-port)
B_DOUT[17:0] is [17:0]
A_BLK[2:0] and B_BLK[2:0]
These signals are the port select control signals for each port block. Table 2-8 shows operations (Read,
Write, and No operation) based on the selection of port select control signals.
Table 2-8 • Block Select Control Signals
Port Select Signal
Value
Result
A_BLK[2:0]
111
Perform read or write operation on Port A. In 36 width mode, perform a
read operation from both port A and B.
A_BLK[2:0]
000
No operation in memory from Port A. Port A read-data will be forced to
logic 0. In 36 width mode, the read-data from both ports A and B are
forced to 0.
001
010
011
100
101
110
B_BLK[2:0]
111
Perform read or write operation on Port B. In 36 width mode, perform a
write operation to both ports A and B.
B_BLK[2:0]
000
No operation in memory from Port B. Port B read-data is forced to 0,
unless it is a 36 width mode and write operation to both ports A and B is
gated.
001
010
011
100
101
110
Revision 2
23
Large SRAM (LSRAM)
A_WMODE[1:0] and B_WMODE[1:0]
These signals represent the Write mode control signals for Port A and Port B.
Table 2-9 • Depth/Width Mode Selection
A_WODE / B_WMODE
Write Mode
00
Simple Write
01
Feed-Through; write data appears on the corresponding output data port. In Twoport mode, Feed-Through write is not supported.
10
Read before write mode. In Two-port mode, Read before write mode is not
supported.
11
No operation.
A_CLK and B_CLK
These signals are the synchronous clock inputs for Port A and Port B. All inputs must be set up before
the rising edge of the clock. The read or write operation begins with the rising edge.
A_DOUT_SRST_N and B_DOUT_SRST_N
These signals are Active Low, synchronous reset inputs for the output pipeline registers for Port A and
Port B. Assertion of these reset signals forces the data output to logic 0. This does not reset the ECC
pipeline registers.
A_DOUT_EN and B_DOUT_EN
These signals are Active High enable inputs for the output pipeline registers for Port A and Port B.
•
Logic 1: Normal register operation
•
Logic 0: Register holds previous data
ECC
This signal is an Active High enable for ECC logic on Port A and Port B.
•
Logic 1: ECC logic enable
•
Logic 0: ECC logic disable
ECC_BYPASS
The ECC pipe line registers have bypass mode for slower operations.
•
Logic 0 = pipe-lined operation
•
Logic 1= non-pipelined operation
A_SB_CORRECT, B_SB_CORRECT
These are Error Correction Code flags for Port A and Port B. The flag going High indicates that a single
bit error has been detected by that port and corrected in the data output. This flag also goes High when a
double bit error is detected. Flags for each port are independent of the opposite port, even in x36 width.
A_DB_DETECT, B_DB_DETECT
These are Error Detection Code flags for Port A and Port B. The flag going High indicates that multiple bit
errors have been detected by that port, but have not been corrected. Flags for each port are independent
of the opposite port, even in x36 width.
DELEN
This signal enables the single-event Transient mitigation. When this signal is driven High, the delay for
glitch filters is turned ON. LSRAM supports maximum frequency up to 250 MHz with glitch filter and
300 MHz with out glitch filter.
ARST_N
This signal is the Global asynchronous reset. When this signal is driven Low, the output registers and
outputs and write enables are reset.
24
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
A_REN and B_REN
These are the read enable signals for A and B ports. If the read enable is Low, the outputs retain their
previous values and there will be no dynamic read power consumed.
A_DOUT_BYPASS and B_DOUT_BYPASS
The pipe line registers have bypass mode inputs for each port.
•
Logic-0 = pipe-lined operation
•
Logic-1= non-pipelined operation
BUSY
This output indicates that the LSRAM is being accessed by the SII.
SECURITY
This is a control signal for security. When this signal is driven High, the entire LSRAM memory gets
locked and cannot be accessed by the SII.
Memory Modes
LSRAM can be configured as a dual-port SRAM or two-port SRAM. The easiest way to configure LSRAM
is to use the Libero SoC tool.
Dual-Port Mode
The LSRAM block configured as dual-port SRAM provides a data storage capability of 24 Kbits with two
independent access ports: Port A and Port B (Figure 2-2 on page 26). Read and write operations can be
performed from both the ports independently at any location as long as there is no collision
(simultaneous access to the same address).
In Dual-port mode, the maximum data width can be x18 for either port. In Dual-port mode, each port of
the LSRAM can be configured in the following depth × width configurations:
•
1 K × 18
•
2K×9
•
2 K × 12
Revision 2
25
Large SRAM (LSRAM)
Figure 2-2 shows the data path for the dual-port SRAM (DPSRAM).
$B',1
%B',1
3RUW$
$B:(1
3RUW%
'DWD,Q$
'DWD,Q%
%B:(1
$B%/.
%B%/.
$B&/.
%B&/.
$B:,'7+
%B:,'7+
$B5(1
%B5(1
$B$''5
%B$''5
'DWD2XW$
'DWD2XW%
3LSHOLQH
5HJLVWHU$
3LSHOLQH
5HJLVWHU%
6WDWLF6LJQDOV
$B'287
%B'287
'\QDPLF6LJQDOV
Figure 2-2 • Data Path for Dual-Port Mode
Data can be written to either or both ports and also can be read from either or both ports. Each port has
its own address, data in, data out, clock, block select, and write enable. The read and write operations
are synchronous and require a clock edge.
There is no collision detection or prevention circuit built into LSRAM. Simultaneous write operations from
both the ports to the same address location result in data uncertainty. Simultaneous read and write
operations from both the ports to the same address location results in correct data written into the
memory but garbage values being read out.
The read operation requires one clock cycle in Non-pipelined mode. In Pipelined mode, the output data
appears in the next cycle. The write operation requires one clock cycle.
Table 2-10 shows the data width configurations that are supported by the LSRAM block configured in
Dual-port mode.
Table 2-10 • Data Width Configurations for LSRAM in Dual-Port Mode
Port A Data Width (represented as - x number of bits) Port B Data Width (represented as - x number of bits)
x9
x9, x18
x12
x12
x18
x9, x18
26
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 2-11 shows the mode operations and data input and output pins used in this case where
simultaneous read from Port A and write to Port B. Writing to Port A and reading from Port B at the same
time is valid. Simultaneous write and read is supported except to the same address.
Table 2-11 • Dual-port Mode of Operation
Mode of Operation
Pins Used
R18/W18
A_DOUT[17:0], B_DIN[17:0]
R18/W9
A_DOUT[17:0], B_DIN[8:0]
R12/W12
A_DOUT[11:0], B_DIN[11:0]
R9/W18
A_DOUT[8:0], B_DIN[17:0]
R9/W9
A_DOUT[8:0], B_DIN[8:0]
Two-Port Mode
The LSRAM block configured as two-port SRAM provides a data storage of 24 Kbits, with Port A
dedicated to read operations and Port B dedicated to write operations, refer Figure 2-3. In Two-port
mode, the data width for the read port (Port A) or the write port (Port B) is x36.
$B',1
%B',1
3RUW$
'DWD,Q$
'DWD,Q%
$B%/.
%B%/.
$B&/.
%B&/.
$B:,'7+
%B:,'7+
$B$''5
%B$''5
'DWD2XW$
'DWD2XW%
3LSHOLQH
5HJLVWHU$
3LSHOLQH
5HJLVWHU%
$B'287
%B'287
3RUW%
'\QDPLF6LJQDOV
6WDWLFVLJQDOV
Figure 2-3 • Data Path for Two-Port Mode
When the read port data width is configured as x36:
•
Output data pins are borrowed from Port B, with Port A forming the MSB and Port B forming the
LSB.
•
Input data pins are borrowed from Port A, with Port A forming the MSB and Port B forming the
LSB.
Revision 2
27
Large SRAM (LSRAM)
The read operation requires one clock cycle in Non-pipelined mode. In Pipelined mode, the output data
appears in the next cycle. The write operation requires one clock cycle.
There is no collision detection or prevention circuit built into LSRAM. Simultaneous read operations from
Port A and write operations from Port B for the same address location must be avoided. This situation
results in correct values being written into the memory, but garbage values will be read out from the
memory.
Table 2-12 shows the data width configurations supported by LSRAM configured in Two-port mode.
Table 2-12 • Data Width Configurations for LSRAM in Two-Port Mode
Read Port - Port A
(represented as - x
number of bits)
Write Port - Port B
(represented as - x
number of bits)
Data Input
Data Output
Address
x36
A_DIN[17:0]
A_DOUT[8:0]
A_ADDR[10:0]
x9
B_DIN[17:0]
x18
x36
A_DIN[17:0]
B_ADDR[10:2]
A_DOUT[17:0]
B_DIN[17:0]
x36
x9
x36
A_DIN[8:0]
x18
x36
B_DIN[17:0]
x36
A_ADDR[10:1]
B_ADDR[10:2]
A_DOUT[17:0]
A_ADDR[10:2]
B_DOUT[17:0]
B_ADDR[10:0]
A_DOUT[17:0]
A_ADDR[10:2]
B_DOUT[17:0]
B_ADDR[10:1]
A_DIN[17:0]
A_DOUT[17:0]
A_ADDR[10:2]
B_DIN[17:0]
B_DOUT[17:0]
B_ADDR[10:2]
Note: In Two-port mode, if the write data width is x36 and read data width is x36, both the bits of A_WEN and B_WEN
have to be tied to logic 1 and must not be dynamically changed.
Operating Modes
Read Operation
LSRAM supports two types of read operations for both Dual-port and Two-port RAM configurations.
•
Pipelined read
•
Non-pipelined read (Flow-through read)
Table 2-13 shows the settings of read enable, block select, and width for the simple read on Port A.
Same settings apply for Port B.
Table 2-13 • Read Enable Settings
A_BLK[2:0]
A_REN
A_WIDTH[1:0]
A_WEN[1:0]
111
1
00
00
Read the data in x12 or x9 mode.
A_DOUT[11:0] is used for ×12 mode and
A_DOUT[8:0] is used for ×9 mode data output
111
1
01
00
Read the data in x18 mode. A_DOUT[17:0] is
used for data output
28
R e visio n 2
Result
UG0574: RTG4 FPGA Fabric User Guide
Pipelined Read
In a pipelined read operation, the output data is registered at the pipeline registers, and the data is
displayed on the corresponding output in the next clock cycle. In Pipeline mode, pipeline clock input and
LSRAM clock input must be synchronized and fed with a single clock source.
Non-pipelined Read
Flow-through mode indicates a non-pipelined read operation where the pipeline registers are bypassed
and the data is displayed on the corresponding output in the same clock cycle. During flow-through read
operation, the LSRAM block can generate glitches on the data output buses. Microsemi® recommends
using LSRAM with pipeline registers to avoid these read glitches.
Timing Diagram: Flow-Through Read and Pipeline Read
•
The addresses (A_ADDR, B_ADDR), BLK enables (A_BLK, B_BLK), and read enables (A_WEN,
B_WEN = 0) must be setup before the rising edge of the clock (A_CLK, B_CLK).
•
For Non-pipeline read operations, data comes on the output bus (A_DOUT, B_DOUT) after a
delay of tCLK2Q (read access time without pipeline register) in the same cycle.
•
For pipeline read operations, the data is displayed on the output in the next clock cycle.
Figure 2-4 shows the timing diagram for a read operation performed on LSRAM.
WF\
WFK
WFO
$B&/.%B&/.
W$''568
W$''5+'
W%/.68
W%/.+'
$B$''5>@
%B$''5>@
$B%/.>@
%B%/.>@
W:(68
W:(+'
W5'68
W5'+'
$B:(1>@
%B:(1>@
$B5(1
%B5(1
WFGRXWWFHGRXW
9DOLGGDWD
$B'287>@QRQSLSHOLQHPRGH
%B'287>@QRQSLSHOLQHPRGH
WIGRXW
9DOLGGDWD
$B'287>@SLSHOLQHDFFHVVRUQRQSLSHOLQHDFFHVVZLWKSLSHOLQH(&&
%B'287>@SLSHOLQHDFFHVVRUQRQSLSHOLQHDFFHVVZLWKSLSHOLQH(&&
WIHGRXW
9DOLGGDWD
$B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQHE\SDVV
%B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQHE\SDVV
WIGRXW
9DOLGGDWD
$B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQH
%B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQH
Figure 2-4 • Read Operation Timing Waveforms
Revision 2
29
Large SRAM (LSRAM)
Table 2-14 shows the read operation timing parameters.
Table 2-14 • Read Operation Timing Parameters
Parameters
Description
tCY
Clock period
tCH
Clock minimum pulse width High
tCL
Clock minimum pulse width Low
tADDRSU
Address setup time
tADDRHD
Address hold time
tBLKSU
Block select setup time (With pipeline register enabled)
tBLKHD
Block select hold time (With pipeline register enabled)
tRDESU
Read enable setup time (A_WEN, B_WEN =0)
tRDEHD
Read enable hold time (A_WEN, B_WEN =0)
tCDOUT
Flow through read access time or non pipe line mode
tFDOUT
Pipe line read access time
tCEDOUT
Non pipeline read access time with non pipe line ECC
tFEDOUT
Non pipeline read access time with pipe line ECC
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
Write Operation
LSRAM supports three types of write operations.
•
Simple Write for both Dual-port and Two-port memory configurations
•
Feed-Through Write (write-bypass write) for Dual-port memory only.
•
Read before Write for Dual-port only.
Simple Write
The simple write mode supports both Dual-port and Two-port memory configurations. The Simple write
mode is selected by A_WMODE/B_WMODE equal to 00. In Simple write mode, the data out will only
change on a read cycle. As the new data is delayed by one clock cycle, the data out in Simple write mode
cannot be read out until the third cycle after the initial write clock cycle and will be delayed an additional
clock cycle for the ECC pipeline on the output side.
Table 2-15 shows the settings of write enable, read enable, block select, and width for the simple write on
Port A. Same settings applies for Port B.
Table 2-15 • Read and Write Enable Settings
A_BLK[2:0]
A_REN
A_WIDTH[1:0]
A_WEN[1:0]
Result
111
x
00
11
Write the data in x12 or x9 mode. A_DIN[11:0]
is used for x12 mode and A_DIN[8:0] is used for
x9 mode input data.
111
x
01
01
Write the data in x18 mode. A_DIN [8:0] is used
for input data. Invalid for x12/x9 mode.
111
x
01
10
Write the data in x18 mode. A_DIN[17:9] is
used for input data. Invalid for x12/x9 mode.
111
x
01
11
Write the data in x18 mode. A_DIN[17:0] is
used for input data.
30
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Feed-Through Write (write-bypass Mode)
The Feed-Through write mode is selected by A_WMODE or B_WMODE equal to "01" and write enable is
High. The Feed-Through write option is not supported when the LSRAM is configured in Two-port mode.
In Feed-Through write operation, the data written into the memory array is displayed immediately on the
corresponding data output for Non-pipeline operation. For Pipeline operation, data output displays in the
next clock cycle. For a Feed-Through write operation in ECC pipeline mode, the pipeline enable holds
the data and will be clocked through the ECC pipe, one cycle after the data is written through the memory
and available at the ECC.
Read Before Write Mode
The read before write mode is selected by A_WMODE or B_WMODE equal to "10" and write is High. The
read before write option is not supported when the LSRAM is configured in Two-port mode.
In read before write operation, the data output will be updated with the content of write address before
write. During read before write and feed through write modes, the data out from the address written is
available after the third clock cycle: one cycle to register the address, one cycle to write/read the data,
and one cycle for the ECC data out pipeline. An additional clock cycle is required, if the data out pipeline
is also selected. ECC flags will only be valid the same clock cycle as the data out.
Timing Diagram: Simple Write, Feed-Through Write, and
ReadBeforeWrite
•
The addresses (A_ADDR, B_ADDR), BLK enables (A_BLK, B_BLK), and write enables (A_WEN,
B_WEN = 1) must be set up before the rising edge of the clock (A_CLK, B_CLK).
•
For a Feed-Through write, the written data is displayed on the output (A_DOUT, B_DOUT) after a
delay of tCEDOUT in the same clock cycle.
•
For a simple write, the written data is displayed on the output only when a read operation is
performed on the same address.
•
If ECC is in Pipeline mode, the actual write to memory is delayed by one clock cycle. In simple
write mode, the data out changes only on a read cycle. As the new data is delayed by one clock
cycle, the data out in Simple-write mode cannot be read out until the third cycle after the initial
write clock cycle and will be delayed an additional clock cycle for the ECC pipeline on the output
side.
•
In RBW and WFT modes, the data out from the address written is also not available until the third
clock cycle: one cycle to register the address, one cycle to write/read the data and one cycle for
the ECC data out pipeline. The pipeline enable will only hold the data at the output pipeline and
not the input data pipeline and will be effective for the clock when the data would be expected to
be clocked through the pipeline.
•
An additional clock cycle is required if the data out pipeline is also selected. ECC flags will only be
valid the same clock cycle as the data out.
•
ECC flags are reset to zero, but are valid only on the same cycle as the corresponding data out. If
Pipeline modes are enabled, the ECC flags will be unknown values on subsequent invalid clock
cycles until a valid data out clock cycle.
•
The pipeline enables only hold the data at the output pipelines, including the ECC data out
pipeline, but not the input data pipeline. It is effective for the clock when the data is expected to be
clocked through the pipeline. For a Feed-Through Write operation in ECC pipeline mode, the
pipeline enable will not be captured during the write cycle and will only hold the data when it is
expected to be clocked through the ECC pipe, one cycle after the data is written through the
memory and available at the ECC.
Revision 2
31
Large SRAM (LSRAM)
Figure 2-5 shows the timing diagram for a write operation performed on the LSRAM block.
W&<
W&+
W&/
$B&/.%B&/.
W$''568
W$''5+'
W%/.68
W%/.+'
W:(68
W:(+'
W'68
W '+'
$B$''5>@
%B$''5>@
$B%/.>@
%B%/.>@
$B:(1>@
%B:(1>@
$B',1>@
%B',1>@
ƚKhd͕ƚKhd
9DOLGGDWD
$B'287>@ )HHG7KURXJK:ULWH255HDGEHIRUH:ULWH
%B'287>@
3LSHOLQH%\SDVVZLWKRXW(&&
ƚ&Khd
9DOLGGDWD
$B'287>@ )HHG7KURXJK:ULWH255HDGEHIRUH:ULWH
%B'287>@
3LSHOLQHGZLWKRXW(&&
ƚ&Khd
$B'287>@
%B'287>@
)HHG7KURXJK:ULWH255HDGEHIRUH:ULWH
3LSHOLQHGDQG:LWK(&&3LSHOLQH%\SDVV
$B'287>@
%B'287>@
)HHG7KURXJK:ULWH255HDGEHIRUH:ULWH
3LSHOLQH%\SDVVDQG:LWK(&&3LSHOLQHG
$B'287>@
%B'287>@
)HHG7KURXJK:ULWH255HDGEHIRUH:ULWH
3LSHOLQHGDQG:LWK(&&3LSHOLQHG
9DOLGGDWD
ƚ&Khd
9DOLGGDWD
ƚ&Khd
Figure 2-5 • Write Operation Timing Waveforms
Table 2-16 shows the write operation timing parameters.
Table 2-16 • Write Operation Timing Parameters
Parameters
Description
tCY
Clock period
tCH
Clock minimum pulse width High
tCL
Clock minimum pulse width Low
tADDRSU
Address setup time
tADDRHD
Address hold time
tBLKSU
Block select setup time (With pipeline register enabled)
tBLKHD
Block select hold time (With pipeline register enabled)
tWESU
Write enable setup time (A_WEN, B_WEN =1)
tWEHD
Write enable hold time (A_WEN, B_WEN =1)
tDSU
Data setup time
tDHD
Data setup time
tCEDOUT
Read access time with non-pipelined Feed-Through write timing, ECC bypass
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
32
R e visio n 2
9DOLGGDWD
UG0574: RTG4 FPGA Fabric User Guide
Table 2-16 • Write Operation Timing Parameters (continued)
Parameters
Description
tCDOUT
Read access time with pipelined Feed-Through write timing
tFEDOUT
Read access time with pipeline bypass and ECC pipeline
tFDOUT
Read access time with pipeline and ECC pipeline
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
ECC
The LSRAM block has error detection and correction logic circuitry (1-bit error correction, 2-bit error
detection) and it is available for the x18 and x36 modes; but not for the x9 and x12 modes. Setting the
ECC enable ECC_EN to High turns ON the ECC circuitry and ECC pipeline stages. Table 2-17 shows the
ECC availability for different modes.
Table 2-17 • ECC available modes
Write Bypass
Read Before
Write
Write Enables
(byte write)
Output Pipeline
Pipeline Bypass
x9/x9
No ECC
No ECC
No ECC
No ECC
No ECC
x9/x18
No ECC
No ECC
No ECC
No ECC
No ECC
x9/x36
N/A
N/A
No ECC
No ECC
No ECC
x12/x12
No ECC
No ECC
No ECC
No ECC
No ECC
x18/x9
No ECC
No ECC
No ECC
No ECC
No ECC
x18/x18
ECC Available
ECC Available
No ECC
ECC Available
ECC Available
x18/x36
N/A
N/A
No ECC
ECC Available
ECC Available
x36/x9
N/A
N/A
No ECC
No ECC
No ECC
x36/x18
N/A
N/A
No ECC
ECC Available
ECC Available
x36/x36
N/A
N/A
No ECC
ECC Available
ECC Available
Port Widths A/B
The ECC encoder provides 24 bits of data for x18 mode or 48 bits of data for x36 mode. The ECC
decoder reads the same amount of bits (24 or 48) from the array and provides the expected number of
corrected bits (18 or 36) on the outputs.
If the ECC has detected an error (A_DB_DETECT, B_DB_DETECT), you need to correct the data in the
LSRAM block. The writing of the correct data is called 'Scrubbing'. Scrubbing is not available inside the
LSRAM. All scrubbing must be done in the fabric design.
Both the ECC encoder and ECC decoder contain their own pipeline registers, which add a clock cycle of
latency to each of the read and write operations. These pipeline registers may be by-passed for slower
operation.
The ECC encoder generates two flags per port, an error correction flag (A_SB_CORRECT,
B_SB_CORRECT) that is set as High when a single bit in a word has been corrected and an error
detection flag (A_DB_DETECT, B_DB_DETECT) that is set as High when two or more bit errors in a
word have been detected, but not corrected. These flags will be set to match the output data of the port
where the error was detected, even in x36 width.
Revision 2
33
Large SRAM (LSRAM)
On a single-bit error, the status flags are set to:
A/B_SB_CORRECT, = 1'b1
A/B_DB_DETECT = 1'b0
On a double-bit error, the status flags are set to:
A/B_SB_CORRECT, = 1'b1
A/B_DB_DETECT = 1'b1
Reset Operation
The global reset signal (ARST_N) is an asynchronous Active Low signal. For any normal operation of
LSRAM, this reset signal must be kept High. To reset the LSRAM block, the reset signal must be set to
Low.
When the reset signal is asserted (ARST_N forced Low), the LSRAM block behaves as follows during
read and write operations:
1. Read operation: If the reset signal is asserted when the read operation is in process, the data
output port is forced to Low after a certain amount of delay. If the clock is set to High and the reset
signal is asserted and then deasserted in the same High clock phase or Low clock phase, the
data output stays Low until the next cycle. The data output changes its state only if a read
operation or write operation in Bypass mode is performed on the LSRAM block. In a simple write
operation, the data output stays Low.
2. Write operation: If the reset signal is asserted during write operation, corrupted data is written
into the memory. Microsemi recommends avoiding asserting reset signal during write operation.
All data stored in the array is lost during a global reset. The content of the array must be considered
unknown until a valid write operation.
Timing Diagram: Asynchronous Reset Operation
Figure 2-6 shows the timing diagram of an asynchronous reset operation.
W&<
W&+
W &/
$B&/.
%B&/.
$567B1
W54
$B'287
%B'287
Figure 2-6 • Asynchronous Reset Operation
Table 2-18 shows the asynchronous reset timing parameters.
Table 2-18 • Asynchronous Reset Timing Parameters
Parameters
Description
tCY
Clock period
tCH
Clock minimum pulse width High
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
34
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 2-18 • Asynchronous Reset Timing Parameters
tCL
Clock minimum pulse width Low
tR2Q
Asynchronous reset to output propagation delay
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
Block Select Operation
The block select in LSRAM works like a chip select. When the block select (A_BLK and B_BLK) is High,
the LSRAM block is active and read and write operations can be performed.
If the block select is Low, LSRAM does not perform any read or write operations. Refer to "A_BLK[2:0]
and B_BLK[2:0]" section on page 23. It drives logic 0 on the data output pins until the next read cycle or
write operation in Bypass mode. When the pipeline registers are used, the block select effect at the
output is delayed by one pipeline clock cycle (the pipeline registers are independent of block select). In
Two-port mode, A_BLK[2:0] controls the entire read port (important when is x36) and B_BLK[2:0]
controls the entire write port (important when is x36). In Two-Port mode, the block select of Port A can be
independent of the block select of Port B.
Figure 2-7 shows the timing diagram for block select inputs for LSRAM.
W&<
$B&/.
%B&/.
W%/.03:
W%/.68
W%/.+'
$B%/.>@
%B%/.>@
W%/.4
'DWDRXWSXWORZ
$B'287>@1RQ3LSHOLQH0RGH
%B'287>@1RQ3LSHOLQH0RGH
W &/.4
'DWDRXWSXWORZ
$B'287>@3LSHOLQH$FFHVV
%B'287>@3LSHOLQH$FFHVV
Figure 2-7 • Block Select Timings
Revision 2
35
Large SRAM (LSRAM)
Table 2-19 shows the block select control signal settings for the read/write operations.
Table 2-19 • Block Selection Timing Parameters
Parameters
Description
tCY
Clock period
tCH
Clock minimum pulse width High
tCL
Clock minimum pulse width Low
tBLKSU
Block select setup time (with pipeline register enabled)
tBLKHD
Block select hold time (with pipeline register enabled)
tBLKMPW
Block select minimum pulse width
tBLK2Q
Block select to out disable time (when pipeline registers are disabled)
tCLK2Q
Read access time without pipeline register
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
Figure 2-6 on page 34 shows the timing diagram for asynchronous reset operation performed on
LSRAM.
Read Enable
The Read enable pin controls each port. It can be used to conserve power while retaining previously read
data out. When the read enable is set to Low, the data outputs will retain their previous state and no
dynamic read power will be consumed on that port. When the read enable is set to High, normal read
operation will resume. This operation is summarized for Port A in Table 2-20 shows the Read enable
functionality for Port A and Port B.
Table 2-20 • Read enable functionality for Port A and Port B
Function
Deselect
LSRAM
Write to Port A
A_WEN/B
_WEN
A_REN/B_ A_BLK/B_ A_DOUT/B_
REN
BLK
DOUT
x
x
Any 0
11 or 10
x
111
All zero
Power
Low
Comment
No read or write operations.
Refer to "A_BLK[2:0] and
B_BLK[2:0]" section on
page 23.
Previous
data
Write power
Simple write mode.
A _WEN/B_WEN = 11 is the
only valid active write setting
for x12/x9.
Read Port A
00
1
111
New data
Read power
Read Operation
Standby mode
00
0
111
Previous
data
Low
No read or write operations
36
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Collision
Collision scenarios arise between both ports of the LSRAM block when a read operation is requested
from one port and a write operation is requested from the other port simultaneously on the same address
location, or when a write operation occurs at the same location at the same time from both ports.
Table 2-21 shows the behavior of the LSRAM block during the various cases of collisions.
Table 2-21 • Collision Scenarios
Operation
Description
Simultaneous read from Port A and Port B at the same Operation is allowed without any restrictions and data is
location
available on the output ports after the specified time, as
described in the read timing diagrams in Figure 2-4 on
page 29.
Simultaneous read from Port A and write from Port B Not allowed. The new data may be written into the address
at the same location
location but the read data out will be a garbage value.
Simultaneous read from Port B and write from Port A Not allowed. The new data may be written into the address
at the same location
location but the read data out will be a garbage value.
Simultaneous write from Port A and Port B at the same Not allowed. If the data to be written is same on both the
location
ports, then the data is successfully written. If the data is
different, then the LSRAM cell has an undetermined state.
Note: There are no collision prevention or detection techniques available in LSRAM. The last three
operations mentioned in Table 2-21 are not allowed on LSRAM and must be avoided.
Revision 2
37
3 – Micro SRAM (uSRAM)
Introduction
The RTG4 FPGA fabric has embedded 1.5 Kbits uSRAM blocks used for storing data. These uSRAMs
are arranged in multiple rows within the FPGA fabric and can be accessed through the fabric routing
architecture. The number of uSRAM present varies between different RTG4 devices. Table 3-1 on
page 39 shows the number of uSRAM present in each RTG4 device.
Features
RTG4 uSRAM blocks have the following features:
•
Each uSRAM block stores up to 1.5 Kbits (1,536 bits) of data and can be configured in any of the
following depth × width combinations: 64 × 18, 128 × 12, and 128 × 9. Only the x12 port width
accesses the entire address space of the 1536bits. The x9 and x18 address space is limited to
1152 bits.
•
Each uSRAM block has two read data ports (Port A and Port B) and one write data port (Port C).
•
Each uSRAM block has built-in EDAC with 1-bit error correction, 2-bit error detection for the x18
mode but not for the x9 and x12 modes. EDAC is referred to as ECC in the description, ports, and
timing diagrams.
•
The registers in uSRAM block are similar to STMR flip-flop in fabric and have an option to mitigate
single-event transients.
•
Read operations can be performed in both Synchronous and Asynchronous modes. The write
operation is always performed in Synchronous mode.
•
The two read ports have address/block select registers for enabling Synchronous mode
operation.
•
In Pipelined mode, the two read ports have output registers with independent clocks. These
Output pipeline registers can also be configured as transparent for Asynchronous mode
operation.
•
Due to the availability of separate input address and output pipeline registers, read operations
through Port A and Port B in uSRAM can be performed in four different modes:
–
Synchronous read mode without pipeline registers (Synchronous-Asynchronous mode)
–
Synchronous read mode with pipeline registers (Synchronous-Synchronous mode)
–
Asynchronous read mode without pipeline registers (Asynchronous-Asynchronous mode)
–
Asynchronous read mode with pipeline registers (Asynchronous-Synchronous mode)
•
Separate synchronous resets are provided for the input address select registers. These resets
can be used to initialize the read ports.
•
The output pipeline registers have separate synchronous resets, which provide independent
control to these registers.
•
uSRAM can operate up to 300 MHz with SET mitigation disable and up to 250 MHz with SET
mitigation enable.
•
The two read ports are independent of each other and simultaneous read operations can be
performed from both ports at the same address location.
•
Simultaneous read and write operations at the same location are not allowed.
Revision 2
38
UG0574: RTG4 FPGA Fabric User Guide
uSRAM Resource Table
Table 3-1 shows uSRAM blocks available for the RTG4 devices.
Table 3-1 • RTG4 uSRAM (1.5 Kb Blocks) Resource Table
Blocks
uSRAM 1.5 Kbit Blocks
RT4G075
RT4G150
112
210
Note: All numbers given above are per device.
Functional Description
This section provides detailed description of the following:
•
Architecture Overview
•
Port List
•
Port Description
Architecture Overview
The RTG4 uSRAM embedded memory includes the RAM64×24 macro available in the Libero SoC
software. Figure 3-1 shows a simplified block diagram of the uSRAM memory block with two read data
ports, one write data port, and pipeline registers at read port. Table 3-2 on page 40 shows the port
descriptions.
$B$''5B%<3$
66
$B$''5>@
$B%/.>@
$B$''5B(
1
$B&/.
(&&B(1
$B'287>@
3RUW$
5HDG
'HFRGH
&B',1>@
(&&B(1
3RUW&
ZULWHFRQWURO
(&&ORJLF
&B$''5>@
3RUW%
5HDG
'HFRGH
&B:(1
$B6%B&255(&7
$B'%B'(7(&7
$B'287B(1
0HPRU\$UUD\
[
&B%/.>@
3LSHOLQH
5HJLVWHU
(&&
/RJLF
%B'287>@
(&&
/RJLF
%B6%B&255(&7
3LSHOLQH
5HJLVWHU
&B&/.
%B'%B'(7(&7
(&&B(1
%B'287B(1
%B$''5>@
%B%/.>@
%B$''5B(
1
%B&/.
%B$''5B%<3$
66
Figure 3-1 • Simplified Functional Block Diagram of uSRAM
Revision 2
39
Micro SRAM (uSRAM)
Port List
Table 3-2 • Port List for uSRAM
Direction
Type*
A_ADDR[6:0]
Input
Dynamic
Port A read address input
A_BLK[1:0]
Input
Dynamic
Port A block select
A_WIDTH
Input
Static
Output
A _CLK
Port Name
Descriptions
Polarity
Port A
–
Active High
Port A Depth × width mode selection
–
Dynamic
Port A read data
–
Input
Dynamic
Port A clock input
Rising
A_DOUT_EN
Input
Dynamic
Port A read-data pipeline register
enable
Active High
A_DOUT_SRST_N
Input
Dynamic
Port A read-data pipeline register
synchronous reset
Active Low
A_DOUT_BYPASS
Input
Static
Port A read data pipeline register
select
Active High
A_ADDR_BYPASS
Input
Static
Port A read address pipeline register
select
Active High
A_ADDR_EN
Input
Dynamic
Port A read-address register enable
Active High
A_ADDR_SRST_N
Input
Dynamic
Port A read-address register
synchronous reset
Active Low
A_SB_CORRECT
Output
Dynamic
Port A 1-bit error correction flag
Active High
A_DB_DETECT
Output
Dynamic
Port A 2-bit error correction flag
Active High
B_ADDR[6:0]
Input
Dynamic
Port B read-address input
B_BLK[1:0]
Input
Dynamic
Port B Block select
B_WIDTH
Input
Static
Output
B_ CLK
A_DOUT[17:0]
Port B
Active High
Depth × width/depth mode selection
–
Dynamic
Port B read data
–
Input
Dynamic
Port B clock input
Rising
B_DOUT_EN
Input
Dynamic
Port B read data pipeline register
enable
Active High
B_DOUT_SRST_N
Input
Dynamic
Pipeline read data pipeline register
synchronous reset
Active Low
B_DOUT_BYPASS
Input
Static
Port B read data pipeline register
select
Active High
B_ADDR_BYPASS
Input
Static
Port B read-address register select
Active High
B_ADDR_EN
Input
Dynamic
Port B read address register enable
Active High
B_ADDR_SRST_N
Input
Dynamic
Port B read address register
synchronous reset
Active Low
B_SB_CORRECT
Output
Dynamic
Port B 1-bit error correction flag
Active High
B_DB_DETECT
Output
Dynamic
Port B 2-bit error correction flag
Active High
B_DOUT[17:0]
Note: *Static inputs are defined at design time and need to be tied to 0 or 1.
40
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 3-2 • Port List for uSRAM (continued)
Direction
Type*
C_ADDR[6:0]
Input
Dynamic
Port C write address input
C_BLK[1:0]
Input
Dynamic
Port C Block select
C_WIDTH
Input
Static
Output
C_CLK
C_WEN
Port Name
Descriptions
Polarity
Port C
–
Active High
Port C Depth × width mode selection
–
Dynamic
Port C Data output
–
Input
Dynamic
Port C Clock input
Rising
Input
Dynamic
Port C write enable
Active High
ECC
Input
Static
ECC enable
Active High
ECC_DOUT_BYPAS
S
Input
Static
ECC pipeline register select
Active High
ARST_N
Input
Global
Read-address and Read-data pipeline
registers asynchronous-reset
Active Low
DELEN
Input
Static
Enable SET mitigation
Active High
Output
Dynamic
Busy signal from SII
Active High
Input
Static
Lock access to SII
Active High
C_DIN[17:0]
Common Signals
BUSY
SECURITY
Note: *Static inputs are defined at design time and need to be tied to 0 or 1.
Port Description
A_WIDTH, B_WIDTH, and C_WIDTH
These signals are the depth × width mode selections for each port. Table 3-3 shows the depth × width
based on ports width selection.
Table 3-3 • Width/Depth Mode Selection
A_WIDTH / B_WIDTH / C_WIDTH
Depth x Width
0
128 × 12
128 × 9
1
64 × 18
Revision 2
41
Micro SRAM (uSRAM)
A_ADDR[6:0], B_ADDR [6:0], and C_ADDR [6:0]
These signals are the address buses for three ports (two read and one write). In ×12 mode, 7 bits are
used to address the 1536 independent locations. In wider mode x9/x18, few address bits are used. The
used address bits are the most significant bits (MSB). The unused bits are the least significant bits
(LSBs) and they must be grounded. Table 3-4 shows the address bus used and unused bits for depth ×
width selections.
Table 3-4 • Address Bus Used and Unused Bits
Depth x Width
A_ADDR/B_ADDR/C_ADDR
Used Bits
Unused Bits (to be grounded)
128 × 9
[6:0]
None
128 × 12
[6:0]
None
64 × 18
[6:1]
[0]
C_DIN[17:0]
This signal is the data input bus for the write Port C. The used bits for any mode are LSB justified in the
data bus and the unused MSB bits must be grounded. Table 3-5 shows the data input bus used and
unused bits for depth × width selections.
Table 3-5 • Data Input Buses Used and Unused Bits
Depth x Width
C_DIN
Used Bits
Unused Bits (to be grounded)
64 × 18
[17:0]
None
128 × 12
[11:0]
[17:12]
128 × 9
[8:0]
[17:9]
A_DOUT[17:0] and B_DOUT[17:0]
These signals are the data output buses for the two ports (Port A and Port B). The used bits for any mode
are LSB justified in the data bus and the unused MSB bits must be grounded. Table 3-6 shows the data
output bus used and unused bits for different depth × width selections.
Table 3-6 • Data Output Buses Used and Unused Bits
Depth x Width
A_DOUT/B_DOUT
Used Bits
Unused Bits
64 × 18
[17:0]
None
128 × 12
[11:0]
[17:12]
128 × 9
[8:0]
[17:9]
42
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
A_BLK[1:0], B_BLK [1:0], and C_BLK [1:0]
These signals are the port select control signal for each port. Table 3-7 shows the operations (Read,
write, and no operation) based on the selection of port select control signals.
Table 3-7 • Port Select Control Signals
Port Select Signal
A_BLK[1:0]
Value
Operation
11
Perform read operation on Port A.
00
Port A is not selected and its read data is logic 0.
01
10
B_BLK[1:0]
11
Perform read operation on Port B.
00
Port B is not selected and its read data is logic 0.
01
10
11
Perform write operation on Port C.
00
C_BLK[1:0]
01
Port C is not selected.
10
A_CLK, B_CLK, C_CLK
This signal is the clock signal for Port A, Port B, and Port C. Ensure that all inputs are set up before the
first rising clock edge. The read/write operation starts at the rising edge of this clock signal.
C_WEN
This signal is the write enable for Port C. If the C_BLK and C_WEN signals are 1, then the write occurs in
Port C.
A_ADDR_SRST_N and B_ADDR_SRST_N
These signals are Active Low, synchronous reset inputs for the input address/block select registers for
Port A and Port B. Assertion of these reset signals forces the address input registers and block select
registers to logic 0, which in turn forces the data output to logic 0. When the registers are configured as
transparent, these inputs must be tied to logic 1.
A_DOUT_SRST_N and B_DOUT_SRST_N
These signals are Active Low, synchronous reset inputs for the output pipeline registers for Port A and
Port B. Assertion of these reset signals forces the data output to logic 0. In Non-pipelined mode of
operation, tie these inputs to logic 1.
A_ADDR_EN and B_ADDR_EN
These signals are Active High enable inputs for the input address/block select registers for Port A and
Port B. When logic 0 is applied on these inputs, the input registers hold the previous input address. When
logic 1 is applied on these inputs, the input registers behave as normal D flip-flops. When the registers
are configured as transparent, these inputs should be tied to logic 1.
A_DOUT_EN and B_DOUT_EN
These signals are Active High enable inputs for the output pipeline registers for Port A and Port B. When
logic 0 is applied on these inputs, the pipeline registers hold the previously read data out. In Nonpipelined mode, tie these inputs to logic 1.
ARST_N
This signal is the Global reset. Connects the read-address and read-data pipeline registers to the global
Asynchronous-reset signal.
Revision 2
43
Micro SRAM (uSRAM)
ECC
This signal is Active High and enables ECC logic for Port A, Port B, and Port C.
•
Logic 1: ECC logic enable
•
Logic 0: ECC logic disable
ECC_DOUT_BYPASS
The ECC pipeline registers have Bypass mode for slow operations.
•
Logic 0: Pipe-lined operation
•
Logic 1: Non-pipelined operation
DELEN
This signal enables the SET mitigation. When this signal is driven High, the delay for SET filters is turned
ON. uSRAM supports maximum frequency up to 250 MHz with SET enable and 300 MHz with SET
disable.
A_DOUT_BYPASS and B_DOUT_BYPASS
The output pipe line registers have bypass mode for each port.
•
Logic-0 = pipe-lined operation
•
Logic-1= non-pipelined operation
A_ADDR_BYPASS and B_ADDR_BYPASS
The Input pipe line registers have bypass mode for each port.
•
Logic-0 = pipe-lined operation
•
Logic-1= non-pipelined operation
A_SB_CORRECT, B_SB_CORRECT
These are Error Correction Code flags for Port A and Port B. When the flag goes High by itself, it
indicates that a single bit error is detected by that port and corrected in the data output. This flag also
goes High for a double bit error.
A_DB_DETECT, B_DB_DETECT
These are Error Detection Code flags for Port A and Port B. When the flag goes High, it indicates that
multiple bit errors are detected by that port, but not corrected.
BUSY
This output indicates that the uSRAM is being accessed by the SII.
SECURITY
Control signal, when 1 locks the entire uSRAM memory from being accessed by the SII.
Operating Modes
This section describes the following operation modes:
44
•
Read Operation
•
Write Operation
•
ECC
•
Reset Operation
•
Collision
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Read Operation
uSRAM blocks are read through two ports: Port A and Port B. There are four modes for read operations:
•
Synchronous read mode without pipeline registers (Synchronous-Asynchronous mode)
•
Synchronous read mode with pipeline registers (Synchronous-Synchronous mode)
•
Asynchronous read mode without pipeline registers (Asynchronous-Asynchronous mode)
•
Asynchronous read mode with pipeline registers (Asynchronous-Synchronous mode)
Synchronous Read Mode
Synchronous read mode requires that the input registers for the address and block select inputs are
configured in STMR flip-flop mode (A_IN_BYPASS or B_IN_BYPASS = 0). Similarly, on the output side,
the pipeline registers can be configured as registered or asynchronous.
When the pipeline registers are enabled, the clock inputs of both the input and output registers must be
synchronous to each other and fed with a single clock source. Microsemi recommends configuring the
registers as pipeline registers during read operation to avoid glitches on the read output data lines.
In Synchronous read mode, the address (A_ADDR or B_ADDR) and block select (A_BLK or B_BLK)
inputs must satisfy the setup and hold timing with respect to the input clocks (A_ CLK or B_ CLK).
Synchronous Read Mode without Pipeline Registers (Synchronous-Asynchronous
Read Mode)
•
The input registers are configured in Synchronous read mode.
•
The output pipeline registers are configured as transparent.
•
This mode is achieved by configuring the following settings:
–
A_DOUT_BYPASS = 1 or B_DOUT_BYPASS = 1
–
A_IN_BYPASS or B_IN_BYPASS = 0
–
A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1
–
A_DOUT_EN or B_DOUT_EN = 1
–
A_BLK = 1, B_BLK = 1
Figure 3-2 on page 46 shows the synchronous asynchronous operation with data output behavior
when block select inputs are deasserted (any bit forced to logic 0).
•
The output data is displayed immediately in the same clock cycle in which the address and block
select inputs were registered.
•
The uSRAM block can generate glitches on the output buses when used without the pipeline
registers.
Revision 2
45
Micro SRAM (uSRAM)
Figure 3-2 shows the timing waveforms for synchronous-asynchronous read operation without pipeline
registers.
W&+
W&/
W&<
$B&/.
%B&/.
W&<
$B$''5>@
%B$''5>@
$B%/.
%B%/.
W$''568
$
W$''5+'
W%/.68
W%/.+'
$
$
W%/.68
W%/.+'
2XWSXWLQWKHV\QFKURQRXV±DV\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV
W&/.4+
$B'287>@
%B'287>@
'
'
W%/.4
W%/.4
'
W&/.45
2XWSXWLQWKHV\QFKURQRXV±DV\QFKURQRXVPRGHZLWK(&&UHJLVWHUV
$B'287>@
%B'287>@
'
'
W&/.4(
'
WFTH
W&/.4(
Figure 3-2 • Synchronous-Asynchronous Read Operation Waveform without Pipeline Registers
Table 3-8 shows the timing parameter values for Synchronous read mode without pipeline registers.
Table 3-8 • Timing Parameters for Synchronous-Asynchronous Read Operation
Parameter
Description
tCY
Read clock period
tCH
Read clock minimum pulse width High time
tCL
Read clock minimum pulse width Low time
tADDRSU
Read address setup time in Synchronous mode
tADDRHD
Read address hold time in Synchronous mode
tBLKSU
Read block select setup time (when pipeline registers enabled)
tBLKHD
Read block select hold time (when pipeline registers enabled)
tCLK2QH
Data output read hold time
tCLK2QR
Data output read access time
tBLK2Q
Block select to dout disable/enable time
tCLK2QE
Data output read access time with ECC registers.
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
46
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Synchronous Read Mode with Pipeline Registers (Synchronous-Synchronous Read
Mode)
•
The input registers are configured in Synchronous read mode.
•
The output pipeline registers are configured as edge-triggered registers (Pipelined mode).
•
Pipelined mode is achieved by making the following settings:
–
A_DOUT_BYPASS or B_DOUT_BYPASS = 0
–
A_IN_BYPASS or B_IN_BYPASS = 0
–
A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1
–
A_DOUT_EN or B_DOUT_EN = 1
–
A_BLK = 1, B_BLK = 1
•
The input register clock and pipeline register clock must be synchronous to each other; hence
they must be sourced from the same clock input.
•
The output data appears on the output bus in the next clock cycle.
Figure 3-3 shows the timing waveforms for synchronous-synchronous read operation with pipeline
registers.
W&+
W&/
W&<
$B&/.
%B&/.
W&<
$B$''5>@
%B$''5>@
$B%/.
%B%/.
W$''568
$
W$''5+'
W%/.68
W%/.+'
$
$
W%/.68
W%/.+'
2XWSXWLQWKHV\QFKURQRXV±V\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV
$B'287>@
%B'287>@
'
'
W&/.43
'
W&/.43
W&/.43
2XWSXWLQWKHV\QFKURQRXV±V\QFKURQRXVPRGHZLWK(&&UHJLVWHUV
$B'287>@
%B'287>@
'
'
W&/.43
'
'
W&/.43
W&/.43
Figure 3-3 • Synchronous-Synchronous Read Operation Waveform with Pipeline Registers
Revision 2
47
Micro SRAM (uSRAM)
Table 3-9 shows the timing parameter values for Synchronous read mode with pipeline registers.
Table 3-9 • Timing Parameters for Synchronous-Synchronous Read Operation
Parameter
Description
tCY
Read clock period
tCH
Read clock minimum pulse width High time
tCL
Read clock minimum pulse width Low time
tADDRSU
Read address setup time in Synchronous mode
tADDRHD
Read address hold time in Synchronous mode
tBLKSU
Read block select setup time (when pipeline registers enabled)
tBLKHD
Read block select hold time (when pipeline registers enabled)
tCLK2QP
Pipeline read access time
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
Asynchronous Read Mode
Asynchronous read mode requires that the input registers for the address and block-select inputs are
configured in asynchronous mode by configuring the following settings:
•
A_IN_BYPASS or B_IN_BYPASS = 1
•
A_ADDR_SRST_N or B_ADDR_SRST_N = 1
•
A_BLK = 1, B_BLK = 1
Asynchronous Read Mode Without Pipeline Registers (Asynchronous-Asynchronous
Mode)
48
•
The input registers are configured in Asynchronous read mode.
•
The output pipeline registers are configured as transparent (non-pipelined operation).
•
The pipeline registers can be made transparent by making the following settings:
–
A_DOUT_BYPASS or B_DOUT_BYPASS = 1
–
A_IN_BYPASS or B_IN_BYPASS = 1
–
A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1
–
A_DOUT_EN or B_DOUT_EN = 1
•
After the input address is provided, the output data is displayed on the output data bus after a
tCLK2Q delay (Figure 3-4 on page 49).
•
The uSRAM block can generate glitches on the data output bus when used without the pipeline
register.
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Figure 3-4 shows the timing diagram for Asynchronous-Asynchronous read mode for uSRAM.
W$''5
$B$''5>@
%B$''5>@
$
$
$
W$''5+'
W$''568
$B%/.
%B%/.
W%/.03:
W%/.68
W%/.+'
2XWSXWLQWKHDV\QFKURQRXV±DV\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV
W$''54+
$B'287>@
%B'287>@
'
'
W%/.4
W%/.4
'
W$''545
$B&/.
%B&/.
W&+
W&/
W&<
2XWSXWLQWKHDV\QFKURQRXV±DV\QFKURQRXVPRGHZLWK(&&UHJLVWHUV
W&/.4
$B'287>@
%B'287>@
'
W&/.4
Figure 3-4 • Read Operations with Asynchronous Inputs Without Pipeline Registers Waveform
Table 3-10 shows the timing parameter values for the asynchronous read mode without pipeline
registers.
Table 3-10 • Timing Parameters of the Asynchronous Read Mode Without Pipeline Registers
Parameter
Description
tADDR
Address cycle time, 300 MHz (250 MHz with SET mitigation enable)
tBLKMPW
Block select cycle time
tADDR2QH
Data output read hold time
tADDR2QR
Data output read access time
tBLK2Q
Block select to dout disable/enable time
tCY
Pipe-line clock period is 300 MHz (250 MHz with SET mitigation enable)
tCH
Clock high time
tCL
Clock low time
tADDRSU
Address setup time
tADDRHD
Address hold time
tBLKSU
Block select setup time
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
Revision 2
49
Micro SRAM (uSRAM)
Table 3-10 • Timing Parameters of the Asynchronous Read Mode Without Pipeline Registers
Parameter
Description
tBLKHD
Block select hold time
tCLK2Q
Pipe-line read access time
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
Asynchronous Read Mode with Pipeline Registers (Asynchronous-Synchronous Mode)
•
The input registers are configured in Asynchronous read mode.
•
The output pipeline registers are configured as registers (Pipelined mode).
•
Pipelined mode is achieved by configuring the following settings:
•
–
A_DOUT_BYPASS or B_DOUT_BYPASS = 0
–
A_IN_BYPASS or B_IN_BYPASS = 1
–
A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1
–
A_DOUT_EN or B_DOUT_EN = 1
–
A_BLK =1, B_BLK = 1
After the input address is provided, the output data is displayed on the output data bus after the
next rising edge of the pipeline register input clock.
Figure 3-5 shows the timing diagrams for Asynchronous-Synchronous read mode for uSRAM.
$B$''5>@
%B$''5>@
$
$B%/.
%B%/.
W$''568
$
$
W$''5+'
W%/.68
W%/.68
W%/.+'
$B&/.
%B&/.
W&+
W%/.+'
W&/
W&<
2XWSXWLQWKHDV\QFKURQRXV±V\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV
$B'287>@
%B'287>@
W&/.4
'
W&/.4
2XWSXWLQWKHDV\QFKURQRXV±V\QFKURQRXVPRGHZLWK(&&UHJLVWHUV
$B'287>@
%B'287>@
W&/.4
W&/.4
'
Figure 3-5 • Read Operations with Asynchronous Inputs with Pipeline Registers Waveform
50
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 3-11 shows the timing parameter values of the asynchronous read mode with pipeline registers.
Table 3-11 • Timing Parameters of the Asynchronous Read Mode with Pipeline Registers
Parameter
Description
tCY
Pipe-line clock period is 300MHz (250MHz with SET mitigation enable)
tCH
Clock high time
tCL
Clock low time
tADDRSU
Address setup time
tADDRHD
Address hold time
tBLKSU
Block select setup time
tBLKHD
Block select hold time
tCLK2Q
Pipe-line read access time
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
Write Operation
•
Port C is the only port through which a write operation can be performed on uSRAM.
•
The write operation is purely synchronous and all operations are synchronized to the rising edge
of the Port C clock input (C_CLK).
•
The write inputs, C_ADDR, C_BLK, C_WEN, and C_DIN, have to satisfy the setup and hold
timings with respect to the rising edge of the C_CLK input for a successful write operation.
•
If all the inputs meet the required timing parameters, the input data is written into uSRAM in one
clock cycle.
Figure 3-6 shows the timing waveforms for a Port C write operation.
W&+
&B&/.
W$''568
W$''5+'
W%/.68
W%/.+'
W:(68
W:(+'
W&/
W&<
W$''568 W$''5+'
W$''568
W$''5+'
&B$''5
&B%/.>@
&B:(1
W',168
W',168
W',1+'
'
&B',1
W%/.68
'
W',168
'
W',168
'
W%/.+'
W',168
'
'
'DWDZULWWHQLQ65$0ZLWKRXW(&&UHJLVWHUV
'DWDZULWWHQ
LQ65$0
'
'
'DWDZULWWHQLQ65$0ZLWK(&&UHJLVWHUV
'DWDZULWWHQ
LQ65$0
'
Figure 3-6 • Timing Waveforms for the Write Operation
Revision 2
51
Micro SRAM (uSRAM)
Table 3-12 shows the timing parameters of the write operation.
Table 3-12 • Timing Parameters of the Write Operation
Parameter
Description
tCY
Write clock period
tCH
Write clock minimum pulse width High
tCL
Write clock minimum pulse width Low
tADDRCSU
Write address setup time
tADDRCHD
Write address hold time
tBLKCSU
Write block setup time
tBLKCHD
Write block hold time
tWESU
Write enable setup time
tWEHD
Write enable hold time
tDINSU
Write input data setup time
tDINHD
Write input data hold time
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
ECC
uSRAM has Error detection and correction logic circuitry (1-bit error correction, 2-bit error detection) and
it is available for the x18 only. Setting the ECC enable ECC_EN to High turns ON the ECC circuitry and
ECC pipeline stages.
The ECC encoder provides 24 bits of data for x18 mode. The ECC decoder reads 24 bits from the array
and provides 18 corrected bits on the output.
If the ECC has detected an error, you can choose to correct the data in the uSRAM block. The writing of
the correct data is called 'Scrubbing'. Scrubbing is not available inside the uSRAM. All scrubbing must be
done in the fabric design.
Both the ECC encoder and ECC decoder contain their own pipeline registers, which add a clock cycle of
latency to each of the read and write operations. These pipeline registers may be bypassed for slower
operation. If pipeline modes are enabled, the ECC flags will be unknown values on subsequent invalid
clock cycles until a valid data out clock cycle.
The ECC encoder generates two flags per port, an error correction flag (A_SB_CORRECT,
B_SB_CORRECT) that is set to High when a single bit in a word is corrected and an error detection flag
(A_DB_DETECT, B_DB_DETECT) that is set to High when two or more bit errors in a word are detected,
but not corrected.
On a single bit error, the status flags will be set to:
A/B_SB_CORRECT = 1'b1
A/B_DB_DETECT = 1'b0
On a double bit error, the status flags will be set to:
A/B_SB_CORRECT = 1'b1
A/B_DB_DETECT = 1'b1
52
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Reset Operation
The global reset signal (ARST_N) is an asynchronous Active Low signal. For any normal operation of
uSRAM, the reset signal must be set to High. To reset the uSRAM block, the reset signals must be set to
Low.
When reset is asserted (ARST_N forced Low), the uSRAM behaves as follows during read and write
operations:
1. Read operation: If reset is asserted when the read operation is in process, the data output port is
forced Low after a certain amount of delay. If the clock is High and the reset signal is asserted and
then deasserted in the same High clock phase or Low clock phase, the data output stays Low
until the next cycle. The data output changes its state only if a read operation or write operation in
Bypass mode is performed on the uSRAM block. In a simple write operation, the data output
stays Low.
2. Write operation: If reset is asserted during the write operation, then the corrupted data is written
into the memory. Microsemi recommends to avoid asserting the reset signal during write
operation.
All data stored in the array is lost during a global reset. The contents of the array must be considered
unknown until a valid write operation.
Timing Diagram: Asynchronous Reset Operation
Figure 3-7 shows the asynchronous reset operation.
W&<
W&+
W &/
$B&/.
%B&/.
$567B1
W54
$B'287
%B'287
Figure 3-7 • Asynchronous Reset Operation
Table 3-13 • Asynchronous Reset Timing Parameters
Parameter
Description
tCY
Clock period
tCH
Clock minimum pulse width High
tCL
Clock minimum pulse width Low
tR2Q
Asynchronous reset to output propagation delay
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
The reset signals (A_ADDR_SRST_N, B_ADDR_SRST_N) are synchronous Active Low signals for the
address and block select input registers for Port A and Port B. The assertion of these reset signals forces
the address and block select input registers to logic 0, which in turn forces the data output to logic 0.
Revision 2
53
Micro SRAM (uSRAM)
Figure 3-8 shows the timing waveform for synchronous reset.
W&/.03:/
W&/.03:+
$B$''5B&/.
%B$''5B&/.
W&<
W656768 W6567+'
$B$''5B6567B1
%B$''5B6567B1
W&/.4
$B'287
%B'287
Figure 3-8 • Timing Waveforms for Synchronous Reset
Table 3-14 shows the timing parameters of the synchronous reset.
Table 3-14 • Timing Parameters of the Synchronous Reset
Parameter
Description
tCY
Read clock period
tCLKMPWH
Read clock minimum pulse width High
tCLKMPWL
Read clock minimum pulse width Low
tSRSTSU
Read synchronous reset setup time
tSRSTHD
Read synchronous reset hold time
tCLK2Q
Read synchronous reset to output propagation delay
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
54
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Collision
Collision between ports occurs when the read and write operations are requested from two or all three
ports at the same time and the same address location. Table 3-15 shows different scenarios for collision.
Table 3-15 • Collision Scenarios
Operation
Comments
Simultaneous read from Port A and read from Port B to Allowed as the read ports are independent of each other.
the same address location
Both read ports deliver correct read data.
Simultaneous read from Port A and write to Port C to Collision occurs. The write operation works correctly but the
the same address location
read operation from Port A generates ambiguous data
output unless the clock cycle is long enough to allow the
newly written data to be read.
Simultaneous read from Port B and write to Port C to Collision occurs. The write operation works correctly but the
the same address location
read operation from Port B generates ambiguous data
output unless the clock cycle is long enough to allow the
newly written data to be read.
Simultaneous read form Port A, read from Port B, and Collision occurs. The write operation works correctly but the
write to Port C to the same address location
read operation from both the ports generates ambiguous
data output unless the clock cycle is long enough to allow
the newly written data to be read.
Note: There is no collision prevention or detection implemented in the uSRAM architecture, so the
designer must take measures to avoid the last three scenarios in designs.
Revision 2
55
4 – uPROM
Introduction
The RTG4 FPGA fabric has embedded micro programmable read only memory (uPROM) blocks used
for storing program data such as initialization data for SERDES, LSRAM, and uSRAM blocks. These
uPROMs are arranged in a single row at the bottom of FPGA fabric and can be accessed through the
System Controller or fabric interface. The number of uPROMs present depends on the device. Table 4-1
shows the numbers of uPROMs present in each RTG4 device.
Features
RTG4 uPROM blocks have the following features:
•
Each uPROM block stores up to 18,144 bits (504x36) of data.
•
Write operation (erase / program) is performed at the same time as FPGA programming.
•
Only Read operation is supported during normal operation.
•
Read operation can be through System Controller or fabric interface.
•
Read operation is supported at 50 MHz speed.
•
Read operation supports Synchronous operation.
•
Each uPROM block has an option to register all inputs and outputs.
•
The registers at read port in uPROM block are similar to STMR flip-flop and have an option to
mitigate single-event transients.
uPROM Resource Table
Table 4-1 shows uPROM blocks available for RTG4 devices.
Table 4-1 • RTG4 uPROM Resource Table
Blocks
uPROM Blocks
RT4G075
RT4G150
254
381
Note: All numbers given above are per device.
Revision 2
56
UG0574: RTG4 FPGA Fabric User Guide
Functional Description
This section provides the detailed description of the following:
•
Architecture Overview
•
Port List
•
Operational Modes
Architecture Overview
The RTG4 uPROM embedded read only memory includes the uPROM macro available in the Libero
SoC software. Figure 4-1 shows a simplified block diagram of the uPROM memory block with on read
data ports and pipeline registers at read port. Table 4-2 shows the port descriptions.
5'(1
&/.
X3520
,QWHUIDFH
$''5>@
5HDG
'HFRGH
0HPRU\$UUD\
[
3LSHOLQH
5HJLVWHU
'$7$5>@
%86<
Figure 4-1 • Simplified Functional Block Diagram of uSRAM
Port List
Table 4-2 shows list of ports for uPROM blocks.
Table 4-2 • Port List for uPROM
Port Name
Direction
Type
Descriptions
ADDR[13:0]
Input
Dynamic
Address input
–
CLK
Input
Dynamic
Clock input
Rising
RDEN
Polarity
Input
Dynamic
Read Enable
Active High
DATAR[35:0]
Output
Dynamic
Data output
–
BUSY
Output
Dynamic
Busy signal from SII
Active High
Operational Modes
In the RTG4 uPROM block, the write operation (Program/Erase) will be performed during FPGA program
or erase operation. Following two modes of read operation are available to access the uPROM data.
•
Mode 1: Read Operation through System Controller
•
Mode 2: Read Operation through Fabric Interface
Mode 1: Read Operation through System Controller
During Power-up sequence, the System Controller reads the data from uPROM to initialize the LSRAM,
uSRAM, FDDR, or SERDES block registers. Refer to the RTG4 FPGA System Controller User Guide for
more information on uPROM access through System Controller.
Revision 2
57
uPROM
Mode 2: Read Operation through Fabric Interface
During normal operation, you can read the uPROM block through fabric interface using uPROM macro
(to be released) available in Libero Macro library. The read timing diagram for the uPROM is shown in
Figure 4-2.
W&+
W&/
W&<
&/.
W$''568
$''5>@
$''5
$''5
$''5
$''5
W68
WFT
'$7$>@
W$''5+'
W'5
:LWKRXWUHJLVWHUHG
'$7$ '$7$ '$7$ '$7$ '$7$>@
WFT
:LWKUHJLVWHUHG
'$7$ '$7$ '$7$ Figure 4-2 • Timing Waveforms for Read Operation
Table 4-3 • Timing Parameters for Synchronous-Synchronous Read Operation
Parameter
Description
tCY
Read clock period (50 MHz)
tCH
Read clock minimum pulse width High time
tCL
Read clock minimum pulse width Low time
tSU
Read setup time
tCQ
Read Clock to Q delay
tDR
Read Data delay
Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values.
58
R e visio n 2
5 – Mathblocks
Introduction
The RTG4 FPGA device implements a custom 18×18 multiply and accumulate block (18×18 MACC) for
efficient implementation of complex DSP algorithms such as finite impulse response (FIR) filters, infinite
impulse response (IIR) filters, and fast fourier transform (FFT) for filtering and image processing
applications etc.
The RTG4 mathblock has a built-in multiplier and adder, which minimizes the fabric logic required to
implement multiplication, multiply-add, and multiply-accumulate (MACC) functions. Implementation of
these arithmetic functions results in efficient resource usage and improved performance for DSP
applications. In addition to the basic MACC function, DSP algorithms typically need small amounts of
RAM for coefficients and larger RAMs for data storage. RTG4 micro RAMs (uSRAMs) are ideally suited
to serve the needs of coefficient storage while the large RAMs are used for data storage. The number of
available mathblocks varies depending on the size of the device, as shown in Table 5-1 on page 60.
Features
Each mathblock has the following features:
•
High-performance and power optimized multiplications operations.
•
Supports 18 × 18 signed multiplication natively.
•
Supports 17 × 17 unsigned multiplications.
•
Supports dot product: the multiplier computes (A[8:0] × B[17:9] + A[17:9] × B[8:0]) × 29.
•
Built-in addition, subtraction, and accumulation units to combine multiplication results efficiently.
•
Independent third input C with data width 44 bits completely registered.
•
Single-bit input, CARRYIN, from fabric routing.
•
Supports both registered and unregistered inputs and outputs.
•
All the input and output registers are STMR-flip-flops.
•
Supports signed and unsigned operations.
•
Internal cascade signals (44-bit CDIN and CDOUT) enable cascading of the mathblocks to
support larger accumulator, adder, and subtractor without extra logic.
•
Supports loopback capability.
•
Adder support: (A × B) + C or (A × B) + D or (A × B) + C + D.
•
Clock-gated input and output registers for power optimizations.
•
Width of adder and accumulator can be extended by implementing extra adders in the FPGA
fabric.
•
Mathblocks can operate up to 300 MHz with SET mitigation disable and up to 250 MHz with SET
mitigation enable.
•
Supports transparent mode.
•
Asynchronous load - limited to reset.
•
Global reset can be ignored, if required.
•
Mathblock flip-flops always reset during power-up.
Revision 2
59
Mathblocks
Mathblock Resource Table
Table 5-1 lists the mathblocks available for RTG4 devices.
Table 5-1 • RTG4 Mathblocks Resource Table
Blocks
RT4G075
RT4G150
224
462
Mathblocks (18-bit ×18-bit)
Note: All numbers given above are per device.
Functional Description
This section provides the detailed description of the architecture of mathblock.
Architecture Overview
RTG4 devices can have one to three rows of mathblocks in the FPGA fabric, as given in Table 5-1.
Mathblocks can be accessed through the FPGA routing architecture and cascaded in a chain, starting
from the left-most block to the right-most block.
Each mathblock consists of the following:
•
Multiplier
•
Adder or Subtractor
•
I/O and Control Registers
Figure 5-1 shows the functional block diagram of the mathblock.
68%
68%B$/B1
FQWOUHJ
68%B%<3$66
68%B6/B1
68%B6'B1
&/.
$>@
'273
68%B$'
68%B(1
LQUHJ
$B%<3$66
3B6567B1
$B6567B1
3B(1
$B(1
&/.
%>@
&/.
%B%<3$66
LQUHJ
%B(1
3B6567B1
&
&/.
3B(1
&$55<,1
LQUHJ
3B%<3$66
&B%<3$66
'
&/.
&B6567B1
&B(1
&/.
$56+)7
FQWOUHJ
$56+)7B$/B1
$56+)7B6/B1
$56+)7B(1
$56+)7B$'
$56+)7B6'B1
29)/B&$55<287B6(/
!!
$56+)7B%<3$66
&/.
&'6(/
&'6(/B$/B1
FQWOUHJ
&'6(/B6/B1
&'6(/B(1
&/.
&'6(/B$'
&'6(/B6'B1
&'6(/B%<3$66
)'%.6(/
)'%.6(/B$/B1
)'%.6(/B6/B1
)'%.6(/B(1
&/.
FQWOUHJ
)'%.6(/B$'
)'%.6(/B6'B1
)'%.6(/B%<3$66
&',1>@
Figure 5-1 • Functional Block Diagram of the Mathblock
60
29)/B&$55<287
&'287>@
%B6567B1
&>@
&$55<,1
FQWOUHJ
3B%<3$66
R e visio n 2
RXWUHJ
3>@
UG0574: RTG4 FPGA Fabric User Guide
Multiplier
The RTG4 mathblock can be used as a multiplier, which accepts two 18-bit inputs (A and B), and
generates a 36-bit output. The mathblock multiplier can be configured in two different operating modes:
•
Normal Mode
•
DOTP Mode
Normal Mode
In Normal mode, the mathblock implements a single 18 × 18 signed multiplier. The mathblock accepts
the inputs, A [17:0] and B [17:0], and generates A*B with a 36-bit wide result. Figure 5-2 shows the
functional block diagram of the mathblock in Normal mode.
1RUPDO0RGH
$>@
68%
%>@
3>@
&$55<,1
&>@
'>@ Figure 5-2 • Functional Block Diagram of the Mathblock in Normal Mode
DOTP Mode
Dot Product (DOTP) mode has two independent 9-bit × 9-bit multipliers with adder and the product sum
is stored in the upper 36 bits of the 44-bit register. In DOTP mode, the mathblock implements the
following equation:
(A [8:0] × B [17:9] + A[17:9] × B[8:0]) × 29
EQ 1
DOTP mode can be used to implement 9 × 9 complex multiplications.
Revision 2
61
Mathblocks
Figure 5-3 shows the functional block diagram of the mathblock in DOTP mode.
68%
'273URGXFW0RGH
$>@
%>@
%>@
$>@
&$55<,1
&>@
3>@
'>@ Figure 5-3 • Functional Block Diagram of the Mathblock in DOTP Mode
Adder or Subtractor
The adder sums the output from the multiplier, C input, CARRYIN, or D input. The final output (P) of the
adder is ((A [17:0] × B [17:0]) + C [43:0] + D [43:0] + CARRYIN).
The mathblock can be configured as a 2-input or 3-input adder.
•
As a 2-input adder, the mathblock computes A × B + C or A × B + D.
•
As a 3-Input adder, the mathblock computes A × B + C + D.
If the adder is configured as a subtractor, the adder output is ((C [43:0] + D [43:0] + CARRYIN) - (A[17:0]
× B[17:0])).
I/O and Control Registers
Mathblocks have built-in registers on data inputs (A, B, C), data output (P), and control signals. If
required, these registers can be bypassed. All the registers in the mathblock have clock gating capability
to reduce the power consumption. These register flip-flops are STMR.
Mathblocks do not have a pipeline register at the cascade input (CDIN), so pipeline registers can be
added from the fabric when multiple mathblocks are cascaded to implement higher bit-width
multiplications.
C Input
The C input port allows the formation of many 3-input mathematical functions, such as 3-input addition or
2-input multiplication with an addition. The CARRYIN signal is the carry input of the adder or
accumulator. The C input can also be used as a dynamic input achieving the following functionalities:
62
•
Wrapping-around the cascade chain of mathblocks from one row to the next row through the
fabric.
•
Rounding of multiplication outputs.
•
Trimming of lower order bits of the final sum, partial sum or the product.
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Cascaded Input, Output, and Selection
Higher level DSP functions are supported by cascading individual mathblocks in a row. The two data
signals, CDIN [43:0] and CDOUT [43:0], provide the cascading capability with a cascade select input
(CDSEL). Table 5-2 shows the selection of CDSEL for propagating CDIN to the D input of the adder. To
cascade mathblocks, the CDOUT of one block must feed the CDIN of another block. CDOUT to CDIN is
a hardwired connection between the blocks within a row.
Two different rows can be cascaded using the fabric routing between the two rows. Extra pipeline
registers may be needed to compensate for the extra delays added due to the fabric routing, which in
turn increases the latency of the chain.
The ability to cascade mathblocks is useful in filter designs. For example, an FIR filter design can use
cascading inputs to arrange a series of input data samples and cascading outputs to arrange a series of
partial output results. The ability to cascade provides a high-performance and low power implementation
of DSP filter functions because the general routing in the fabric is not used.
Overflow Output
Each mathblock has an overflow signal, OVFL_CARRYOUT. This signal indicates any overflow from the
additional operation performed by the adder. This signal is also used to extend the adder data widths
from the existing 44 bits using the fabric. The overflow signal is also used for the implementation of
saturation capabilities. Saturation refers to catching an overflow condition and replacing the output with
either the maximum (most positive) or minimum (most negative) value that can be represented. In RTG4
mathblocks, this capability is implemented using the adder's output sign bit (MSB [43] bit of the P output)
and the overflow signal.
Shift Input
For multi-precision arithmetic, mathblocks provide a right-wire-shift by 17 which is controlled by the
ARSHFT17 input. Thus, a partial product from one mathblock can be shifted to the right and added to the
next partial product computed in an adjacent mathblock. Using this technique, mathblocks can be used
to build larger multipliers.
Feedback Select Input
For accumulation operations, the mathblock output needs to loopback to the D input of the adder block.
Selection of the D input is controlled by the feedback select (FDBKSEL) input. Table 5-2 shows the
selection of FDBKSEL for loopback.
Table 5-2 • Truth Table for Propagating Operand D of the Adder or Accumulator
CDSEL
FDBKSEL
ARSHFT17
Operand D
0
0
0
0
0
0
1
0
1
X
0
CDIN[43:0]
1
X
1
{{17{CDIN[43]}}, CDIN[43:18]}
0
1
0
P[43:0]
0
1
1
{{17{P[43]}}, P[43:18]}
Mathblock Interface to Fabric Routing
Mathblocks can access the fabric routing through interface logic routing clusters. These clusters are
composed of 12 flip-flops and 12 4-input LUTs. When mathblocks are used, these flip- flops and LUTs act
as an interface to the fabric routing. When mathblocks are not used, these flip-flops and LUTs can be
utilized as normal flip-flops and LUTs. The interface logic clusters do not have carry chain support.
Revision 2
63
Mathblocks
How to Use Mathblocks
The following sections describe how to use Mathblock in an application:
•
Design Flow
•
Mathblock Use Models
•
Coding Style Examples
Design Flow
Mathblocks can be used in two ways: through inference or by using the mathblock primitive. Inference is
done during the synthesis stage of an RTL design. Alternately, the mathblock primitive is available in the
Libero SoC IP catalog as a component that can be used directly in the HDL file or instantiated in
SmartDesign.
Using a Mathblock Through Inference
Synplify Pro can infer mathblocks and can configure them into appropriate modes automatically, if the
RTL contains any specific multiply, multiply-accumulate, multiply-add, or multiply-subtract functions. In
this case, the synthesis tool takes care of all the signal connections of the mathblock to the rest of the
design and provides the correct values for the static signals to configure the appropriate operational
mode. The tool ties unused dynamic input signals to ground and provides default values to unused static
signals.
The synthesis tool maps any multiplication function with input widths of three or greater to mathblocks.
However, the mapping of multiplication functions with input widths less than three, which are
implemented in FPGA logic by default, can be controlled by the synthesis attribute (syn_multstyle). The
tool also has the capability to cascade multiple mathblocks, if the function crosses the limits of a single
mathblock. For example, if an RTL function has a 35 × 35 multiplication, the synthesis tool implements
this using four mathblocks cascaded in a chain. It also has the capability to place the input and output
registers inside the mathblock boundary, provided they are driven by same clock. If the registers have
different clocks, the clock that drives the output register has priority, and all registers driven by that clock
are placed into the mathblock. If the outputs are unregistered and the inputs are registered with different
clocks, the input registers with the larger input have priority and are placed into the mathblock.
The synthesis tool supports inference of mathblock components across hierarchical boundaries, which
means even if the multipliers, input registers, output registers, and subtracter/adders are present in
different hierarchies, they can be placed into the same mathblock.
For more information on mathblock inference by Synplify Pro, refer to the Synopsys application note on
inferring Microsemi RTG4 MACC Blocks (to be released).
Using the Mathblock Primitive
The mathblock primitive available in the Libero SoC IP Catalog is called MACC. Figure 5-4 on page 65
shows the MACC primitive with input/output port and the bit width of each port.
The MACC primitive can be used in designs by SmartDesign for schematic-based design entry or by
directly instantiating the MACC wrapper in an HDL file as a component. For the MACC primitive, the
inputs and outputs must be connected manually to the design signals. Proper values to the static signals
must be provided to ensure that the mathblock is configured in the correct operational mode. For
example, to configure the mathblock in DOTP mode, the DOTP signal must be tied to logic 1.
Unused active high dynamic signals must be connected to ground, unused active low dynamic signals
must be set to High, and unused static signals must be in default state.
64
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Figure 5-4 • Mathblock Macro
Revision 2
65
Mathblocks
Table 5-3 provides the port list and definitions.
Table 5-3 • Mathblock Pin Descriptions
Pin Name
CLK
Direction
Type
Polarity
Input
Dynamic
Rising
Edge
Description
Input clock. There is one clock used in the entire
mathblock
•
ARST_N
Input
Global
A[17:0]
Input
Dynamic
A_EN
Input
Dynamic
Low
CLK is the clock for A[18:0], B[18:9], P[43:0],
OVFL, SHFTSEL, CDSEL, FDBKSEL, and SUB
registers.
Global Asynchronous reset
Port A (to Multiplier)
Input Data
High
Enable for data registers
•
A_EN is for A[17:0]
When not registered, connect A_EN to logic 1.
A_SRST_N
Input
Dynamic
Low
Synchronous reset
•
A_SRST_N is for A[17:0]
When not registered, connect A_SRST_N to logic 1.
A_BYPASS
Input
Dynamic
B[17:0]
Input
Dynamic
B_SRST_N
Input
Dynamic
Low
Port A register select
Port B (to Multiplier)
Input Data
Low
Synchronous reset
•
B_SRST_N is for B[17:0]
When not registered, connect B_SRST_N to logic 1.
B_EN
Input
Dynamic
High
Enable for data registers
•
B_EN is for B[17:0]
When not registered, connect B_EN to logic 1.
B_BYPASS
Input
Dynamic
Low
Port B register select
C[43:0]
Input
Dynamic
Input Data
CARRYIN
Input
Dynamic
Adder/accumulator's carry input
C_SRST_N
Input
Dynamic
Port C (to Adder)
Low
Synchronous reset
•
C_SRST_N is for C[43:0]
When not registered, connect C_SRST_N to logic 1.
C_EN
Input
Dynamic
High
Enable for data registers
•
C_EN is for C[43:0]
When not registered, connect C_EN to logic 1.
C_BYPASS
Input
Dynamic
Input
Cascade
Low
Port C register select
Other Inputs
CDIN[43:0]
Cascaded input for operand D of the
adder/accumulator. The entire CDIN will be driven by
another mathblock's CDOUT.
Note: Asynchronous load input has higher priority than the synchronous load input.
66
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 5-3 • Mathblock Pin Descriptions (continued)
Pin Name
DOTP
Direction
Type
Polarity
Input
Static
High
Description
Dot product mode
When DOTP = 1, mathblock performs (A[8:0] ×
B[17:9] + A[17:9] × B[8:0]) × 29
When DOTP = 0, mathblock performs normal 18× 18
multiplication operations.
SUB
Input
Dynamic
High
Subtract operation
When SUB = 1, perform 2's complement subtraction
to get
P = C + D + CARRYIN - (A x B).
When SUB = 0, perform 2's complement addition to
get
P = C + D + CARRYIN + (A x B).
SUB_SL_N
Input
Dynamic
Low
Synchronous reset input for SUB input control
register.
SUB_EN
Input
Dynamic
High
Enable input for SUB input control register.
SUB_SD
Input
Static
Low
Synchronous load data for the SUB input control
register.
SUB_BYPASS
Input
Dynamic
Low
SUB register select
ARSHFT17
Input
Dynamic
High
Arithmetic right-shift for operand D. When asserted, a
17-bit arithmetic right-shift is performed on operand D
of the adder/accumulator.
ARSHFT17_SL_N
Input
Dynamic
Low
Synchronous reset input for ARSHFT17 input control
register.
ARSHFT17_EN
Input
Dynamic
High
Enable input for ARSHFT17 input control register.
ARSHFT17_SD
Input
Static
Low
Synchronous load data for the ARSHFT17 input
control register.
ARSHFT17_BYPASS
Input
Dynamic
Low
ARSHFT17 register select
CDSEL
Input
Dynamic
High
Selects CDIN for operand D of the adder/accumulator
input.
When CDSEL = 1, CDIN is propagated to the operand
D.
When CDSEL = 0, either logic 0 or feedback from
output P is routed to the operand D depending upon
the FDBKSEL.
CDSEL_SL_N
Input
Dynamic
Low
Synchronous reset input for CDSEL input control
register.
CDSEL_EN
Input
Dynamic
High
Enable input for CDSEL input control register.
CDSEL_SD
Input
Static
Low
Synchronous load data for the CDSEL input control
register.
CDSEL_BYPASS
Input
Dynamic
Low
CDSEL register select
Note: Asynchronous load input has higher priority than the synchronous load input.
Revision 2
67
Mathblocks
Table 5-3 • Mathblock Pin Descriptions (continued)
Pin Name
Direction
Type
Polarity
Description
FDBKSEL
Input
Dynamic
High
Select the feedback from P for operand D of the adder
or accumulator.
•
When FDBKSEL = 1, propagate the current value
of result P register.
•
Ensure P_BYPASS = 0 and CDSEL = 0. When
FDBKSEL = 0, logic 0 is propagated. Ensure
CDSEL = 0.
FDBKSEL_SL_N
Input
Dynamic
Low
Synchronous reset input for FDBKSEL input control
register.
FDBKSEL_EN
Input
Dynamic
High
Enable input for FDBKSEL input control register.
FDBKSEL_SD
Input
Static
Low
Synchronous load data for the FDBKSEL input control
register.
FDBKSEL_BYPASS
Input
Dynamic
Low
FDBKSEL register select
Output Port
P[43:0]
Output
Result data out
•
Normal mode
P = C + D + CARRYIN + (A × B) when SUB = 0 P = C
+ D + CARRYIN - (A × B) when SUB = 1
•
DOTP mode
P = C + D + CARRYIN + ((A[8:0] × B[17:9] + A[17:9] ×
B[8:0]) × 29) when SUB = 0
P = C + D + CARRYIN - ((A[8:0] x B[17:9] + A[17:9] ×
B[8:0]) × 29) when SUB = 1
OVFL_CARRYOUT
Output
Overflow output
•
Normal mode
if C + D + CARRYIN +/- (A x B) > (243 - 1), then
OVFL_CARRYOUT = 1
if C + D + CARRYIN +/- (A x B) < - (243), then
OVFL_CARRYOUT = 1
else OVFL_CARRYOUT = 0.
•
DOTP mode
if C + D + CARRYIN +/- ((A[8:0] x B[17:9] + A[17:9] ×
B[8:0]) × 29) > (243- 1), then OVFL_CARRYOUT = 1 if
C + D + CARRYIN +/- ((A[8:0] × B[17:9] + A[17:9] ×
B[8:0]) × 29) < - (243), then
OVFL_CARRYOUT = 1
else OVFL_CARRYOUT = 0.
Note: Asynchronous load input has higher priority than the synchronous load input.
68
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 5-3 • Mathblock Pin Descriptions (continued)
Pin Name
OVFL_CARRYOUT_SEL
Direction
Type
Polarity
Description
Input
Static
High
Input to the adder for generating the overflow bit or an
external bit, which finally comes as an output on the
OVFL_CARRYOUT port. The overflow bit indicates
the overflow generated in the addition process. The
external bit is generated to extend the adder into the
fabric. In this case, P[43], C[43], and D[43] are not
representing the sign bit.
When OVFL_CARRYOUT_SEL = 1,
OVFL_CARRYOUT is the external bit for fabric
extension. Otherwise, OVFL_CARRYOUT is the
overflow output.
CDOUT[43:0]
P_SRST_N
Output
Input
Cascade output of result P. CDOUT is the same as P.
It is used to drive CDIN of another mathblock.
Dynamic
Low
Synchronous reset input for P and
OVFL_CARRYOUT control registers
•
P_SRST_N
P[43:0]
is for OVFL_CARRYOUT and
When not registered, connect P_SRST_N to logic 1.
P_EN [1:0]
Input
Dynamic
High
Enable input for P and OVFL_CARRYOUT control
registers
•
P_EN[1] is for OVFL_CARRYOUT and P[43:18]
•
P_EN[0] is for P[17:0]
When not registered, connect P_EN[1:0] to logic 1.
In Normal mode, ensure P_EN[1] = P_EN[0].
P_BYPASS
Input
Dynamic
Low
Output Port P register select
Note: Asynchronous load input has higher priority than the synchronous load input.
Revision 2
69
Mathblocks
Mathblock Use Models
This section describes a few use models for RTG4 mathblocks.
Use Model 1: Non-Pipelined Implementation of the 35 × 35 Multiplier
35 × 35 multipliers are useful for applications which require more than 18-bit precision. Non-pipelined
implementation is typically used for low speed applications. A 35 × 35 multiplier can be constructed using
4 mathblocks in a single row, connected in a cascade. Figure 5-5 shows a typical implementation of a
non-pipelined 35 × 35 multiplier.
The inputs are assumed to be A [34:0] and B [34:0] with a product of P [69:0].
$>@
$>@
+
3>@
%>@
%>@
+
!!
$>@
^$>@`
/
3>@
%>@
%>@
+
$>@
$>@
+
8QFRQQHFWHG
%>@
^%>@`
/
!!
$>@
^$>@`
/
3>@
%>@
^%>@`
/
Figure 5-5 • Non-Pipelined 35 × 35 Multiplier
70
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Use Model 2: Pipelined Implementation of the 35 x 35 Multiplier
RTG4 mathblocks have built-in registers on all input and output ports. To implement high-speed
multipliers, extra registers are added to the input or output side of the mathblocks to balance the pipeline
latency. These extra registers are implemented in the fabric.
Figure 5-6 shows a typical 35 × 35 multiplier implementation with fabric pipeline registers.
$>@
$>@
+
3>@
%>@
%>@
+
!!
$>@
^$>@`
/
3>@
%>@
%>@
+
$>@
$>@
+
8QFRQQHFWHG
%>@
^%>@`
/
!!
$>@
^$>@`
/
3>@
%>@
^%>@`
/
)DEULF5HJLVWHUV
Figure 5-6 • Pipeline 35 × 35 Multiplier
Revision 2
71
Mathblocks
Use Model 3: Implementation of 9-Bit Complex Multiplication
Complex multiplication implemented using a mathblock in DOTP mode requires additional 2's
complement logic in the fabric for negating the Q input. The DOTP implementation in Figure 5-7 shows
the optimized way of implementing the 2's complement with minimal logic in the fabric.
For two complex numbers X + jY, P + jQ, the complex multiplication is shown in EQ 2:
Multiplication Result = Real part + Imaginary Part = (PX - QY) + j (PY + QX)
EQ 2
In EQ 2, real part (PX-QY) requires -Q for the multiplication result. This can be computed using the one‘s
complement of Q and add the Y using the C input (since -Q = ~Q+1).
Imaginary part = P*Y+Q*X
EQ 3
Real part
= P*X + (~Q)*Y + Y
EQ 4
Figure 5-7 shows the implementation of 9 × 9 complex multiplication using a mathblock configured in
DOTP mode.
,QSXW
$GGHU
,QSXW
$GGHU
3<4;
,PDJLQDU\3DUW
$/
<
%+
3
'RW3URGXFW
0RGH
%/
4
$+
;
&>@ =HURHV
0DWKEORFN %+
¶VFRPSOHPHQW
/RJLF
4
'RW3URGXFW
0RGH
%/
3
$+
;
&>@ =HURHV
&>@ =HURHV
&>@ <
$/
<
0DWKEORFN Figure 5-7 • 9-Bit Complex Multiplication Using DOTP Mode
72
R e visio n 2
3;4<
5HDO3DUW
UG0574: RTG4 FPGA Fabric User Guide
Use Model 4: Multi-Threading and Multi-Channeling
Mathblocks support a multi-threading option where the same mathblock can be used for performing more
than one computation by time multiplexing. Time multiplexing can be done easily for designs with low
sample rates.
The multi-threading capability, if implemented for a chain of mathblocks, is called multi-channeling. Multichanneling can be used to implement multi-channel FIR filters where the same mathblock chain can be
used to process multiple input channels by time multiplexing the mathblock chain. Multi-channel filtering
is used in applications such as wireless communications, image processing, and multimedia
applications. The mathblock uses its C input for multi-threading and multi-channeling, but fabric registers
are also required for implementation.
Use Model 5 - Rounding and Trimming
Rounding
Rounding can be computed by adding a fixed term and a variable term to the input value to be rounded,
and then truncating. The fixed term can be feed using the C-Input of the mathblock and the value
depends on the number of decimal points required after rounding. The variable term is always a single bit
in the least-significant position whose value may be determined from the input value based on the type of
rounding.
Types of rounding are:
•
Round to the adjacent even integer: The variable term is determined from the 20-bit of the input
value.
•
Round towards zero: The variable term is determined from the sign bit of the input value. For
example, 1.5 rounds to 1 and -1.5 rounds to -1.
Table 5-4 shows examples for 6-bit values including three fraction bits.
Table 5-4 • Rounding Examples
Input Value
Round To Even
Round Toward Zero
Decimal
Fixed
Binary
Term Variable
C-Input
Term
2.5
010.100
0.011
000.000 010.111
010
2
000.000 010.111
010
2
1.5
001.100
0.011
000.001 010.000
010
2
000.000 001.111
001
1
-1.5
110.100
0.011
000.000 110.111
110
-2
000.001 111.000
111
-1
-2.5
101.100
0.011
000.001 110.000
110
-2
000.001 110.000
110
-2
Sum
Truncated
Sum
Revision 2
Decimal Variable
Term
Sum
Truncated Decimal
Sum
73
Mathblocks
$>@
%>@
)L[HG7HUP
&,QSXW
9DULDEOH7HUP
&$55<,1
3>@
Figure 5-8 • Rounding Using C-Input and CARRYIN
Trimming
Trimming of the Final Sum: Applications such as IIR and FFT often requires the rounding and trimming
of the final result. For example, last output of a cascade chain or the final value read from an
accumulator. The addition of the rounding terms can be done as shown in the Figure 5-9 and final result
can be trimmed in the fabric.
9DULDEOH
7HUP
$
%
$
%
)L[HG
7HUP
3
Figure 5-9 • Rounding and Trimming of the Final Sum
Trimming of Grouped Sums: When computing very large dot products (for example, a large, fullyenumerated FIR), it is good to avoid overflow by breaking the sum into a few groups, trimming the sum
for each group, and only then combining the sums of the groups into a final result. The rounding of each
group's sum can be done as shown in Figure 5-9. The trimming of each group's sum and summation of
the final result can be done in the fabric. Trimming can be done between the output of each cascade and
the final fabric adder.
74
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Trimming of Products: Figure 5-10 shows the implementation of rounding all products towards zero and
then trimming the least significant m bits of the product. As long as there are no additive terms other than
the products, it is possible to equivalently trim the partial sums instead of the products. Round towards
zero can be done using sign bit of the product (A*B) from the sign bits of the incoming factors A and B
using an EXOR.
$
$>@
$
%
%>@
%
&>P@
&>P@
&
3>P@
&>P@
3
Figure 5-10 • Rounding and Trimming of the Final Sum
Coding Style Examples
The following code examples illustrate coding styles from which the synthesis tool can infer and
implement RTG4 mathblocks.
Note: Examples provided are only in VHDL. Verilog examples are provided on request.
Example 1: 18 × 18 Signed Multiplication – Non-Registered
The following code is for an 18 × 18-bit signed multiplier. The input and output registers are configured in
Transparent mode. The synthesis tool maps the code into one mathblock.
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
entity sign18x18_mult is
port(
in1
: in signed(17 downto 0);
in2
: in signed(17 downto 0);
out1
: out signed(35 downto 0)
);
end sign18x18_mult;
architecture behav of sign18x18_mult is
begin
out1 <= in1 * in2;
end behav;
Revision 2
75
Mathblocks
Example 2: 18 × 18 Signed Multiplication – Registered
The following code is for an 18 × 18 signed multiplier. The inputs and outputs are registered, with a
synchronous active low reset signal. The synthesis tool maps the code into one mathblock.
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
entity sign18x18_mult_reg is
port(
clk
: in std_logic;
rstn
: in std_logic;
in1
in2
out1
: in signed(17 downto 0);
: in signed(17 downto 0);
: out signed(35 downto 0)
);
end sign18x18_mult_reg;
architecture behav of sign18x18_mult_reg is
signal in1_reg :signed(17 downto 0);
signal in2_reg :signed(17 downto 0);
begin
process(clk,rstn)
begin
if(rstn = '0')then
in1_reg <= (others => '0');
in2_reg <= (others => '0');
out1
<= (others => '0');
else
if(rising_edge(clk))then
in1_reg <= in1;
in2_reg <= in2;
out1 <= in1_reg * in2_reg;
end if;
end if;
end process;
end behav;
76
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Example 3: 17 × 17-Bit Unsigned Multiplier with Different Resets
The following code is for a 17 × 17-bit unsigned multiplier, which has input and output registers with
different asynchronous resets. The synthesis tool maps the code into one RTG4 mathblock.
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
entity mult_17x17unsign is
port(
clk
: in std_logic;
rstn1
: in std_logic;
rstn2
: in std_logic;
in1
: in std_logic_vector(16 downto 0);
in2
: in std_logic_vector(16 downto 0);
out1
: out std_logic_vector(33 downto 0)
);
end mult_17x17unsign;
architecture behav of mult_17x17unsign is
signal in1_reg :std_logic_vector(16 downto 0);
signal in2_reg :std_logic_vector(16 downto 0);
begin
process(clk,rstn1)
begin
if(rstn1 = '0')then
in1_reg <= (others => '0');
in2_reg <= (others => '0');
else
if(rising_edge(clk))then
in1_reg <= in1;
in2_reg <= in2;
end if;
end if;
end process;
process(clk,rstn2)
begin
if(rstn2 = '0')then
out1
<= (others => '0');
else
if(rising_edge(clk))then
out1 <= in1_reg * in2_reg;
end if;
end if;
end process;
end behav;
Revision 2
77
Mathblocks
Example 4: 17 × 17-Bit Unsigned Multiplier with Different Clocks
This example shows an unsigned multiplier with inputs and outputs that are registered with different
clocks: clock1 and clock2. In this case, the synthesis tool places only the output registers and the
multiplier into the RTG4 mathblock. The input registers are implemented in the FPGA logic outside the
mathblock.
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
entity mult_17x17unsign is
port(
clk1
: in std_logic;
clk2
: in std_logic;
in1
in2
out1
: in std_logic_vector(16 downto 0);
: in std_logic_vector(16 downto 0);
: out std_logic_vector(33 downto 0)
);
end mult_17x17unsign;
architecture behav of mult_17x17unsign is
signal in1_reg :std_logic_vector(16 downto 0);
signal in2_reg :std_logic_vector(16 downto 0);
begin
process(clk1)
begin
if(rising_edge(clk1))then
in1_reg <= in1;
in2_reg <= in2;
end if;
end process;
process(clk2)
begin
if(rising_edge(clk2))then
out1 <= in1_reg * in2_reg;
end if;
end process;
end behav;
78
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Example 5: Multiplier-Adder
In the code below. the output of a multiplier is added with another input. Inputs and outputs are registered
and have enables and synchronous resets. The synthesis tool maps the code into one RTG4 mathblock.
library iEEE;
use iEEE.std_logic_1164.all;
use iEEE.std_logic_unsigned.all;
entity mult_add is port (
clk : in std_logic;
rst : in std_logic;
en : in std_logic;
in1 : in std_logic_vector (16 downto 0);
in2 : in std_logic_vector (16 downto 0);
in3 : in std_logic_vector (33 downto 0);
out1 : out std_logic_vector (34 downto 0)
);
end mult_add;
architecture behav of mult_add is
signal in1_reg, in2_reg : std_logic_vector (16 downto 0 );
signal mult_out : std_logic_vector ( 33 downto 0 );
begin
process(clk)
begin
if(rising_edge(clk))then
if(rst = '0') then
in1_reg <= ( others => '0');
in2_reg <= ( others => '0');
out1 <= ( others => '0');
elsif(en = '1')then
in1_reg <= in1;
in2_reg <= in2;
out1 <= ( '0' & mult_out ) + ('0' & in3 );
end if;
end if;
end process;
mult_out <= in1_reg * in2_reg;
end behav;
Revision 2
79
Mathblocks
Example 6: Multiplier-Subtractor
There are two ways to implement multiplier and subtract logic. The synthesis tool places the logic
differently, depending on how it is implemented.
•
Subtract the result of multiplier from an input value (P = Cin – mult_out). The synthesis tool places
all logic in the mathblock.
•
Subtract a value from the result of the multiplier (P = mult_out – Cin). The synthesis tool places
only the multiplier in the mathblock. The subtractor is implemented in FPGA logic outside the
mathblock.
–
Unsigned MultSub Example (P = Cin – Mult_out) - Implemented in a single mathblock.
– Unsigned MultSub Example (P = Cin – Mult_out) - Implemented in a single mathblock
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
entity mult_sub is port (
clk : in std_logic;
rst : in std_logic;
in1 : in std_logic_vector(16 downto 0);
in2 : in std_logic_vector(16 downto 0);
in3 : in std_logic_vector(33 downto 0);
out1 : out std_logic_vector(33 downto 0)
);
end mult_sub;
architecture behav of mult_sub is
signal in1_reg, in2_reg : std_logic_vector(16 downto 0);
begin
process(clk)
begin
if(rising_edge(clk))then
if(rst = '0') then
in1_reg <= ( others => '0');
in2_reg <= ( others => '0');
out1
<= ( others => '0');
else
if(rising_edge(clk))then
in1_reg <= in1;
in2_reg <= in2;
out1 <= in3 - (in1_reg * in2_reg);
end if;
end if;
end if;
end process;
end behav;
– Unsigned MultSub Example (P = Mult - Cin) Multiplier is implemented in the mathblock and subtractor in FPGA logic
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
entity mult_sub is port (
clk : in std_logic;
rst : in std_logic;
80
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
in1 : in std_logic_vector(16 downto 0);
in2 : in std_logic_vector(16 downto 0);
in3 : in std_logic_vector(33 downto 0);
out1 : out std_logic_vector(33 downto 0)
);
end mult_sub;
architecture behav of mult_sub is
signal in1_reg, in2_reg : std_logic_vector(16 downto 0);
begin
process(clk)
begin
if(rising_edge(clk))then
if(rst = '0') then
in1_reg <= ( others => '0');
in2_reg <= ( others => '0');
out1
<= ( others => '0');
else
if(rising_edge(clk))then
in1_reg <= in1;
in2_reg <= in2;
out1 <= (in1_reg * in2_reg)-in3;
end if;
end if;
end if;
end process;
end behav;
Example 7: Signed 35 × 35 Multiplication
The following code implements a signed 35 × 35 multiplication function. The synthesis tool uses four
cascaded mathblocks to implement this multiplication function.
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
entity sign35x35_mult is port (
in1 : in signed(34 downto 0);
in2 : in signed(34 downto 0);
out1 : out signed(69 downto 0)
);
end sign35x35_mult;
architecture behav of sign35x35_mult is
begin
out1 <= in1*in2;
end behav;
Revision 2
81
Mathblocks
Example 8: Signed 35 × 35 Multiplication with Two Pipelined Register Stages
The following code implements a signed 35 × 35 multiplication function with two pipelined register
stages. The synthesis tool uses four cascaded mathblocks to implement this multiplication function. The
synthesis tool first infers pipeline registers at the input, output of the RTG4 mathblock and controls
pipeline latency by balancing the number of register stages. To balance the stages, the tool adds
additional registers at the input or output of the mathblock as required, implemented in the fabric logic.
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
entity sign35x35_mult is
port (
clk
: in std_logic;
rstn
: in std_logic;
in1
in2
out1
: in signed(34 downto 0);
: in signed(34 downto 0);
: out signed(69 downto 0)
);
end sign35x35_mult;
architecture behav of sign35x35_mult is
signal in1_reg : signed(34 downto 0);
signal in2_reg : signed(34 downto 0);
begin
process(rstn,clk)
begin
if(rstn ='0')then
in1_reg <= (others => '0');
in2_reg <= (others => '0');
out1
<= (others => '0');
else
if(rising_edge(clk))then
in1_reg <= in1;
in2_reg <= in2;
out1
end if;
end if;
end process;
<= in1_reg*in2_reg;
end behav;
82
R e visio n 2
6 – I/Os
Introduction
RTG4 FPGA devices have different types of I/Os, such as MSIO and MSIOD, double data rate I/Os
(DDRIO), and dedicated I/Os based on functional usage. For more information on I/O naming
conventions and I/O description, refer to the RTG4 FPGA Pin Description.
The MSIO, MSIOD, and DDRIO provide programmable I/O features such as drive strength, slew rate,
input delay, weak pull-up, and weak pull-down for several voltages. The programmable I/O features are
explained in detail in the "I/O Programmable Features" section on page 91.
The DDRIO is an MSIO optimized for LPDDR/DDR2/DDR3 performance. In RTG4 devices, there is a
DDR subsystem that is used to control external DDR memory, called FDDR. DDRIOs can be connected
to the respective DDR subsystem PHYs or can be used as user I/Os. For more information on DDR
subsystem, refer to RTG4 High Speed DDR Interfaces User Guide.
The MSIO, MSIOD, and DDRIO can be configured as fabric I/Os, whereas dedicated I/Os can be used
for a single purpose such as serializer/deserializer (SERDES), device reset, and clock functions.
Dedicated I/Os cannot be used by any other circuits.
The MSIO, MSIOD, and DDRIO are configured at power-up by means of fabric-related flash bits, which
are used to initialize register blocks. This is automatically done using Libero SoC.
Functional Description
The RTG4 I/O is classified into the following three categories depending on their functional usage:
•
MSIO, MSIOD, and DDRIO
•
JTAG I/O
•
Dedicated I/Os
MSIO, MSIOD, and DDRIO
Figure 6-1 on page 84 shows the top-level view of I/O interconnection between fabric logic and FDDR.
The DDRIOs are shared among the fabric logic and FDDR. when FDDR controller is used, the Libero
SoC software automatically assigns and configures the FDDR controller signals to respective DDRIOs.
The SPIO_SEL signal (as shown in Figure 6-1 on page 84) determines whether fabric logic or FDDR
peripheral connected to the corresponding I/O. This selection is set automatically by Libero SoC software
during programming. When FDDR controller is not used, the respective DDRIOs are available to fabric
logic as shown in Figure 6-1 on page 84.
In case of MSIO and MSIOD, the I/O is directly connected to fabric logic. For fabric logic, each I/O port of
the design must be individually assigned to I/Os in the Libero SoC software.
Revision 2
83
I/Os
,2'
8VHU&RQILJXUHVLQ
/LEHUR6R&
)DEULF
/RJLF
/LEHUR6R&
&RQILJXUHV$XWRPDWLFDOO\
)''5
&RQWUROOHU3+<
,2$
2(B3
'DWDBRXW
)DEULF,2'
'DWDBLQ
'2B3
',B3
3
63,2B6(/
7UDQVPLWWHUDQG
5HFHLYHU
2(B3
'2B3
',B3
)''5
,2'
',B3
,3EXIIHU
GLVDEOH
FRQWURO
8VHUFRQILJXUHVLQ/LEHUR6R&
)DEULF
/RJLF
3$'
'2B3
23EXIIHU
GLVDEOH
FRQWURO
2(B1
'DWDBRXW
)DEULF,2'
'DWDBLQ
'2B1
',B1
/LEHUR6R&
&RQILJXUHVDXWRPDWLFDOO\
)''5
&RQWUROOHU3+<
63,2B6(/
'2B1
',B1
)''5
,2'
'LIIHUHQWLDO
2(B1
'2B1
',B1
7UDQVPLWWHUDQG
5HFHLYHU
3$'
'LIIHUHQWLDO
1
Figure 6-1 • I/O Interconnection
An I/O consists of a highly featured bi-directional I/O buffer. The I/O is divided into two main sections, as
shown in Figure 6-1:
•
Digital: IOD (fabric and FDDR)
•
Analog: IOA
The digital (IOD) section generates output enable (OE), data out (DO), and data in (DIN) signals for both
P and N. Refer to the "Fabric Architecture" chapter on page 7 for more details on IOD.
Each pair of Analog (IOA) block forms a differential pair as shown in Figure 6-2 on page 86. The
differential pair is used to support differential and Pseudo differential modes of operation. The differential
pair is composed of a true and complement IOA. The True IOA is called P (with positive polarity relative
to the DO/DIN data signals of the P cell). The complement IOA is called ION (with negative polarity
relative to the DO/DIN data signals of the N cell).
The IOA blocks form a ring around the periphery of the device (Excluding the SERDES channel edge).
The Top and Bottom edge of the device IOA order starts with P on the left and N on the right. The left and
right edges use N on the top and P on the bottom. There is One IOD for each pair of IOAs.
In order to support a variety of different differential standards, the RTG4 uses pairs of regular IO cells: P
and N. These two IO cells of MSIO, MSIOD, and DDRIO can be configured as separate single ended IOs
or configured as one differential IO pair. In differential output mode, the output data signal is driven out on
both the P cell and N cell as a differential pair, where the true signal is on the P pad and the complement
signal is on N pad.
The P and N output signals will be complementary as required by the DDR1/DDR2/DDR3 standards for
CK and DQS signals. The P and N cells have to be laid out next to each other, as a pair, in order to
minimize the skew between the two output signals of the differential pair.
84
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
The analog block (IOA) section has transmitter and receiver buffers for the P and N pair. The main
circuits in the IOA are transmit and receive buffers (as shown in Figure 6-2 on page 86), that supports
various I/O standards and contains the following modules:
•
Transmit Buffer
•
Receive Buffer
•
Input Programming Delay
•
On-Die Termination
Transmit Buffer
Transmit and receive buffers transfer signals between the FPGA fabric and the IOA and also transfer
signals between FDDR and the IOA.
OE_P and OE_N control the direction of I/O buffers, as shown in Figure 6-2 on page 86. When an I/O is
operated as a single-ended I/O, OE_P and OE_N individually control the P and N I/O buffers. When an
I/O is operated as a differential I/O, OE_P controls both the P and N I/O buffers.
The dynamic OE disables or enables an output buffer for all the standards.
Receive Buffer
The enabling and disabling of the input buffer is controlled automatically by the Libero SoC software.
The I/O receiver can be made to operate in four different modes, as shown in Figure 6-2 on page 86.
These modes are selected based on flash configuration bits, which are configured during programming,
after power-on. Following are the four modes of the receiver:
•
True differential
•
Pseudo-differential
•
Single-ended
•
Schmitt trigger
In True differential mode, P and N pad inputs are fed to the comparator, whereas in Pseudo-differential
mode, each pad input is compared to reference with external reference voltage. Figure 6-2 on page 86
shows the detailed IOA structure of an I/O.
The I/O input can be configured as a Schmitt trigger receiver or single-ended receiver. When Schmitt
trigger inputs are selected, the input buffers present hysteresis that filters the noise at the receiver and
double glitching prevents caused by noisy input edges.
Input Programming Delay
Input delays can be used for hold time improvement of the input register by increasing input pin to input
register delay. Refer to "I/O Programmable Features" section on page 91 for more information.
On-Die Termination
The On-die termination (ODT) improves the signaling environment by reducing the electrical
discontinuities introduced with off-die termination and hence enables reliable operation at higher
signaling rates.
For more information on the programmed ODT values for DDRIO, MSIO, and MSIOD, refer to the "I/O
Programmable Features" section on page 91.
Revision 2
85
I/Os
3URJUDPGLUHFWO\2'7WRGHVLUHGYDOXH
5HIHUHQFH5HVLVWRU9DOXH
,2$
)DEULF
RU
)''5
'2B3
''5,23DLUV&RQQHFWHGWR
0''5)''5
''5,2
&DOLEUDWLRQ%ORFN
9&&,2
3URJUDPPD EOH6OHZUDWHIRUµ3¶GULYHU
3URJUDPPD EOH3XOOXSRU
3XOOGRZQRU
'LVDEOHERWKIRUµ3¶
2'7
7UDQVPLWWHU
,PSHGDQFH
7[3
3$'B3
2(B3
6LQJOH(QGHG
5HFHLYHU3
6FKPLW
3VXHGR'LIIHUHQWLDO
7UXH 'LIIHUHQWLDO
',1B3
,QSXW3URJUDPPLQJ
'HOD\
',1B3BGHOD\HG
9ROWDJH6WDQGDUG
6HOHFW
3URJUDPPD EOH6OHZUDWH IRUµ1¶GULYHU
'2B1
'LIIHUHQWLDO
2'7
06,2'RQO\
;B95()
9&&,2
2'7
7UDQVPLWWHU
,PSHGDQFH
7[1
3$'B1
2(B1
5HFHLYHU1
6LQJOH HQGHG
3URJUDPPD EOH3XOO XSRU 3XOO GRZQRU
'LVDEOHERWKIRUµ1¶
'LIIHUHQWLDO
6FKPLW
',1B1
',1B1BGHOD\HG
3VXHGR 'LIIHUHQWLDO
,QSXW3URJUDPPLQJ
'HOD\
;B95()
Figure 6-2 • IOA Architecture
Radiation Hardening
Radiation Hardening is the act of making systems resistant to damage or malfunctions caused by
ionization radiation (such as particle radiation and high-energy electromagnetic radiation, which are
encountered in space, high-altitude flight, and so on). The Hardened term is referred to Radiation
Hardened.
RTG4 devices have a hardened input buffers for receiving clock inputs or other critical signals. There are
24 primary clock inputs on a RTG4 device. The hardened capability is only available for MSIO and
MSIOD receivers. The DDRIO receivers are not hardened, which means they are susceptible to
radiation.
The RTG4 hardened receiver uses TMR logic (that is, each receiver block is composed of three receivers
with a wire-or at the output). Each RTG4 hardened receiver in MSIO and MSIOD supports the following
modes of operations:
1. Single ended ratio receiver mode (LVTTL/LVCMOS) with programmable ON/OFF
2. Reference receiver mode (SSTL/HSTL)
3. Differential input mode (LVDS/RSDS)
In RTG4 devices, as hardening is only on the input buffer, when an I/O is configured bi-directional mode,
it is not hardened. The hardened input has a programmable on-die termination ON/OFF, programmable
weak pull-up/pull-down ON/OFF per pad.
86
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
I/O Banks
I/Os are grouped in banks on the basis of I/O voltage standard. Each I/O bank has dedicated I/O supply
and ground voltages and only standards compatible with the voltage supplied to the bank might be used.
There are 10 I/O banks as shown in RTG4150-CG1657 device. Every I/O bank has input and output
buffers to support a wide range of standards, which require different VCC voltage and reference voltages
(VREF) for voltage referenced standards. These voltages are externally supplied and connected to
device pins, which serve banks (groups) of I/Os.
This section discusses on the RT4G150 device CG1657 package details. There are 10 banks in the
RT4G150 device as shown in Figure 6-3. There are three MSIO banks, four MSIOD banks, and two
DDRIO banks in the RT4G150 device. The maximum number of available I/Os are mentioned in
parenthesis as shown in Figure 6-3. For more information on RTG4 FPGA pin descriptions, supply pins,
unused conditions, and packaging details, refer to the RTG4 FPGA Pin Description.
%$1.
-7$*
%$1.
06,2
%$1.
06,2
%$1.
06,2
$
%$1.
06,2'
%$1.
06,2'
57*)3*$
57*&*
%$1.
06,2'
%$1.
06,2'
%$1.
''5,2
)''5B:
%$1.
''5,2
)''5B(
6(5'(6B3&,(B
6(5'(6B
6(5'(6B
6(5'(6B
6(5'(6B
6(5'(6B3&,(B
Figure 6-3 • RT4G-CG1657 I/O Bank Location and Naming
The MSIOs, MSIODs, and DDRIOs are divided into banks, each of which may be configured to support
one of the standards listed in Table 6-2 on page 88.
Revision 2
87
I/Os
Table 6-1 shows the organization of I/O banks in the RTG4 devices.
Table 6-1 • The Organization of I/O Banks in RTG4 Devices
I/O Banks
RT4G150CG1657
Bank 7
MSIOD: Fabric
Bank 8
MSIOD: Fabric
Bank 9
DDRIO: FDDR or fabric
Bank 4
MSIO: Fabric
Bank 5
MSIO: Fabric
Bank 6
MSIO: Fabric
Bank 1
MSIOD: Fabric
Bank 2
MSIOD: Fabric
Bank 0
DDRIO: FDDR or fabric
Bank 3
MSIO: JTAG
Supported I/O Standards
Table 6-2 shows the supported voltage standards for various I/O types.
Table 6-2 • Supported Voltage Standards
I/O Types
I/O Standards
MSIO
MSIOD
DDRIO
LVTTL 3.3 V
Yes
–
–
LVCMOS 3.3 V
Yes
–
–
PCI
Yes
–
–
LVCMOS 12
Yes
Yes
Yes
LVCMOS 15
Yes
Yes
Yes
LVCMOS 18
Yes
Yes
Yes
LVCMOS 25
Yes
Yes
Yes
SSTL2I
Yes
Yes
Yes (DDR1)
SSTL2II
Yes
Yes
Yes (DDR1)
SSTL18I
Yes
Yes
Yes (DDR2)
SSTL18II
Yes
Yes
Yes (DDR2)
SSTL15I
–
–
Yes (DDR3) Only for
I/Os used by FDDR
SSTL15II
–
–
Yes (DDR3) Only for
I/Os used by FDDR
HSTL18I
Yes
Yes
Yes
HSTL18II
Yes
Yes
Yes
Single-Ended Standard
Voltage-Referenced Standard
88
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 6-2 • Supported Voltage Standards (continued)
I/O Types
I/O Standards
MSIO
MSIOD
DDRIO
HSTLI
Yes
Yes
Yes
HSTLII
–
–
Yes
LPDDRI
–
–
Yes
LPDDRII
–
–
Yes
LVPECL
Yes
–
–
LVDS 33
Yes
–
–
LVDS
Yes
Yes
–
RSDS
Yes
Yes
–
MINILVDS
Yes
Yes
–
BUSLVDS
Yes
Yes (Only Input)
–
MLVDS
Yes
Yes (Only Input)
–
Differential Standard
For I/O pin naming and assignments to specific banks, refer to the RTG4 Pin Descriptions document.
Single-Ended Standards
Single-ended I/O standards use a push-pull CMOS output stage with a voltage referenced to the system
ground. The input buffer configuration, output drive, and I/O supply voltage (VCCI) vary among the I/O
standards. The advantage of these standards is that a common ground can be used for multiple I/Os.
This simplifies board layout and reduces system cost. The reduced slew rate of these I/O standards
causes less electromagnetic interference (EMI) on the board. However, these I/Os are not suitable for
high frequency (>200 MHz) switching due to noise and high power consumption.
Low Voltage TTL (LVTTL)
This is a general purpose standard (EIA/JESD8-B) for 3.3 V applications. It uses an LVTTL input buffer
and a push-pull output buffer. The LVTTL output buffer can have up to eight different programmable drive
strengths.
Low Voltage CMOS (LVCMOS)
RTG4 devices provide five different kinds of LVCMOS:
•
LVCMOS 3.3 V
•
LVCMOS 2.5 V
•
LVCMOS 1.8 V
•
LVCMOS 1.5 V
•
LVCOMS 1.2 V
LVCMOS 3.3 V (only in MSIO) is an extension of the LVCMOS standard (JESD8-B compliant) used for
general purpose 3.3 V applications. LVCMOS 2.5 V is an extension of the LVCMOS standard (JESD8-5compliant) used for general purpose 2.5 V applications.
LVCMOS 1.8 V is an extension of the LVCMOS standard (JESD8-7-compliant) used for general purpose
1.8 V applications. The LVCMOS 1.5 V is an extension of the LVCMOS standard (JESD8-11-compliant)
used for general purpose 1.5 V applications.
The VCCI values for these standards are 3.3 V, 2.5 V, 1.8 V, 1.5 V, and 1.2 V, respectively. For MSIOs, all
these versions use a 3.3 V-tolerant CMOS input buffer and a push-pull output buffer. Similar to LVTTL,
the output buffer has up to eight different programmable drive strengths.
Revision 2
89
I/Os
3.3 V Peripheral Component Interface (PCI)
This standard specifies support for both 33 MHz and 66 MHz PCI bus applications. It uses an LVTTL
input buffer and a push-pull output buffer. With the aid of an external resistor, this I/O standard can be 5
V-compliant.
Voltage-Referenced Standards
I/Os using these standards are referenced to an external reference voltage (VREF).
High-Speed Transceiver Logic (HSTL) Class I
These are general purpose, high-speed 1.5 V bus standards (EIA/JESD8-6) for signaling between
integrated circuits. The signaling range is 0 V to 1.5 V, and signals can be either single-ended or
differential. HSTL requires a differential amplifier input buffer and a push-pull output buffer. These
standards are used in the memory bus interface with data switching capability of up to 400 MHz. The
other advantages of these standards are low power and fewer EMI concerns. HSTL has four classes, of
which RTG4 devices support Class I. The reference voltage (VREF) is 0.75 V.
Stub Series Terminated Logic 2.5 V (SSTL2) Class I and II
These are general purpose 2.5 V memory bus standards (JESD8-9) for driving transmission lines,
designed specifically for driving the DDR SDRAM modules used in computer memory. The SSTL2
requires a differential amplifier input buffer and a push-pull output buffer. The reference voltage (VREF) is
1.25 V.
Stub Series Terminated Logic 1.8 V (SSTL18) Class I and II
These are general purpose 1.8 V memory bus standards (JESD8-15) for driving transmission lines,
designed specifically for driving the DDR2 SDRAM modules used in computer memory. SSTL18 requires
a differential amplifier input buffer and a push-pull output buffer. The VREF is 0.9 V.
Differential Standards
These standards require two I/Os per signal (called a signal pair). Logic values are determined by the
potential difference between the lines, not with respect to ground. This is why differential drivers and
receivers have much better noise immunity than single-ended standards. The differential interface
standards offer higher performance and lower power consumption than their single-ended counterparts.
Two I/O pins are used for each data transfer channel. Differential standards require resistor termination
on both I/Os.
Low Voltage Positive Emitter Coupled Logic
Low voltage positive emitter coupled logic (LVPECL) requires that one data bit is carried through two
signal lines; therefore, two pins are needed per input or output. It also requires external resistor
termination. The voltage swing between the two signal lines is approximately 850 mV. When the power
supply is +3.3 V, it is commonly referred to as LVPECL.
Low Voltage Differential Signal
Low voltage differential signal (LVDS) is a differential I/O standard. As with all differential signaling
standards, LVDS requires that one data bit is carried through two signal lines, and it has inherent noise
immunity over single-ended I/O standards. The voltage swing between two signal lines is approximately
350 mV. The external VREF or board termination voltage (VTT) is not required. LVDS requires the use of
two pins per input or output.
Reduced Swing Differential Signaling
Reduced swing differential signaling (RSDS) is a signaling standard that defines the output
characteristics of a transmitter and inputs of a receiver along with the protocol for a chip-to-chip interface
between flat-panel timing controllers and column drivers.
90
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
B-LVDS/M-LVDS
Bus LVDS (B-LVDS) refers to bus interface circuits based on LVDS technology. Multipoint LVDS (MLVDS) specifications extend the LVDS standard to high-performance multipoint bus applications. Multidrop and multipoint bus configurations may contain any combination of drivers, receivers, and
transceivers. The LVDS drivers provide the higher drive current required by B-LVDS and M-LVDS to
accommodate the bus loading.
The driver requires series terminations for better signal quality and to control the voltage swing.
Termination is also required at both ends of the bus, since the driver can be located anywhere on the
bus. The RTG4 MSIOD has an internal circuit isolation, and the bus isolation must be implemented in the
design external to the FPGA when using M-LVDS.
Mini-LVDS
A serial, intra-flat panel solution that serves as an interface between the timing control function and an
LCD source driver.
I/O Programmable Features
RTG4 devices support different I/O programmable features for MSIO, MSIOD, and DDRIO. Each I/O pair
(P and N) supports the following programmable features:
•
Programmable Input Delay
•
Pre-Emphasis
•
Programmable Slew Rate Control
•
Programmable Output Drive Strength
•
Programmable Weak Pull-Up/Pull-Down
•
Programmable Schmitt Trigger Input and Receiver
•
Configurable ODT and Driver Impedance
These features can be configured using Libero SoC or in a PDC file. Refer to the Libero SoC User Guide
for more details.
Revision 2
91
I/Os
Table 6-3 lists all the features supported for single-ended and differential I/Os.
Table 6-3 • RTG4 I/O Features
I/Os
I/O Features
MSIO
MSIOD
DDRIO
Programmable drive strength
Yes
Yes
Yes
Programmable weak pull-up and pull-down
Yes
Yes
Yes
Configurable ODT
Yes
Yes
Yes
–
–
–
Yes
–
–
Pre-emphasis capability
–
Yes
-
Programmable slew rate
–
–
Yes
5 V tolerant with minimal use of external circuitry
Yes
Yes
–
Schmitt receiver
Yes
Yes
Yes
Programmable input delay
Yes
Yes
Yes
Programmable weak pull-up and pull-down
Yes
Yes
Yes
Configurable ODT
Yes
Yes
Yes
–
–
Yes
100 Ω differential ODT
Yes
Yes
–
Schmitt receiver
Yes
Yes
Yes
Programmable input delay
Yes
Yes
Yes
–
–
Yes
Single-Ended Transmitter
Hot insertion capable
LVTTL/LVCMOS 3.3 V outputs compatible with external 5 V TTL inputs
Single-Ended Receiver
Differential Transmitter
Programmable slew rate
Differential Receiver
Programmable Slew rate
Programmable Input Delay
Each I/O, when configured as an input, can be programmed with different input delays. The input delay is
calculated using:
Delay = D + N x 0.1 ns
EQ 1
where,
N ranges from 0 to 63.
D is the intrinsic delay or circuit delay of an input without additional delay, when N is 0. The total delay
range is between D ns to D + 6.3 ns. The intrinsic delay varies depending on SLOW (SS), MEDIUM (TT),
and FAST (FF) slew rates.
Hence, there are 65 input delay values which can be selected and configured using the I/O Constraints
Editor of Libero SoC for MSIO, MSIOD, and DDRIO.
Note: Input delays could be used for hold time improvement for the input register by increasing input pin
to input register delay.
92
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Pre-Emphasis
The MSIOD has pre-emphasis on the differential output only. The RTG4 MSIOD has LVDS preemphasis.
Programmable Slew Rate Control
The MSIO and MSIOD do not support a user programmable slew rate. Although, the MSIO and MSIOD
output drive slew rate is managed, to some extent, with staggered output pre drive stages. Each output
buffer has multiple transistors connected in parallel and driven by corresponding pre-driver circuits.
Delay circuit is introduced to stagger the pre-driver turn-on times and then control the overshoot of the
switching current.
The DDRIO has two bits of programmable slew control capability on the non-differential drive outputs.
The LVCMOS25, LVCMOS18, LVCMOS15, and LVCMOS12 support three levels of slew control.
Minimum, Medium, and Maximum slew are supported.
The DDRIO output drive slew rate is also managed with staggered output pre drive stages and by use of
an impedance matched output driver.
Programmable Weak Pull-Up/Pull-Down
All I/O standards support the Pull-up or Pull-down or None states. The default configuration is None.
These states can be configured using I/O Constraints Editor in the Libero SoC software. The Pull-up
and Pull-down are mutually exclusive and weakly hold the output to either VDDI or VSS respectively
through 10K ohm resistor.
Table 6-4 • Weak Pull-Up/Pull-Down
I/O Standard
MSIO
MSIOD
DDRIO
LVTTL33
None
–
–
–
–
–
–
None
None
None
Down
Down
Down
Up
Up
Up
None
None
None
Down
Down
Down
Up
Up
Up
None
None
None
Down
Down
Down
Up
Up
Up
Down
Up
LVCMOS33
None
Down
Up
PCI
None
Down
Up
LVCMOS12
LVCMOS15
LVCMOS18
Revision 2
93
I/Os
Table 6-4 • Weak Pull-Up/Pull-Down (continued)
I/O Standard
MSIO
MSIOD
DDRIO
LVCMOS25
None
None
None
Down
Down
Down
Up
Up
Up
Programmable Schmitt Trigger Input and Receiver
The MSIO, MSIOD, and DDRIO inputs can be configured as Schmitt trigger receiver. When the Schmitt
trigger inputs are enabled, the input buffers present a hysteresis and filter out the noise at the receiver
and prevent double glitching caused by noise at input edges. This feature can be enabled or disabled by
using a physical design constraints (PDC) command or by using I/O Constraints Editor in the Libero
SoC software. The Schmitt Trigger receiver is disabled by default. Table 6-5 shows the different I/O
standards which support the Schmitt Receiver option.
Table 6-5 • Schmitt Receiver
I/O Standard
LVTTL33
MSIO
MSIOD
DDRIO
Off
–
–
–
–
–
–
Off
Off
Off
On
On
On
Off
Off
Off
On
On
On
Off
Off
Off
On
On
On
Off
Off
Off
On
On
On
On
LVCMOS33
Off
On
PCI
Off
On
LVCMOS12
LVCMOS15
LVCMOS18
LVCMOS25
94
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Programmable Output Drive Strength
The programmable current drive output buffers can be programmed to select the current drive
capabilities ranging from 2 mA to 16 mA. Programmable values are available only for LVTTL and
LVCMOS standards as shown in Table 6-6. These values can be programmed using I/O Constraints
Editor in the Libero SoC software for the selected I/O standard.
Table 6-6 • Recommended DDRIO Output Drive Strengths
I/O Standard
LVTTL
MSIO (mA)
MSIOD (mA)
DDRIO (mA)
2
–
–
–
–
2
2
2
4
4
4
4
8
12
16
LVCMOS33
2
4
8
12
16
LVCMOS12
6
LVCMOS15
2
2
2
4
4
4
6
6
6
8
8
10
12
LVCMOS18
2
2
2
4
4
4
6
6
6
8
8
8
10
10
10
12
12
16
LVCMOS25
2
2
2
4
4
4
6
6
6
8
8
8
12
12
12
16
Revision 2
16
95
I/Os
Configurable ODT and Driver Impedance
The MSIO, MSIOD, and DDRIOs have an ODT or transmitter impedance feature which is calibrated
depending on the I/O standard. If the impedance feature is enabled, impedance can be programmed to
the desired value in three ways. Figure 6-2 on page 86 shows the impedance configuration for DDRIO.
•
Calibrate the ODT/Driver Impedance with Calibration Block
•
Calibrate the ODT/Driver Impedance with Fixed Calibration Codes
•
Configure the ODT/Driver Impedance Statically to Desired Value Directly
There are two DDRIO calibration blocks in each RTG4 device. The FDDR has a DDRIO calibration
block. Each calibration block calibrates ODT/driver impedance for all 44 DDRIO pairs (P and N).
Calibrate the ODT/Driver Impedance with Calibration Block
The I/O calibration block automatically calibrates the I/O drivers to an external resistor. The impedance
control is used to identify the digital values PCODE<5:0> and NCODE<5:0>. These values are fed to the
pull-up/pull-down reference network to match the impedance with an external resistor. Once it matches
the PCODE and NCODE registers, they are latched and sent to the drivers.
The calibrated impedance value can be configured statically by enabling ODT_STATIC, or dynamically
by enabling ODT_DYN. ODT_STATIC selects the ODT value set in flash configuration bits programmed
during power-on, whereas ODT_DYN selects the ODT value provided at run time. Refer to the FDDR I/O
Calibration Control register of the "System Register Block" in the RTG4 FPGA High Speed DDR User
Guide for enabling the calibration block.
Table 6-7 shows the ODT calibrated impedances for the listed I/O standards.
Table 6-7 • ODT Calibrated Impedance
Driver Mode
ODT, DDR3/SSTL 1.5, 1.5 V
Reference Resistor (Ω)
ODT Calibrated Impedance
240
120
60
40
30
20
ODT, DDR2/SSTL 1.8, 1.8 V
150
150
75
50
ODT, HSTL
191
47.8
To calibrate driver or transmitter impedance for an I/O, configure it to the calibrated impedance according
to the flash configuration bits for the appropriate I/O standard. Recommended reference resistor values
are used for calibration. The calibrated impedance values are shown in Table 6-8.
Table 6-8 • Driver/Transmitter Calibrated Impedance
Driver Mode
Transmitter, DDR3 SSTL 1.5 V
Reference Resistor (Ω)
Transmitter Calibrated Impedance
240
34
40
Transmitter, DDR2 SSTL 1.8 V
150
20
42
Transmitter, DDR1 SSTL 2.5 V
150
20
42
96
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 6-8 • Driver/Transmitter Calibrated Impedance (continued)
Driver Mode
Reference Resistor (Ω)
Transmitter Calibrated Impedance
150
20
Transmitter, LPDDR SSTL 1.8 V
42
Transmitter, HSTL 1.5 V
191
25.5
47.8
LVCMOS 1.2 V and 1.5 V
300
75
66.7
50
LVCMOS 1.8 V
150
75
50
33
25
LVCMOS 2.5 V
150
75
50
33
25
Calibrate the ODT/Driver Impedance with Fixed Calibration Codes
The DDRIO can use fixed impedance calibration for different drive strengths, and these values can be
programmed using I/O Constraints Editor in the Libero SoC software for the selected I/O standard.
Refer to the I/O Constraints Editor section in the Libero SoC User Guide. Table 6-6 on page 95 shows
the recommended DDRIO output drive strength values. PCODE<5:0> and NCODE<5:0> are registers
accessible through the dedicated APB configuration interface.
Table 6-9 • PCODE and NCODE Values
I/O Standard
NCODE
PCODE
DDR1 Full Drive/SSTL2 II
42
44
DDR1 Half Drive/SSTL2 I
42
44
DDR2 Full Drive/SSTL18 II
58
61
DDR2 Half Drive/SSTL18 I
58
61
LPDDR Full Drive
58
61
LPDDR Half Drive
58
61
HSTL II
53
56
HSTL I
53
56
LVCMOS25 16 mA
42
44
LVCMOS25 12 mA
42
44
LVCMOS25 8 mA
42
44
LVCMOS25 6 mA
42
44
LVCMOS25 4 mA
42
44
LVCMOS25 2 mA
42
44
Revision 2
97
I/Os
Table 6-9 • PCODE and NCODE Values (continued)
I/O Standard
NCODE
PCODE
LVCMOS18 16 mA
58
61
LVCMOS18 12 mA
58
61
LVCMOS18 10 mA
58
61
LVCMOS18 8 mA
58
61
LVCMOS18 6 mA
58
61
LVCMOS18 4 mA
58
61
LVCMOS18 2 mA
58
61
LVCMOS15 12 mA
53
56
LVCMOS15 10 mA
53
56
LVCMOS15 8 mA
53
56
LVCMOS15 6 mA
53
56
LVCMOS15 4 mA
53
56
LVCMOS15 2 mA
53
56
LVCMOS12 6 mA
40
42
LVCMOS12 4 mA
40
42
LVCMOS12 2 mA
40
42
Configure the ODT/Driver Impedance Statically to Desired Value Directly
The ODT/driver can be calibrated to a desired value by providing PCODE<5:0> and NCODE<5:0>
values directly through the dedicated APB configuration interface FIC2. In this configuration, the values
are overwritten with the existing values. Refer to the FDDR I/O Calibration Control register of the
"System Register Block" in the RTG4 FPGA High Speed DDR User Guide for configuring the PCODE
and NCODE values. For MSIO and MSIOD, the ODT values shown in Table 6-10 are configured based
on I/O standard.
Table 6-10 • ODT Values
I/O Standards
SSTL18I & SSTL18II (DDR2)
SSTL15I & SSTL15II (DDR3)
MSIO
MSIOD
DDRIO
50
50
50
75
75
75
150
150
150
–
–
20
30
40
60
120
HSTL18I & HSTL18II
98
50
50
50
75
75
75
150
150
150
R e visio n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 6-10 • ODT Values
I/O Standards
LPDDRI & LPDDRII
MSIO
MSIOD
DDRIO
–
–
50
75
150
Default Values Set in Software (Cannot be accessible)
LVDS33
100
–
–
LVPECL
100
–
–
LVDS25
100
100
–
RSDS
100
100
–
MINILVDS
100
100
–
BUSLVDS
100
100
–
MLVDS
100
100
–
I/O External Termination
If ODT is not used, I/O standards require termination for better signal integrity. Voltage referenced
standards generally have a serial (driver) and parallel (receiver) termination whereas differential
standards have only a parallel termination (receiver). Table 6-11 shows external termination schemes for
the I/O standards supported for DDRIO, MSIO, and MSIOD when the ODT/driver impedance calibration
feature is not used.
Table 6-11 • Termination Schemes
I/O Standard
External Termination Scheme
SSTL 1.5 single-ended (Class I & II)
SSTL 1.8 single-ended (Class I & II)
Single-ended SSTL I/O standard termination
SSTL 2 single-ended (Class II)
HSTL 1.5 single-ended (Class II)
Single-ended HSTL I/O standard termination
SSTL 2.5 differential (Class I & II)
SSTL 1.8 differential (Class I & II)
Differential SSTL I/O standard termination
SSTL 1.5 differential (Class I & II)
HSTL 1.5 differential (Class II)
Differential HSTL I/O standard termination
LVCMOS 2.5
LVCMOS 1.8
LVCMOS 1.5
No external termination required
LVCMOS 1.2
LVDS
100 Ω, parallel termination
MLVDS
100 Ω, parallel termination
BLVDS
100 Ω, parallel termination
RLVDS
100 Ω, parallel termination
Mini LVDS
100 Ω, parallel termination
Revision 2
99
I/Os
Table 6-11 • Termination Schemes (continued)
I/O Standard
External Termination Scheme
LVPECL
100 Ω, parallel termination
Note: To obtain more information on electrical characteristics, refer to the RTG4 DataSheet (to be released). RTG4
does not support Bus Keeping feature.
Cold Sparing
In cold sparing applications, voltage can be applied to device I/Os before and during power-up. The
RTG4 device is capable of cold sparing applications which has the following strategies:
•
System board integrates two parallel RTG4 devices on the board with shared or common I/O
connections.
•
Primary RTG4 device has its core powered and fully functional until a point where a swap of
devices is determined to be necessary.
•
Backup RTG4 device has its I/O banks powered to prevent I/O leakage through the ESD diodes
and fabric core un-powered. This establishes a low power, protected state for the backup RTG4
device.
•
At any point, you can swap by powering down the core of the primary RTG4 device and powering
up the core of the backup RTG4 device and going through its configuration sequence.
•
Primary and backup devices are identical parts.
•
Only one of the two devices might be active at one time.
•
CoreVDD high activates the part and low de-activates the part.
•
The de-active part must tie the VDD to the ground and must not be floating.
Following are the advantages of Cold Sparing:
•
Power-Up can be done in any sequence.
•
No Excess device leakage in spare device.
(No power supply sequencing requirement)
(In this cold sparing method)
9'' YQRP
&RUHSRZHU
9'',2
,2%DQNSRZHU
3ULPDU\57*
YGG $FWLYH
9'',2
,2%DQNSRZHU
%DFNXS57*
YGG 6SDUH
YGGLR
YGGLR
2WKHU&KLS
Figure 6-4 • Cold Sparing
100
9'' 9QRP
&RUHSRZHU
R e vi s i o n 2
UG0574: RTG4 FPGA Fabric User Guide
5 V Input Tolerance and Output Driving Compatibility (only MSIO)
5 V Input Tolerance
I/Os can support 5 V inputs when LVTTL 3.3 V and LVCMOS 3.3 V configurations and one of the
following techniques is used to reduce the voltage at the IO. There are three recommended solutions for
achieving 5 V receiver tolerance. All the solutions meet a common requirement of limiting the voltage at
the input to 3.45 V or less. The I/O absolute maximum voltage rating is 3.45 V, and any voltage above
3.45 V may cause long-term gate oxide failures.
Solution 1
The board-level design must ensure that the reflected waveform at the pad does not exceed the limits
provided in the recommended operating conditions in the datasheet. This is a requirement to ensure
long-term reliability.
This scheme also works for a 3.3 V PCI configuration, but the internal diode must not be used for
clamping, and the voltage must be limited by two external resistors. Relying on diode clamping would
create an excessive pad DC voltage of 3.3 V + 0.7 V = 4 V.
This solution requires two board resistors. Here are some examples of possible resistor values based on
a simplified simulation model with no line effects and 10 Ω transmitter output resistance.
where,
Rtx_out_high = [VCCI – VOH] / IOH and Rtx_out_low = VOL / IOL).
EQ 2
Example 1 (high speed, high current):
Rtx_out_high = Rtx_out_low = 10 Ω
R1 = 36 Ω (±5%), P(r1)min = 0.069 Ω
R2 = 82 Ω (±5%), P(r2)min = 0.158 Ω
Imax_tx = 5.5 V / (82 × 0.95 + 36 × 0.95 + 10) = 45.04 mA
tRISE = tFALL = 0.85 ns at C_pad_load = 10 pF (includes up to 25% safety margin)
tRISE = tFALL = 4 ns at C_pad_load = 50 pF (includes up to 25% safety margin)
Example 2 (low-medium speed, medium current):
Rtx_out_high = Rtx_out_low = 10 Ω
R1 = 220 Ω (±5%), P(r1)min = 0.018 Ω
R2 = 390 Ω (±5%), P(r2)min = 0.032 Ω
Imax_tx = 5.5 V / (220 × 0.95 + 390 × 0.95 + 10) = 9.17 mA
tRISE = tFALL = 4 ns at C_pad_load = 10 pF (includes up to 25% safety margin)
tRISE = tFALL = 20 ns at C_pad_load = 50 pF (includes up to 25% safety margin)
Other values of resistors are also allowed as long as the resistors are sized to limit the voltage at the
receiving end to 2.5 V < Vin(rx) < 3.6 V when the transmitter sends a logic 1.
This range of Vin_dc(rx) must be assured for any combination of transmitter supply (5 V ± 0.5 V),
transmitter output resistance, and board resistor tolerances.
Revision 2
101
I/Os
Figure 6-5 shows the 5 V input tolerance solution 1.
9
9
5H[W
5H[W
5HTXLUHVWZRERDUGUHVLVWRUV
/9&0269,2V
Figure 6-5 • 5 V Input Tolerance Solution 1
Solution 2
The board-level design must ensure that the reflected waveform at the pad does not exceed the voltage
overshoot/undershoot limits provided in the datasheet. This is a requirement to ensure long-term
reliability. This scheme also works for a 3.3 V PCI configuration, but the internal diode must not be used
for clamping, and the voltage must be limited by the external resistors and Zener. Relying on the diode
clamping would create an excessive pad DC voltage of 3 V + 0.7 V = 4 V.
9
9
5H[
=HQHU
9
5HTXLUHVRQHERDUGUHVLVWRUV
RQH=HQHU9GLRGH/9&0269,2V
Figure 6-6 • 5 V Input Tolerance Solution 2
102
R e vi s i o n 2
UG0574: RTG4 FPGA Fabric User Guide
5 V Output Driving Compatibility
RTG4 I/Os must be set to 3.3 V LVTTL or 3.3 V LVCMOS mode to reliably drive 5 V TTL receivers. It is
also critical that there is NO external I/O pull-up resistor to 5 V, since this resistor would pull the I/O pad
voltage beyond the 3.6 V absolute maximum value and consequently cause damage to the I/O. When
set to 3.3 V LVTTL or 3.3 V LVCMOS mode, the I/Os can directly drive signals into 5 V TTL receivers.
VOL = 0.4 V and VOH = 2.4 V in both 3.3 V LVTTL and 3.3 V LVCMOS modes exceeds the VIL = 1.8 V
and VIH = 2 V level requirements of 5 V TTL receivers. Therefore, level 1 and level 0 are recognized
correctly by 5 V TTL receivers.
Temperature Sensing
This feature is used as an internal thermometer to provide a way for monitoring the die temperature. This
is a temperature sense diode located in lower left corner of the device. The temperature sensing diode
has one dedicated pin, PTEMP, connected to the anode of the diode. The cathode of the diode is
connected to the VSS of the die. The diode is a passive device and the pins are always attached to the
die. The PTEMP pin can be left floating, if the feature is not being used. There is nothing that needs to be
programmed in software to enable this temperature sensing feature. In order to use the temperature
sensing diode, it must be calibrated by user software and/or circuits. To measure the temperature, check
the voltage drop between PTEMP and VSS.
I/Os in Shared By Fabric and FDDR
DDRIOs with FDDR
If FDDR is selected, Libero SoC automatically connects FDDR signals to the DDRIOs. Depending on the
memory configuration, only the required DDRIOs are used by Libero SoC. The unused DDRIOs are
available to connect to the FPGA fabric.
DDRIOs with Fabric
If FDDR is not selected, DDRIOs are available to the FPGA fabric. DDRIOs must be configured manually
in Libero SoC.
MSIOs/MSIODs with Fabric
There are two macros in silicon called DDR_IN and DDR_OUT and these can be connected to a DDR
controller soft ip core in fabric. You can use the I/O standards of MSIO and MSIODs for DDR controllers
which can not be supported by dedicated DDRIO bank. MSIOs/MSIODs are available to the FPGA fabric
and must be configured manually in Libero SoC.
JTAG I/O
The system controller implements the functionality of a JTAG slave, with IEEE 1532 support, which also
implies IEEE 1149.1 compliance. JTAG communicates with the system controller using a Command
register that conveys the JTAG instruction to be executed and a 128-bit data I/O buffer that transfers any
associated data.
The JTAG pins can be run at any voltage from 1.5 V to 3.3 V (nominal). The IO voltage of this interface is
set by powering the VJTAG power pin with the desired IO voltage. Core voltage must also be powered for
the JTAG state machine to operate, even if the device is in Bypass mode. VJTAG power alone is
insufficient. Both VJTAG and core voltage to the RTG4 part must be supplied to allow JTAG signals to
transit the RTG4 device. Isolating the JTAG power supply in a separate I/O bank gives greater flexibility
with supply selection and simplifies power supply and PCB design. If the JTAG interface is not used and
not planned for use, the VJTAG pin together with the TRSTB pin must be tied to GND.
Revision 2
103
I/Os
The TAP controller is a state machine whose transitions are controlled by the TMS signal and controls
the behavior of the JTAG system. The TAP controller uses 8-bit instructions consistent with previous
Microsemi product families.
There are two types of TAP controllers.
•
Fabric TAP
•
Auxiliary TAP
Table 6-12 • JTAG Pin Description
Name
JTAGSEL
Type
Bus Size
In
1
Description
JTAG controller selection
Depending on the state of the JTAGSEL pin, an external JTAG controller detects the
FPGA fabric TAP/auxiliary TAP. The JTAGSEL pin must be connected to an external
pull-up resistor such that the default configuration selects the FPGA fabric TAP.
TCK
In
1
•
Logic 1: FPGA fabric TAP selected
•
Logic 0: AUX TAP selected
Test Clock
Serial input for JTAG boundary scan, ISP, and UJTAG. The TCK pin does not have
an internal pull-up/-down resistor. If JTAG is not used, Microsemi recommends tying
it off TCK to GND or VJTAG through a resistor placed close to the FPGA pin. This
prevents JTAG operation in case TMS enters an undesired state.
To operate at all VJTAG voltages, the resistor values mentioned in Table 6-13 on
page 105 are recommended.
TDI
In
1
Test Data Input
Serial input for JTAG boundary scan, ISP, and UJTAG usage. There is an internal
weak pull-up resistor (10K) on the TDI pin.
TDO
Out
1
Test Data Output
Serial output for JTAG boundary scan, ISP, and UJTAG usage.
TMS
1
Test Mode Select
The TMS pin controls the use of the IEEE1532 boundary scan pins (TCK, TDI, TDO,
and TRSTB). There is an internal weak pull-up resistor (10K) on the TMS pin.
TRSTB
1
Boundary scan reset pin. The TRSTB pin functions as an active low input to
asynchronously initialize (or reset) the boundary scan circuitry. There is an internal
weak pull-up resistor (10K) on the TRSTB pin. If JTAG is not used, an external pulldown resistor must be included to ensure the TAP is held in Reset mode. The
resistor values must be selected from Table 6-13 on page 105 and must satisfy the
parallel resistance value requirement. The values in Table 6-13 on page 105
correspond to the resistor recommended when a single device is used. The values
correspond to the equivalent parallel resistor when multiple devices are connected
through a JTAG chain.
In safety critical applications (Avionics mode), an upset in the JTAG circuit could
allow entering an undesired JTAG state. In such cases, Microsemi recommends
tying off TRSTB to GND through a resistor placed close to the FPGA pin. This keeps
JTAG circuitry in Reset state.
104
R e vi s i o n 2
UG0574: RTG4 FPGA Fabric User Guide
Table 6-13 • Recommended Tie-Off Values for the TCK and TRST Pins
Tie-Off Resistance1, 2
VJTAG
VJTAG at 3.3 V
200 Ohm to 1 KOhm
VJTAG at 2.5 V
200 Ohm to 1 KOhm
VJTAG at 1.8 V
500 Ohm to 1 KOhm
VJTAG at 1.5 V
500 Ohm to 1 KOhm
Notes:
1. The TCK pin can be pulled up/down. If it is pulled-up, it should w.r.t. VJTAG voltage.
2. The TRSTB pin can only be pulled down.
3. Equivalent parallel resistance if more than one device is on JTAG chain.
Dedicated I/Os
The RTG4 devices have the following dedicated I/Os:
•
Device Reset I/O
•
SERDES I/O
Device Reset I/O
RTG4 devices have a dedicated input reset. Anytime reset is asserted, the whole chip will be reset. The
device reset feeds the system controller, which generates the system reset for the reset controller to
reset the entire device. Figure 6-7 shows the full chip reset flow from device reset. The Libero SoC tool
allows to configure the reset controller using the System Builder.
System Controller
Reset Controller
DEVRST_N
Chip Level Resets
System Resets
Figure 6-7 • Chip Level Resets From Device Reset
Port List and I/O Pins
Table 6-14 • Device Reset I/O Pin
Pin
Type
I/O
Description
DEVRST_N Analog Input
Device reset, asserted low, and powered by VPP.
Revision 2
105
I/Os
SERDES I/O
The SERDES I/Os available in RTG4 devices are dedicated to high-speed serial communication
protocols. For more information, refer to the SERDES section in the RTG4 FPGA High Speed Serial
Interfaces User Guide. The SERDES I/O supports protocols such as PCI Express 2.0, XAUI, serial
gigabit media independent interface (SGMII), serial rapid I/O (SRIO), and any user-defined high speed
serial protocol implementation in the fabric. These protocols access the SERDES lanes through the
physical media attachment (PMA) and physical coding sub layer (PCS) of SERDES interface. The
detailed configuration of the SERDES interface for various protocols is explained in the "SERDESIF
Block" chapter of the RTG4 FPGA High Speed Serial Interfaces User Guide. This section describes the
SERDES I/O pins, SERDES I/O banks, SERDES I/O standards, and board-level design considerations
available.
SERDES I/O Banks
The SERDES I/Os reside in the dedicated I/O banks. The number of SERDES I/Os depends on the
device size and pin count. For example, the RT4G150 device has four SERDES_IFs (SERDES_IF0,
SERDES_IF1, SERDES_IF2 and SERDES_IF3), which reside on four I/O banks.
Refer to the RTG4 FPGA High Speed Serial Interfaces User Guide for details on I/O bank locations and
I/O electrical specifications.
SERDES I/O Pins
Each SERDES interface in the RTG4 device has four SERDES I/O data lanes or 16 SERDES I/Os
available for accessing the SERDES interface (SERDESIF block). Each data lane has two pairs of
differential signals: one for transmit data (TxDP, TxDN) and other for receive data (RxDP, RxDN). Data
Ianes are multiplexed to support different serial protocols and scalable to various link widths - ×1, ×2, and
×4. These settings can be configured in the SERDES_IF macro using Libero SoC design software. Each
SERDES_IF has two sets of dedicated power, clock, and reference signals. One set for data lane 0 and
lane 1 and another for data lane 2 and lane 3. For more information on SERDES I/O and power pin
names and descriptions, refer to the RTG4 FPGA Pin Descriptions.
Dedicated Global I/Os
Dedicated global I/Os are dual-use I/Os, which can drive the global blocks directly or through clock
conditioning circuits (CCC). They can also be used as regular user I/Os. These global I/Os are the
primary source to bring external clock inputs into the RTG4 device.
Unused dedicated global I/Os behave similarly to unused regular User I/Os (MSIO, MSIOD, DDRIO).
Libero configures unused User I/Os as input buffer is disabled, output buffer is tristated with weak pullup.
The RTG4 devices have 36 I/Os, which are dedicated for global clocks. Out of these 36 global clocks, 12
are dedicated for SERDES clocks.
GRESET generates a global asynchronous reset signal during power-up or programming, and allows the
user to apply an asynchronous reset on the fabric flip-flops globally, if required. For more information on
Global I/Os, refer to the "Fabric Global Routing Resources" chapter of the RTG4 FPGA Clocking
Resources User Guide.
106
R e vi s i o n 2
A – Glossary
Acronyms
uSRAM
Micro static random access memory
CCC
Clock conditioning circuits
LSRAM
Large static random access memory
LSB
Least significant bit
ECC
Error correction code
MSB
Most significant bit
STMR
Self-corrected triple module redundancy
DDRIO
Double data rate input output
FDDR
Controller for external DDR memory
IOA
Input output analog
IOD
Input output digital
LPDDR
Low power double data rate memory
ODT
On-die termination
HSTL
High-speed transceiver logic
SSTL
Stub series terminated logic
LVDS
Bus LVDS
ESD
Electrostatic discharge protection
Revision 2
107
Glossary
HSTL
High-speed transceiver logic
LPE
Low power exit
LVPECL
Low-voltage positive emitter coupled logic
LVTTL
Low voltage transistor transistor logic
MLVDS
Multipoint LVDS
MSIO
Multi-standard I/O
MVN
MultiView Navigator
ODT
On-die termination
RSDS
Reduced swing differential signaling
SERDES
Serializer/deserializer
108
R e vi s i o n 2
UG0574: RTG4 FPGA Fabric User Guide
Terminology
Clusters
Clusters are formed by grouping a certain number of logic elements and interconnecting them. This is
related to the clustered routing architecture of the RTG4 FPGA fabric.
Interface Cluster
An interface cluster is formed by grouping 12 interface logic elements.
I/O Cluster
I/O cluster is formed by grouping either three or four I/O modules.
Interface Logic
The logic element consists of a 4-input LUT and a STMR-flip-flop. This logic element interfaces the hard
macros (LSRAMs, uSRAMs, and mathblocks) to fabric routing.
I/O Module
The logic element consists of flip-flops and routing multiplexers. This logic element interfaces the user
I/Os to fabric routing.
Inter-cluster Routing
Inter-cluster routing refers to routing resources between various types of clusters.
Intra-cluster Routing
Intra-cluster routing refers to routing resources existing inside a specific cluster.
Logic Cluster
A logic cluster is formed by grouping 12 logic elements.
Logic Element
The basic logic element in the RTG4 FPGA fabric, consisting of a 4-input LUT, a D-flip-flop, and a
dedicated carry chain.
Flow-Through Read
A read operation performed with the output not being registered by the output pipeline registers.
Pipelined Read
A read operation performed with the output being registered by the output pipeline registers.
Simple Write
A write operation in which the data written does not appear on the SRAM output ports.
Feed-Through Write (Write-Bypass Write)
A write operation in which the data written appears on the SRAM output ports immediately for nonpipeline mode and next clock cycle for pipeline mode.
Dual-Port Mode
SRAM with two independent ports through which both read and write operation can be done.
Two-Port Mode
SRAM with two ports, one dedicated to read operations and the other dedicated to write operations.
Multi-Channeling
Multi-threading done for a chain of mathblocks
Multi-Threading
Using a mathblock for performing more than one computation by time multiplexing it.
Pipelined Operation
The mode of operation where the mathblock output is registered at the pipeline registers.
Revision 2
109
Glossary
STMR
Self-corrected triple module redundancy
Transparent Mode
Non-registered/Non-pipelined mode
Inference
Using RTL to infer mathblocks
Bus Keeper
Holds the signal on an I/O pin at its last driven state.
Hot Insertion
Capability to connect I/O to external circuitry even after power-up.
Low Power Exit
Logic for the chip to come out from low power state.
110
R e vi s i o n 2
B – List of Changes
The following table shows important changes made in this document for each revision.
Date
Revision 2
(April 2015)
Changed Chapters
Updated the document with FTC inputs (SAR 63317).
NA
Updated "Features" section (SAR 65204).
18
Updated "Features" section and Table 5-3 (SAR 66075).
18 and 66
Updated "ECC" section (SAR 65236).
52
Updated Table 6-3 (SAR 64973).
92
Updated Table 6-10 (SAR 64447).
98
Updated "Low Voltage CMOS (LVCMOS)" section (SAR 64028).
89
Added "Cold Sparing" and "Dedicated Global I/Os" sections (SAR 64970 and
64971).
Revision 1
(November 2014)
List of Changes
100 and 106
Updated "Programmable Slew Rate Control" section.
93
Added "uPROM" chapter.
56
Replaced GSR_N signal with ARST_N (SAR 66394).
NA
Initial release.
NA
Revision 2
111
C – Product Support
Microsemi SoC Products Group backs its products with various support services, including Customer
Service, Customer Technical Support Center, a website, electronic mail, and worldwide sales offices.
This appendix contains information about contacting Microsemi SoC Products Group and using these
support services.
Customer Service
Contact Customer Service for non-technical product support, such as product pricing, product upgrades,
update information, order status, and authorization.
From North America, call 800.262.1060
From the rest of the world, call 650.318.4460
Fax, from anywhere in the world, 408.643.6913
Customer Technical Support Center
Microsemi SoC Products Group staffs its Customer Technical Support Center with highly skilled
engineers who can help answer your hardware, software, and design questions about Microsemi SoC
Products. The Customer Technical Support Center spends a great deal of time creating application
notes, answers to common design cycle questions, documentation of known issues, and various FAQs.
So, before you contact us, please visit our online resources. It is very likely we have already answered
your questions.
Technical Support
For Microsemi SoC Products Support, visit
http://www.microsemi.com/products/fpga-soc/designsupport/fpga-soc-support
Website
You can browse a variety of technical and non-technical information on the SoC home page, at
www.microsemi.com/soc.
Contacting the Customer Technical Support Center
Highly skilled engineers staff the Technical Support Center. The Technical Support Center can be
contacted by email or through the Microsemi SoC Products Group website.
Email
You can communicate your technical questions to our email address and receive answers back by email,
fax, or phone. Also, if you have design problems, you can email your design files to receive assistance.
We constantly monitor the email account throughout the day. When sending your request to us, please
be sure to include your full name, company name, and your contact information for efficient processing of
your request.
The technical support email address is [email protected].
Revision 2
112
UG0574: RTG4 FPGA Fabric User Guide
My Cases
Microsemi SoC Products Group customers may submit and track technical cases online by going to My
Cases.
Outside the U.S.
Customers needing assistance outside the US time zones can either contact technical support via email
([email protected]) or contact a local sales office. Sales office listings can be found at
www.microsemi.com/soc/company/contact/default.aspx.
ITAR Technical Support
For technical support on RH and RT FPGAs that are regulated by International Traffic in Arms
Regulations (ITAR), contact us via [email protected]. Alternatively, within My Cases, select
Yes in the ITAR drop-down list. For a complete list of ITAR-regulated Microsemi FPGAs, visit the ITAR
web page.
Revision 2
113
Microsemi Corporation (Nasdaq: MSCC) offers a comprehensive portfolio of semiconductor
and system solutions for communications, defense & security, aerospace and industrial
markets. Products include high-performance and radiation-hardened analog mixed-signal
integrated circuits, FPGAs, SoCs and ASICs; power management products; timing and
synchronization devices and precise time solutions, setting the world’s standard for time; voice
processing devices; RF solutions; discrete components; security technologies and scalable
anti-tamper products; Power-over-Ethernet ICs and midspans; as well as custom design
capabilities and services. Microsemi is headquartered in Aliso Viejo, Calif., and has
approximately 3,400 employees globally. Learn more at www.microsemi.com.
Microsemi Corporate Headquarters
One Enterprise, Aliso Viejo,
CA 92656 USA
Within the USA: +1 (800) 713-4113
Outside the USA: +1 (949) 380-6100
Sales: +1 (949) 380-6136
Fax: +1 (949) 215-4996
E-mail: [email protected]
© 2015 Microsemi Corporation. All
rights reserved. Microsemi and the
Microsemi logo are trademarks of
Microsemi Corporation. All other
trademarks and service marks are the
property of their respective owners.
Microsemi makes no warranty, representation, or guarantee regarding the information contained herein or
the suitability of its products and services for any particular purpose, nor does Microsemi assume any
liability whatsoever arising out of the application or use of any product or circuit. The products sold
hereunder and any other products sold by Microsemi have been subject to limited testing and should not
be used in conjunction with mission-critical equipment or applications. Any performance specifications are
believed to be reliable but are not verified, and Buyer must conduct and complete all performance and
other testing of the products, alone and together with, or installed in, any end-products. Buyer shall not rely
on any data and performance specifications or parameters provided by Microsemi. It is the Buyer's
responsibility to independently determine suitability of any products and to test and verify the same. The
information provided by Microsemi hereunder is provided "as is, where is" and with all faults, and the entire
risk associated with such information is entirely with the Buyer. Microsemi does not grant, explicitly or
implicitly, to any party any patent rights, licenses, or any other IP rights, whether with regard to such
information itself or anything described by such information. Information provided in this document is
proprietary to Microsemi, and Microsemi reserves the right to make any changes to the information in this
document or to any products and services at any time without notice.
50200574-2/04.15