QDRII+ SRAM Controller MACO Core User Guide

ispLever
CORE
TM
QDRII+ SRAM Controller MACO Core
User’s Guide
June 2008
ipug45_01.5
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Introduction
Lattice’s QDRII and QDRII+ (QDRII/II+) SRAM Controller MACO™ core assists the FPGA designer’s efforts by
providing pre-tested, reusable functions that can be easily plugged in, freeing designers to focus on their unique
system architecture. These blocks eliminate the need to “re-invent the wheel,” by providing industry-standard
QDRII/II+ memory controller modules. These proven cores are optimized utilizing the LatticeSCM™ device’s
MACO architecture, resulting in fast, small cores that use the latest architecture to its fullest.
Figure 1. Lattice Semiconductor MACO Conceptual Diagram
MACO
Soft IP
LatticeSCM
FPGA Fabric
User Logic
Interface
Memory
Interface
Lattice
IPexpress
Lattice QDRII+
MACO Solution
PLL
DLL
Complementing the Lattice ispLEVER® software is the support to generate a number of user-customizable cores
with the IPexpress™ utility. This utility assists the designer to input design information into a parameterized design
flow. Designers can use the IPexpress software tool to help generate new configurations of this IP core. Specific
information on bus size, clocking, and memory device requirements are prompted by the GUI and compiled into the
FPGA design database. The utility generates templates and HDL-specific files needed to synthesize the FPGA
design.
IPexpress, the Lattice IP configuration utility, is included as a standard feature of the ispLEVER design tools.
Details regarding the usage of IPexpress can be found in the IPexpress and ispLEVER on-line Help systems. For
more information on the ispLEVER design tools, visit the Lattice web site at www.latticesemi.com/software.
Overview
The second generation Quad-Data-Rate (QDRII/II+) Static Random Access Memory (SRAM) Controller is a general-purpose memory controller that interfaces with industry standard QDRII/II+ SRAM. The controller can be configured to function in two-word burst or four-word burst modes. It can also be configured to have an 18-bit bus or a
36-bit data bus. The data is transferred on both edges of the clock, doubling the rate of data transfer. Separate read
and write data buses again double the data rate.
This user’s guide explains the functionality of the Lattice’s QDRII/II+ Controller core.
2
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Features
• Interfaces to industry standard QDRII/II+ SRAM
• Supports QDRII SRAM memory devices operating up to 250MHz
• Supports QDRII+ SRAM memory devices operating up to 375MHz (highest speed grade)
• FPGA can be configured for 18-bit or 36-bit read and write memory data buses (on FPGA, 36-bit or 72-bit data
buses)
• Shared address bus can be configured from 17 bits to 20 bits wide
• Programmable burst lengths of two or four
• Maximum read/write blocks of 31 consecutive locations
Core Deliverables
• Sample instantiation (template)
• Synthesis black box for MACO core
• Pre-compiled ModelSim® MACO core model
• Verilog core source code
• Evaluation design
– Verilog test bench
• Preference files
Getting Started
Requirements to implement a MACO core include:
• ispLEVER 7.0 or later
• MACO design kit
• MACO license file
For information on obtaining the above requirements, please contact your local Lattice Semiconductor sales representative.
Functional Description
The QDRII/II+ Controller comprises an FPGA logic block and an ASIC block. The FPGA logic is sometimes
referred to as the “soft IP” because it is programmed into the FPGA along with the user application. The embedded
ASIC block is called the MACO “hard IP”, because as an ASIC, it is a permanent part of the device.
Together, these components are provided as intellectual property (IP) by Lattice Semiconductor in a single unit,
called qdr_ip_top. This should be instantiated as a single component in the user’s design. Figure 2 depicts the
interface to the qdr_ip_top.
3
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Figure 2. QDR Controller Core, qdr_ip_top
QDRII+ SRAM Controller MACO Core
clk
Internal Clocks
and
PLL Status Reports
k_clk
pll0_lock
dll0_lock
cfg_qdr_bmode
Control and Resets
ff_rst_n
2*(DATA_WIDTH)
qdr_write_data_[(2*DATA_WIDTH -1):0]
DATA_WIDTH
D_[(DATA_WIDTH -1):0]
ADDR_WIDTH
A_[(ADDR_WIDTH -1):0]
CQ
R_N
DATA_WIDTH
Q_[(DATA_WIDTH -1):0]
FPGA Side Write Ports
W_N
Internal and
External
Memory Controller
Interface Ports
qdr_write_block_length_[4:0]
ADDR_WIDTH
qdr_write_addr_[(ADDR_WIDTH -1):0]
External (User Side) FPGA Interface
K_N
FPGA Side Read Ports
K
External (Line Side) I/O Pad Interface
5
qdr_data_ready
qdr_wcmd_fifo_wenab
qdr_wcmd_fifo_full
qdr_wcmd_fifo_full_m1
qdr_wcmd_fifo_full_m2
qdr_wcmd_fifo_empty
ADDR_WIDTH
qdr_read_addr_[(ADDR_WIDTH -1):0]
5
qdr_read_block_length_[4:0]
2*(DATA_WIDTH)
qdr_read_data_[2*(DATA_WIDTH -1):0]
qdr_read_data_valid
qdr_rcmd_fifo_wenab
qdr_rcmd_fifo_full
qdr_rcmd_fifo_full_m1
qdr_rcmd_fifo_full_m2
qdr_rcmd_fifo_empty
There are two major interfaces to the qdr_ip_top, the FPGA User Application Interface and the QDRII/II+ SRAM
interface. The FPGA User Application Interface communicates with the on FPGA application logic designed by the
user. The QDRII/II+ SRAM interface communicates directly with the FPGA pins connected to the external SRAM
device. No additional user logic is required between the QDRII/II+ Controller core and the QDRII/II+ SRAM.
Differences Between QDRII and QDRII+
The LatticeSCM QDR Memory Controller supports both the QDRII and the QDRII+ protocols. The QDRII+ protocol
has been introduced as a higher-speed enhancement to the earlier QDRII protocol. QDRII+ incorporates the signal
QVLD, which accompanies the Q bus and indicates valid data on that bus. QVLD is edge-aligned with CQ/CQ#,
and precedes the valid Q data by one-half clock cycle. The differences between QDRII and QDRII+ are summarized in Table 1.
4
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Table 1. Differences Between QDRII and QDRII+
Feature
QDRII
QDRII+
Bit Rate per Data Bus Signal (Max)
400 Mbps*
750 Mbps*
Total Bandwidth for 36-Bit Read/Write Buses
28.8 Gbps
54.0 Gbps
QVLD Support
No
Yes
C Clock Support
Yes
No
Burst Mode Size
2-Word
4-Word
4-Word
Read Latency
1.5 Clocks
2 Clocks
2.5 Clocks
I/O
1.5V HSTL
1.8V HSTL
1.5V HSTL
1
1. See the QDRII/II+ Memory Controller Performance table in this document for device-specific supported speeds.
Parameter Descriptions
Several configuration and timing parameters must be set before the QDRII/II+ SRAM Controller Module can be
interfaced to a memory device. To ensure maximum flexibility in using the IP Core, these parameters are designed
as inputs to the IP core that can be tied to desired values within the top level RTL file. These values are input via
the IPexpress GUI utility capturing the parameters into the user’s customized core. The user inputs physical and
actual timing information to reflect their memory design into the GUI. This data is processed to format the pertinent
parameters needed to compile their customized design.
The QDRII/II+ IP parameters include clocking preferences. The user can customize the width of the address and
data buses and can choose between 4- or 2-word memory modes (4-word only for QDRII+). Sizing of the write and
read command FIFO is also permitted.
Below is an example of the “qdr2_define.v” file generated by IPexpress for a QDRII/II+ memory application. This file
incorporates the user’s design-specific information that is processed for the HDL generation.
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
MCTL25
QDR_II_PLUS
QDR_DATA_18
QDR_ADDR_WIDTH
MAX_QDR_ADDR_WIDTH
QDR_4WB
QDRPLS_2P0L_4WB
WCMD_FIFO_ASIZE 2
RCMD_FIFO_ASIZE 2
PINOUT_BOTTOM
18
20
Internal PLL and DLL
A PLL is used to derive the internal clock, k_clk, from the reference clock, clk. The user can define the relationship
between these two clocks via a setting in the GUI. The pll0_lock output can be used to determine when the PLL frequency is locked.
The SRAM clocks, K and K_N, operate at the same frequency as k_clk. For this reason, the user should set the
PLL to produce a clock frequency that matches the desired memory operation frequency.
The QDRII/II+ SRAM standard requires the clocks to the memory to be phase-shifted 90 degrees with respect to
data. Likewise, it is necessary to shift the echo clock, CQ, coming back from the SRAM, by 90 degrees. This keeps
the clocks in the center or “eye” of the data, providing ample setup and hold times. This is accomplished by a DLL
internal to qdr_ip_top.
5
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
I/O Signals
System Signals
Table 2 shows the system clock and signals. All signals are active-high unless otherwise noted. These signals are
in the reference clock domain. All other signals are in the k_clk clock domain.
Table 2. System Clock and Signals
Signal Name
Signal Direction
(I/O)
Signal Width
Signal Description
clk
I
1
Reference Clock. This clock is the input to the QDR_IP
PLL.
ff_rst_n
I
1
System Reset (active-low). Synchronized to K clock.
QDR Burst Mode Configuration
cfg_qdr_bmode1
I
1 = 2 word burst (QDRII only)
0 = 4 word burst
1
This signal is static, not clocked. It should be assigned to
the value set in the GUI.
1. This signal is set by one of the parameters set by the ispLEVER GUI.
User Application Interface Signals
The QDRII/II+ Controller core provides a simple FIFO-based interface to receive read and write commands. Since
the FIFOs reside in the soft IP, the user may change their depth via the GUI. By default, these FIFOs are four commands deep.
When the write enable signal for either FIFO goes high, the data present on the address and block length buses
will be written on the rising edge of k_clk. The width of the address bus can also be varied by a setting in the GUI,
to match the address bus of the memory in use. The maximum block length is fixed at 31.
Each FIFO provides empty and full signals. In addition, they provide full minus one and full minus two signals two
provide advanced warning and avoid loss of commands.
Two buses and two handshake signals are provided to manage data traffic. When the QDRII/II+ Controller drives
qdr_data_rdy active, it indicates that it has accepted the previous value on the write data bus and is ready for a new
one. When it drives qdr_read_data_valid active, it indicates that the data on the read bus is valid. The read and
write bus widths are configured via the GUI when the user selects either 18-bit or 36-bit memory data words.
Table defines the signals that communicate data and control between the user application and the QDRII/II+ Controller core. All signals are active-high unless otherwise noted. These signals are in the k_clk clock domain.
Table 3. FPGA Application Interface I/O Signals
Signal Name
qdr_write_addr
Signal
Direction
(I/O)
Signal
Width
I
17-20
Description
Write address.
qdr_write_block_length
I
5
Write block length.
qdr_wcmd_fifo_wenab
I
1
Causes write command address and block length to be written.
qdr_read_addr
I
17-20
qdr_read_block_length
I
5
Read block length.
Causes read command address and block length to be written.
qdr_rcmd_fifo_wenab
I
1
qdr_write_data
I
36/72
k_clk
O
1
Read address.
Write data bus.
Internal clock, derived from system clock, clk.
6
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Table 3. FPGA Application Interface I/O Signals (Continued)
Signal
Direction
(I/O)
Signal
Width
qdr_wcmd_fifo_empty
O
1
Write command FIFO is empty.
qdr_wcmd_fifo_full
O
1
Write command FIFO is full.
qdr_wcmd_fifo_full_m1
O
1
Write command FIFO is almost full (minus 1).
Signal Name
Description
qdr_wcmd_fifo_full_m2
O
1
Write command FIFO is almost full (minus 2).
qdr_rcmd_fifo_empty
O
1
Read command FIFO is empty.
qdr_rcmd_fifo_full
O
1
Read command FIFO is full.
qdr_rcmd_fifo_full_m1
O
1
Read command FIFO is almost full (minus 1).
qdr_rcmd_fifo_full_m2
O
1
Read command FIFO is almost full (minus 2).
qdr_data_rdy
O
1
The controller has accepted the data on qdr_write_data.
Accompanies valid data on bus qdr_read_data.
qdr_read_data_valid
O
1
qdr_read_data
O
36/72
Read data bus.
QDR SRAM I/O Signals
This group of signals provides a standard interface to a QDRII/II+ SRAM device. The outputs consist of a clock and
its inverse, a read strobe, write strobe, address bus and a write data bus. The width of the address bus and data
bus are both configured via the GUI. The controller provides an internal DLL to shift the clocks by 90 degrees. This
is done to provide adequate setup and hold time for the SRAM address and data input.
The inputs consist an echo clock and the read data bus. The width of the read data is also determined by a setting
in the GUI. The echo clock is sent from the QDRII/II+ SRAM along with the read data. This clock is used to account
for data-flight time across the board. Since both data and clock are in phase, the controller uses its internal DLL to
shift this clock by 90 degrees, insuring adequate setup and hold time for the read data.
Table 4 shows the signals connecting the QDRII/II+ Controller to the QDR SRAM. All signals are active-high unless
otherwise noted.
Table 4. QDR SRAM Interface I/O Signals
Signal Name
Signal
Direction
(I/O)
Signal Width
K
O
1
Input. K is the Memory Controller clock, delayed 90º.
KN is the inverse of K.
K_N
O
1
A
O
17-20
Signal Description
Address bus.
D
O
18/36
W_N
O
1
R_N
O
1
Active-LOW read enable.
CQ is the clock for the read data bus, “Q”. Note: the CQ# signal from the
QDRII/II+ SRAM is not used. Instead, both the rising and falling edges of
CQ are used to clock incoming data.
CQ
I
1
Q
I
18/36
QVLD
I
1
Write data bus.
Active-LOW write enable.
Read data bus.
Valid signal for read data bus Q.
Reference Design/Test Bench
Lattice supplies a reference design along with the QDRII/II+ Controller core. While the core design is intended for
use “as is”, the reference design provides a framework for testing the core. In the absence of a real user applica-
7
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
tion, the reference design provides synchronization between the external and internal clock domains, and pseudorandom data generation.
Using the supplied reference design and test bench as a guide, users can easily customize the verification of the
core by adding, removing, and customizing tests. The reference design is included in this package to demonstrate
how a design using the QDRII/II+ Controller core can be implemented.
Reference Design Block Diagram
Figure 3. Block Diagram of QDRII/II+ Controller Reference Design
QDR_TB_v2 (Simulation Test Environment)
qdr_top (Reference Design)
qdr_ip_top
(QDR Memory Controller IP)
pll0
dll0
k_clk
ref_clk
PLL k_clk_90
DLL
mt54w512h36j
External
Memory
Side
us_qdr_v2_prbs_1.v
Micron
QDR
Memory
Module
FPGA Array
PRBS
Data
Generator
& Checker
User
Side
(FPGA
Array)
MCTL
TDI
TMS
TCK
TDO
MACO ASIC Gates
USI
Bus
JTAG
MPI BUS
Systembus
mpu_8_us_um
The QDRII/II+ Controller reference design consists of the following blocks:
1. Pseudo-random data generator
2. System bus
3. JTAG
4. QDR IP module
5. Micron memory module test bench
The external QDRII/II+ SRAM Interface I/O signals run directly between the QDRII/II+ IP core and the pads. There
is no extra logic between them in the reference design. Their function is identical to that described in the previous
section.
QDRII/II+ MACO Memory Controller Design Kit Directory
The directory structure of the QDRII/II+ MACO Memory Controller IP, as generated by the IPexpress GUI, is shown
in Figure 4.
8
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
A more detailed description of the files generated, as well as information on installation, functional simulation, synthesis, design implementation and timing simulation, is given in the “readme.htm” file. This Readme file can be
invoked in IPexpress by clicking on the “Help” button of the GUI, as shown in Figure 5. It can also be found in the
qdr_maco_eval directory.
Figure 4. QDRII/II+ MACO IP Design Kit Directory Structure
qdr_maco_eval
<username>
impl
precision
synplify
sim
aldec
rtl
script
timing
modelsim
rtl
script
timing
work
src
params
top
testbench
memory
top
help_files
models
scm
support
9
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Figure 5. GUI Dialog Box for QDRII/II+ Memory Controller
Table 5. GUI Dialog Box for QDRII/II+ Memory Controller
Parameter
Description
Project Path
This is the directory in which the project will be generated
File Name
Enter the project name
Design Entry
The design entry mode is Verilog HDL
Device Family
The device family is LatticeSC
Part Name
Select the desired LatticeSC device size, speed grade and package
10
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Figure 6. GUI Dialog Box for QDRII/II+ Memory Controller Clocks
Table 6. GUI Dialog Box for QDRII/II+ Memory Controller Clocks
Parameter
Description
Input Clock Frequency
Specify the frequency of the input clock to the memory controller
Clock Multiplier
Set this value to the ratio of the desired Memory Controller Clock Frequency and the
selected Input Clock Frequency.
Memory Controller Clock Frequency
The memory Controller Clock Frequency is the operating frequency of the QDRII/II+
device. It is calculated by IPexpress, and is set to (Input Clock Frequency) * (Clock Multiplier). Result value is up to 375 MHz.
11
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Figure 7. GUI Dialog Box for QDRII/II+ Memory Controller Options
Table 7. GUI Dialog Box for QDRII/II+ Memory Controller Options
Parameter
Description
Memory Controller Type
QDRII or QDRII+
Data Width
Data bus width: 18 or 36 bits
Address Width
Address bus width: 17-20 bits
Burst Mode
2-word or 4-word (depending on memory controller type)
Latency
1.5, 2 or 2.5 (depending on memory controller type)
Write Command FIFO Depth
4, 8, 16, 32 or 64
Read Command FIFO Depth
4, 8, 16, 32 or 64
Use QVLD
Use QVLD (QDRII+ only)
12
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Figure 8. Figure X4. GUI Dialog Box for QDRII/II+ Memory Controller Location
Table 8. GUI Dialog Box for QDRII/II+ Memory Controller Location
Parameter
Description
LL: Left MACO, Left Pinout
The left-side MACO used for the QDRII/II+ controller,
and the pinout is on the left side.
LC: Left MACO, CIB Pinout
The left-side MACO used for the QDRII/II+ controller,
and the pinout is CIB.
LB: Left MACO, Bottom Pinout
The left-side MACO used for the QDRII/II+ controller,
and the pinout is on the bottom side.
RB: Right MACO, Bottom Pinout
The right-side MACO used for the QDRII/II+ controller,
and the pinout is on the bottom side.
RC: Right MACO, CIB Pinout
The right-side MACO used for the QDRII/II+ controller,
and the pinout is CIB.
RR: Right MACO, Right Pinout
The right-side MACO used for the QDRII/II+ controller,
and the pinout is on the left side.
Add SMI Port Interface for PLL and DLL
Check this box to enable run-time access to PLL and
DLL memory-mapped parameters
13
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Design Guidelines to Optimize Performance
Master Clock
The master reference clock can be sourced from any clock source, either internal or external to the LatticeSC
device. If the source is external, it should use the direct input pin for that PLL’s CLKI input (refer to Table 9). Also,
minimize clock jitter caused by coupling from noisy neighboring signals (refer to the accompanying discussion,
“Selecting a Pin That Has Low Jitter Noise” below). Note that the PLL will filter some of the jitter that exists at the
PLL’s input.
Implementation Details
The following section discusses implementation details, such as pinout selection, clock, PLL and DLL considerations, as well as PCB layout guidelines for optimum performance.
PCB Layout and On-Chip Pinout Considerations
This section discusses some areas of the QDRII/II+ Memory Controller design that require particular attention, and
offers recommendations that will lead to a more robust solution.
Master Clock and its PLL
• The Master Clock can originate from a variety of sources (input pin, another PLL, SERDES clock, FPGA logic,
etc.). This clock drives a PLL via any primary clock net.
• If the Master Clock is sourced by an input pin (or pin pair), use the pin(s) designated for the chosen PLL for that
purpose (refer to Table 9), and observe the recommendations below for minimizing jitter noise.
Clocking Challenges and Solutions
Figure 9 illustrates the clocking network. Several unique features of the LatticeSC architecture are utilized in this
design. A PLL [1] is used to perform frequency multiplication of the input clock “refclk”, and at the same time to generate a second clock that is 90° lagging, so that the clocks “K” and “K#” to the QDRII/II+ SRAM can transition in the
center of the data eye of bus “D”.
Both “k_clk” and “k_clk shifted 90°” are typically routed on primary clock nets so that there is very little skew from
the ideal 90° offset. The clocks “K” and “K#” are then generated using the same DDR output buffer elements as are
used in the buffers for output data and control signals, so that once again very little skew is introduced. These two
clocks are generated by simply sending a constant “10” pattern on outputs that are in every other respect identical
to the data and control outputs.
A Valid Timing Chain [2] generates a data valid signal at the correct time to line up with the returning read data by
duplicating the latency in the external QDRII/II+ SRAM and board routing. This is necessary because there is nothing returned from the QDRII/II+ SRAM with the read data to identify the valid data. Note that for 2-word bursts, the
valid is asserted for one full clock (two half-clocks), and for 4-word bursts, it is asserted for two full clocks (four halfclocks). Note also that the number of registers in the timing chain varies to match the read latency (1.5, 2.0 or 2.5)
of the QDRII/II+ SRAM. The Valid Timing Chain straddles two clock domains having the same frequency but different phases, and performs the clock domain transition between them. The phase difference represents all the cumulative delays in the external path: board trace delays (in both directions), and delay from K/K# to CQ/CQ#. The
clocking scheme described here can accommodate and compensate for approximately 1/2 clock cycle of variation
in this delay.
The input registers for the read data bus “Q” and signal "QVLD" [6] require some special clocking, and this need is
handled by special hardware capability. The input bus registers have two clock inputs. The first, ECLK, is fed by the
edge clock, to receive the data at the earliest time, since the edge clock net has less delay and skew to the I/O registers than the primary clock net. But if this data were to be sent directly to a register clocked by the primary clock,
the receiving register’s input hold time could be violated. Therefore, the input register also takes a second input
clock, SCLK, which is fed the primary clock. The register does not output the data until this clock’s edge, avoiding
14
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
any hold time issues. This clock domain transfer mechanism is built-in to the LatticeSC input buffers, thus allowing
operation at highest rates of speed.
For the return read data bus “Q” and its accompanying valid signal QVLD, a DLL [3] is employed to dynamically
generate a value that determines the proper delay to cause an effective 90° phase shift on CQ’s input buffer [5], so
that it too is positioned in the center of the data it captures. This takes advantage of the fact that the DLL and the
input buffers contain matching delay blocks, so that the delay selection value generated in the DLL when it generates a 90° shifted clock can also be used in the input buffer to cause the same phase shift. A 9-bit digital bus communicates this delay selection value from the DLL to the “CQ” input buffer.
The read data is then typically transferred from the “CQ” clock domain to the internal clock domain with the assistance of a synchronous FIFO.
For the “Q” data bus and signal QVLD, the delay elements [4] are also used in an “Edge Clock Injection Match”
mode. This compensates for the edge clock routing of the “CQ” input, thereby providing an optimal read data edge.
Manual changes to the input delay can also be made to each individual “Q” input pin to compensate for differences
in board trace delays.
15
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Figure 9. QDRII+ SRAM Memory Interface Clock and Data Paths
[1]
PLL
refclk
k_clk
CLKOP
CLKOS
CLKI
(primary clock)
(primary clock)
k_clk
0
shifted 90
CLKFB CLKINTFB
ODDRXA
K
Q
DA
DB
0
1
ODDRXA
K#
Q
DA
DB
1
0
ODDRXA
A[19:0],
WN,
RN
Q
FIFO
DA
DB
Q
Address
[2]
RN
D1
D2
D3
D4
D5
D Q
D Q
D Q
D Q
D Q
ODDRXA
D[35:0]
Q
Q
[3]
CLKI
Control
[4]
Fixed delay
Q[35:0]
QVLD
Logic
Net
Reg
DA
DB
D
Write Data
Data valid
DLL
CLKOP
CLKOS
UPDT DCNTL[8:0]
IDDRX1A
[6]
D
(edge clock)
(primary clock)
[5]
DA
DB
QA
QB
FIFO
D
Q
Read Data
ECLK
SCLK
o
90 phase delay
CQ
(primary clock)
(CQ#)
PCB Board Trace Matching
• All A, D, WN, RN and K/K# pins must have PCB board trace lengths matched to within 50 psec.
• All Q, QVLD and CQ pins must have PCB board trace lengths matched to within 50 psec.
Other Board-Level Considerations
• All dynamic signal traces must be 50 Ohm transmission lines.
• All power signals, including any VTT power, must be supplied by planes, not traces.
• Care must be taken to keep reference voltages, such as the QDRII/II+ device’s VREF pin, noise-free.This
involves robust, wide-bandwidth decoupling, and isolation of quiet, noise-sensitive signals from noise sources.
• The physical distance between the LatticeSC device and the QDRII/II+ needs to be minimized, since trace
16
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
delays, skews and signal degradation will limit overall speed, as previously discussed.
Selecting a Pin That Has Low Jitter Noise
When a signal, such as an input clock or the QDRII/II+ clock K/K# needs to be especially quiet with low-jitter, some
special design rules can help achieve this goal:
• It is highly preferable to place the pin in a bank that does not also contain single-ended output drivers. Figure 10
shows how bank groups form clusters around the package, in this case for a 256-pin fpBGA.
• If a quiet bank cannot be used, avoid creating inductively coupled paths linked to noisy signals on the package.
These occur when the low-noise signal trace passes through an area on the package substrate from pin to pad
that contains noisy signal pins or traces (in particular single-ended outputs, and especially when those singleended outputs are unterminated). Figure 10 also illustrates this concept. Two examples are shown:
– Example A shows a noisy output pin (G12, bank 2) that is near the package center, and a low-noise clock pin
(F16, bank 3) that is situated radially outward from that pin. In this case, the pin-to-pad connection for the
clock will route directly past the noisy output pin, resulting in coupled noise. This should be avoided.
– Example B demonstrates the reverse situation, which is also to be avoided. In this case, a noisy output pin
(M16, bank 3) is situated radially outward from a low-noise clock pin (L12, bank 4), so that the noisy output’s
pad-to-pin connection will pass over the clock pin.
– In order to minimize this coupling, it is typically better to place noise-sensitive pins toward the center of the
package. This reduces the trace length of this signal in the package, thus reducing coupling to this signal.
• Noise immunity may be further enhanced by providing extra “ground” pins around the sensitive signal, by driving
adjacent outputs to a constant LO and tying them to signal ground on the PCB. This can enhance noise immunity
in two ways: first, it provides extra signal current return paths, and second, it provides a buffer distance to nearby
signal pins, thus reducing coupling to their signals. The buffers should be set to the maximum drive strength
allowed at the bank’s VCCIO voltage.
Figure 10. Selecting a Pin for Low Jitter Noise
16
A
15
14
13
12
11
10
1
1
1
1
1
1
1
1
1
1
B
1
C
Example A
Noisy
SingleEnded
Output
LowNoise
Clock
Input
Example B
D
1
2
9
8
7
6
5
4
1
B
C
1
1
1
1
1
D
1
2
2
2
F
3
2
2
G
2
2
2
7
7
H
3
3
2
2
2
2
7
7
J
3
3
3
3
3
3
6
K
3
3
L
3
3
3
3
M
3
3
3
4
4
4
5
5
5
N
4
4
4
4
4
5
5
P
4
4
4
5
5
R
4
4
4
4
4
4
15
14
13
12
16
2
A
E
T
3
1
1
2
2
E
1
7
7
7
F
7
G
7
7
7
7
H
6
6
7
7
7
J
3
6
6
6
K
4
5
6
6
6
6
L
6
6
6
6
6
M
5
5
6
N
6
P
4
4
5
5
4
4
5
5
4
4
4
5
5
5
11
10
9
8
7
6
6
5
5
5
5
5
5
5
5
4
3
2
5
6
“7” indicates
I/O bank 7
R
T
1
Optimum Pinout Selection
In order to ensure that the demanding I/O timing requirements of QDRII/II+ devices will always be met, dedicated
signal paths from the MACO core to the I/O pins have been designed into the LatticeSCM devices. If the designer
17
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
chooses to use these optimized locations, I/O timing can be guaranteed, and will not change each time the device
undergoes map/place/route. These designated pinout assignments are given in Tables 11 (for the left-side MACO)
and 12 (for the right-side MACO). In addition, some flexibility has been provided by offering two sets of locations,
one on the side edge and one on the bottom edge, so that conflicts with other pinout placement requirements can
be resolved. Note that these special routings only apply to the signals that connect to the MACO block (address
and control); other signals (data and their clocks) have more freedom of placement, restricted only by the need to
place complete lanes in a single I/O bank adjacent to the PLLs and DLLs, as described previously. In addition to
the two pinout options described above, a third option is provided that interfaces the signals to the general FPGA
routing fabric. This allows the signals to be routed to any pin, or even to FPGA logic, albeit at the penalty of additional and variable routing delay. This option should only be considered when the QDRII/II+ Memory Controller is
being operated well below its maximum operating frequency.
General Considerations
• Lattice recommends simulation of Simultaneous Switching Outputs (SSOs) for the device/package combination
for performance targeted to over 200 MHz.
• Lattice also recommends that the LatticeSC device’s design be placed and routed before commitment of the
board design to manufacture.
Setting Design Timing Constraints
In order to ensure that a design will meet a specific speed requirement, the requirement must be called out as a
preference in the *.lpf file. The design kit gives an example of how this is done, and the values simply need to be
adjusted to meet the specific design’s requirements.
Note that the internal name of a clock net can change if the design is modified or if the synthesis engine version is
changed. In this case, the net names given in the design example will not be correct. To find the new net name, run
the synthesis flow through the map phase, and inspect the Map Report (*.mrp) file. It will list all the clock nets that
the mapper detected. Find the new net name in question and put it in the preference file in place of the old name.
Preferred Pinouts
The tables below show connections from I/O to logic that have been designed-in to be fast and consistent, so that
special signals such as clocks and timing-critical I/O can be guaranteed to always meet requirements. Tables 9 and
10 give the designated pins for driving the PLLs and DLLs respectively. This information is extracted from the
pinout tables in the LatticeSC Family Data Sheet. Tables 11 and 12 show the designated optimum-performance
pins for interfacing the QDRII/II+ Memory Controller to the QDRII/II+ device, for the left-side and right-side MACO
respectively.
Table 9. PLL Direct Input Pins (True/Complement Pair)
ULC PLL A
F900
FF1020
FC1152
FC1704
D3/D2
K25/J25
F30/G30
J37/J38
ULC PLL B
K4/J4
M23/N23
N25/P25
N33/P33
LLC PLL B
AC6/AC7
AC23/AD24
AG29/AG28
AN36/AP36
LLC PLL A
AH1/AJ1
AJ32/AK32
AM33/AN33
AU42/AV42
LRC PLL A
AJ30/AH30
AJ1/AK1
AN2/AM2
AV1/AU1
LRC PLL B
AD26/AC25
AC10/AD9
AG6/AG7
AN7/AP7
URC PLL B
K25/K26
M10/N10
N10/P10
N10/P10
URC PLL A
D28/E28
K8/J8
F5/G5
J6/J5
18
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Table 10. DLL Direct Input Pins (True/Complement Pair)
F900
FF1020
FC1152
FC1704
ULC DLL C
E3/E2
D32/D31
F31/G31
G40/H40
ULC DLL D
F3/G3
E32/E31
D33/E33
G41/H41
LLC DLL E
AB6/AC5
AE26/AE27
AJ30/AK30
AL37/AM37
LLC DLL F
AF2/AG2
AG32/AG31
AL32/AL31
AR39/AR40
LLC DLL C
AF4/AE5
AF27/AG28
AH29/AJ29
AL33/AL34
LLC DLL D
AG3/AH2
AK31/AL31
AM32/AM31
AU38/AV38
LRC DLL C
AJ29/AH29
AL2/AK2
AM3/AM4
AV2/AW2
LRC DLL D
AG28/AG29
AJ2/AH3
AJ6/AH6
AL10/AL9
LRC DLL F
AF29/AF28
AG1/AG2
AL3/AL4
AR4/AR3
LRC DLL E
AB26/AC26
AE7/AE6
AJ5/AK5
AL6/AM6
URC DLL D
G28/F28
E1/E2
D2/E2
G2/H2
URC DLL C
D29/D30
D1/D2
F4/G4
G3/H3
Table 11. Preferred Pinout for Left Side Memory Controller
Bottom Edge Preferred Pinout
QDR/QDRII
Port
SC25 900
All 1020
W_N
AE5
AG28
AJ29
R_N
AJ1
AK32
AN33
All 1152
Left Edge Preferred Pinout
All 1152
SC25 900
All 1020
All 1152
All 1152
AL34
V4
W25
AA24
AG29
AV42
V5
Y26
Y24
AF29
A[0]
AH4
AJ28
AN31
AW40
U5
W29
AA33
AD39
A[1]
AG5
AK28
AN30
AY40
U4
W30
Y33
AC39
A[2]
AF8
AJ31
AP31
AW39
T4
V30
Y31
AB42
A[3]
AG8
AH30
AP30
AW38
T5
V29
W31
AA42
A[4]
AH3
AM30
AM29
AV37
U1
V31
W33
AB38
A[5]
AJ3
AM29
AM28
AV36
T1
V32
V33
AA38
A[6]
AF9
AH29
AJ27
AM31
V3
U31
V34
Y41
A[7]
AE10
AH28
AJ26
AM32
U3
U32
U34
W41
A[8]
AK3
AJ27
AP29
BA40
T6
T27
V25
AA36
A[9]
AJ4
AK27
AP28
BB40
U2
T32
U33
Y40
A[10]
AE11
AL28
AN29
BA39
T2
T31
T33
W40
A[11]
AF10
AL27
AN28
BA38
R4
U24
Y27
AC32
A[12]
AH7
AM28
AL26
AW36
R1
R32
W30
Y39
A[13]
AH8
AM27
AL25
AW35
P1
R31
V30
W39
A[14]
AE12
AG23
AG23
AM28
R3
T26
V28
AB35
A[15]
AE13
AF22
AG22
AL28
R2
R29
T34
Y38
A[16]
AK4
AG26
AN27
AV35
P2
R30
R34
W38
A[17]
AK5
AG25
AN26
AV34
P3
P31
U30
V42
A[18]
AJ5
AL26
AP27
AY36
N3
P32
T30
U42
A[19]
AJ6
AM26
AP26
AY35
R6
T24
V29
W36
19
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Table 12. Preferred Pinout for Right Side Memory Controller
Bottom Edge Preferred Pinout
Left Edge Preferred Pinout
QDR/QDRII
Port
SC15 900 SC25 900 All 1020 All 1152 All 1704 SC15 900 SC25 900 All 1020 All 1152
W_N
AD25
AH30
AK1
AM2
AU1
Y30
W26
R_N
AE26
AG29
AH3
A[0]
AK28
AF25
AJ5
AH6
AL9
AA30
V26
AN3
AW3
T30
T27
A[1]
AH21
AG25
AK5
A[2]
AH23
AG24
AH4
AP3
AY3
W28
R27
AM6
BA2
U26
V27
W8
All 1704
AA11
AG14
Y7
Y11
AF14
W4
AA2
AD4
W3
Y2
AC4
V3
Y4
AB1
A[3]
AH22
AF24
AH5
AM7
AY2
U28
U27
V4
W4
AA1
A[4]
AG22
AH27
AM3
AP4
AV6
M30
R30
V2
W2
AB5
A[5]
AG21
AH26
AM4
AP5
AV7
R29
P30
V1
V2
AA5
A[6]
AF21
AE22
AF10
AK9
AN11
P29
U29
U2
V1
Y2
A[7]
AE21
AK29
AJ6
AN6
AY4
P27
T29
U1
U1
W2
A[8]
AE20
AK28
AK6
AN7
AY5
N29
T24
T6
V10
AA7
A[9]
AK25
AH25
AG8
AP6
BA4
N28
N30
T1
U2
Y3
A[10]
AH19
AH24
AG7
AP7
BA5
R25
M29
T2
T2
W3
A[11]
AK23
AE23
AL5
AN8
BB4
R28
U26
U9
Y8
AC11
A[12]
AJ21
AD23
AL6
AN9
BB5
N27
U28
R1
W5
Y4
A[13]
AG18
AH21
AC12
AF12
AT10
L30
T28
R2
V5
W4
A[14]
AK21
AH23
AM5
AL9
AV8
J30
W30
AA1
AG2
AK3
A[15]
AJ19
AH22
AM6
AL10
AV9
M26
Y27
AB6
AC6
AJ9
A[16]
AJ18
AG22
AE12
AP8
AY7
G29
W27
AC6
AD6
AK9
A[17]
AG17
AG21
AD12
AP9
AY8
F29
AA30
AC2
AF4
AK5
A[18]
AH18
AF21
AJ8
AM9
AV10
H28
AA25
AD4
AH2
AL1
A[19]
AH17
AE21
AK8
AM10
AV11
J28
AB25
AD3
AJ2
AM1
20
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Timing Specifications
The timing diagrams in Figures 11 and 12 below show the timing on the QDRII/II+ device interface, and Figures 13
and 14 add the timing on the user interface for command and data.
Figure 11. QDRII/II+ SRAM Interface Timing (4-Word Burst Mode)
0
1
2
3
4
5
6
7
8
9
K
K#
R#
R1
R2
W#
W1
Address[19:0]
R1
W2
W1
R2
D[35:0]
W1a
W2
W1b
W1c
W1d
W2a
W2b
W2c
W2d
CQ
Q[35:0]
R1a
R1b
R1c
R1d
R2a
R2b
R2c
R2d
QVLD
Figure 12. QDRII SRAM Interface Timing (2-Word Burst Mode)
0
1
2
3
4
5
R#
R1
R2
R3
R4
W#
W1
W2
W3
W4
Address[19:0]
R1
W1
R2
W2
R3
W3
W4
W4
W1a
W1b
W2a
W2b
W3a
W3b
W4a
W4b
6
7
8
K
K#
D[35:0]
CQ
Q[35:0]
R1a
21
R1b
R2a
R2b
R3a
R3b
R4a
R4b
9
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Figure 13. Complete 4-Word Write/Read Sequence, Reading Back Just-Written Data
1
0
2
3
4
5
6
7
9
8
10
11
12
13
14
15
16
18
17
19
20
k_clk
qdr_wcmd_fifo_wenab
qdr_write_data_ready
qdr_write_data[71:0]
WD0
WD1
qdr_wcmd_fifo_empty
qdr_rcmd_fifo_wenab
qdr_read_data_valid
qdr_read_data[71:0]
RD0
RD1
qdr_rcmd_fifo_empty
K
WN
RN
A[19:0]
WA
RA
W
D
0
D[35:0]
W
D
0b
W
D
1b
W
D
1
CQ
RD RD RD RD
0a 0b 1a 1b
Q[35:0]
QVLD
Figure 14. Complete 2-Word Write/Read Sequence, Reading Back Just-Written Data
0
1
2
3
4
6
5
7
9
8
10
11
12
13
14
k_clk
qdr_wcmd_fifo_wenab
qdr_write_data_ready
qdr_write_data[71:0]
WD0
qdr_wcmd_fifo_empty
qdr_rcmd_fifo_wenab
qdr_read_data_valid
qdr_read_data[71:0]
RD0
qdr_rcmd_fifo_empty
K
WN
RN
A[19:0]
RA
WA
D[35:0]
WD
0a
WD
0b
CQ
RD
0a
Q[35:0]
22
RD
0b
15
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
QDRII/II+ Memory Controller Performance
Table 13 lists the bandwidth performance per data bit for the various LatticeSCM packages, device supply voltages,
and device speed grades. All timing is at a junction temperature of 105°C and below.
Table 13. QDRII/II+ Memory Controller Performance
VCC = 1.0V ±5%
VCC = 1.2V ±5%
Package
-5
-6
-7
-5
-6
-7
Units
QDRII
200
400
250
500
250
500
250
500
250
500
250
500
MHz
Mbps
QDRII+
275
550
325
650
350
700
325
650
350
700
375
750
MHz
Mbps
QDRII/II+ Memory Controller On-Chip Resources
Figure 15 illustrates some of the resources on the LatticeSCM device that are available to the QDRII/II+ Memory
Controller, including:
• Seven banks of I/O pins;
• Dedicated routing to two sets of pins from each Memory Controller MACO block;
• Edge Clock buses containing eight clock lines per bus (shown), and two DCNTL buses per bank (not shown).
• PLLs for clock conditioning (up/down frequency shifting, duty cycle/phase adjusting, jitter filtering, etc.);
• DLLs for phase and delay adjustment.
Edge Clock Bus (8)
UR
PLL
B
UR
PLL
A
UR
DLL
D
UR
DLL
C
LL
DLL
C
LL
DLL
D
LL
PLL
A
LL
PLL
B
LL
DLL
E
LL
DLL
F
LR
DLL
F
Edge Clock Bus (8)
Bank 5
Edge Clock Bus (8)
Bank 4
Left MACO Memory Controller
Bottom Pinout
Right MACO Memory Controller
Bottom Pinout
23
LR
DLL
E
LR
DLL
D
LR
DLL
C
LR
PLL
B
LR
PLL
A
Right MACO Memory Controller
Side Pinout
Edge Clock Bus (8)
Right MACO
Memory
Controller
Bank 3
Edge Clock Bus (8)
Left MACO
Memory
Controller
Bank 6
Left MACO Memory Controller
Side Pinout
Bank 7
Bank 2
UL
DLL
D
Bank 1
Quad
SERDES
UL
DLL
C
Quad
SERDES
UL
PLL
B
Quad
SERDES
UL
PLL
A
Quad
SERDES
Figure 15. MACO Memory Controller Resources
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Conclusion
Applications using QDRII/II+ SRAM are becoming popular in FPGA designs. LatticeSC MACO devices offer a
proven, flexible, and high-performance interface to these SRAM with consistent timing margins to meet your design
needs. The ease of integration gives the FPGA designer the freedom to choose different variations of SRAM and
reduces the risks of system complexity.
References
• MT54W2MH8J, MT54W1MH18J, MT54W512H36J, 18Mb QDR-II SRAM 4-Word Burst, Micron Technology, Inc.,
2003.
• K7R323684B, K7R321884B, 1Mx36 & 2Mx18 QDR-II b4 SRAM, Samsung Electronics Co. LTD., Dec. 2003, Rev
2.0.
• CY7C1411AV18, CY7C1413AV18, CY7C1415AV18, 36-Mbit QDR-II SRAM 4-Word Burst Architecture, Cypress
Semiconductor Corp., Feb. 11, 2005.
• QDRII/II+ Evaluation Board Demonstration Design
• Lattice technical note TN1033, High-Speed PCB Design Considerations
Technical Support Assistance
Hotline: 1-800-LATTICE (North America)
+1-503-268-8001 (Outside North America)
e-mail: [email protected]
Internet: www.latticesemi.com
Revision History
Date
Version
Change Summary
April 2006
01.0
Initial release.
August 2007
01.1
References to LatticeSC changed to LatticeSCM.
September 2007
01.2
Added QDRII+ documentation support.
February 2008
01.3
Updated Features bullets.
March 2008
01.4
Updated GUI Dialog Box for QDRII/II+ Memory Controller Clocks table.
June 2008
01.5
Title changed from “LatticeSCM QDRII/II+ SRAM Controller MACO
Core User’s Guide” to “QDRII+ SRAM Controller MACO Core User’s
Guide”.
Updated Features bullets.
24
QDRII+ SRAM Controller MACO Core
User’s Guide
Lattice Semiconductor
Appendix for LatticeSCM FPGAs
Table 14. Performance and Resource Utilization1
Configuration
Rd/Wr
FIFO
Depth
Latency
Burst
Mode
Slices
LUT4s
Registers
PIOs
Type
Data Width
Address
Width
QDRII+
18
18
4/4
2.5
4
230
297
233
194
QDRII+
36
18
4/4
2.0
4
342
406
382
194
QDRII
18
18
64/64
1.5
2
453
717
242
194
1. Performance and utilization characteristics are generated using Lattice’s ispLEVER® 7.0 software. When using this IP core with different
software or in a different speed grade, performance may vary.
Ordering Part Number
All MACO IP, including the Ethernet flexiMAC™ Core, is pre-engineered and hardwired into the MACO structured
ASIC blocks of the LatticeSCM family of parts. Each LatticeSCM device contains a different collection of MACO IP.
Larger FPGA devices will have more instances of MACO IP. Please refer to the Lattice web pages on LatticeSCM
and MACO IP or see your local Lattice sales office for more information.
All MACO IP is licensed free of charge, however a license key is required. See your local Lattice sales office for the
license key.
25