DDR/DDR2 SDRAM Controller MACO Core User Guide

ispLever
CORE
TM
DDR/DDR2 SDRAM Controller MACO Cores
User’s Guide
May 2010
ipug46_01.8
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Introduction
Lattice’s DDR/DDR2 Memory Controller MACO™ IP core assists the FPGA designer by providing pre-tested, reusable functions that can be easily plugged in, freeing the designer to focus on system architecture design. These
blocks eliminate the need to “re-invent the wheel,” by providing industry-standard DDR and DDR2 memory controller modules. These proven cores are optimized utilizing the LatticeSCM™ device’s MACO architecture, resulting in
fast, small cores that utilize the latest architecture to its fullest.
Figure 1. Lattice MACO Conceptual Diagram
MACO
Soft IP
LatticeSCM
FPGA Fabric
User Logic
Interface
Memory
Interface
Lattice
IPexpress
Lattice DDR/DDR2
MACO Solution
PLL
DLL
Complementing the Lattice ispLEVER® software is the support to generate a number of user-customizable cores
with the IPexpress™ utility. This utility helps the designer to input design information into a parameterized design
flow. Designers can use the IPexpress software tool to help generate new configurations of this IP core. Specific
information on bus size, clocking, and memory device requirements are prompted by the GUI and compiled into the
FPGA design database. The utility generates templates and HDL-specific files needed to synthesize the FPGA
design.
IPexpress, the Lattice IP configuration utility, is included as a standard feature of the ispLEVER design tools.
Details regarding the usage of IPexpress can be found in the IPexpress and ispLEVER online Help systems. For
more information on the ispLEVER design tools, visit the Lattice web site at www.latticesemi.com/software.
Overview
The DDR/DDR2 Synchronous Dynamic Random Access Memory (SDRAM) Controller is a general-purpose memory controller that interfaces with industry standard DDR/DDR2 SDRAM devices and modules. The Lattice Semiconductor DDR SDRAM Controller is a parameterized core that provides the flexibility for modifying data widths,
burst transfer rates, and CAS latency settings in a design. It provides a simple command interface for application
logic. The controller can be configured to function as a DDR only or DDR2 memory controller.
The memory controller comprises an FPGA logic block and an ASIC block. The FPGA logic is sometimes referred
to as the “soft IP” because it is programmed into the FPGA along with the user application. The embedded ASIC
block is called the MACO “hard IP”, because as an ASIC, it is an unmodifiable part of the device. Two (one on SC15) DDR MACO sites are available on the device.
2
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Lattice technical note TN1099, LatticeSC DDR/DDR2 SDRAM Memory Interface User’s Guide covers topics such
as modes of operation, I/O buffer and termination issues, system clocking and timing.
In addition to supporting all the features of regular DDR memory, DDR2 memory also supports:
• The posted CAS functionality to maximize data throughput when successive read/write commands with auto precharge are presented to the memory.
• An on-die termination resistor. The on/off state of this resistor is controlled by a signal driven by the controller.
This user’s guide explains the functionality of the Lattice DDR Controller IP core.
Features
• Interfaces to industry standard DDR and DDR2 SDRAM
• Programmable burst length of 4 or 8
• Posted CAS functionality
• ODT signal generation
• Programmable CAS latency of 3 or higher
• Intelligent bank management to minimize ACTIVE commands
• Synchronous implementation
• Command pipeline to maximize throughput
• Supports SDRAM data path widths of 8, 16, 32, 40, 64 and 72 bits. Data width of 72 is supported in flip-chip or
wire bond packages only with single-ended DQS. Maximum data width with differential mode DQS is 40 in wire
bond packages.
• Varying address widths for different memory devices
• Programmable timing parameters
• Internal core frequency and DDR-2 DRAM frequency of 333MHz with two chip selects used
• Byte-level writing through data mask signals
• Supports both true and complementary DQS during write (for a maximum of 40 data bits). During read, the complementary pin is unused.
• Maximum of two chip selects (includes the capability for both chip selects to be de-selected to allow for other chip
selects to be added via FPGA gates)
• Supports PCB trace lengths of up to eight inches.
Design Kit Deliverables
• Sample instantiation (template)
• Synthesis black box for MACO core
• Pre-compiled ModelSim® MACO core model
• Verilog core source code
• Evaluation design
– Verilog testbench
• Preference files
3
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Getting Started
Requirements to implement a MACO core include:
• ispLEVER version 6.1 SP2 or later
• MACO Design Kit: see the ReadMe file supplied with the IPexpress DDR/DDR2 MACO Kit for details on the Kit’s
contents
• MACO License File
• See the IPexpress Tutorial for more information on the ispLEVER design flow
For information on obtaining the above requirements, please contact your local Lattice Semiconductor sales representative.
Functional Description
DDR/DDR2 SDRAM is similar in function to regular SDRAM, but doubles the bandwidth of the memory by transferring data twice per cycle, on both the rising and falling edges of the clock signal.
The memory controller core provides a generic command interface to the user’s application. This interface reduces
the effort to integrate the module with the user’s design and minimizes the need to deal with the DDR/DDR2
SDRAM command interface. The timing parameters for the memory can be set through the signals that are input to
the core. This enables the user to switch between different memory devices and/or to modify the timing parameters
to suit the application using the IPexpress utility.
While most of the functionality of the memory controller remains the same for both DDR and DDR2 mode, certain
differences exist.
Table 1. Basic Differences Between DDR and DDR2
Feature
DDR Mode
DDR2 Mode
1, 2, or 3 clocks
2, 3, 4 or 5 clocks
Write Latency
1 clock
Read Latency - 1
Burst Length
2, 4, 8 words
4, 8 words
DQS as differential signals
No
Yes
Redundant DQS for read data (RDQS, RDQS#)
No
Yes (only for 32x8 configuration)
Ability to interrupt 8-word burst (write or read)
No
Yes
4
4 or 8
On Die Termination
NA
Supported
Posted CAS Additive Latency Mode
NA
Supported
CAS Latency
No of banks per device
4
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Top Level Block Diagram
Figure 2. DDR_IP_TOP Module
DDR/DDR2 Memory Controller
ddr_ref_clk
ddr_k_clk
ddr_k90_clk
Internal Clocks
and
PLL Status Ports
ddr_kclk_lock
ddr_k2_clk
ddr_k3_clk
ddr_k2clk_lock
ddr_dll_lock
ddr_rst_n
Resets
DATA_WIDTH
2*DATA_WIDTH
DATA_WIDTH/8
DATA_WIDTH/4
DATA_WIDTH/8
DM_[(DATA_WIDTH/8 -1):0]
RA_WIDTH
A_[(RA_WIDTH -1):0]
2 or 3*
BA_[(2 or 1)*:0]
CAS_N
WE_N
CS_WIDTH
CKE
CLK
CLK_N
DDR2 Mode
Only
CS_WIDTH
ddr_dm_[(DATA_WIDTH/4 -1):0]
4
ddr_cmd_[3:0]
ddr_cmd_valid
5
ddr_burst_length_[4:0]
ddr_burst_terminate
ddr_init_start
ddr_init_done
FPGA Side Read FPGA Address
Ports
Port
Internal and
External Memory
Controller
Interface Ports
RAS_N
FPGA Side Write Ports
ddr_write_data_[(2*DATA_WIDTH -1):0]
5
(RA_WIDTH +
CA_WIDTH +
BSIZE*)
ddr_addr_[(RA_WIDTH + CA_WIDTH + BSIZE**) -1:0]
2*DATA_WIDTH
ddr_read_data_[(2*DATA_WIDTH -1):0]
ddr_read_data_valid
ddr_cmd_rdy
ddr_data_rdy
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
IPexpress GUI
Parameters
Figure 3. DDR_IP_TOP Detailed Diagram of DDR Controller
rst_n
tRAS[4:0]
tRC[4:0]
tRCD[2:0]
tRRD[2:0]
tRFC[5:0]
tRP[2:0]
tMRD[2:0]
tWR[2:0]
tREFI[15:0]
tWTR[2:0]
tRTP[1:0]
tFAW[4:0]
tCKP[7:0]
init_cas_latency [2:0]
ar_burst_en[2:0]
MACO
ASIC Gates
ddr_ref_clk
rst_n
FPGA Array
CKE
ib_ddr_cke
CS_N[`USR_CS_WIDTH-1:0]
ib_ddr_ras_n
cmd_io
ib_cas_n
RAS_N
CAS_N
ib_ddr_we_n
WE_N
ODT[`USR_CS_WIDTH-1:0]
ib_cs_n[7:0]
A[`USR_ROW_WIDTH-1:0]
DDR
MACO
ib_ddr_odt[1:0]
BA[2:0]
ib_ddr_ba[2:0]
ib_ddr_addr[13:0]
DQS[(`DATA_WIDTH/8)-1:0]
ib_ddr_write_enable
ddr_init_start
ddr_cmd_valid
ddr_burst_length[4:0]
ddr_addr[USR_ASIZE-1:0]
ddr_cmd[3:0]
ddr_burst_terminate
ddr_init_done
ddr_cmd_rdy
ddr_data_rdy
CLK
CLK_N
k_clk
data_io
ib_ddr_dqs_out_en
read_command
read_latency[3:0]
blength[2:0]
ffo_ddr_init_done
ib_ddr_dqs[1:0]
ref_clk
k_clk_in_dly
k_clk (ref_clk*2)
PLL1
k4_clk (k_clk + 90 )
PLL2
k2_clk (match DQS trace delay)
k3_clk (match DQS delay)
ddr_read_data_valid
ddr_read_data[`DSIZE-1:0]
ddr_write_data[`DSIZE-1:0]
k_clk
ddr_dm[(DSIZE/8)-1:0]
DLL
DCNTL [8:0]
CLKCNTL
The DDR Controller core includes the following functional blocks:
1. DDR MACO hard-core of the LatticeSCM device.
2. cmd_io – instantiates I/Os for memory device command and address bus
3. data_io – instantiates I/Os for memory device data bus
4. PLLs and DLLs
5. Clock control and clock detection logic blocks
The MACO hard core includes three blocks as shown in Figure 4.
6
CLKDET
DQ[`DATA_WIDTH-1:0]
DQS[(`DATA_WIDTH/8)-1:0]
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Figure 4. DDR MACO Block Diagram
Initialization
Module
Command
Application
Logic
Command
Decode Logic
User Interface
Soft IP Interface
Command Decode Logic
The commands presented by the user are decoded and placed into one of the two internal queues by this module.
The controller asserts the signal ddr_cmd_rdy, whenever it is capable of accepting a new command from the user.
To ensure that the available bandwidth is fully utilized at a burst length of 8, this module is capable of issuing a
ddr_cmd_rdy signal once every 4-clock cycles. A command is accepted if the ddr_cmd_valid signal was asserted.
A valid command is then decoded and the bank management logic compares the row and bank address of the current command with the list of open banks/rows to determine whether precharge and/or activate command should
be applied.
If the command received was for a mode register write, controller continues and completes execution of all commands in the queue ahead of the MODE register update command. New commands will be accepted once the register update is complete and the memory chip is reprogrammed with the new values.
This module also maintains a refresh counter and issues a request for a refresh command(s) to be generated. The
controller allows up to eight auto-refresh commands to be issued to the memory chip. The user can select the
exact number to be issued through the ar_burst_en signal.
The generic user interface integrates the core to standard bus interfaces. The user is required to only supply the
Read, Write, Power down, Load Mode register, and Self Refresh commands through the interface. The controller
can also accept the read/write with auto precharge command. The controller will apply the proper commands
based on the address of the accessed location. Table shows the valid values for the cmd[3:0] bus.
Table 2. User Interface Commands
Acronym
Command
Decoding
cmd[3:0]
CS#
RAS#
CAS#
WE#
SDRAM
Address
READ
0001
0
1
0
1
Column
Write
WRITE
0010
0
1
0
0
Column
Read with Auto Precharge
READA
0011
0
1
0
1
Column
Write with Auto Precharge
WRITEA
0100
0
1
0
0
Column
PWRDN
0101
LOAD_MR
0110
0
0
0
0
Opcode A15-A0
Command
Read
Power Down
Load Mode Register
Self Refresh
Control Signals
SELF_REFRESH
0111
0
0
0
1
X
Read Interrupt
READ_INT
1001
0
1
0
1
Column
Read Interrupt with Auto Precharge
READ_INTA
1010
0
1
0
1
Column
Write Interrupt
WRITE_INT
1011
0
1
0
0
Column
Write Interrupt with Auto Precharge
WRITE_INTA
1100
0
1
0
0
Column
7
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
The DDR2 IP core automatically closes (precharges) and opens rows according to the user memory address
accesses. Therefore, the READA and WRITEA commands are not used for most applications. The commands are
provided to comply to the JEDEC DDR2 specification.
Initialization State Machine
This module initializes the DDR SDRAM after power-up as indicated by the user. Initialization is done in a predefined manner as mentioned in the JEDEC specification. Since initialization must be performed at least 200µS
after power-up, the user is required to initiate this process to meet the desired specification.
The following operations are done as a part of the initialization process:
• Issue a NOP command
• * Activate internal DDR SDRAM clock signals by making ddr_cke signal HIGH
• Issue a PRECHARGE ALL command
• Enable the DLL by issuing a LOAD MODE REGISTER command to the extended mode register. Write default
values to the register.
• Reset the DLL by issuing a LOAD MODE REGISTER command to the mode register
• Wait for 200 clock cycles for the DDR SDRAM DLL to lock
• Place the device in idle state by issuing a PRECHARGE ALL command
• Once in idle state, issue two AUTO REFRESH commands
• Issue a LOAD MODE REGISTER command to the mode register to program operating parameters with “reset
DLL” deactivated. Writes CFG register value for BL (Burst Length), CL (CAS Latency) and sets the BT (Burst
Type) to sequential mode.
The initialization sequence varies slightly in the DDR2 mode.
Command Application Logic
This command application logic module receives input from the configuration interface as well as the command
decode logic. The commands presented by the decode logic are applied to the memory in the order received.
Commands in the two pipelines are executed in parallel to maintain a high throughput. This module also meets the
timing requirements set by the user through the configuration interface. To maximize data throughput at burst
length of eight, this module is capable of accepting a new command every four clock cycles.
The controller supports a burst mode of command execution where the user provides a base address and a burst
count. The read or write command is then executed as many times as set at the burst_count[4:0] signal. The row
address is fixed for every single burst while the column address is incremented. If the column address happens to
reach the page boundary, it wraps around to the beginning of the same page. The controller supports a burst count
of up to 31.
8
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Figure 5. Clocking Scheme
Cmd, Adrs
Config
rst_n
ref_clk
C LK I
CLKI
FDBK
PLL1
EHXPLLA
P
1
S
2
P
3
Memory
Interface
Control
Unit
I/O Reg
D
RAS#
CAS#
WE#
CKE
Q
Register
PLL2
EHXPLLA
S
CLK
A
BA
CS
ODT
“1”
(0,1)
D
Q
Register
4
CLK
K
k_clk_in
(1,0)
D
K#
Q
Register
CLK
datamask
D
Q
D
Register
Q
D
Register
CLK
DM
Q
Register
CLK
CLK
k_clk
D
Q
Register
CLK
write_data
D
Q
D
Register
Q
D
Register
CLK
Q
Register
CLK
CLK
DQ
read_data
DO
DI
Q
FIFO
CKO
D
Register
CKI
CLK
5
D
Q
D
Register
Q
Register
CLK
CLK
D
Q
D
Register
Q
Register
CLK
CLK
90°
DataValid
Q
CLK
Q
ClkDet
RST
DQS
D
ClkCntl
CLK
CLK
Clock
Control
Module
UPDT
CLKI
Notes:
1
DLL1
TRDDLLA
DCNTL[8:0]
k_clk (ref_clk X 2)
3
k2_clk (match DQS – trace delay) + 90°
2
k4_clk (k_clk + 90°)
4
k3_clk (match DQS) + 90°
5
Delay for DQS edge clock injection matching (Delay = 13 for SC25, other devices TBD)
The core also utilizes the FPGA fabric for I/O interfaces and clocking. This includes Data I/O and Command I/O as
well as the PLLs, DLLs and the clock detect logic.
9
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Data I/O
Data I/O interfaces with the user logic and I/O pads for transferring data between the two interfaces. The logic for
this module is outside the MACO core. This module transfers write data from user to memory, read data from memory to user. During a write operation, the user data is transferred to the DRAM on the bi-directional DQ bus using
k4_clk (k_clk+90 deg). Data is sent on both edges using ODDRXA pads. During a read operation, the data from
the DRAM is captured using the DQS signal (shifted by 90 degrees) and is given to the user after synchronizing
with system clk (k_clk).
Command I/O
Command I/O interfaces with DDR MACO and I/O pads for transmitting DDR command to memory device. The
DDR Commands from the MACO block are directly sent to the memory. This module is also part of the soft IP.
Clocking
Two PLLs and one DLL are used in the soft IP. PLL1 generates k_clk (core clock) and k4_clk which is a 90-degree
phase shifted version of k_clk. K_clk is the main clock that is used for all the registers in the design. The same
k_clk is sent to memory as K and K#. K4_clk is used for driving the write data from the user to the memory. The
clocking scheme is shown in Figure 6.
Figure 6. Read Data Capture at SDRAM Controller
K
DQS
At SDRAM
valid window
DQ
Tpcb_DQS =pcb
trace delay for DQS
Tpcb_DQ =pcbtrace
delay for DQ
tDQSQ
DQS
tpcb_DQS
At FPGA I/O
valid window
DQ
tpcb_DQ
At input of first
PIO flip-flop
DQS
(90 degree shifted)
valid window =
487 ps
DQ
As shown in Figure 6, the read data DQ is captured on the rising or falling edge of the data strobe DQS (DQS# is
not shown). Since DQ and DQS are edge-aligned coming from the SDRAM device, DQS needs to be delayed (ideally centered to DQ) to effectively capture the data. Methods such as using the cycle stealing delays or by pre-setting the INDEL to a given value can be used to delay the DQS with respect to the data, but using the DLL as shown
in Figure 3 to control the INDEL to delay the DQS signals by 90 degrees gives the greatest timing margin over PVT
and is independent of the interface speed. The INDEL can be set to a single value per device to match the edge
clock injection delay variations over process, voltage and temperature, thus a fixed INDEL setting on the DQ inputs
will be used to match the captured DQ data to the edge clock injection delay for DQS. The memory controller core
uses the K clock. K and its complement K# are also sent to the SDRAM memory device. This core clock K is fed
into DLL0TRDLL (which operates as the master DLL) to produce a T/4 digital control output called dcntlctrl0. This
is a 9-bit bus that is used to control the INDEL delay cells within the PIOs used for DQS/DQS# read inputs and will
provide a 90 degree time shift for the DQS/DQS# input signals. DLL0TRDLL can be adjusted to give additional
margin on top of the 90 degree delay based on the customer’s actual system.
10
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
As shown in Figure 4, the PCB routing delay of DQS is denoted by tpcb_dqs and the PCB routing delay of DQ is
denoted by tpcb_dq. DQS arrives at the FPGA I/O and gets routed to the clock pins of all the associated DQ pins.
This results in extra on-chip DQS clock delay and a clock skew on the FPGA device which is estimated to be
approximately 50 to 100 ps for the worst case edge clock under worst case conditions. The digital control dctrl0
delays DQS by T/4, which results in DQS0d shown in Figure 4. DQS0e with respect to DQS is Tcyc/4 + clock injection time (Tinj).
The available data window at the first capture flip flop = Data valid window at the SDRAM memory - ((setup+hold at
FPGA + package skew + tpcb_dqs + clock skew). Assume FPGA setup and hold is 100 ps.
Example of the data valid window = 987 ps - ((100 + 100+ 50 + 100 + 50) = 487 ps.
Since DQS is a strobe and not a free-running clock, the read data captured with DQS should be recaptured using a
free-running clock. As shown in Figure 3 and Figure 4, this is done using the K3 clock rather than the K clock. This
is done because the DQS signal from the DDR2 memory is generated from the K clock signal sent from the FPGA
device and then sent back to the FPGA device during a read. As shown in Figure 2, the K clock is looped back
within the same I/O pad to the input clock routing in order to generate the K2 clock matched to k_clk_in_dly. Thus,
this delay path has the same output buffer delay as K clock (including associated extrinsic loading delay) and
matches the input buffer delay buffer delay on the DQS/DQS# pins. It is delayed by dcntl0 control from TRDLL,
which is the same control that is used to provide a 90 degree lag on the DQS pins. On the DDR2 device, the K
clock input is used to generate the DQS strobe at tDQSCK (+/-450 ps for the Micron device). Therefore the resulting clock signal k2_clk has the same delay as the DQS signal coming back from the SDRAM except that the DQS
strobe has extra delay associated with the K signal pcb trace delay (tpcb_K) and the DQS return pcb trace delay
(tpcb_DQS) and the DQS also can be +/- this delay by tDQSCK (+/-450 ps).
The DQS is then received at the FPGA to capture the read data. The output from the input buffer INDEL element at
the pad for the K clock, referred to as k_clk_in_dly, is fed as the reference clock to PLL2 to generate k2_clk and
k3_clk. If the RAM device is close enough to the FPGA on the board and the SDRAM interface speed is slow
enough, then the k_clk_in_dly (possibly tuned further using INDEL) can be used to hand off from the DQS clock
that will stop at the end of read instructions to an internal continuous clock. Generally however k2_clk is phase
matched to k_clk_in_dly and k3_clk is phase shifted from k2_clk by a value equal to the pcb routing delay. Thus
k3_clk nominally matches the round trip delay of DQS. Generally k_clk is the clock used for other internal logic on
the device.
The read data-timing diagram in Figure 5 shows the read data captured using DQS at the FPGA I/O, the relationship between k_clk, k4_clk, k2_clk and k3_clk. It also shows an example for the number of K clock cycles of latency
after which read data is available to the FPGA. The data_valid read_data_start signal generated in the soft IP indicates the start of the read data burst. This is generated by sampling the first rising edge of DQS using the edge
detect capability built into the FPGA PIOs. The naming conventions used in Figure 5 should be used only as a reference.
11
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Figure 7. Read Data Timing
200 MHz Example
5 ns
k_clk
(from PLL1
on Pclk)
1.25 ns
(90°)
k4_clk
(from PLL1
on Pclk)
1.1ns
(Pclk
->
Obuf)
K [To Mem]
(Inv k_clk -> Obuf)
1.1ns
(Ibuf )
k_clk_in_dly
(Ibuf -> Indel -> PLL2)
1.25 ns
(90°)
1.1ns
(Ckin )
k2_clk
(from PLL2, on edge to k_clk_in_dly,
edge aligned to DQS, minus trce delays)
k3_clk (used to transfer from DQSin clk)
(from PLL2, trace delay from k2_clk)
-> This PLL output is tuned per application
One Clock Cycle
DQS (From memory)
-> The Indel is used for 90° phase shift
Note: Example, will not occur in this
exact clock cycle
Trace Delay =~2.9 ns
1/2 Clock Cycle
1.1ns
(Ibuf )
1.25 ns
(90°)
Note: Clkcntrl keeps this
low during 3-state of DQS
(preamble and postamble)
1.1ns
(Eclk )
DQSin (On-chip from DQS)
(Inbuf -> Indel -> Clkcntrl -> Edge clock)
-> The Indel is used for 90° phase shift
Note: Aligned to k3_clk, will not occur
in this exact clock cycle
clk_turn_off_k
3/4 Clock cycle
clk_turn_off_k4
Obuf + Ibuf +Ckin
(must be < CLK period
for reliable operation
clk_turn_off_k2_p
1/2 Clock cycle
clk_turn_off_k2
Trace Delay =~2.9ns
clk_turn_off_k3_n
Note: The actual clk_turn_off signal is generated from
combinatorial combination of these two signals.
1/2 Clock cycle
clk_turn_off_k3_p
GOAL: Transfer Signals between k_clk and k3_clk, where k3_clk is created to be matched to the delay of DQS that is sent back
from the DDR memory device when performing a read. The goal is for this circuit to work regardless of speed and trace length of
K to the memory and DQS back from the memory. The only requirements are:
Note : Example shown is for transfer of clk_turn_off signal generated on core clock which is k_clk. This signal is transferred through k4_clk,
k2_clk and finally to k3_clk in such a way that delays are not lumped between transfers. Various delays are as shown in the waveform.
K1_clk, k2_clk, k3_clk and k4_clk are all shown as they appear at FFs after routing on primary clocks.
Three types of delays are possible :
1. Delays that depends on the clock cycle itself.
2. Trace delay of PMIK to the DDR memory, DLL delay at the DDR memory trace delay of DQS returning to the LatticeSCM device.
3. Output Buffer (with clk->out of ODDRXA with board load on PMIK) + Input Buffer (With Clkcntrl delay) + Edge clock insertion delay (ECLK).
The above scheme will work for all the clock frequencies as long as following conditions are MET.
1. Trace Delays + DLL delay < 1 Clock Cycle.
2. Output Buffer + Input Buffer + Edge Clock Insertion Delay < 1 Clock Cycle
Note: If all of these delays can be in one clock cycle, k2_clk can be removed and transfers from k4_clk -> k3_clk can be done.,
where k3_clk if used as fhe feedback to PLL2.
12
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Input/Output Signals
Table 3 shows the signals connecting to the user interface.
Table 3. User Interface I/O Signals
Signal Name
Active
State
Signal Direction
(I/O)
—
I
Low
I
System Reset
Description
User Interface
ddr_ref_clk
ddr_rst_n
System Clock
ddr_init_start
High
I
Asserted when an initialization routine is to be performed,
and deasserted when ddr_init_done is asserted, indicating that the initialization routine is complete.
ddr_cmd_valid
High
I
Asserted when the contents of cmd and addr bus are
valid
—
I
Command for controller
ddr_cmd[3:0]
ddr_addr[`USR_ASIZE-1:0]
—
I
Address for read/write. USR_ASIZE is a programmable
parameter set based on size of memory, which is derived
by the following formula:
USR_ASIZE = USR_ROW_WIDTH + USR_BSIZE +
USR_COL_WIDTH
ddr_burst_length [4:0]
—
I
Indicates the number of read/write commands to be
issued to DRAM
High
I
Asserted if the burst cycle is to be terminated.
—
I
Data input. DSIZE is set to DATA_WIDTH times 2
ddr_dm [(`DSIZE/8) -1:0]
High
I
Data Mask for write data
ddr_cmd_rdy
High
O
Asserted to indicate that the controller is ready to accept
a new command.
ddr_data_rdy
High
O
When asserted, the controller is ready to accept data on
the write_data bus.
ddr_init_done
High
O
Asserted when the controller has completed the initialization routine.
ddr_read_data_valid
High
O
When asserted, the contents of the ddr_read_data bus
are valid
—
O
Read Data Out
ddr_burst_terminate
ddr_write_data [`DSIZE-1:0]
ddr_read_data [`DSIZE-1:0]
Configuration Interface Signals (set through ispLEVER/IPexpress GUI)
trefi[15:0]
NA
I
Refresh Interval in clock cycles.
Table 4 shows the signals of the DDR SDRAM memory types.
Table 4. DDR/DDR2 External Interface I/O Signals
Signal Name
Active State
Signal Direction
(I/O)
Description
DDR/DDR2 Memory Interface Primary Signals
CLK
High
O
DDR/DDR2 SDRAM clock derived from the system
clock
CLK_N
Low
O
Inverted DDR/DDR2 SDRAM clock derived from the
system clock
CKE
High
O
Clock enable
CS_N [`USR_CS_WIDTH-1:0]
Low
O
Active low chip select which selects and deselects the
DDR SDRAM
RAS_N
Low
O
Row Address Strobe
13
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Table 4. DDR/DDR2 External Interface I/O Signals (Continued)
Active State
Signal Direction
(I/O)
CAS_N
Low
O
Column Address Strobe
WE_N
Low
O
Write Enable
ODT [`USR_CS_WIDTH-1:0]
Low
O
DDR2 only: Signals controlling the on-die termination
registers on the memory chip.
Signal Name
Description
Bank address select.
BA [2:0]
NA
O
DDR2 Mode:
[2:0] if INT_BANK is 8
[1:0] if INT_BANK is 4
DDR Mode: default value [1:0]
A [`USR_ROW_WIDTH-1:0]
NA
O
Row or column address lines depending whether the
ddr_ras_n or ddr_cas_n is active.
DQ[`DATA_WIDTH-1:0]
NA
I/O
Bi-directional data bus.
DQS [(`DATA_WIDTH/8)-1:0]
NA
I/O
Bi-directional data strobe.
DM [(`DATA_WIDTH/8)-1:0]
NA
O
Data mask signals used to mask the byte lanes for
byte level write control.
Parameter Descriptions
Several configuration and timing parameters must be set before the DDR SDRAM Controller Module can be interfaced to a memory device. To ensure maximum flexibility in using the IP core, these parameters are designed as
inputs to the IP core that can be tied to desired values within the top level RTL file. These values are input via the
IPexpress GUI utility capturing the parameters into the user’s customized core. The user inputs physical and actual
timing information to reflect their memory design into the GUI. This data is processed to format the pertinent
parameters needed to compile their customized design.
Table 5. Programmable Parameters/User Interface I/O Signals
Signal Name
Active State
Signal Direction
(I/O)
Description
Configuration Interface Signals (set through ispLEVER/IPexpress GUI)
tRAS[4:0]
NA
I
ACTIVE to PRECHARGE command delay in clock cycles.
tRC[4:0]
NA
I
ACTIVE to ACTIVE/AUTO REFRESH delay in clock cycles.
tRCD[2:0]
NA
I
ACTIVE to READ/WRITE delay in clock cycles.
tRRD[2:0]
NA
I
ACTIVE bank a to ACTIVE bank b delay in clock cycles.
tRFC[5:0]
NA
I
AUTO REFRESH command period in clock cycles.
tRP[2:0]
NA
I
PRECHARGE command period in clock cycles.
tMRD[2:0]
NA
I
Loan Mode Register command period in clock cycles.
tWR[2:0]
NA
I
Write recovery time in clock cycles.
tREFI[15:0]
NA
I
Refresh Interval in clock cycles.
ext_reg_en
High
I
When Asserted, EMR is written into during initialization
tWTR[2:0]
NA
I
DDR2 only: Internal Write to Read command delay in clock
cycles.
tRTP[1:0]
NA
I
DDR2 only: Internal READ to Precharge command delay.
tFAW[4:0]
NA
I
DDR2 only:
tCKP [7:0]
NA
I
DDR2 only: CKE assertion to Precharge command delay during initialization sequence.
Init_cas_latency [2:0]
NA
I
CAS latency during initialization sequence.
ar_burst_en[2:0]
NA
I
Number of Auto Refresh commands issued at a time.
14
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Table 6. Mode Parameters I/O Signals
Parameter Name
Range
Description
Configuration Interface Signals (set through ispLEVER/IPexpress GUI)
CONTROLLER_MODE
DDR, DDR2
DDR mode or DDR2 mode
DATA_WIDTH
8, 16, 24, 32, 40, 48, 56, 64, 72
Data bus width
USR_ROW_WIDTH
1 to 14
Row address width
USR_COL_WIDTH
8 to 13
Column address width
USR_CS_WIDTH1
1, 2, 4, 8
Number of chip selects
INT_BANK2
4, 8
Number of banks
Operating Frequency
166, 200, 266
Frequency
BUFFER_TYPE
SSTL2-Class2, HSTL1
I/O buffers to be selected
1. For DDR, allowed values are 1, 2, 4 and 8. For DDR2, allowed values are 1 and 2.
2. For DDR2 only.
Table 7. Bank Size Dependency on CS_WIDTH and INT_BANK Parameters
CS_WIDTH
Parameters Derived
Value
Example
1
BSIZE
2
`define BSIZE 2
2
BSIZE
3
`define BSIZE 3
4
BSIZE
4
`define BSIZE 4
8
BSIZE
5
`define BSIZE 5
1
BSIZE
3
`define BSIZE 3
2
BSIZE
4
`define BSIZE 4
1
BSIZE
2
`define BSIZE 2
2
BSIZE
3
`define BSIZE 3
DDR Mode
INT_BANK Set to 8 in DDR2 Mode
INT_BANK Set to 4 in DDR2 Mode
User Interface
After a power-on reset, the user requests the IP to initialize by asserting the ddr_init_start signal, and keeping it
asserted until the ddr_init_done signal returns asserted, at which time ddr_init_start is deasserted and initialization
is complete.
After initialization is complete, the user can issue a command by holding the ddr_cmd and ddr_addr buses valid for
two consecutive rising edges of k_clk, the first being together with the assertion of ddr_cmd_rdy by the user and
ddr_cmd_valid by the controller.
Along with a read or write command, the user also needs to place the ddr_burst_length and the ddr_addr signals
for that particular command. When using burst count, address will get incremented automatically by the controller
and always lies within the same chip select. After reaching the last address within the same chip select, address
will be wrapped to zero within the same chip select.
If the command issued was a read, the read data will be available on the ddr_read_data bus when
ddr_read_data_valid is active.
If the command issued was a write, the user has to provide the data to be written on the ddr_write_data bus when
ddr_data_rdy is active.
The data mask signal ddr_dm is used to mask the data being written and should be provided along with the data.
15
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
In the case of BL=8 (burst length of 8), two cases of interrupt by a new burst access are allowed. A read can be
interrupted by a read and a write can be interrupted by a write with 4 word burst boundary respectively. The minimum CAS to CAS delay is defined by tCCD and is a minimum of 2 clocks for read or write cycles. The following
rules apply to burst interrupt:
1. The user command READ_INT will interrupt the immediately preceding READ command.
2. Interruption of a burst read or write cycle during BL=4 mode is not allowed.
3. A read burst with auto-precharge enabled (READA) cannot be interrupted (i.e. READ_INT cannot follow a
READA command).
4. A read burst with auto-precharge enabled, can interrupt the current read burst (i.e. READ_INTA can follow a
READ command).
5. When a current READ command is interrupted, the read data from the device memory is four words instead of
eight.
6. All command timings will be referenced to the burst length mode set in the mode register and not the shortened
burst.
7. The user command WRITE_INT will interrupt the immediately preceding WRITE command. This will cause only
four words of data associated with the WRITE command to be written into memory.
8. WRITE_INT cannot interrupt a WRITEA command (autoprecharge enabled).
9. WRITE_INTA can interrupt a WRITE command.
10.When a WRITE_INTA or a READ_INTA is presented when a multiple burst write/read operation is in progress,
the burst will be terminated.
User Address Mapping
For Single Chip Select
Example: 256Mb DDR2 device arranged as 16 Meg x 16 (16-bit data width)
i.e., Four banks, each has 4Meg locations.
Row address
Column address
Bank Address
8K (A0-A12)
512 (A0-A8)
4 (BA0, BA1)
User Address = 13 + 9 +2 = 24 bits
To address 64M locations, the required number of address bits is 24. Figure 8 shows how the user address is
mapped to the memory address. If INT_BANK is set to 8 in DDR2 mode, the BA width becomes [2:0] (eight banks
and 32Meg).
Figure 8. Mapping of User Address to Memory Address for Single Chip Select
23
Row Addr [23:11]
0
BA[10:9]
ddr_addr[23:0]
16
Col.Addr[8:0]
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
For Two Chip Selects
Two chip selects require one extra address line and the effective user address now is 1+24 bits. BA width becomes
[2:0] and user address [11:9] is assigned to BA.
DDR/DDR2 MACO Memory Controller Design Kit Directory
The directory structure of the DDR/DDR2 MACO Memory Controller IP, as generated by the IPexpress GUI, is
shown in Figure 9.
A more detailed description of the files generated, as well as information on installation, functional simulation, synthesis, design implementation and timing simulation, is given in the “readme.htm” file located in the ddr_maco_eval
directory. This Readme file can be invoked in IPexpress by clicking on the “Help” button of the GUI, as shown in
Figure 10. It can also be found in the ddr_maco_eval directory.
Figure 9. DDR/DDR2 MACO IP Design Kit Directory Structure
ddr_maco_eval
<username>
impl
precision
synplify
sim
aldec
rtl
script
timing
modelsim
rtl
script
timing
work
src
params
top
testbench
memory
top
help_files
models
scm
support
17
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Parameter Descriptions
Figures 10 through 14 give examples of the four IPexpress GUI windows that allow the user to customize the generated IP to a particular application, and Tables 8 through 12 describe each parameter and its function.
Figure 10. GUI Dialog Box for DDR/DDR2 Memory Controller
Table 8. GUI Dialog Box for DDR/DDR2 Memory Controller
Parameter
Description
Project Path
This is the directory in which the project will be generated
File Name
Enter the project name
Design Entry
The design entry mode is Verilog HDL
Device Family
The device family is LatticeSCM
Part Name
Select the desired LatticeSCM device size, speed grade and package
18
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Figure 11. GUI Dialog Box for DDR/DDR2 Memory Controller Clocks
Table 9. GUI Dialog Box for DDR/DDR2 Memory Controller Clocks
Parameter
Description
Input Reference Clock Frequency
Specify the frequency of the input clock to the memory controller. Value range is 100
to 400MHz if the multiplier is set to 1, or 50 to 200MHz if the multiplier is set to 2, etc.
Reference Clock Multiplier
Set this value to the ratio of the desired Output Frequency and the selected Input
Reference Clock Frequency. Choices are x1, x2, x4, x8. Default is x2.
Output frequency
The Output Frequency is the operating frequency of the DDR interface. It is calculated by IPexpress, and is set to (Input Reference Clock Frequency) * (Reference
Clock Multiplier).
DQS Trace Delay Compensation
The DQS Trace Delay Compensation is set to the round-trip board trace delay (outbound delay on K, plus inbound delay on DQS) in picoseconds. When the module is
being generated for back-annotated simulation purposes, this value should be set to
zero.
19
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Figure 12. GUI Dialog Box for DDR/DDR2 Memory Controller Configuration
Table 10. GUI Dialog Box for DDR/DDR2 Memory Controller Configuration
Parameter
Description
Controller Mode
Select DDR or DDR2 Mode
Data Width
Width of DQ bus
RA Width
Row Address Width
CA Width
Column Address Width
Chip Selects
Number of chip selects required
Clock Width
Select the number of clocks to be driven out of the LatticeSC device. Valid
choices are 1 or 2 and should be the same as CKE Width.
CKE Width
Select the number of clock enables to be driven out of the LatticeSC device.
Valid choices are 1 or 2 and should be the same as Clock Width.
Number of Auto Refresh Burst Commands
Select number of refresh operations per auto refresh burst
Use Differential DQS
Check this box to enable differential DQS signals
20
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Table 10. GUI Dialog Box for DDR/DDR2 Memory Controller Configuration (Continued)
Parameter
Description
DDR Mode Only Parameters
Page Size
Select the desired page size
Extended Register Mode Enable
Enable/disable Extended Mode Register
DDR2 Mode Only Parameters
INT_BANK
Set this to the internal bank structure of the target DDR device
Number of DIMM Slots
Set this to the number of DIMM slots that the target board supports
Figure 13. GUI Dialog Box for DDR/DDR2 Memory Controller Timing
Table 11. GUI Dialog Box for DDR/DDR2 Memory Controller Timing
Parameter
Description
Initial CAS Latency
This is the CAS latency assigned during DDR device initialization
MIN tRAS
ACTIVE to PRECHARGE command
MIN tRC
ACTIVE to ACTIVE (same bank) command
MIN tRCD
ACTIVE to READ or WRITE delay
MIN tRRD
ACTIVE bank a to ACTIVE bank b command
MIN tRFC
REFRESH to Active or Refresh to Refresh command interval
MIN tRP
PRECHARGE command period
MIN tWR
Write recovery time
21
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Table 11. GUI Dialog Box for DDR/DDR2 Memory Controller Timing (Continued)
Parameter
Description
MIN tMRD
LOAD MODE command cycle time
MIN tREFI
Average periodic refresh interval
DDR2-Specific Parameters
MIN tRTP
Internal READ to precharge command delay
MIN tWTR
Internal WRITE to READ command delay
MIN tFAW
Four Bank Activate period
MIN tCKP
CKE assertion to Precharge command delay during initialization sequence. Set this value to (output
clock frequency) * 0.4, to produce a 400 ns delay.
Figure 14. GUI Dialog Box for DDR/DDR2 Memory Controller Location
Table 12. GUI Dialog Box for DDR/DDR2 Memory Controller Location
Parameter
LL: Left MACO, Left Pinout
Description
The left-side MACO used for the RLDRAM controller, and the pinout is on the left
side.
LC: Left MACO, CIB Pinout
The left-side MACO used for the RLDRAM controller, and the pinout is CIB.
LB: Left MACO, Bottom Pinout
The left-side MACO used for the RLDRAM controller, and the pinout is on the bottom side.
RB: Right MACO, Bottom Pinout
The right-side MACO used for the RLDRAM controller, and the pinout is on the bottom side.
RC: Right MACO, CIB Pinout
The right-side MACO used for the RLDRAM controller, and the pinout is CIB.
RR: Right MACO, Right Pinout
The right-side MACO used for the RLDRAM controller, and the pinout is on the
right side.
Add SMI Port Interface for PLL and DLL
Check this box to enable run-time access to PLL and DLL memory-mapped parameters
22
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Design Guidelines to Optimize Performance
Master Clock
The master reference clock can be sourced from any clock source, either internal or external to the LatticeSCM
device. It is fed to PLL PLL1. If the source is external, it should use the direct input pin for that PLL’s CLKI input
(refer to Table 13). Also, minimize clock jitter caused by coupling from noisy neighboring signals (refer to the
accompanying discussion, “Selecting a Pin That Has Low Jitter Noise” below). Note that the PLL will filter some of
the jitter that exists at the PLL's input.
Table 13. PLL Direct Input Pins (True/Complement Pair)
F900
FF1020
FC1152
FC1704
ULC PLL A
D3/D2
K25/J25
F30/G30
J37/J38
ULC PLL B
K4/J4
M23/N23
N25/P25
N33/P33
LLC PLL B
AC6/AC7
AC23/AD24
AG29/AG28
AN36/AP36
LLC PLL A
AH1/AJ1
AJ32/AK32
AM33/AN33
AU42/AV42
LRC PLL A
AJ30/AH30
AJ1/AK1
AN2/AM2
AV1/AU1
LRC PLL B
AD26/AC25
AC10/AD9
AG6/AG7
AN7/AP7
URC PLL B
K25/K26
M10/N10
N10/P10
N10/P10
URC PLL A
D28/E28
K8/J8
F5/G5
J6/J5
Table 14. DLL Direct Input Pins (True/Complement Pair)
ULC DLL C
F900
FF1020
FC1152
FC1704
E3/E2
D32/D31
F31/G31
G40/H40
ULC DLL D
F3/G3
E32/E31
D33/E33
G41/H41
LLC DLL E
AB6/AC5
AE26/AE27
AJ30/AK30
AL37/AM37
LLC DLL F
AF2/AG2
AG32/AG31
AL32/AL31
AR39/AR40
LLC DLL C
AF4/AE5
AF27/AG28
AH29/AJ29
AL33/AL34
LLC DLL D
AG3/AH2
AK31/AL31
AM32/AM31
AU38/AV38
LRC DLL C
AJ29/AH29
AL2/AK2
AM3/AM4
AV2/AW2
LRC DLL D
AG28/AG29
AJ2/AH3
AJ6/AH6
AL10/AL9
LRC DLL F
AF29/AF28
AG1/AG2
AL3/AL4
AR4/AR3
LRC DLL E
AB26/AC26
AE7/AE6
AJ5/AK5
AL6/AM6
URC DLL D
G28/F28
E1/E2
D2/E2
G2/H2
URC DLL C
D29/D30
D1/D2
F4/G4
G3/H3
23
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Table 15. Preferred Pinout for Left Side Memory Controller
Bottom Edge Preferred Pinout
Left Edge Preferred Pinout
DDR/DDR2
Port
SC25 900
All 1020
All 1152
All 1704
SC25 900
All 1020
All 1152
All 1704
ODT[0]
AF7
AK29
AN32
AW42
AA1
Y32
AF34
AG42
ODT[1]
AF6
AL29
AP32
AY42
Y1
W32
AE34
AH42
WE_N
AE5
AG28
AJ29
AL34
V4
W25
AA24
AG29
RAS_N
AJ1
AK32
AN33
AV42
V5
Y26
Y24
AF29
CAS_N
AD6
AE25
AG27
AM34
W2
Y28
AC31
AG39
BA[0]
AJ2
AK30
AL29
AV41
V2
W28
AB31
AF39
BA[1]
AK2
AL30
AL28
AW41
V6
Y27
AA27
AH36
BA[2]
AD7
AD23
AH27
AK30
U6
W27
AA26
AG36
CS_N[0]
AH2
AL31
AM31
AV38
W1
Y31
AC32
AG40
CS_N[1]
AG3
AK31
AM32
AU38
V1
W31
AB32
AF40
CS_N[2]
AK9
AM22
AN20
BA26
AC1
AC31
AF31
AK38
CS_N[3]
AG14
AL20
AK20
AV24
Y6
AD30
AJ33
AM42
CS_N[4]
AK10
AJ19
AL20
BB24
AC3
AC32
AG34
AN42
CS_N[5]
AK11
AK19
AL19
BB25
AD3
AD32
AH34
AP42
CS_N[6]
AH15
AM21
AP21
AW24
AC4
AE30
AK33
AN41
CS_N[7]
AG15
AM20
AP20
AW23
AD4
AE29
AL33
AP41
A[0]
AH4
AJ28
AN31
AW40
U5
W29
AA33
AD39
A[1]
AG5
AK28
AN30
AY40
U4
W30
Y33
AC39
A[2]
AF8
AJ31
AP31
AW39
T4
V30
Y31
AB42
A[3]
AG8
AH30
AP30
AW38
T5
V29
W31
AA42
A[4]
AH3
AM30
AM29
AV37
U1
V31
W33
AB38
A[5]
AJ3
AM29
AM28
AV36
T1
V32
V33
AA38
A[6]
AF9
AH29
AJ27
AM31
V3
U31
V34
Y41
A[7]
AE10
AH28
AJ26
AM32
U3
U32
U34
W41
A[8]
AK3
AJ27
AP29
BA40
T6
T27
V25
AA36
A[9]
AJ4
AK27
AP28
BB40
U2
T32
U33
Y40
A[10]
AE11
AL28
AN29
BA39
T2
T31
T33
W40
A[11]
AF10
AL27
AN28
BA38
R4
U24
Y27
AC32
A[12]
AH7
AM28
AL26
AW36
R1
R32
W30
Y39
A[13]
AH8
AM27
AL25
AW35
P1
R31
V30
W39
CKE
AE12
AG23
AG23
AM28
R3
T26
V28
AB35
24
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Table 16. Preferred Pinout for Right Side Memory Controller
DDR/
DDR2
Port
Bottom Edge Preferred Pinout
SC15 900 SC25 900 All 1020
All 1152
Right Edge Preferred Pinout
All 1704 SC15 900 SC25 900 All 1020
All 1152
All 1704
ODT[0]
AE25
AF27
AL3
AL6
AV4
AB25
W29
Y1
AF1
AG1
ODT[1]
AH28
AG26
AL4
AL7
AV3
AD30
V29
W1
AE1
AH1
WE_N
AD25
AH30
AK1
AM2
AU1
Y30
W26
W8
AA11
AG14
RAS_N
AE26
AG29
AH3
AH6
AL9
AA30
V26
Y7
Y11
AF14
CAS_N
AG29
AE25
AD10
AH8
AK13
AA25
U30
Y5
AC4
AG4
BA[0]
AJ28
AD25
AE8
AG8
AM9
AE30
T30
W5
AB4
AF4
BA[1]
AE22
AE26
AE9
AG9
AM10
AB28
V25
Y6
AA8
AH7
BA[2]
AK29
AH29
AK2
AM4
AW2
AC28
U25
W6
AA9
AG7
CS_N[0]
AH30
AH28
AJ3
AN4
AY1
AF30
W28
Y2
AC3
AG3
CS_N[1]
AH29
AJ28
AK3
AN5
AW1
AG30
V28
W2
AB3
AF3
CS_N[2]
AE19
AH18
AM12
AL15
AW19
AC26
AB26
AE7
AJ5
AL6
CS_N[3]
AK24
AH17
AM13
AL16
AW20
AF28
AG30
AE1
AM1
AP5
CS_N[4]
AK22
AK19
AJ15
AM15
AY19
AC25
AC27
AF1
AJ4
AR2
CS_N[5]
AJ20
AK18
AK15
AM16
AY20
AB26
AC26
AE6
AK5
AM6
CS_N[6]
AF18
AH16
AM14
AK17
AV21
AF29
AC25
AD9
AG7
AP7
CS_N[7]
AK20
AE16
AD16
AE17
AP21
AB27
AF28
AG2
AL4
AR3
A[0]
AK28
AF25
AJ5
AN3
AW3
T30
T27
W4
AA2
AD4
A[1]
AH21
AG25
AK5
AP3
AY3
W28
R27
W3
Y2
AC4
A[2]
AH23
AG24
AH4
AM6
BA2
U26
V27
V3
Y4
AB1
A[3]
AH22
AF24
AH5
AM7
AY2
U28
U27
V4
W4
AA1
A[4]
AG22
AH27
AM3
AP4
AV6
M30
R30
V2
W2
AB5
A[5]
AG21
AH26
AM4
AP5
AV7
R29
P30
V1
V2
AA5
A[6]
AF21
AE22
AF10
AK9
AN11
P29
U29
U2
V1
Y2
A[7]
AE21
AK29
AJ6
AN6
AY4
P27
T29
U1
U1
W2
A[8]
AE20
AK28
AK6
AN7
AY5
N29
T24
T6
V10
AA7
A[9]
AK25
AH25
AG8
AP6
BA4
N28
N30
T1
U2
Y3
A[10]
AH19
AH24
AG7
AP7
BA5
R25
M29
T2
T2
W3
A[11]
AK23
AE23
AL5
AN8
BB4
R28
U26
U9
Y8
AC11
A[12]
AJ21
AD23
AL6
AN9
BB5
N27
U28
R1
W5
Y4
A[13]
AG18
AH21
AC12
AF12
AT10
L30
T28
R2
V5
W4
CKE
AK21
AH23
AM5
AL9
AV8
J30
W30
AA1
AG2
AK3
Note that if there are multiple DDR2 Memory Controllers on the same LatticeSCM device that operate at the same
rate, they can share a common PLL PLL1, in which case the two nets “k_clk” and “k4_clk” will also be common
among them. This is accomplished by wrapping each memory controller in its own module that contains all logic
except PLL1, connecting each module's internal nets “k_clk” and “k4_clk” to module inputs, instantiating one copy
of PLL1 outside the DDR2 modules, and connecting that PLL's outputs to those inputs on each module.
DQS Strobe
The DQS strobe and its associated DQ and DM signals must all reside in banks served by a common edge clock.
There is a single edge clock serving the two banks (2-3 or 6-7) on each of the two sides of the device, right and left.
The two banks (4-5) on the bottom edge are each served by a separate edge clock.
25
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
The bidirectional DQS strobe to/from the DDR2 device can be implemented as either a single-ended or differential
signal. Lattice recommends differential, and differential is required at clock rates above 200 MHz.
For differential DQS implementations, the differential pairs must be on A/B pairs (for example, PB17A/PB17B) that
are semi-dedicated “PCLKT/C[7:2]_[7:0]” (for example, PCLKT5_2/PCLKC5_2). These pairs feature complementary outputs and differential inputs, and are able to directly drive edge clocks. If IPexpress™ is used to generate the
DDR2 memory Controller, the pins assigned will conform to this requirement.
Differential DQS applications also require that the IOBUF preference for the DQS signals have the IO_TYPE
changed from SSTL18_II to SSTL18D_II. If the design is generated by IPexpress, this is handled automatically.
K/K# Clocks
The K/K# clock pair (and K_copy/K_copy# pair, if used) must be placed on A/B pairs, since they form a complementary output pair.
In order to minimize skew and noise, the K/K# clock pair should be located on the pins that are the driving PLL's
designated direct input pins (refer to Table 13). If this is not possible, then use an input driven by an edge clock.
This may seem unusual, since these are outputs rather than inputs, but the driven signal is also fed back into PLL
PLL2 to create the read data recapture clock, and it is this feedback that needs to be specially handled.
It is important to minimize clock jitter caused by coupling from noisy neighboring signals (refer to the accompanying
discussion, “Selecting a Pin That Has Low Jitter Noise”, below).
PLLs
PLL1 in Figure 5 generates a 90° phase shift between the address/data/control lines to the DDR device and the
accompanying K/K# clock. PLL1 also performs optional clock frequency multiplication when necessary. No custom
adjustment of PLL1 is needed.
PLL2 in Figure 5 performs a 90° phase shift on "k_clk_in", the internally reflected copy of the outbound clock. PLL2
also compensates for the total round-trip delay of the board traces to/from the memory device. The value of this
delay is entered into IPexpress as the “DQS Trace Delay Compensation” when the DDR module is generated. In
order to achieve optimum performance, especially at high clock rates, this delay value can be tuned for the specific
implementation. This tuning need only be performed once, and should be performed using the final board layout.
The simplest way to perform this tuning is by iteratively changing the DQS Trace Delay Compensation in IPexpress
to determine the range of values that yield correct performance, and then using the “sweet spot” centered in that
range. Alternatively, PLL2's behavior can be modified dynamically by writing the relevant parameters, PHASEADJ
and CLKOS_VCODEL, via the System Bus. For details on modifying these parameters, refer to TN1098, LatticeSC
sysCLOCK and PLL/DLL User's Guide. Note that CLKOS_VCODEL is applied at PLL reset, so a reset must be
applied after each change.
DLLs
The DLL is used to generate a 90° phase shift so that the receive DQ data eye is centered on the receiving DQS
clock. It uses TRDDLLA (time reference delay) mode to achieve this result.
To achieve maximum performance, it may be necessary to adjust the DLL’s ALU function +/- 1 or more taps in order
to center the DQS in the DQ eye. The optimum setting should be determined experimentally, using the final board
layout. The parameter to adjust is named DCNTL_ADJVAL, and can be set in the DLL’s source code file
“ddr_trdll.v”. Do this by adding two lines similar to the following, which set DCNTL_ADJVAL to -2:
/* synthesis DCNTL_ADJVAL="-2" */
// exemplar attribute ddr_trdll_0_0 DCNTL_ADJVAL -2
Add each of the lines to the group of similar lines in the code. The value can also be modified dynamically by writing it via the LatticeSCM’s SMI Bus, if the DLL has been assigned a unique SMI address (DCNTL_ADJVAL is byte
26
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
9). Also, if the device has the ORCAstra interface module implemented in the design, the value can be modified via
the ORCAstra GUI.
Since this DLL drives the DCNTL bus, it should be located in the corner of the device adjacent to the DQS signal
banks. Note that the clock to the DLL is driven internally, so there is no input pin to be located.
Board Layout and Trace Matching
All DQ, DQS and DM signals within a lane must have PCB board trace lengths matched to within 50 picoseconds,
and across lanes, they must be matched to within 150 picoseconds.
All address, control signals, and their clocks (K, K#, K_copy and K_copy#) to the DDR2 device must be matched to
within 50 picoseconds.
Lattice recommends simulation of simultaneous switching outputs (SSOs) for the device/package combination for
performance targeted to over 200 MHz.
In order to ensure that potential conflicts are resolved and to provide maximum flexibility when assigning resources,
Lattice recommends that the LatticeSCM device design be placed and routed in ispLever before commitment of the
board design to manufacture.
Other Board-Level Considerations
All dynamic signal traces must be 50 Ohm transmission lines.
All power signals, including any VTT power, must be supplied by planes, not traces.
Care must be taken to keep reference voltages, such as the DDR2’s VREF pin, noise-free.This involves robust,
wide-bandwidth decoupling, and isolation of quiet, noise-sensitive signals from noise sources.
The physical distance between the LatticeSCM device and the DDR2 memory device needs to be minimized, since
trace delays, skews and signal degradation will limit overall speed.
Selecting a Pin That Has Low Jitter Noise
When a signal, such as an input reference clock or the DDR2 clocks K/K# or DQS, needs to be especially quiet
with low-jitter, some special design rules can help achieve this goal:
• It is highly preferable to place the pin in a bank that does not also contain single-ended output drivers. Figure 14
shows how bank groups form clusters around the package for a 256-pin fpBGA. The 256-pin fpBGA was used for
simple illustration. The 256-pin package is too small to allow for complete dedicated pinout and thus performance
is not guaranteed for this package. See Table 17 for performance data.
27
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Figure 15. Selecting a Pin for Low Jitter Noise
16
A
15
14
13
12
11
10
1
1
1
1
1
1
1
1
1
1
B
1
C
Example A
Noisy
SingleEnded
Output
LowNoise
Clock
Input
Example B
9
8
7
6
5
4
B
C
2
2
2
2
F
3
2
2
G
2
2
2
7
7
H
3
3
2
2
2
2
7
7
3
3
3
3
3
6
3
3
J
3
3
L
1
1
1
1
D
1
1
1
2
2
E
1
7
7
7
F
7
G
7
7
7
7
H
6
6
7
7
7
J
6
6
6
K
5
6
6
6
6
L
6
6
6
M
6
N
6
P
6
R
3
3
3
3
4
M
3
3
3
4
4
4
5
5
5
6
6
N
4
4
4
4
4
5
5
5
5
P
4
4
4
4
4
5
5
5
5
R
4
4
4
4
4
5
5
5
5
4
4
4
4
4
4
5
5
5
5
5
5
5
5
15
14
13
12
11
10
9
8
7
6
5
4
3
2
T
16
1
1
E
K
2
A
D
1
3
6
5
“7” indicates
I/O bank 7
T
1
• If a quiet bank cannot be used, avoid creating inductively coupled paths linked to noisy signals on the package.
These occur when the low-noise signal trace passes through an area on the package substrate from pin to pad
that contains noisy signal pins or traces (in particular single-ended outputs, and especially when those singleended outputs are unterminated). Figure 14 also illustrates this concept. Two examples are shown:
– Example A shows a noisy output pin (G12, bank 2) that is near the package center, and a low-noise clock pin
(F16, bank 3) that is situated radially outward from that pin. In this case, the pin-to-die connection for the
clock will route directly past the noisy output pin, resulting in coupled noise. This should be avoided.
– Example B demonstrates the reverse situation, which is also to be avoided. In this case, a noisy output pin
(M16, bank 3) is situated radially outward from a low-noise clock pin (L12, bank 4), so that the noisy output’s
pad-to-pin connection will pass over the clock pin.
– In order to minimize this coupling, it is typically better to place noise-sensitive pins toward the center of the
package. This reduces the trace length of this signal in the package, thus reducing coupling to this signal.
Noise immunity may be further enhanced by providing extra “ground” pins around the sensitive signal, by driving
adjacent outputs to a constant LO and tying them to signal ground on the PCB. This can enhance noise immunity
in two ways: first, it provides extra signal current return paths, and second, it provides a buffer distance to nearby
signal pins, thus reducing coupling to their signals. The buffers should be set to the maximum drive strength
allowed at the bank's VCCIO voltage.
Timing Specifications
The timing diagrams below show the user interface for command and data. For memory interface timing diagrams,
please refer to the data sheet of the memory device.
28
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Figure 16. Write Timing
k_clk
ddr_init_done
CMD2 latched
CMD1 latched
ddr_cmd_rdy
CMD1
ddr_cmd_valid
NOP
ddr_cmd[3:0]
CMD2
WR
NOP
WR
burst_length = 5'b0
ddr_burst_length[4:0]
'h0000000
ddr_addr
'h0000400
ddr_data_rdy
DON'T CARE
ddr_write_data
DON'T
CARE
4'b0000
DON'T CARE
ddr_dm
D02, D01
4'b0000
DON'T
CARE
D04, D03
D12, D11
D14, D13
CLK
CKE
CS_N
RAS_N
CAS_N
WE_N
A[12:0]
BA[1:0]
DON'T CARE
'h000
'h000
'h000
'h000
DON'T CARE
'b00
'b00
'b01
'b01
ODT
DQS[1:0]
'b00
DQ[15:0]
0
DM[1:0]
'b11
DON'T CARE
'b00
'b11
D02
D03
D04
2'b00
2'b00
2'b00
2'b00
'b11 'b00
'b00
'b00
D01
0
0
DON'T CARE
'b11
'b00
D11
D12
D13
D14
2'b00
2'b00
2'b00
2'b00
0
DON'T CARE
Figure 17. Read Timing
k_clk
ddr_init_done
CMD1 latched
CMD2 latched
ddr_cmd_rdy
ddr_cmd_valid
ddr_cmd[3:0]
CMD1
NOP
CMD2
WR
NOP
WR
burst_length = 5'b0
ddr_addr
'h0000000
'h0000400
ddr_read_data_valid
ddr_read_data
DON'T CARE
CLK
CKE
CS_N
RAS_N
CAS_N
WE_N
A[12:0]
BA[1:0]
DON'T CARE
'h000
'h000
'h000
'h000
DON'T CARE
'b00
'b00
'b01
'b01
ODT
DQS[1:0]
'b00
DQ[15:0]
DM[1:0]
DON'T CARE
29
'b11
'b00
'b11
'b00
D01
D02
D03
D04
'b00
'b11
'b00
'b11
'b00
D11
D12
D13
D14
D02, D01
D04, D03
DON'T
CARE
D12, D11
D14, D13
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Reference Design/Test Bench
Lattice supplies a reference design along with the DDR or DDR2 controller core. While the core design is intended
for use “as is”, the reference design provides a framework for testing the core. In the absence of a real user application, the reference design provides synchronization between the external and internal clock domains and
pseudo-random data generation.
Using the supplied reference design and test bench as a guide, users can easily customize the verification of the
core by adding, removing and customizing tests.
Figure 18. DDR2 Reference Design
DDR2_TB(Simulation Test Environment)
ddr2_top (Reference Design)
ddr_ip_top
(DDR2 Memory Controller IP)
Soft Logic
drr_trdll
ddr_pll90
k_clk
PLL
k4_clk
DLL
update_cntl
ddr_ref_clk
CLK
ddr_rdpll
k2_clk
PLL
mt47h32m8bp
External
Memory
Side
MICRON
DDR2
Memory
Model
k3_clk
us_ddr_prbs_opt_pcie.v
PRBS
Data
Generator
& Checker
MCTL
TDI
TMS
TCK
TDO
Hard
Core
USI
BUS
JTAG
MPI BUS
Systembus
mpu_8_us_um
30
User
(FPGA
Core)
Side
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
DDR/DDR2 SDRAM Memory Controller Performance
Table 17 lists the bandwidth performance per data bit for the various LatticeSCM packages, device supply voltages,
and device speed grades. All timing is at a junction temperature of 105°C and below.
Table 17. DDR/DDR2 SDRAM Memory Controller Performance
VCC = 1.0V ±5%
Package
VCC = 1.2V ±5%
-5
-6
-7
-5
-6
-7
Units
Wirebond : 900
533
533
533
533
533
533
Mbps
Flip-Chip: 1020, 1152, 17042
533
533
533
533
667
667
Mbps
1
1. For 72-bit configurations in SCM80 and SCM115 devices, a -7 speed grade will be needed to meet 667 Mbps.
2. The 256-pin package is also wirebond. However, pins are too sparse to permit dedicated pinout of all critical signals and thus timing cannot
be guaranteed.
DDR/DDR2 SDRAM Memory Controller On-Chip Resources
Figure 19 illustrates some of the resources on the LatticeSCM device that are available to the DDR/DDR2 SDRAM
Memory Controller, including:
• Seven banks of I/O pins;
• Dedicated routing to two sets of pins from each Memory Controller MACO block;
• Edge Clock buses containing eight clock lines per bus (shown), and two DCNTL buses per bank (not shown).
• PLLs for clock conditioning (up/down frequency shifting, duty cycle/phase adjusting, jitter filtering, etc.);
• DLLs for phase and delay adjustment.
31
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Edge Clock Bus (8)
UR
PLL
B
UR
PLL
A
UR
DLL
D
UR
DLL
C
Bank 3
Right MACO
Memory
Controller
LL
DLL
C
LL
DLL
D
LL
PLL
A
LL
PLL
B
LL
DLL
E
LL
DLL
F
LR
DLL
F
Edge Clock Bus (8)
Bank 5
Edge Clock Bus (8)
Bank 4
Left MACO Memory Controller
Bottom Pinout
LR
DLL
E
LR
DLL
D
LR
DLL
C
LR
PLL
B
LR
PLL
A
Right MACO Memory Controller
Side Pinout
Edge Clock Bus (8)
Edge Clock Bus (8)
Left MACO
Memory
Controller
Bank 6
Left MACO Memory Controller
Side Pinout
Bank 7
Bank 2
UL
DLL
D
Bank 1
Quad
SERDES
UL
DLL
C
Quad
SERDES
UL
PLL
B
Quad
SERDES
UL
PLL
A
Quad
SERDES
Figure 19. MACO Memory Controller Resources
Right MACO Memory Controller
Bottom Pinout
Conclusion
Applications using DDR and DDR2 SDRAM are becoming popular in FPGA designs. LatticeSCM MACO devices
offer a proven, flexible, high-performance interface to these SDRAM with consistent timing margins to meet your
design needs. The ease of integration into the LatticeSCM gives the FPGA designer the freedom to choose different variations of SDRAM and reduces the risk of system complexity.
References
• TN1099, LatticeSC DDR/DDR2 SDRAM Memory Interface User’s Guide
• TN1098, LatticeSC sysCLOCK and PLL/DLL User’s Guide
• JEDEC Standard Publication JESD79C, DDR SDRAM Specification, JEDEC
Solid State Technology Association
• JEDEC Standard Publications JESD79-2A, DDR2 SDRAM Specification, JEDEC Solid State Technology Association
• Micron Technical Note DDR333, Memory Design Guide for Two-DIMM Unbuffered Systems.
Technical Support Assistance
Hotline: 1-800-LATTICE (North America)
+1-503-268-8001 (Outside North America)
e-mail:
[email protected]
Internet: www.latticesemi.com
32
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Revision History
Date
Version
April 2006
01.0
Initial release.
Change Summary
June 2007
01.1
Updated Clocking Scheme diagram.
Added Design Guidelines to Optimize Performance section.
Added DDR/DDR2 SDRAM Memory Controller Performance section
and DDR/DDR2 SDRAM Memory Controller On-Chip Resources section.
Added DDR/DDR2 MACO Memory Controller Design Kit Directory section.
July 2007
01.2
Added PLLs section.
August 2007
01.3
Updated DDR/DDR2 SDRAM Memory Controller Performance table.
Replaced references to “LatticeSC” with “LatticeSCM”.
Added LatticeSCM appendix.
January 2008
01.4
Updated User Interface text section.
Updated Write Timing diagram.
Updated GUI Dialog Box for DDR/DDR2 Memory Controller Con-
figuration figure.
Updated GUI Dialog Box for DDR/DDR2 Memory Controller Con-
figuration table.
July 2008
01.5
Updated appendix for LatticeSCM FPGAs.
July 2008
01.6
Document title changed from “LatticeSCM DDR/DDR2 SDRAM Controller MACO Cores User’s Guide” to “DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide”.
Updated Performance and Utilization table footnote in the Appendix for
LatticeSCM FPGAs.
July 2008
01.7
Added information regarding READA and WRITEA commands to the
Command Decode Logic text section.
May 2010
01.8
Modified DDR/DDR2 SDRAM Memory Controller Performance table
and Clocking Scheme figure. Changed references of ddr_ref_clk to
k_clk.
33
DDR/DDR2 SDRAM Controller
MACO Cores User’s Guide
Lattice Semiconductor
Appendix for LatticeSCM FPGAs
Table 18. Performance and Resource Utilization1
Configuration
Type
DDR2
LatticeSCM
Device Speed
Slices
LUTs
Registers
PIOs
16
Typ. (-6)
269
225
387
43
32
Typ. (-6)
422
321
629
63
Typ. (-6)
729
515
1113
103
Max. (-7)
806
562
1234
113
Data Width
64
72
RA / CA
Widths
13 / 9
1. Performance and utilization characteristics are generated using Lattice's ispLEVER® 7.1 software. When using this IP core with different
software or in a different speed grade, performance may vary. Not all configurations will fit on smaller LatticeSCM devices. These results
are from Synplify Pro v9.4L.
Ordering Part Number
All MACO IP, including the Ethernet flexiMAC™ Core, is pre-engineered and hardwired into the MACO structured
ASIC blocks of the LatticeSCM family of parts. Each LatticeSCM device contains a different collection of MACO IP.
Larger FPGA devices will have more instances of MACO IP. Please refer to the Lattice web pages on LatticeSCM
and MACO IP or see your local Lattice sales office for more information.
All MACO IP is licensed free of charge, however a license key is required. See your local Lattice sales office for the
license key.
34