AN203230 Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability.pdf

AN203230
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and
Reliability
This application note highlights the SPI flash DDR Quad I/O access protocol and how the use of DLP can optimize
performance and reliability.
1
Abstract
Today's embedded systems are typically have larger code and data densities, faster start-ups, higher application
performance requirements, while at the same time trying to reduce overall system cost. Cypress understands
these conflicting constraints and continues its product innovations to offer best in class NVM solutions to address
the next generation system requirements. For the past decade Cypress innovations have broaden our portfolio
focus beyond parallel NOR's Asynchronous, Page, Burst interfaces to lower pin count, high read performance, low
cost NOR SPI based solutions.
SPI interfaces have progressed from a single-bit, SDR, unidirectional input and output (x1) interface to
SDR/DDR four-bit, bidirectional (x4) interface. Today's several leading NOR-based SPI memories achieved SDR
133-MHz clock rates and DDR quad I/O interface to facilitate a 66 MB/s continuous read throughput utilizing
legacy SPI timing modes. With the addition of DDR timing approaches and the application of a DLP (data learning
pattern), the Cypress S25FL-S SPI Family improves read throughputs by an additional
20 percent.
This application note highlights the SPI flash DDR Quad I/O access protocol and how the use of DLP can optimize
performance and reliability.
2
Multi IO SPI Data Learning Pattern
Today high speed embedded systems have more complex OS and Application requirements which typically
results in increased read bandwidth requirements to provide acceptable performance at a neutral or lower cost
point. Multi I/O SPI-based flash is becoming a pseudo- industry standard and its key features are its low pin count
serial interface and bandwidths that are comparable to today's higher pin count parallel flash devices. The
Cypress NOR SPI-based S25FL-S Eclipse™ family offers Multi I/O read access capability; offering excellent
tradeoffs in reducing interface pin count and best in class read bandwidth. The S25FL-S optimized the base Multi
I/O SPI-based read performance with the addition of DDR timing and the application of a data learning pattern
(DLP). The S25FL-S improve read throughputs by an additional 20 percent.
The following sections highlights the SDR Multi I/O Quad read mode operation and a provides a comparison
against DDR Multi I/O Quad DLP.
www.cypress.com
Document No. 001-03230 Rev. *A
1
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
2.1
Quad Read Overview SDR
The Quad I/O SPI interface retains backward compatibility to support legacy x1 and x2 peripheral products. See
Figure 1.
Figure 1. High Level Quad I/O Master / Slave Interface
SOC / MCU
Quad SPI Flash
CS#
IO0
Data Input
CPU
SPI Controller
SCK
IO1
IO2
IO3
The SDR (single-edge data rate) Multi I/O SPI read timing mode outputs a new data value upon each falling edge
of SCK. After a period known as the clock-to-data-out time (tV), data becomes valid and remains valid until shortly
after the next falling SCK edge see Figure 2. The host typically uses this falling SCK edge to capture the data
being output by the SPI flash. The hold time (tHO) defines the length of time that data remains valid after a falling
SCK edge.
Figure 2. Legacy SPI Timing  tDV = Psck - tV + tHO
PSCK
SCK
tV
tHO
D1
Valid
IO[3:0]
D2
Valid
Evaluating Data-Valid Time
Legacy SPI-based timing values, the size of the data valid (tDV) is the clock period (Psck) minus the time until data
becomes valid (tV) plus the hold time (tHO) after the next falling clock edge: tDV = Psck - tV + tHO

Example Legacy SDR Data Valid Window:
–
Clock Period: PSCK = 7.5 ns (80 MHz)
–
Open: tV = 6.5 ns / Close: tHO = 0 ns
–
Data Valid = PSCK - tV + tHO: 7.5 ns - 6.5 ns + 0 ns ~ 1 ns
If one has the assumption that tV and tHO timings are fixed then the data valid window compresses as the SCK
frequency increases which limits SCK to ~133 MHz.
In a SPI device the tV and tHO timing track each other; a device with a longer tV have a longer tHO and a device
with a short tV have a short tHO. The data valid window timing varies with respect to the next falling clock edge and
www.cypress.com
Document No. 001-03230 Rev. *A
2
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
tV. Utilizing this information that tV and tHO track one another one can define the size of the data valid (tDV) as
equal to:
tDV = Psck - tO_SKEW - tOTT
The key benefit of understanding the tV/ tHO timing constraint allows for optimization of flash data-valid window.
Consider the 3V flash using a 133-MHz clock (Psck=7.5 ns), a tO_SKEW of 600 ps. and an Output slew rate of 2V/
ns. The output rise/fall time (tOTT) is:
tOTT = Voutput_swing/Output_slew_rate [4]= 3 V/(2 V/ns) =1.5 ns
tDV = Psck - tO_SKEW - tOTT
= 7.5 ns - 600 ps - 1.5 ns =5.4 ns
The new data-valid window can provide a significant improvement over the legacy SPI timing.
2.2
Multi I/O Quad Read DDR
These insights concerning the data valid window can be applied SPI DDR mode. In DDR mode the tO_SKEW and
tOTT values do not change but new data is output every half clock cycle rather than after every full clock cycle, as
is the case for SDR mode (see Figure 3). The tDV is defined as:
tDV = tCLH - tO_SKEW - tOTT [5]
Figure 3. DDR Quad SPI Timing
This DDR SPI data-valid period for a system running at a given clock speed is identical to the data-valid period for
an SDR system running at twice that clock speed. This means a DDR SPI device can reliably achieve the same
read data rate at a significantly slower clock speed. For example, a QUAD DDR SPI with a clock operating at 80
MHz can achieve 80 MB/s. Note operating at a slower clock speed increases the data valid time.
Consider the 3V flash DDR Read access using a 80 MHz clock (Psck/2 = 6.25 ns), a tO_SKEW of 600 ps, and an
Output slew rate of 2V/ns. The output rise/fall time (tOTT) is:
tOTT = Voutput_swing/Output_slew_rate [4]= 3V/(2V/ns) = 1.5 ns. Assume 50% SCK duty cycle.
tDV = Psck/2 - tO_SKEW - tOTT
= 6.25 ns - 600 ps - 1.5 ns ~ 4.15 ns
www.cypress.com
Document No. 001-03230 Rev. *A
3
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
SPI-DDR Read Operation — Data Learning Patterns
In a DDR implementation, the tV time can be greater than a half clock period which means a specific SCK edge
cannot be used by the Master to reliably capture the data coming from the flash. The Master must skew the datacapture point with respect to each SCK edge in order to reliably capture data. Prior to detailing how the Master
might facilitate appropriate sampling skews lets discuss the S25FL-S six new SPI DDR read operations. The new
DDR read protocol is available for x1, x2, and x4 interfaces with either three- or four-byte addressing. Consider an
SPI-DDR read operation performed using a Quad I/O interface with three-byte addressing (command EDh). The
command sequence is much like a standard Quad I/O read operation with the exception that the address, mode,
and data bits are transferred on both rising and falling clock edges (DDR) rather than the standard Quad I/O SDR
protocol. The SPI-DDR read protocol is processed as follows:
1.
The instruction (command operation code) is transferred in an SDR manner for compatibility with all other
legacy SPI instructions. After the instruction is sent, all remaining transfers are DDR.
2.
Target address is transferred (DDR).
3.
Mode bits are loaded (DDR).
4.
Read latency (dummy) cycles are issued while target data is extracted from the array.
5.
A data learning pattern (DLP) is output by the SPI NOR flash during the last four dummy cycles (DDR).
6.
Target data is output by the SPI NOR device (DDR).
The new SPI-DDR protocol as shown in Figure 4 adds an 8-bit data-learning pattern (DLP) that is output by the
SPI flash using DDR protocol on the four dummy clock cycles just before output of the target data. The DLP
provides a known data sequence on each data signal so that the host controller can determine the optimal capture
timing to use when receiving the read data. The dummy cycles that carry the DLP occur during the idle period in
the legacy Quad I/O read protocol while the target data is retrieved from the memory array; it can be accomplished
without impacting performance. The DLP presents the same timing, phase delay, and skew characteristics that
exist during output of the target data. The clock to data output delay (tV in Figure 3) will be the same for the DLP
and the target data on each individual data signal. Data phase delay and skew arise from issues related to either
the memory device or the system environment. Memory device timing variations are caused by process, voltage,
temperature, and output-to-output skew. System-level variations are introduced by PCB parasitics, trace-length
mismatches, and bus capacitive loading. Collectively, these timing phase and skew relationships are represented
in Figure 3. The timing characteristics of the known data learning pattern will allow the host controller to
compensate for both the device- and system-level timing phase and skew offsets when valid data is present on
the bus.
Figure 4. Quad IO SPI-DDR Showing DLP Read Transaction
www.cypress.com
Document No. 001-03230 Rev. *A
4
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
Key Points:

Data Learning Pattern output by memory

Oversampled by host

Optimal data capture point determined

Data read from device

–
Calibration upon every read transaction
–
Provides Compensation for Process, Voltage, Temperature
160 Mbps (per IO) Data Rate Today
–
2.3
Strategy extendable to higher rates
Data Learning Pattern Storage and Definition
The non-volatile data learning register (NVDLR) and the volatile data learning register (VDLR) are used to define
the sequence of DLP values (8 bits on each of the four Lows) that are used during an SPI-DDR read operation to
train the host controller (see Figure 5). The NVDLR can be programmed one time (OTP) with a customer-specific
DLP value. During power-up or reset, the value in the NVDLR is loaded into the VDLR. The sequence of values
used as the DLP is defined in the VDLR during SPI-DDR read operations. The VDLR can be read and written
directly by the host system. When the VDLR is 00h, the DLP will not be output during DDR read operations, thus
providing an option to turn off the DLP.
The choice of an appropriate DLP is up to the system developer but the pattern should be chosen to maximize
skews dependent on the bit-stream sequence. The most significant pattern-dependent skewing is bounded by
transitions from states that have been stable for extended periods and states that have existed for shorter periods
of time; for example, at higher frequencies, the High to Low transition of a signal behaves slightly different when
the bit stream is xx110 than when the bit stream is xx010. The High reached in the xx110 pattern is usually a
higher voltage than the High reached with the xx010 pattern. The higher starting voltage will mean that it will take
slightly longer to reach a valid LOW state during a High to Low transition. The xx110 High to Low transition
demonstrates a ‘strong 1,’ while the xx010 transition demonstrates a
‘weak 1.’
The DLP pattern should be chosen to include at least one instance of:

weak 0, strong 0, weak 1, and strong 1
One pattern fulfilling these requirements is 34h (00110100b). The edges in the 34h pattern step through the
following transitions: Strong 0 -> Strong 1 -> Weak 0 -> Weak 1
Any data learning pattern that includes these four transition types should maximize pattern-dependent skew
characteristics.
2.4
Host Capture Strategy
The overall data-capture strategy for the host memory controller is to use the DLP input as a test sequence to
characterize system response and determine tV and tDV. Once the data eye has been identified during the DLP
portion of the DDR read sequence, the controller selects the optimal data-capture point to maximize the timing
margin for the read data.
A common way to create the Master data-capture logic is via series of skewed data-capture points that span the
data-valid window. The implementation for a single I/O might consist of five channels with a fixed sampling delay
between each of the channels. The five delayed strobes (A through E) could be generated with a delay-locked
loop (DLL) or using an oversampling clock that is in turn generated using an internally available higher frequency
clock. The host controller samples the target I/O while the DLP is being output. The phase-delayed strobes (A-E)
are triggered by the eight SCLK edges when the DLP is output.
www.cypress.com
Document No. 001-03230 Rev. *A
5
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
Figure 5. Host Capture Strategy
Key Points:
3

Data oversampled

Samples from ‘taps’ B, C, and D always successfully capture the DLP

In this case Tap C provides the greatest margin

Use Tap C to capture data for the remainder of this read transaction

Recalibration can be performed prior to every read transaction, provides more robust / reliable operation
across operating conditions
Conclusion
As embedded applications continue to demand higher performance the legacy SPI interface and protocols must
continue to accommodate higher read speeds. The DLP approach enables SPI-DDR NOR flash to move beyond
today 133 Mbps (per pin) data rate. This new feature provides the embedded designers another attractive
nonvolatile memory solution to maximize read data throughput while minimizing pin count, PCB complexity,
package size, and cost. These principles and feature improvements apply across many market applications
whether it is industrial, automotive graphics, or consumer; DLP-enabled SPI-DDR NOR flash provides another
enhanced solution to improve system design and performance, all for a reasonable cost.
4
References
Cypress: S25FL256S Data Sheet
Cypress Article: Data learning and SPI boost NOR flash performance by Cliff Zitlaw
www.cypress.com
Document No. 001-03230 Rev. *A
6
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
Document History Page
Document Title: AN203230 - Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
Document Number: 001-03230
Rev.
ECN No.
Orig. of
Change
Submission
Date
Description of Change
**
–
–
05/20/2015
Initial version
*A
5041698
MSWI
12/08/2015
Updated in Cypress template
www.cypress.com
Document No. 001-03230 Rev. *A
7
Cypress S25FL-S Multi-I/O DDR DLP Optimizes Read Performance and Reliability
Worldwide Sales and Design Support
Worldwide Sales and Design Support
Cypress maintains a worldwide network of offices, solution centers, manufacturers’ representatives, and distributors. To find the
office closest to you, visit us at Cypress Locations.
#
999
Products
PSoC® Solutions
Automotive..................................cypress.com/go/automotive
psoc.cypress.com/solutions
Clocks & Buffers ................................ cypress.com/go/clocks
PSoC 1 | PSoC 3 | PSoC 4 | PSoC 5LP
Interface......................................... cypress.com/go/interface
Cypress Developer Community
Lighting & Power Control ............cypress.com/go/powerpsoc
Memory........................................... cypress.com/go/memory
PSoC ....................................................cypress.com/go/psoc
Touch Sensing .................................... cypress.com/go/touch
Community | Forums | Blogs | Video | Training
Technical Support
cypress.com/go/support
USB Controllers ....................................cypress.com/go/USB
Wireless/RF .................................... cypress.com/go/wireless
MirrorBit®, MirrorBit® Eclipse™, ORNAND™, EcoRAM™ and combinations thereof, are trademarks and registered trademarks of Cypress Semiconductor Corp. All
other trademarks or registered trademarks referenced herein are the property of their respective owners.
Cypress Semiconductor
198 Champion Court
San Jose, CA 95134-1709
Phone:
Fax:
Website:
408-943-2600
408-943-4730
www.cypress.com
© Cypress Semiconductor Corporation, 2015. The information contained herein is subject to change without notice. Cypress Semiconductor Corporation assumes
no responsibility for the use of any circuitry other than circuitry embodied in a Cypress product. Nor does it convey or imply any license under patent or other rights.
Cypress products are not warranted nor intended to be used for medical, life support, life saving, critical control or safety applications, unless pursuant to an express
written agreement with Cypress. Furthermore, Cypress does not authorize its products for use as critical components in life-support systems where a malfunction or
failure may reasonably be expected to result in significant injury to the user. The inclusion of Cypress products in life-support systems application implies that the
manufacturer assumes all risk of such use and in doing so indemnifies Cypress against all charges.
This Source Code (software and/or firmware) is owned by Cypress Semiconductor Corporation (Cypress) and is protected by and subject to worldwide patent protection (United States and foreign), United States copyright laws and international treaty provisions. Cypress hereby grants to licensee a personal, non-exclusive,
non-transferable license to copy, use, modify, create derivative works of, and compile the Cypress Source Code and derivative works for the sole purpose of creating
custom software and or firmware in support of licensee product to be used only in conjunction with a Cypress integrated circuit as specified in the applicable agreement. Any reproduction, modification, translation, compilation, or representation of this Source Code except as specified above is prohibited without the express written permission of Cypress.
Disclaimer: CYPRESS MAKES NO WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, WITH REGARD TO THIS MATERIAL, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Cypress reserves the right to make changes without
further notice to the materials described herein. Cypress does not assume any liability arising out of the application or use of any product or circuit described herein.
Cypress does not authorize its products for use as critical components in life-support systems where a malfunction or failure may reasonably be expected to result in
significant injury to the user. The inclusion of Cypress' product in a life-support systems application implies that the manufacturer assumes all risk of such use and in
doing so indemnifies Cypress against all charges.
Use may be limited by and subject to the applicable Cypress software license agreement.
www.cypress.com
Document No. 001-03230 Rev. *A
8