ETC AU1X00SDRAMPERF_R1

SDRAM Performance on
Alchemy™ Au1000™,
Au1100™ and Au1500™
Processors from AMD
Application Note
Revision: 1.2
Issue Date: October 2002
© 2002 Advanced Micro Devices, Inc. All rights reserved.
The contents of this document are provided in connection with Advanced Micro Devices,
Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the
accuracy or completeness of the contents of this publication and reserves the right to make
changes to specifications and product descriptions at any time without notice. No license,
whether express, implied, arising by estoppel or otherwise, to any intellectual property
rights is granted by this publication. Except as set forth in AMD’s Standard Terms and
Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or
implied warranty, relating to its products including, but not limited to, the implied warranty
of merchantability, fitness for a particular purpose, or infringement of any intellectual property right.
AMD’s products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications
intended to support or sustain life, or in any other application in which the failure of
AMD’s product could create a situation where personal injury, death, or severe property or
environmental damage may occur. AMD reserves the right to discontinue or make changes
to its products at any time without notice.
Contacts
www.amd.com [email protected]
Trademarks
AMD, the AMD Arrow logo, and combinations thereof, and Au1000, Au1100, Au1500, and Alchemy are trademarks of
Advanced Micro Devices, Inc.
Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
Rev. 1.2
October 2002
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
1. Introduction
The Au1000™, Au1500™ and Au1100™ (Au1x00) processors each feature an integrated, high-performance SDRAM controller for connecting to 32-bit wide external SDRAM memory. This document describes the performance characteristics of the SDRAM controller and techniques for
optimizing SDRAM performance.
2. SDRAM Controller Overview
The SDRAM controller on an Au1x00 processor supports three ranks of 32-bit wide SDRAM. A rank
is a physical grouping of SDRAM devices, all tied to the same chip select. The term ‘rank’ is used to
distinguish the physical grouping of the SDRAM chips from the internal banks of an SDRAM chip.
Each rank is independently programmed with row size, column size, bank size, RAS, CAS and other
timing values (as further described in the data books).
The SDRAM data interface is 32-bits wide and supports various arrangements of SDRAM devices:
• one 32-bit wide SDRAM device
•
two 16-bit wide SDRAM devices
•
four 8-bit wide SDRAM devices
The SDRAM controller supports a maximum of 6 loads, so the physical arrangement of the SDRAM
limits the number of devices that can be installed in a system.
The SDRAM interface operates at one-half the Au1x00 internal system bus clock, up to a maximum
frequency of 125 MHz. Utilizing a 396-MHz Au1x00 system as an example, the Au1 core runs at
396 MHz (sys_cpupll = 33), the system bus operates at one-half the Au1 core frequency at 198 MHz
(sys_powerctrl[SD]=00), thus the SDRAM interface operates at 99 MHz.
NOTE: For an Au1x00 processor rated to run at 492 MHz, an SDRAM clock of
123 MHz is possible, and this is a frequency within the operating capability of the
controller and commodity PC desktop SDRAM (133-MHz PC133).
The Au1 core and system bus operate at higher frequencies than the SDRAM interface; as a result, the
Au1x00 processor is capable of saturating the SDRAM interface on both reads and writes from/to the
SDRAM. Thus, with a properly configured SDRAM interface, the actual, realized SDRAM throughput is not limited by the SDRAM controller; it is instead highly dependent upon the run-time aspects
of the software and applications running in the system.
3. SDRAM Performance
The performance of the SDRAM interface is measured in throughput, which is the amount of data
that can be transferred to and from the SDRAM in a given time period. The maximum throughput of
the SDRAM controller is approximated by this equation:
Application Note
3
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
TP = SDCLK * BPT * ((RR / (2+CPR)) + (WR / CPW))
where
•
TP is the throughput (bytes/second)
•
SDCLK is the SDRAM clock frequency (cycles/second)
•
BPT is the number of bytes per transfer, 32 (bytes/transfer)
•
RR is the read ratio; percentage of accesses that are reads, expressed as decimal, e.g.
0.75 for 75% (RR + WR must equal 1.0)
•
CPR is the cycles per read for BPT bytes (cycles/transfer)
•
WR is the write ratio; percentage of accesses that are writes, expressed as decimal, e.g.
0.25 for 25% (RR + WR must equal 1.0)
•
CPW is the cycles per write for BPT bytes (cycles/transfer)
This equation is an approximation since the actual number of bytes per transfer (BPT) to/from
SDRAM is not always 32 (8-word burst), and because the read ratio (RR) and write ratio (WR) vary
greatly depending upon the software applications running in the system. Heuristically, read accesses
(instruction fetches and data loads) out-number write accesses (data stores) by three to one. Furthermore, this approximation does not take into account other aspects of SDRAM, such as refresh cycles
and open banks.
Nonetheless, an approximate SDRAM throughput for the example 396-MHz Au1x00 system utilizing
SDRAM with CAS latency of 2 becomes:
TP = 99MHz * 32 * ((0.75 / (2+12)) + (0.25 / 10)) = 248.9MB/s
The read ratio RR is 0.75 and the write ratio WR is 0.25 to reflect that reads heuristically out-number
writes three-to-one. The cycles per read CPR is 14 (1 cycle activate command + 1 cycle nop + 1 cycle
read command + 1cycle CAS latency + 8 cycles for burst of 8 words + 2 cycles for synchronizing to
the internal system bus). The cycles per write CPW is 10 (1 cycle activate command + 1 cycle nop +
8 cycles for burst of 8 words).
The SDRAM throughput approximation equation tends to yield a realistic upper-bound on the
SDRAM throughput. By profiling the application(s) running on the Au1x00 system, one can obtain
more accurate values for the read ratio RR, the write ratio WR, and the average number of bytes per
transfer BPT which in turn will yield a closer approximation for the actual SDRAM throughput.
In an Au1x00-based design, the following items can be tuned (both by hardware and software decisions) to achieve the best overall SDRAM throughput:
• SDRAM clock
4
•
CAS latency
•
Refreshes
•
Access pattern
Application Note
Rev. 1.2
October 2002
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
In practice, the SDRAM throughput is influenced (both positively and negatively) by the software
running in the system. Software initiates the accesses to SDRAM by requesting instruction fetches,
data loads and stores, or DMA activity. The access pattern generated by software determines how
efficiently the SDRAM interface is utilized to transfer data to and from memory, and ultimately the
throughput to the SDRAM.
In the examples that follow, a 99-MHz SDRAM clock and SDRAM of CAS latency of 2 is assumed
unless otherwise stated.
3.1 SDRAM Clock
The SDRAM clock is the dominating factor in determining SDRAM throughput. Simply stated, the
higher the frequency of the SDRAM clock, the higher the SDRAM throughput.
Since the SDRAM clock frequency is derived from the Au1 core frequency, the process of selecting a
specific speed-grade Au1x00 device should take into account the resulting SDRAM clock frequency.
Table “Au1 Core and SDRAM frequencies” on page 11 lists some common Au1 core and SDRAM
frequencies.
3.2 SDRAM CAS Latency
The SDRAM CAS latency determines the number of cycles until data can be read from the SDRAM.
The CAS latency varies depending upon the SDRAM device, but is usually either 2 or 3. An SDRAM
device with a CAS latency of 2 will yield better throughput than an SDRAM with CAS latency of 3.
For a 396-MHz Au1x00 system utilizing SDRAM with a CAS latency of 3, the approximate throughput becomes:
TP = 99MHz * 32 * ((0.75 / (2+13)) + (0.25 / 10)) = 237.6MB/s
Utilizing CAS 3 SDRAM reduces throughput by 11.3MB/s relative to the CAS 2 SDRAM. Clearly
utilizing an SDRAM with a low CAS latency is desirable.
3.3 SDRAM Refreshes
As is the case for all dynamic RAM, SDRAM requires periodic refreshes to ensure the integrity of the
data arrays. During an SDRAM refresh, accesses by the core are stalled until the refresh completes.
As a result, the actual number of transfers that can occur to the SDRAM are reduced, which in turn
reduces the overall SDRAM throughput.
For example, many SDRAMs require periodic refresh of all rows every 64 milliseconds. For SDRAM
devices with 4096 rows, this equates to a refresh cycle every 15.7 microseconds or 63694 refreshes
per second. For 100-MHz SDRAM devices, an auto refresh phase typically requires about nine cycles
(Trp + Trc); thus, 63694 * 9 = 573246 SDRAM clock cycles are consumed each second by refreshes,
and not by data accesses. While the number is a small fraction of the available 99-MHz SDRAM
clock cycles (0.58%), it does represent a throughput reduction of nearly 1.5MBytes/second.
Application Note
5
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
Refresh cycles have another adverse effect: all the SDRAM banks must be closed prior to the refresh
cycle. As a result, the next read or write to the SDRAM must first re-open the bank which increases
the number of cycles necessary by the read or write. With a refresh interval of 15.7microseconds, only
1570 SDRAM clocks occur between refreshes (15.7e-6 / 10.0e-9), which permits a maximum of 112
burst reads between refreshes before all banks are closed again for the next refresh.
Clearly, a refresh interval that is too conservative (shorter than necessary) only reduces the SDRAM
throughput with no additional benefit.
3.4 SDRAM Access Pattern
The SDRAM clock, CAS latency and refresh interval are all “static” variables that can be optimized
according to the SDRAM devices to ensure efficient operation. However, it is the “dynamic” variable,
the software, that influences, both positively and negatively, the actual, realized SDRAM throughput.
This “dynamic” variable is the access pattern created by software during run-time, and can be generally classified as follows:
• burst accesses
•
single beat accesses
•
locality of references
The access patterns are all a direct result of the software running in the system.
3.4.1 Burst Accesses
All SDRAM accesses require a few cycles of setup time before data is accessed. To maximize
throughput, the setup time needs to be minimized. Since the steps to access SDRAM are pre-determined, the setup time can not [typically] be reduced, but it can be amortized to minimize its impact.
Burst accesses amortize the setup time by reading/writing more data for the same, fixed setup time.
Burst accesses can be initiated by instruction cache loads, data cache loads, data cache line cast-out,
the write-buffer, or DMA engines. The Au1 core utilizes 8 word (32-byte) cache lines, the writebuffer can burst up to 8 words (32-bytes), and DMA engines can burst as well. The SDRAM throughput approximation equation intentionally utilizes 32-byte transfers to match the Au1 core cache line
size; after all, the majority of SDRAM accesses are initiated by the caches.
For an 8-word read burst to SDRAM that requires 12 cycles, the efficiency of the access is 32 bytes /
12 cycles = 2.67 bytes/cycle. The ideal maximum is to sustain 4 bytes/cycle for a 32-bit SDRAM
interface, but this is not possible due to the non-zero setup time required for all SDRAM accesses.
Only in a system where the majority of the SDRAM accesses are burst accesses can the actual
SDRAM throughput approach the maximum.
6
Application Note
Rev. 1.2
October 2002
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
3.4.2 Single-Beat Accesses
Single-beat accesses occur when only a single memory location is accessed per SDRAM access. As
such, the SDRAM setup time dominates the access. And by the very nature of the single-beat access,
less data is read or written which lowers the overall SDRAM throughput.
For a single word read access to SDRAM, 6 cycles are necessary which yields 4 bytes / 6 cycles =
0.66 bytes/cycle. Compared to an 8 word read burst access, the single-beat access is nearly four times
less efficient.
Single-beat accesses degrade overall SDRAM throughput. But single-beat accesses do occur and can
be initiated by data cache write misses (because the Au1 core uses a read-allocate cache policy) or by
non-cached reads and writes. If the access is a write, the access travels through the write-buffer on its
way to SDRAM.
Performance features within the write-buffer can reduce the inefficiency of the single-beat access.
The write-buffer can do both merging and gathering. Merging delays writes to the same word address
until a new word address is presented to the write-buffer. Gathering bundles sequential word
addresses into a burst access to SDRAM. See the description in the Au1x00 data books (“2.3 Write
Buffer”) for further details.
For non-cacheable accesses to take advantage of the write-buffer features, the write must be marked
with the CCA encoding 7 (see “2.2.4 Cache Coherency Attributes” in the data books). Cache-able
write accesses that miss in the data cache are automatically marked merge-able and gatherable to the
write-buffer.
If the access is a non-cached read, the access is the least efficient access possible.
3.4.3 Locality of References
The phrase “locality of reference” means that related instructions and/or data are physically located
near one another in memory. The phenomenon of “locality of reference” drove the creation of instruction and data caches that reaped huge overall system performance increases.
For example, a subroutine or an array of data is merely a sequence items, physically located in adjacent memory addresses. If the subroutine fetches an instruction from memory address X, the instruction cache fetches the instruction from memory, and several more subsequent instructions into its
cache. Then when the subroutine requests the instruction from address X + 1 (and there is a high
probability that this will be the case), the instruction cache already has the instruction and eliminates
the need to access memory.
The same locality of reference phenomenon applies equally to the memory cells; if cell X was just
accessed, then there is a high probability that cell X + 1 will be next. This mirrors the behavior of the
instruction and data caches and is the reason SDRAMs have burst capability.
Application Note
7
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
3.4.3.1 SDRAM Open Bank
SDRAMs take advantage of the locality of reference phenomenon with the “open bank policy”. The
open bank policy allows an SDRAM bank to remain open after an access so as to reduce the setup
time needed for a subsequent access to the same row within the open bank. By reducing the setup
time, the efficiency of the SDRAM interface increases and throughput increases as well.
The Au1x00 SDRAM controller has the ability to keep 4 banks open per rank, for a total of 12 banks
open at a time. When an access to SDRAM occurs, the controller automatically leaves the bank open.
If an access to an open bank occurs and it is to the same row, the SDRAM controller simply asserts
CAS# and completes the access. This removes up to two cycles from the access and permits faster
accesses to frequently accessed banks.
For a series of 8-word burst accesses to the same row and bank, the number of cycles required is
(12 – 2) = 10 cycles. The efficiency of the access is now 32 bytes / 10 cycles = 3.2 bytes/cycle, a 15%
improvement over the 2.67 bytes/cycle for a non-open bank burst access.
To take explicit advantage of SDRAM open bank feature, the configuration of the SDRAM needs to
be taken into consideration. As a direct result of the locality of reference phenomenon, the open bank
feature is best exploited with a large row size. Various SDRAM configurations are provided in
Table 1 to illustrate the effect the SDRAM configuration has on row size.
Table 1: SDRAM Device Configurations
SDRAM Device
Density
8
Rank
# of Bits in
BS, RS, CS
Configuration
(Total Size)
Bank
Size
Row
Size
16Mb (2MB) 1Mbit x 8 x 2 banks
BS=1,RS=11,CS=9
x4 (8MB)
4MB
2KB
16Mb (2MB) 512Kbit x 16 x 2 banks
BS=1,RS=11,CS=8
x2 (4MB)
2MB
1KB
64Mb (8MB) 2Mbit x 8 x 4 banks
BS=2,RS=12,CS=9
x4 (32MB)
8MB
2KB
64Mb (8MB) 1Mbit x 16 x 4 banks
BS=2,RS=12,CS=8
x2 (16MB)
4MB
1KB
64Mb (8MB) 512Kbit x 32 x 4 banks
BS=2,RS=12,CS=7
x1 (8MB)
2MB
512B
64Mb (8MB) 512Kbit x 32 x 4 banks
BS=2,RS=11,CS=8
x1 (8MB)
2MB
1KB
128Mb (16MB) 4Mbit x 8 x 4 banks
BS=2,RS=12,CS=10
x4 (64MB)
16MB
4KB
128Mb (16MB) 2Mbit x 16 x 4 banks
BS=2,RS=12,CS=9
x2 (32MB)
8MB
2KB
128Mb (16MB) 1Mbit x 32 x 4 banks
BS=2,RS=12,CS=8
x1 (16MB)
4MB
1KB
256Mb (32MB) 8Mbit x 8 x 4 banks
BS=2,RS=13,CS=10
x4 (128MB)
32MB
4KB
256Mb (32MB) 4Mbit x 16 x 4 banks
BS=2,RS=13,CS=9
x2 (64MB)
16MB
2KB
256Mb (32MB) 2Mbit x 32 x 4 banks
BS=2,RS=13,CS=8
x1 (32MB)
8MB
1KB
Application Note
Rev. 1.2
October 2002
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
The last two columns yield the effective bank and row size for the given SDRAM configuration. Software can take advantage of both the bank size and row size to improve SDRAM throughput.
The Au1x00 SDRAM controller can keep four banks open simultaneously per rank. To take advantage of this, software should align program code and data on SDRAM bank boundaries. With this
approach, instructions and data reside in different banks allowing the SDRAM controller to keep the
banks open simultaneously, thus improving access time to these items. Since the controller can maintain four open banks per rank, software should consider partitioning instruction and data sets further
to align on as many of the available SDRAM banks as possible. (For example, program code, stack,
heap, global data, network buffers, and disk buffers are all candidates for aligning on an SDRAM
bank boundary).
The larger the row size, the better the likely utilization of the SDRAM open bank feature—a direct
result of the locality of reference phenomenon. Software should group commonly accessed items
(such as instructions, data sets and stack) together, and then align the individual groups on separate
SDRAM row boundaries.
The linker is used to place the various program sections in RAM. Most linkers accept directives indicating the alignment of the various sections; so by changing the alignments in linker scripts, it is possible to take advantage of the SDRAM open bank and open row features.
Fortunately in most software systems, memory is managed in 4KB pages (a result of the typical
MMU/TLB page size), and these pages are always aligned on 4KB boundaries. As a result, memory
pages are already properly aligned to take advantage of the SDRAM open bank feature.
The locality of reference phenomenon improves SDRAM throughput in two ways: 1) it permits
caches to initiate burst accesses, and 2) the SDRAM open-bank policy permits reduced access to time
to adjacent items in the same SDRAM bank.
3.4.3.2 Back to Back Accesses
The interaction of the various SDRAM commands (ACTIVATE, PRECHARGE, AUTOREFRESH)
with read and write cycles also negatively impacts the SDRAM open bank policy.
Read or write accesses to rows not in the currently open bank and row must first close the row (PRECHARGE command) before opening the new row. Two additional SDRAM clock cycles are needed
to close the row, thus reducing the SDRAM throughput.
4. SDRAM Performance Checklist
The following items serve as a guideline for optimizing the SDRAM interface in an Au1x00-based
design. Example SDRAM configurations are provided at the end of this application note.
Application Note
9
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
•
The speed-rating of the Au1x00 processor determines the SDRAM clock frequency.
Ensure the selected speed-rating can provide the anticipated SDRAM throughput for
the intended application.
•
Software must program sys_cpupll and sys_powerctrl[SD] appropriately so that both
the Au1 core and the SDRAM interface operate at the desired frequencies.
•
The CAS Latency rating of the SDRAM impacts SDRAM throughput. Choose an
SDRAM device with a small CAS latency, preferably 2.
•
Software must program the SDRAM refresh interval timer register mem_sdrefcfg register with an optimal value for the SDRAM devices in use. A refresh interval that is
too conservative (shorter than necessary) only reduces the SDRAM throughput with
no additional benefit.
•
Software must program the SDRAM controller configuration registers with optimal
values. The exact values to use are dependent upon the SDRAM clock frequency and
devices in use.
•
Software must program the SDRAM MODE register with the appropriate information.
Specifically the burst type and length, and the CAS latency must match the value in
the SDRAM controller.
•
Software should utilize cache-able memory to take advantage of SDRAM burst
accesses. The Au1 core initiates cache-able accesses via the KSEG0 region or in the
KUSEG, KSEG2 and KSEG3 regions with a CCA encoding of 3, 5 or 6 in the TLB.
•
Software should take advantage of write-buffer gathering to coalesce single-beat
writes into burst transfers. For non-cacheable spaces, CCA encoding 7 in the TLB
enables write-buffer gathering. Cache-able write accesses that miss in the data cache
are automatically marked gatherable to the write-buffer.
•
Software should utilize the coherency model of the Au1 core. In doing so, peripheral
DMA transfers can be serviced by the Au1 data cache; thus preventing the DMA from
accessing SDRAM unnecessarily and thus providing more SDRAM throughput to the
Au1 core.
•
Software should take advantage of the SDRAM open-bank policy by locating code,
data, etc. on separate SDRAM bank and row boundaries.
5. SDRAM Controller Configuration
For each Au1x00-based design, the SDRAM controller configuration is dependent upon the SDRAM
devices in use and the SDRAM clock. The SDRAM configuration is essentially unique to each design
and so must be developed by examining the SDRAM data sheets. Table 2 correlates the SDRAM
device data sheet values with the SDRAM controller fields.
10
Application Note
Rev. 1.2
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
October 2002
Table 2: SDRAM Timing Parameters
Au1x00 SDRAM Controller Field
SDRAM Data Sheet Interpretation
Tras
ACTIVATE to PRECHARGE time
Tmrd
MODE Register write to ACTIVATE time
Twr
Last DATA-IN to PRECHARGE time
Trp
PRECHARGE to ACTIVATE time
Trcd
ACTIVATE to READ/WRITE time
Trc
REFRESH to ACTIVATE time
When examining the timing values specified in the SDRAM data sheets, many values are specified in
nanoseconds. It is thus important to understand the relationship between the SDRAM clock (which
the SDRAM controller uses for the above mentioned fields) and nanoseconds.
The SDRAM interface clock is a function of the Au1 core frequency. The SDRAM clock frequency is
determined by the value written to sys_cpupll and sys_powerctrl[SD]. The SDRAM clock frequency
is the Au1 core frequency (sys_cpull * 12MHz) divided by the system bus divisor
(sys_powerctrl[SD]) divided again by 2. Table 3 lists common operating frequencies and the resulting
SDRAM clock frequency.
Table 3: Au1 Core and SDRAM frequencies
sys_cpupll
sys_powerctrl[SD]
Au1 Core
Frequency
Au1x00
System Bus
Frequency
SDRAM
Frequency
SDRAM
Clock
Period
33
00
396 MHz
198 MHz
99 MHz
10.10 ns
27
00
324 MHz
162 MHz
81 MHz
12.34 ns
22
00
264 MHz
132 MHz
66 MHz
15.15 ns
16
00
19 MHz
96 MHz
48 MHz
20.83 ns
The important value to use when establishing the SDRAM controller value is the SDRAM clock
period: all values provided to the SDRAM controller are specified in multiples of SDRAM clocks.
For example, in a system with a 66MHz SDRAM clock, and for an SDRAM device which has a 65ns
timing requirement for a given timing parameter, then the number of SDRAM clocks needed to meet
this timing parameter is 5 (65ns / 15.15ns = 4.29 rounded up to the next clock is 5).
A variety of SDRAM devices work with the Au1x00 SDRAM controller and a few example configurations are provided here as a reference.
Application Note
11
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
5.1 NEC uPD45128163G5-A80-9JF at 99 MHz
The NEC uPD45128163G5-A80-9JF are 16MB devices arranged as 2Mbit x 16 x 4 banks with a
CAS latency of 2 at 125 MHz. These devices are featured on the Pb1000 evaluation platform
(Au1000 processor).
Table 4: NEC uPD45128163G5-A80-9JF at 99 MHz
Field
Name
mem_sdmode
12
Value
Description
0x00552229
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
010
9 column address lines
14:11
Tras
0100
(tRAS) data sheet specs 48ns
10:9
Tmrd
01
(tRSC) data sheet specs 2 clocks
8:7
Twr
00
(tDPL) data sheet specs 8ns
6:5
Trp
01
(tRP) data sheet specs 20ns
4:3
Trcd
01
(tRCD) data sheet specs 20ns
2:0
Tcl
001
CAS Latency 2
mem_sdrefcfg
0x66000C24
31:28
Trc
0110
(tRC1) data sheet specs 70ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable refresh
24:0
RI
0xC24
(tREF) 64ms for 4096 rows
mem_sdwrmd
0x00000023
6:4
LTMOD
010
CAS Latency 2
3
WT
0
Sequential Wrap Type
2:0
BL
011
Burst of 8
Application Note
Rev. 1.2
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
October 2002
5.2 Micron 48LC8M16A2-7E at 99 MHz
The Micron 48LC8M16A2-7E are 16MB devices arranged as 2Mbit x 16 x 4 banks with a CAS
latency of 2 at 133 MHz. These devices are featured on the Pb1500 evaluation platform (Au1500 processor).
Table 5: Micron 48LC8M16A2-7E at 99 MHz
Field
Name
mem_sdmode
Value
Description
0x00551AA9
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
010
9 column address lines
14:11
Tras
0011
(tRAS) data sheet specs 37ns
10:9
Tmrd
01
(tMRD) data sheet specs 2 clocks
8:7
Twr
01
(tWR) data sheet specs 1 clock + 7ns
6:5
Trp
01
(tRP) data sheet specs 15ns
4:3
Trcd
01
(tRCD) data sheet specs 15ns
2:0
Tcl
001
CAS Latency 2
mem_sdrefcfg
0x66000C24
31:28
Trc
0110
(tRFC) data sheet specs 66ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0xC24
64ms for 4096 rows
mem_sdwrmd
0x00000023
9
WB
0
Bursts
8:7
OpMode
0
Normal
6:4
CAS Latency
010
CAS Latency 2
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note
13
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
5.3 Samsung K4S28163LD-RF75 at 99 MHz
The Samsung K4S28163LD-RF75 are 16MB devices arranged as 2Mbit x 16 x 4 banks with a
CAS latency of 2 at 100 MHz. These devices are featured on the Pb1100 evaluation platform
(Au1100 processor).
Table 6: Samsung K4S28163LD-RF75 at 99 MHz
Field
Name
mem_sdmode
14
Value
Description
0x00552229
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
010
9 column address lines
14:11
Tras
0100
(tRAS) data sheet specs 45ns
10:9
Tmrd
01
(MRS) data sheet specs 2 clocks
8:7
Twr
00
(tRDL) data sheet specs 10ns
6:5
Trp
01
(tRP) data sheet specs 20ns
4:3
Trcd
01
(tRCD) data sheet specs 20ns
2:0
Tcl
001
CAS Latency 2
mem_sdrefcfg
0x66000C24
31:28
Trc
0110
(tRC) data sheet specs 65ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0xC24
64ms for 4096 rows
mem_sdwrmd
0x00000023
9
WBL
0
Bursts
8:7
Test Mode
0
Mode Register Set
6:4
CAS Latency
010
CAS Latency 2
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note
Rev. 1.2
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
October 2002
5.4 Toshiba TC59SM816CFTL-70 at 99 MHz
The Toshiba TC59SM816CFTL-70 are 32MB devices arranged as 4Mbit x 16 x 4 banks with a CAS
latency of 2 at 143 MHz. These devices are featured in the Mobile Reference Design (Hydrogen).
Table 7: Toshiba TC59SM816CFTL-70 at 99 MHz
Field
Name
mem_sdmode
Value
Description
0x00591A29
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
10
13 row address lines
17:15
CS
010
9 column address lines
14:11
Tras
0011
(tRAS) data sheet specs 40ns
10:9
Tmrd
01
(tRSC) data sheet specs 14ns
8:7
Twr
00
(tWR) data sheet specs 7ns
6:5
Trp
01
(tRP) data sheet specs 15ns
4:3
Trcd
01
(tRCD) data sheet specs 15ns
2:0
Tcl
001
CAS Latency 2
mem_sdrefcfg
0x6600060A
31:28
Trc
0101
(tRC) data sheet specs 56ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0x60A
(tREF) 64ms for 8192rows
mem_sdwrmd
0x00000023
9
Burst Mode
0
Bursts
8:7
Test Mode
00
Normal
6:4
CAS Latency
010
CAS Latency 2
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note
15
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
5.5 Samsung K4S281632D-TC/L75 at 99 MHz
The Samsung K4S281632D-TC/L75 are 32MB devices arranged as 2Mbit x 16 x 4 banks with a CAS
latency of 3 at 100 MHz. These devices are featured on the Au1x00/Db1x00 evaluation platforms.
Table 8: Samsung K4S281632D-TC/L75 at 99 MHz
Field
Name
mem_sdmode
16
Value
Description
0x005522AA
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
010
9 column address lines
14:11
Tras
0100
(tRAS) data sheet specs 45ns
10:9
Tmrd
01
(MRS) data sheet specs 2 clocks
8:7
Twr
01
(tRDL) data sheet specs 2 clocks
6:5
Trp
01
(tRP) data sheet specs 20ns
4:3
Trcd
01
(tRCD) data sheet specs 20ns
2:0
Tcl
010
CAS Latency 3
mem_sdrefcfg
0x66000C24
31:28
Trc
0110
(tRC) data sheet specs 65ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0xC24
64ms for 4096 rows
mem_sdwrmd
0x00000033
9
WBL
0
Bursts
8:7
Test Mode
0
Mode Register Set
6:4
CAS Latency
011
CAS Latency 3
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note
Rev. 1.2
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
October 2002
5.6 Micron 48LC4M16A2-75 at 99 MHz
The Micron 48LC4M16A2-75 are 8MB devices arranged as 1Mbit x 16 x 4 banks with a CAS latency
of 3 at 133 MHz.
Table 9: Micron 48LC4M16A2-75 at 99 MHz
Field
Name
mem_sdmode
Value
Description
0x0054A2AA
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
001
8 column address lines
14:11
Tras
0100
(tRAS) data sheet specs 44ns
10:9
Tmrd
01
(tMRD) data sheet specs 2 clocks
8:7
Twr
01
(tWR) data sheet specs 1 clock + 7.5ns
6:5
Trp
01
(tRP) data sheet specs 20ns
4:3
Trcd
01
(tRCD) data sheet specs 20ns
2:0
Tcl
010
CAS Latency 3
mem_sdrefcfg
0x66000C24
31:28
Trc
0110
(tRFC) data sheet specs 66ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0xC24
64ms for 4096 rows
mem_sdwrmd
0x00000033
9
WB
0
Bursts
8:7
OpMode
0
Normal
6:4
CAS Latency
011
CAS Latency 3
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note
17
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
5.7 Micron 48LC8M16A2-75 at 99 MHz
The Micron 48LC8M16A2-75 are 16MB devices arranged as 2Mbit x 16 x 4 banks with a CAS
latency of 2 at 100 MHz.
Table 10: Micron 48LC8M16A2-75 at 99 MHz
Field
Name
mem_sdmode
18
Value
Description
0x005522A9
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
010
9 column address lines
14:11
Tras
0100
(tRAS) data sheet specs 44ns
10:9
Tmrd
01
(tMRD) data sheet specs 2 clocks
8:7
Twr
01
(tWR) data sheet specs 1 clock + 7.5ns
6:5
Trp
01
(tRP) data sheet specs 15ns
4:3
Trcd
01
(tRCD) data sheet specs 15ns
2:0
Tcl
001
CAS Latency 2
mem_sdrefcfg
0x66000C24
31:28
Trc
0110
(tRFC) data sheet specs 66ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0xC24
64ms for 4096 rows
mem_sdwrmd
0x00000023
9
WB
0
Bursts
8:7
OpMode
0
Normal
6:4
CAS Latency
010
CAS Latency 2
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note
Rev. 1.2
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
October 2002
5.8 Micron 48LC8M16A2-75 at 66 MHz
The Micron 48LC8M16A2-75 are 16MB devices arranged as 2Mbit x 16 x 4 banks with a CAS
latency of 2 at 100 MHz.
Table 11: Micron 48LC8M16A2-75 at 66 MHz
Field
Name
mem_sdmode
Value
Description
0x00551281
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
010
9 column address lines
14:11
Tras
0010
(tRAS) data sheet specs 44ns
10:9
Tmrd
01
(tMRD) data sheet specs 2 clocks
8:7
Twr
01
(tWR) data sheet specs 1 clock + 7.5ns
6:5
Trp
00
(tRP) data sheet specs 15ns
4:3
Trcd
00
(tRCD) data sheet specs 15ns
2:0
Tcl
001
CAS Latency 2
mem_sdrefcfg
0x42000818
31:28
Trc
0100
(tRFC) data sheet specs 66ns
27:26
Trpm
00
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0x818
64ms for 4096 rows
mem_sdwrmd
0x00000023
9
WB
0
Bursts
8:7
OpMode
0
Normal
6:4
CAS Latency
010
CAS Latency 2
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note
19
SDRAM Performance on
Au1000™, Au1100™ and Au1500™ Processors
Rev. 1.2 October 2002
5.9 Micron 48LC4M32B2TG-7 at 81 MHz
The Micron 48LC4M32B2TG-7 are 16MB devices arranged as 1Mbit x 32 x 4 banks with a CAS
latency of 2 at 81 MHz.
Table 12: Micron 48LC4M32B2TG-7 at 81 MHz
Field
Name
mem_sdmode
20
Value
Description
0x00549AA9
23
SF
0
SDRAM operation
22
F
1
Au1 core is only caching master
21
SR
0
SDRAM operation
20
BS
1
4 banks
19:18
RS
01
12 row address lines
17:15
CS
001
8 column address lines
14:11
Tras
0011
(tRAS) data sheet specs 42ns
10:9
Tmrd
01
(tMRD) data sheet specs 2 clocks
8:7
Twr
01
(tWR) data sheet specs 1 clock + 7ns
6:5
Trp
01
(tRP) data sheet specs 20ns
4:3
Trcd
01
(tRCD) data sheet specs 20ns
2:0
Tcl
001
CAS Latency 2
mem_sdrefcfg
0x560009EF
31:28
Trc
0101
(tRFC) data sheet specs 70ns
27:26
Trpm
01
Trp from mem_sdmode
25
E
1
Enable Refresh
24:0
RI
0x9EF
64ms for 4096 rows
mem_sdwrmd
0x00000023
9
WB
0
Bursts
8:7
OpMode
0
Normal
6:4
CAS Latency
010
CAS Latency 2
3
Burst Type
0
Sequential bursts
2:0
Burst Length
011
Burst of 8
Application Note