TI TMS320C6474

TMS320C6474
Digital Signal Processor
Silicon Revisions 1.3, 1.2
Silicon Errata
Literature Number: SPRZ283
October 2008
2
SPRZ283 – October 2008
Submit Documentation Feedback
Contents
1
Introduction......................................................................................................................... 5
2
3
.............................................................. 5
1.2
Package Symbolization and Revision Identification .................................................................. 6
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications .............................. 7
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications ............................ 22
1.1
Device and Development Support Tool Nomenclature
SPRZ283 – October 2008
Submit Documentation Feedback
Table of Contents
3
www.ti.com
List of Figures
1
2
3
4
5
Lot Trace Code Examples for TMS320C6474 (ZUN Package)........................................................ 6
IDMA, SDMA, and MDMA Paths .......................................................................................... 9
IDMA, SDMA, and MDMA Paths ......................................................................................... 24
Correct Device Input Clocks, Clock Selects, and Scaled Supply Timings .......................................... 36
Prog Set Options Register ................................................................................................ 39
List of Tables
1
2
3
4
5
4
Lot Trace Codes ............................................................................................................. 6
Silicon Revision Variables .................................................................................................. 6
Silicon Revision 1.3 Advisory List ......................................................................................... 7
Silicon Revision 1.2 Advisory List ........................................................................................ 22
TC Registers Summary .................................................................................................... 39
List of Figures
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Errata
SPRZ283 – October 2008
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
1
Introduction
This document describes the silicon updates to the functional specifications for the TMS320C6474 digital
signal processor; see the device-specific data manual, TMS320C6474 Multicore Digital Signal Processor
(literature number SPRS552).
1.1
Device and Development Support Tool Nomenclature
To designate the stages in the product development cycle, TI assigns prefixes to the part numbers of all
DSP devices and support tools. Each DSP commercial family member has one of three prefixes: TMX,
TMP, or TMS (e.g., TMS320C6474ZUN). Texas Instruments recommends two of three possible prefix
designators for its support tools: TMDX and TMDS. These prefixes represent evolutionary stages of
product development from engineering prototypes (TMX/TMDX) through fully qualified production
devices/tools (TMS/TMDS).
Device development evolutionary flow:
TMX
Experimental device that is not necessarily representative of the final device's electrical
specifications
TMP
Final silicon die that conforms to the device's electrical specifications but has not
completed quality and reliability verification
TMS
Fully-qualified production device
Support tool development evolutionary flow:
TMDX
Development-support product that has not yet completed Texas Instruments internal
qualification testing
TMDS
Fully-qualified development-support product
TMX and TMP devices and TMDX development-support tools are shipped against the following
disclaimer:
"Developmental product is intended for internal evaluation purposes."
TMS devices and TMDS development-support tools have been characterized fully, and the quality and
reliability of the device have been demonstrated fully. TI's standard warranty applies.
Predictions show that prototype devices (TMX or TMP) have a greater failure rate than the standard
production devices. Texas Instruments recommends that these devices not be used in any production
system because their expected end-use failure rate still is undefined. Only qualified production devices are
to be used.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
5
Introduction
1.2
www.ti.com
Package Symbolization and Revision Identification
The device revision can be determined by the lot trace code marked on the top of the package. The
location of the lot trace code for the ZUN package is shown in Figure 1. Figure 1 also shows an example
of C6474 package symbolization.
DSP
TMS320C6474ZUN
#xx−#######
Lot Trace Code
Figure 1. Lot Trace Code Examples for TMS320C6474 (ZUN Package)
Silicon revision correlates to the lot trace code marked on the package. This code is of the format
#xx-#######. If xx is "12", then the silicon is revision 1.2. Table 1 lists the silicon revisions associated with
each lot trace code for the C6474 device.
Each silicon revision uses a specific revision of the CPU and the C64x+ megamodule. The CPU revision
ID identifies the silicon revision of the CPU. Table 2 lists the CPU and C64x+ megamodule revision
associated with each silicon revision. The CPU revision can be read from the REVISION_ID field of the
CPU control status register (CSR). The C64x+ megamodule revision can be read from the REVISION field
of the megamodule revision ID register (MM_REVID) located at address 0181 2000h.
The VARIANT field of the JTAG ID register (located at 0288 0814h) changes between silicon revisions.
Table 2 lists the contents of the JTAG ID register for each revision of the device. More details on the
JTAG ID register can be found in the device-specific data manual, TMS320C6474 Multicore Digital Signal
Processor (literature number SPRS552).
Table 1. Lot Trace Codes
LOT TRACE CODE (xx)
SILICON REVISION
COMMENTS
13
1.3
Silicon revision 1.3
12
1.2
Silicon revision 1.2
Table 2. Silicon Revision Variables
6
SILICON REVISION
CPU REVISION
C64X+ MEGAMODULE REVISION
JTAG ID REGISTER VALUE
1.3
1.0
(REVISION_ID = 0h)
Rev. 0
(MM_REVID[REVISION] = 0h)
0x2009 202Fh
(VARIANT = 0010b)
1.2
1.0
(REVISION_ID = 0h)
Rev. 0
(MM_REVID[REVISION] = 0h)
0x1009 202Fh
(VARIANT = 0001b)
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
2
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
Table 3. Silicon Revision 1.3 Advisory List
Title
......................................................................................................................................
Advisory 1.3.1
Advisory 1.3.2
Advisory 1.3.3
Advisory 1.3.4
Advisory 1.3.5
Advisory 1.3.6
Page
DSP SDMA/IDMA: Unexpected Stalling of SDMA/IDMA Access to L2 SRAM ................................. 8
Potential Data Corruption on SCR Bridge .......................................................................... 15
Potential Insertion or Deletion of 2 Bits in SerDes Data Stream ................................................ 16
MAC EOI Register Write Causes Potential CPU Lockup......................................................... 18
Potential SerDes Clocking Issue..................................................................................... 19
I2C: Slave Boot Aborts ................................................................................................ 20
Advisory 1.3.7 EMAC Boot Issue ..................................................................................................... 21
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
7
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.3.1
DSP SDMA/IDMA: Unexpected Stalling of SDMA/IDMA Access to L2 SRAM
Revision(s) Affected:
1.3, 1.2
Details:
Note:
Only when DSP level 2 (L2) memory is configured as non-cache (RAM),
unexpected stalling may occur on DSP SDMA/IDMA accesses. If DSP L2
memory is used only as cache or if L2 RAM is not accessed by IDMA or
via the SDMA interface during run-time, then this exception does not
apply.
The C64x+ megamodule has a Master Direct Memory Access (MDMA) bus interface and
a Slave Direct Memory Access (SDMA) bus interface. The MDMA interface provides
DSP access to resources outside the C64x+ megamodule (i.e., DDR2 memory). The
MDMA interface is used for CPU/cache accesses to memory beyond the level 2 (L2)
memory level. These accesses include cache line allocates, write-backs, and
non-cacheable loads and stores to/from system memories. The SDMA interface allows
other master peripherals in the system to access level 1 data (L1D), level 1 program
(L1P), and L2 RAM DSP memories. The masters allowed accesses to these memories
are DMA controllers, EMAC, and SRIO. The DSP Internal Direct Memory Access (IDMA)
is a C64x+ megamodule DMA engine used to move data between internal DSP
memories (L1, L2) and/or the DSP peripheral configuration bus. The IDMA engine
shares resources with the SDMA interface.
The C64x+ megamodule has an L1D cache and an L2 caches, both of which implement
write-back data caches. The C64x+ megamodule holds updated values for external
memory as long as possible. It writes these updated values, called victims, to external
memory when it needs to make room for new data, when requested to do so by the
application, or when a load is performed from a non-cacheable memory for which there
is a set match in the cache (i.e., the non-cacheable line would replace a dirty line if
cached). The L1D sends its victims to L2. The caching architecture has pipelining,
meaning multiple requests could be pending between L1, L2, and MDMA. For more
Details: on the C64x+ megamodule and its MDMA and SDMA ports, see the
TMS320C64x+ Megamodule Reference Guide (literature number SPRU871).
Ideally, the MDMA (the blue lines in Figure 2) and SDMA/IDMA paths (the orange lines
in Figure 2) operate independently with minimal interference. Normally, MDMA accesses
may stall for extended periods of time (clock cycles) due to expected system level delays
(e.g., bandwidth limitations, DDR2 memory refreshes). However, when using L2 as
RAM, SDMA and/or IDMA accesses to L2/L1 may experience unexpected stalling in
addition to the normal stalls seen by the MDMA interface. For latency-sensitive traffic,
the SDMA stall can result in missing real-time deadlines.
Note:
SDMA/IDMA accesses to L1P/D will not experience an unexpected stall if
there are no SDMA/IDMA accesses to L2. Unexpected SDMA/IDMA
stalls to L1 happen only when they are pipelined behind L2 accesses.
Figure 2 is a simplified view for illustrative purposes only. The IDMA/SDMA path (orange
lines) can also go to L1D/L1P memories and IDMA can go to the DSP CFG peripherals.
MDMA transactions (blue lines) can also originate from L1P or L1D through the L2
controller or directly from the DSP.
8
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
RAM/
Cache
RAM/
Cache
256
ROM
256
Cache Control
Memory Protect
L1P
256
256
Cache Control
256
Memory Protect
Bandwidth Mgmt
L2
Bandwidth Mgmt
256
128
256
256
Power Down
Instruction Fetch
Interrupt
Controller
Register
File A
IDMA
C64x + CPU
Register
File B
64
64
Bandwidth Mgmt
CFG
256
Memory Protect
32
Peripherals
EMC
L1D
Cache Control
MDMA
8 x 32
SDMA
128
128
EDMA Master
Peripherals
RAM/
Cache
CPU/Cache Access Origination
Master Peripheral Origination
Figure 2. IDMA, SDMA, and MDMA Paths
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
9
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
SDMA/IDMA stalls may occur during the following scenarios. Each of these scenarios
describes expected normal DSP functionality, but the SDMA/IDMA access potentially
exhibits additional unexpected stalling.
1. Bursts of writes to non-cacheable MDMA space (i.e., DDR2). The DSP buffers up to
4 non-cacheable writes. When this buffer fills, SDMA/IDMA is blocked until the buffer
is no longer full. Therefore, bursts of non-cacheable writes longer than three writes
can stall SDMA/IDMA traffic.
2. Various combinations of L1 and L2 cache activity:
a. L1D read miss generating victim traffic to L2 (cache or SRAM) or external
memory. The SDMA/MDMA may be stalled while servicing the read miss and the
victim. If the read miss also misses L2 cache, the SDMA/IDMA may be stalled
until data is fetched from external memory to service the read miss. If the read
access is to non-cacheable memory there will still potentially be an L1D victim
generated even though the read data will not replace the line in the L1D cache.
b. L1D read request missing L2 (going external) while another L1D request is
pending. The SDMA/IDMA may be stalled until the external memory access is
complete.
c. L2 victim traffic to external memory during any pending L1D request. The
SDMA/IDMA may be stalled until external memory access and the pending L1D
request are complete.
The duration of the SDMA/IDMA stalls depends on the quantity/characteristics of the
L1/L2 cache and the MDMA traffic in the system. In cases 2a, 2b, and 2c, stalling may or
may not occur depending on the state of the cache request pipelines and the traffic
target locations. These stalling mechanisms may also interact in various ways, causing
longer stalls. Therefore, it is difficult to predict if stalling will occur and for how long.
SDMA/IDMA stalling and any system impact is most likely in systems with excessive
context switching, L1/L2 cache miss/victim traffic, and heavily loaded EMIF.
Use the following steps to determine if SDMA/IDMA stalling is the cause of real-time
deadline misses for existing applications. Situations where real-time deadlines may be
missed include loss of McBSP samples and low peripheral throughput.
1. Determine if the transfer missing the real-time deadline is accessing L2 or L1D
memory. If not, then SDMA/IDMA stalling is not the source of the real-time deadline
miss.
2. Identify all SDMA transfers to/from L2 memory (e.g., EDMA transfer to/from L2
from/to a McBSP or from/to AIF, TCP, or VCP). If there are no SDMA transfers going
to L2, then SDMA/IDMA stalling is not the source of the problem.
3. Redirect all SDMA transfers to L2 memory to other memories using one of the
following methods:
• Temporarily transfer all the L2 SDMA transfers to L1D SRAM.
• If not all L2 SDMA transfers can be moved to L1D memory, temporarily direct
some of the transfers to DDR memory and keep the rest in L1D memory. There
should be no L2 SDMA transfers.
• If neither of the above approaches are possible, move the transfer with the
real-time deadline to the EMAC CPPI RAM. If the EMAC CPPI RAM is not big
enough, a two-step mechanism can be used to page a small working buffer
defined in the EMAC CPPI RAM into a bigger buffer in L2 SRAM. The EDMA
module can be setup to automate this double buffering scheme without CPU
intervention for moving data from the EMAC CPPI RAM. Some throughput
degradation is expected when the buffers are moved to the EMAC CPPI RAM.
Note: Note that EMAC CPPI RAM memory is word-addressable only and,
therefore, must be accessed using an EDMA index of 4 bytes.
If real-time deadlines are still missed after implementing any of the options in Step 3,
then SDMA/IDMA stalling is likely not the cause of the problem. If real-time deadline
misses are solved using any of the options in Step 3, then SDMA/IDMA stalling is likely
the source of the problem.
10
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
An extreme manifestation of the IDMA/SDMA stall bug is the C64x+ MDMA-SDMA
deadlock that requires a device reset or power-on reset in order for the system to
recover. The following summarizes the deadlock conditions:
• Master(s) on a single main MSCR port write to the GEM's SDMA followed by a write
to slaveX
• The GEM issues victim traffic or a non-cacheable write to slaveX
• Any one of the following:
– A write data path pipelined in main MSCR between master(s) and the GEM's
SDMA
– A bridge exists between master(s) and the main MSCR
– Master(s) are able to issue a command to slaveX concurrent with the write to the
GEM's SDMA.
A load (either cacheable or non-cacheable) from another core's L1D or L2 memory can
additionally create a deadlock condition. When the load is issued the read command is
propagated to the SDMA port of the other core through a bridge that is shared with the
EDMA TC1, EMAC, RapidIO (both data and CPPI), and other GEM MDMA. When the
load is issued, if a victim is generated in L1D cache, then the SDMA may stall until the
load completes. If other masters are issuing commands through the shared bridge, then
the bridge may fill due to the stalled SDMA before the read command can propagate
through the bridge and complete. In summary, a deadlock can occur if the following is
true:
• GEMx issues a read to GEMy or GEMz L1D or L2 SRAM
• Any of the following are issuing commands to GEMx L2: TC1, EMAC (data or CPPI),
RapidIO (data or CPPI), GEMy, or GEMz.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
11
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
Workarounds:
www.ti.com
Method 1
Issues such as dropped McBSP samples can be worked around by moving
latency-sensitive buffers outside the C64x+ megamodule. For example, rather than
placing buffers for the McBSP into L1/L2, those buffers can instead be placed in other
memory, such as the EMAC CPPI RAM.
Note: Note that EMAC CPPI RAM memory is word-addressable only and, therefore,
must be accessed using an EDMA index of 4 bytes.
Method 2
To reduce the SDMA/IDMA stalling system impact, perform any of the following:
1. Improve system tolerance on DMA side (SDMA/IDMA/MDMA):
• Understand and minimize latency-critical SDMA/IDMA accesses to L2 or L1P/D.
• Directly reduce critical real-time deadlines, if possible, at peripheral/IO level (e.g.,
increase word size and/or reduce bit rates on serial ports).
• To reduce DSP MDMA latency:
– Increase the priority of the DSP access to DDR2 such that MDMA latency of
MDMA accesses causing stalls is minimized.
Note: Other masters may have real-time deadlines that dictate higher priority
than the DSP.
– Lower the PRIO_RAISE field setting in the DDR2 memory controller's burst
priority register. Values ranging between 0x10 and 0x20 should give decent
performance and minimize latency; lower values may cause excessive
SDRAM row thrashing.
2. Minimize offending scenarios on DSP/caching side:
• If the DSP performing non-cacheable writes is causing the issue, insert protected
non-cacheable reads (as shown in the last list item below) every few writes to
allow the write buffer to empty.
• Use explicit cache commands to trigger cache writebacks during appropriate
times (L1D Writeback All, L2 Writeback All). Do not use these commands when
real-time deadlines must be met.
• Restructure program data and data flow to minimize the offending cache activity.
– Define the read-only data as const. The const C keyword tells the compiler
not to write to the array. By default, such arrays are allocated to the .const
section as opposed to BSS. With a suitable linker command file, the
developer can link the .const section off chip, while linking .bss on chip.
Because programs initialize .bss at run time, this reduces the program's
initialization time and total memory image.
– Explicitly allocate lookup tables and writeable buffers to their own sections.
The #pragma DATA_SECTION (label, section) directive tells the compiler to
place a particular variable in the specified COFF section. The developer can
explicitly control the layout of the program with this directive and an
appropriate linker command file.
– Avoid directly accessing data in slow memories (e.g., flash); copy at
initialization time to faster memories.
• Modify troublesome code.
– Rewrite using DMAs to minimize data cache writebacks. If the code accesses
a large quantity of data externally, consider using DMAs to bring in the data,
using double buffering and related techniques. This will minimize cache
write-back traffic and the likelihood of SDMA/IDMA stalling.
– Re-block the loops. In some cases, restructuring loops can increase reuse in
the cache and reduce the total traffic to external memory.
– Throttle the loops. If restructuring the code is impractical, then it is reasonable
to slow it down. This reduces the likelihood that consecutive SDMA/IDMA
blocks stack up in the cache request pipelines, resulting in a long stall.
12
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
•
Protect non-cacheable reads from generating an SDMA stall by freezing the L1D
cache during the non-cacheable read access(es). The following example code
contains a function that protects non-cacheable reads, avoids blocking during the
reads, and, therefore, avoids the deadlock state.
;; ======================================================================== ;;
;; Long Distance Load Word
;;
;;
;;
;;
int long_dist_load_word(volatile int *addr)
;;
;;
;;
;; This function reads a single word from a remote location with the L1D
;;
;; cache frozen. This prevents L1D from sending victims in response to
;;
;; these reads, thus preventing the L1D victim lock from engaging for the ;;
;; corresponding L1D set.
;;
;;
;;
;; The code below does the following:
;;
;;
;;
;;
1. Disable interrupts
;;
;;
2. Freeze L1D
;;
;;
3. Load the requested word
;;
;;
4. Unfreeze L1D
;;
;;
5. Restore interrupts
;;
;;
;;
;; Interrupts are disabled while the cache is frozen to prevent affecting ;;
;; the performance of interrupt handlers. Disabling interrupts during
;;
;; the long distance load does not greatly impact interrupt latency,
;;
;; because the CPU already cannot service interrupts when it's stalled by ;;
;; the cache. This function adds a small amount of overhead (~20 cycles) ;;
;; to that operation.
;;
;;
;;
;; ======================================================================== ;;
.asg
0x01840044,
L1DCC
.global _long_dist_load_word
.text
.asmfunc
; int long_dist_load_word(volatile int *addr)
_long_dist_load_word:
MVKL
L1DCC,
B4
MVKH
L1DCC,
B4
||
DINT
||
MVK
1,
B5
STW
B5,
*B4
LDW
*B4,
B5
NOP
4
SHR
B5,
16,
B5
||
LDW
*A4,
A4
NOP
4
STW
B5,
*B4
RET
B3
||
LDW
*B4,
B5
NOP
4
RINT
.endasmfunc
; L1D Cache Control
; Disable interrupts
; \_ Freeze cache
; /
; POPER -> OPER
; read value remotely
; \_ Restore cache
; /
; Restore interrupts
;; ======================================================================== ;;
;; End of file: ldld.asm
;;
;; ======================================================================== ;;
In the C6474 multicore device, when one GEM is accessing another GEM's L1 or L2
memory it is an MDMA access, so the potential SDMA/IDMA stall can occur. The stall
can be avoided by using the EDMA to transfer data from one GEM's memory to another.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
13
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Method 3
Entirely eliminate the exception by removing all SDMA/IDMA accesses to L2 SRAM. For
example, EMAC descriptors and EMAC payload cannot reside in L2. Master peripherals
like the EDMA/QDMA, IDMA, and SRIO cannot access L2. There are no issues with the
CPU itself accessing code/data in L2. This issue only pertains to SDMA/IDMA accesses
to L2.
Deadlock Avoidance
To avoid the manifestation of a C64x+ deadlock, several Workarounds: are suggested
depending on the VBUSM master in question:
14
VBUSM MASTER
WORKAROUND
GEM
GEMs should not write to the memory of any other GEM. This will cause
complications across any master peripheral that is transferring data to multiple
L2s. GEMs must not directly read from the memory of any other GEMs without
providing the L1D cache disable workaround mentioned in Method 2 to ensure
that the load will not stall itself indefinitely and hang the system.
EDMA3TCx
Inbound and outbound traffic should be programmed on different TC ports (i.e.,
two different EDMA queues, since a given queue maps to a given TC). Note that
in-/out-bound direction is defined as the write direction, meaning that a
DDR2-to-DDR2 transfer is outbound and L2-to-L2 is inbound. Any TC used to
write to DDR should not be used to write to a GEM even when the TC writing to
the DDR is also reading from DDR.
EMAC
EMAC should write to the GEM's memory or the DDR, but not both. This includes
buffers and buffer descriptors. EMAC CPPI descriptors should be placed wholly
in the local wrapper memory, any combination of wrapper and L2 memory (must
match other master transactions), or any combination of wrapper and DDR2
SDRAM (must match other master transactions). Buffer descriptors should not be
placed in any combination of L2 and DDR2 SDRAM.
SRIO
SRIO should transfer payload data to only GEM memories or to DDR2 SDRAM,
but not both. This includes any direct I/O writes as well as any inbound RX
messaging transfer.
SRIO CPPI
SRIO CPPI descriptors should be placed wholly in the local wrapper memory,
any combination of wrapper and L2 memory, or any combination of wrapper and
DDR2 SDRAM. Buffer descriptors should not be placed in any combination of L2
and DDR2 SDRAM.
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.3.2
Potential Data Corruption on SCR Bridge
Revision(s) Affected:
1.3, 1.2
Details:
This issue manifests itself when two masters write to a bridge endpoint and the
commands arrive on the same clock cycle. The VBUS protocol is violated and the data is
corrupted. The consequence is that one of the writes goes through with corrupt data, the
other completes normally.
On some occasions the bridge may not recover without a reset. There is no software
indication of this nor a means to reset only the bridge. Therefore, the situation must be
avoided. The affected bridges are: TCP, VCP, AIF write port, and DMA bridge to
configuration bus (where key endpoints beyond this bus could include Semaphore
configuration port, EMAC configuration ports, PaRAM configuration port, or SRIO
configuration ports).
Workarounds:
Corrective action is taken by avoiding the issue as described in the following:
• TCP: Access to R/W ports is controlled by the Semaphore module; no issue.
• VCP: Access to R/W ports is controlled by the Semaphore module; no issue.
• AIF: DMA access is from a single transfer controller (TC); no issue. If more than one
TC is used, the issue will be exposed.
• DMA bridge to configuration bus: Dedicate a single TC for use of the DMA to write
through this bridge to all the endpoints beyond or use the configuration bus directly
and do not use the DMA to program the following:
– Semaphore configuration port: Configuration registers
– SRIO configuration ports: Configuration registers
– EMAC configuration ports: Configuration registers (caution on using EMAC CPPI
buffer to work around the DSP SDMA/IDMA unexpected stalling, see Advisory
1.3.1 )
– PaRAM configuration port: Do not auto-program channels from one TC to another
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
15
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.3.3
Potential Insertion or Deletion of 2 Bits in SerDes Data Stream
Revision(s) Affected:
1.3, 1.2
Details:
For arbitrary phase mode, a FIFO function is integrated into the SerDes TX serializer.
This FIFO has three states (minus1, center, plus1) and is supposed to be reset to the
center state at startup. From this position, the SerDes is then tolerant to variations of
phase between the input clock (TXBCLKIN) and the SerDes internal clock, caused by
temperature and voltage variations. However, as a result of a logic bug, the possibility
exists that under some circumstances, the FIFO may not start in the center state. When
this happens, there is a risk that the FIFO may subsequently overflow or underflow.
Whether the FIFO fails to initialize to the center state depends on the timing
relationships between several signals, including the SerDes internal clock. Even if the
FIFO fails to initialize to the center state, the FIFO will only underflow or overflow if the
phase relationship between the TXBCLKIN input and the internal SerDes clock vary (due
to temperature or voltage changes) in such a way as to cause their edges to cross in
one particular direction. Overflow results in two bits being added to the data stream.
Underflow results in two bits being deleted. If overflow or underflow occurs at all, it only
happens once per TX lane because after it has occurred the FIFO is configured exactly
as if it had initialized to the center state at startup.
The precise silicon process of the device will also be a factor in whether the overflow or
underflow occurs. Some devices may exhibit this behavior at some particular PVT
combinations, others may never exhibit it. It is not possible to predict whether, or under
what conditions, a device is susceptible. If overflow or underflow occur, it could be at any
time ranging from immediately after startup to weeks, months, or years later.
Workarounds:
The issue can be worked around by software control of two ports on the SerDes. At
initialization, cycling of bits resets the circuit and resolves the issue.
• AIF has a software workaround as follows:
The software workaround limits restart to per macro, not per lane. There is one set of
software control bits for the B8 and another for the B4. For details, see the
device-specific data manual, TMS320C6474 Multicore Digital Signal Processor
(literature number SPRS552). There are new recommendations for the initialization
sequence that is shown in the following code example:
//Enable the Tx Link
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_TX_LINK_EN, ENABLED);
//Set the Link Rate
if (aCommoncfg[0].linkRate == CSL_AIF_LINK_RATE_1x){
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_LINK_RATE, 1X);
}
else if (aCommoncfg[0].linkRate == CSL_AIF_LINK_RATE_2x)
{
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_LINK_RATE, 2X);
}
else if (aCommoncfg[0].linkRate == CSL_AIF_LINK_RATE_4x)
{
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_LINK_RATE, 4X);
}
//Toggle the ENFTP bit
CSL_FINS( hAifLink[0]->regs->AI_SERDES0_TST_CFG, AIF_AI_SERDES0_TST_CFG_INVPATT,
1);
CSL_FINS(hAifLink[0]->regs->AI_SERDES0_TST_CFG, AIF_AI_SERDES0_TST_CFG_INVPATT,
0);
CSL version 3.0.6.2 for the C6474 device has a new hardware control command
(CSL_AIF_CMD_ENABLE_DISABLE_TX_LINK_SI1_1) that has the fix for this
advisory.
16
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
•
EMAC has a software workaround and an auto-recovery for this advisory as follows:
There is a new recommendation for initialization sequence as shown in the following
code example. This example code should be used with CSL version 03.00.06.01.
SgmiiCfg.masterEn
= 0x1;
SgmiiCfg.loopbackEn = 0x1;
SgmiiCfg.auxConfig = 0x0000000b;
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
LocalTicks = 0;
while(LocalTicks !=3); // wait for 2us
SgmiiCfg.txConfig
SgmiiCfg.rxConfig
= 0x00000e21; // enable transmitter
= 0x00081021;
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
SgmiiCfg.txConfig
= 0x00001e21; // toggle the ENFTP bit
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
SgmiiCfg.masterEn
= 0x1;
SgmiiCfg.loopbackEn = 0x1;
SgmiiCfg.txConfig
= 0x00000e21; // toggle the ENFTP bit
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
// wait for the Auto-negotiation Complete
SGMII_REGS->CONTROL |= 0x1; // Loopback mode is selected
// set full dupex and Gig bits
SGMII_REGS->MR_ADV_ABILITY = 0x9801;
•
SRIO has an auto-recovery as follows:
Auto-recovery resets the link and re-exposes the issue. TI is working to understand
the likelihood of repeated recovery and whether there could be performance impacts
due to repeated recovery.
The software workaround is enabled with a partial fix for AIF and EMAC. A complete fix
will be available in silicon revision 2.x.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
17
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.3.4
MAC EOI Register Write Causes Potential CPU Lockup
Revision(s) Affected:
1.3, 1.2; Fixed in CSL version 03.00.06.01
Details:
A bug has been found affecting multiple cores trying to write the EOI register via the
MAC interface. It causes a lockup of one of the three MAC interfaces that is attempting
to write the EOI register.
When multiple cores try to access the MAC interface one of the three cores that
requested the EOI write gets locked up. This situation occurs when the MAC interface
receives the EOI register write requests from multiple cores like "X" write followed by "Y"
write at same clock, the EOI register updates only one write at a time, X or Y, and
ignores the other write. The EOI write request that was ignored by the MAC locks up the
CPU that requested the write.
Workarounds:
Semaphores can be used to fix the EOI issue. There are two new APIs added to write
the EOI register, one for receive and the other for transmit. The application can make
use of those APIs with the semaphore module, to protect the EOI write when all the 3
cores try to access EOI register at same time.
This workaround is required only when more than one core requests the EOI write. Code
examples for receive and transmit writes are shown below.
• Before and after rxEoiWrite, the semaphore APIs are called:
/* Check Whether Handle opened successfully and then read module status*/
if(hSemHandle!= NULL){
/* Check whether semaphore resource is Free or not */
do{
/* Get the semaphore*/
CSL_semGetHwStatus(hSemHandle,CSL_SEM_QUERY_DIRECT,&response);
}while(response.semFree != CSL_SEM_FREE);
/* write the EOI register */
EMAC_rxEoiWrite(coreNum);
/* Release the semaphore*/
CSL_semHwControl(hSemHandle, CSL_SEM_CMD_FREE_DIRECT,NULL);
•
Before and after txEoiWrite the semaphore APIs are called:
/* Check Whether Handle opened successfully and then read module status*/
if(hSemHandle!= NULL){
/* Check whether semaphore resource is Free or not*/
do {
/* Get the semaphore*/
CSL_semGetHwStatus(hSemHandle,CSL_SEM_QUERY_DIRECT,&response);
} while (response.semFree != CSL_SEM_FREE);
/* write the EOI register */
EMAC_txEoiWrite(coreNum);
/* Release the semaphore*/
CSL_semHwControl(hSemHandle, CSL_SEM_CMD_FREE_DIRECT,NULL);
This issue is fixed in the CSL version 03.00.06.01
18
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.3.5
Potential SerDes Clocking Issue
Revision(s) Affected:
1.3, 1.2
Details:
A bug has been found in the SerDes interfaces that causes a SerDes clocking problem
in normal functional operation. This problem will not occur when external pull-down is
applied on the TCK pin (JTAG controller clock). SerDes are used in the Ethernet
interface (EMAC), Serial RapidIO interface (SRIO) and the Antenna Interface (AIF).
The TCK pin (JTAG controller clock) is internally assigned to an internal signal that is
used by the SerDes macro. For the SerDes macro to get proper clocking in the normal
functional operation, it needs the internal signal to be held low. However, there is an
internal pull-up on the TCK, creating problems for SerDes operation. This problem exists
on all SerDes interfaces.
Workaround:
The TCK pin should be externally pulled down with an 1-kΩ resistor.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
19
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.3.6
I2C: Slave Boot Aborts
Revision(s) Affected:
1.3, 1.2
Details:
I2C Slave Boot is intended to speed the boot process for a system with more than two
devices by allowing a single master read of the I2C EEPROM followed by a broadcast
by that master to all remaining devices on the I2C bus. However, during the I2C slave
boot process an internal exception is encountered, causing the boot sequence to abort
on the slave device(s). Consequently, I2C slave boot does not complete.
Workaround:
Use I2C master boot for all devices in the system. Other boot modes with SRIO or
EMAC may also be utilized, if available on system.
20
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.3 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.3.7
EMAC Boot Issue
Revision(s) Affected:
1.3, 1.2
Details:
The EMAC ready announcement frame is not transmitted when the C6474 device is
booted in master and slave modes.
When the DSP is booted in EMAC master/slave boot modes (boot modes 4, 5), the DSP
transmits an Ethernet Ready Announcement (ERA) frame in the form of a BOOTP
request. The BOOTP request is intended to inform the host server that the DSP is ready
to receive boot packets. The ERA frame packet is described in more detail in the
TMS320C6474 Bootloader User's Guide (literature number SPRUG24).
Texas Instruments will fix the Ethernet Ready Announcement frame transmission in the
next silicon revision for C6474 devices.
Workaround 1:
Have the host that is responsible for sending the boot packets broadcast a small boot
table with the program that is shown in the example below. This will cause any C6474
device to restart the EMAC boot procedure (without configuring the MAC peripheral
again) and re-transmit the ERA.
Re-send ERA packet code:
BOOT_REENTRY_ADDR .equ 03c000110h
BOOT_EMAC_OPT
.equ 01088480Ah
MVKL
MVKH
Workaround 2:
BOOT_EMAC_OPT, A1
BOOT_EMAC_OPT, A1
MVKL
MVKH
STH
NOP
0x00000026, A4
0x00000026, A4
A4, *A1
4
;overwrite option field in EMAC bootparam
MVKL
MVKH
BNOP
BOOT_REENTRY_ADDR,
BOOT_REENTRY_ADDR,
B3, 5
B3
B3
The host server would need to rely on prior knowledge of the DSP MAC address to
transmit boot packets to the correct DSP. The DSP will be ready to receive EMAC boot
packets within 2 ms following deassertion of reset.
In the scenario where the boot server reads the MAC address of the DSP from the ERA
packet, the procedure would need to be changed. After some customer TBD delay
where the ERA is not received, the host sends the broadcast packet with the payload
described in Workaround 1.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
21
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
3
www.ti.com
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
Table 4. Silicon Revision 1.2 Advisory List
Title
......................................................................................................................................
Advisory 1.2.1
Advisory 1.2.2
Advisory 1.2.3
Advisory 1.2.4
Advisory 1.2.5
Advisory 1.2.6
Advisory 1.2.7
Advisory 1.2.8
DSP SDMA/IDMA: Unexpected Stalling of SDMA/IDMA Access to L2 SRAM ................................
Potential Data Corruption on SCR Bridge ..........................................................................
Potential Insertion or Deletion of 2 Bits in SerDes Data Stream ................................................
MAC EOI Register Write Causes Potential CPU Lockup.........................................................
Potential SerDes Clocking Issue.....................................................................................
I2C: Slave Boot Aborts ................................................................................................
Potential Random E-fuse Blow ......................................................................................
EMAC Boot Issue .....................................................................................................
Page
23
30
31
33
34
35
36
37
Advisory 1.2.9 EDMA3CC COMPACTV Issue ...................................................................................... 38
22
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.1
DSP SDMA/IDMA: Unexpected Stalling of SDMA/IDMA Access to L2 SRAM
Revision(s) Affected:
1.3, 1.2
Details:
Note:
Only when DSP level 2 (L2) memory is configured as non-cache (RAM),
unexpected stalling may occur on DSP SDMA/IDMA accesses. If DSP L2
memory is used only as cache or if L2 RAM is not accessed by IDMA or
via the SDMA interface during run-time, then this exception does not
apply.
The C64x+ megamodule has a Master Direct Memory Access (MDMA) bus interface and
a Slave Direct Memory Access (SDMA) bus interface. The MDMA interface provides
DSP access to resources outside the C64x+ megamodule (i.e., DDR2 memory). The
MDMA interface is used for CPU/cache accesses to memory beyond the level 2 (L2)
memory level. These accesses include cache line allocates, write-backs, and
non-cacheable loads and stores to/from system memories. The SDMA interface allows
other master peripherals in the system to access level 1 data (L1D), level 1 program
(L1P), and L2 RAM DSP memories. The masters allowed accesses to these memories
are DMA controllers, EMAC, and SRIO. The DSP Internal Direct Memory Access (IDMA)
is a C64x+ megamodule DMA engine used to move data between internal DSP
memories (L1, L2) and/or the DSP peripheral configuration bus. The IDMA engine
shares resources with the SDMA interface.
The C64x+ megamodule has an L1D cache and an L2 caches, both of which implement
write-back data caches. The C64x+ megamodule holds updated values for external
memory as long as possible. It writes these updated values, called victims, to external
memory when it needs to make room for new data, when requested to do so by the
application, or when a load is performed from a non-cacheable memory for which there
is a set match in the cache (i.e., the non-cacheable line would replace a dirty line if
cached). The L1D sends its victims to L2. The caching architecture has pipelining,
meaning multiple requests could be pending between L1, L2, and MDMA. For more
Details: on the C64x+ megamodule and its MDMA and SDMA ports, see the
TMS320C64x+ Megamodule Reference Guide (literature number SPRU871).
Ideally, the MDMA (the blue lines in Figure 3) and SDMA/IDMA paths (the orange lines
in Figure 3) operate independently with minimal interference. Normally, MDMA accesses
may stall for extended periods of time (clock cycles) due to expected system level delays
(e.g., bandwidth limitations, DDR2 memory refreshes). However, when using L2 as
RAM, SDMA and/or IDMA accesses to L2/L1 may experience unexpected stalling in
addition to the normal stalls seen by the MDMA interface. For latency-sensitive traffic,
the SDMA stall can result in missing real-time deadlines.
Note:
SDMA/IDMA accesses to L1P/D will not experience an unexpected stall if
there are no SDMA/IDMA accesses to L2. Unexpected SDMA/IDMA
stalls to L1 happen only when they are pipelined behind L2 accesses.
Figure 3 is a simplified view for illustrative purposes only. The IDMA/SDMA path (orange
lines) can also go to L1D/L1P memories and IDMA can go to the DSP CFG peripherals.
MDMA transactions (blue lines) can also originate from L1P or L1D through the L2
controller or directly from the DSP.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
23
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
RAM/
Cache
RAM/
Cache
256
ROM
256
Cache Control
Memory Protect
L1P
256
256
Cache Control
256
Memory Protect
Bandwidth Mgmt
L2
Bandwidth Mgmt
256
128
256
256
Power Down
Instruction Fetch
Interrupt
Controller
Register
File A
IDMA
C64x + CPU
Register
File B
64
64
Bandwidth Mgmt
CFG
256
Memory Protect
32
Peripherals
EMC
L1D
Cache Control
MDMA
8 x 32
SDMA
128
128
EDMA Master
Peripherals
RAM/
Cache
CPU/Cache Access Origination
Master Peripheral Origination
Figure 3. IDMA, SDMA, and MDMA Paths
24
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
SDMA/IDMA stalls may occur during the following scenarios. Each of these scenarios
describes expected normal DSP functionality, but the SDMA/IDMA access potentially
exhibits additional unexpected stalling.
1. Bursts of writes to non-cacheable MDMA space (i.e., DDR2). The DSP buffers up to
4 non-cacheable writes. When this buffer fills, SDMA/IDMA is blocked until the buffer
is no longer full. Therefore, bursts of non-cacheable writes longer than three writes
can stall SDMA/IDMA traffic.
2. Various combinations of L1 and L2 cache activity:
a. L1D read miss generating victim traffic to L2 (cache or SRAM) or external
memory. The SDMA/MDMA may be stalled while servicing the read miss and the
victim. If the read miss also misses L2 cache, the SDMA/IDMA may be stalled
until data is fetched from external memory to service the read miss. If the read
access is to non-cacheable memory there will still potentially be an L1D victim
generated even though the read data will not replace the line in the L1D cache.
b. L1D read request missing L2 (going external) while another L1D request is
pending. The SDMA/IDMA may be stalled until the external memory access is
complete.
c. L2 victim traffic to external memory during any pending L1D request. The
SDMA/IDMA may be stalled until external memory access and the pending L1D
request are complete.
The duration of the SDMA/IDMA stalls depends on the quantity/characteristics of the
L1/L2 cache and the MDMA traffic in the system. In cases 2a, 2b, and 2c, stalling may or
may not occur depending on the state of the cache request pipelines and the traffic
target locations. These stalling mechanisms may also interact in various ways, causing
longer stalls. Therefore, it is difficult to predict if stalling will occur and for how long.
SDMA/IDMA stalling and any system impact is most likely in systems with excessive
context switching, L1/L2 cache miss/victim traffic, and heavily loaded EMIF.
Use the following steps to determine if SDMA/IDMA stalling is the cause of real-time
deadline misses for existing applications. Situations where real-time deadlines may be
missed include loss of McBSP samples and low peripheral throughput.
1. Determine if the transfer missing the real-time deadline is accessing L2 or L1D
memory. If not, then SDMA/IDMA stalling is not the source of the real-time deadline
miss.
2. Identify all SDMA transfers to/from L2 memory (e.g., EDMA transfer to/from L2
from/to a McBSP or from/to AIF, TCP, or VCP). If there are no SDMA transfers going
to L2, then SDMA/IDMA stalling is not the source of the problem.
3. Redirect all SDMA transfers to L2 memory to other memories using one of the
following methods:
• Temporarily transfer all the L2 SDMA transfers to L1D SRAM.
• If not all L2 SDMA transfers can be moved to L1D memory, temporarily direct
some of the transfers to DDR memory and keep the rest in L1D memory. There
should be no L2 SDMA transfers.
• If neither of the above approaches are possible, move the transfer with the
real-time deadline to the EMAC CPPI RAM. If the EMAC CPPI RAM is not big
enough, a two-step mechanism can be used to page a small working buffer
defined in the EMAC CPPI RAM into a bigger buffer in L2 SRAM. The EDMA
module can be setup to automate this double buffering scheme without CPU
intervention for moving data from the EMAC CPPI RAM. Some throughput
degradation is expected when the buffers are moved to the EMAC CPPI RAM.
Note: Note that EMAC CPPI RAM memory is word-addressable only and,
therefore, must be accessed using an EDMA index of 4 bytes.
If real-time deadlines are still missed after implementing any of the options in Step 3,
then SDMA/IDMA stalling is likely not the cause of the problem. If real-time deadline
misses are solved using any of the options in Step 3, then SDMA/IDMA stalling is likely
the source of the problem.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
25
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
An extreme manifestation of the IDMA/SDMA stall bug is the C64x+ MDMA-SDMA
deadlock that requires a device reset or power-on reset in order for the system to
recover. The following summarizes the deadlock conditions:
• Master(s) on a single main MSCR port write to the GEM's SDMA followed by a write
to slaveX
• The GEM issues victim traffic or a non-cacheable write to slaveX
• Any one of the following:
– A write data path pipelined in main MSCR between master(s) and the GEM's
SDMA
– A bridge exists between master(s) and the main MSCR
– Master(s) are able to issue a command to slaveX concurrent with the write to the
GEM's SDMA.
A load (either cacheable or non-cacheable) from another core's L1D or L2 memory can
additionally create a deadlock condition. When the load is issued the read command is
propagated to the SDMA port of the other core through a bridge that is shared with the
EDMA TC1, EMAC, RapidIO (both data and CPPI), and other GEM MDMA. When the
load is issued, if a victim is generated in L1D cache, then the SDMA may stall until the
load completes. If other masters are issuing commands through the shared bridge, then
the bridge may fill due to the stalled SDMA before the read command can propagate
through the bridge and complete. In summary, a deadlock can occur if the following is
true:
• GEMx issues a read to GEMy or GEMz L1D or L2 SRAM
• Any of the following are issuing commands to GEMx L2: TC1, EMAC (data or CPPI),
RapidIO (data or CPPI), GEMy, or GEMz.
26
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Workarounds:
Method 1
Issues such as dropped McBSP samples can be worked around by moving
latency-sensitive buffers outside the C64x+ megamodule. For example, rather than
placing buffers for the McBSP into L1/L2, those buffers can instead be placed in other
memory, such as the EMAC CPPI RAM.
Note: Note that EMAC CPPI RAM memory is word-addressable only and, therefore,
must be accessed using an EDMA index of 4 bytes.
Method 2
To reduce the SDMA/IDMA stalling system impact, perform any of the following:
1. Improve system tolerance on DMA side (SDMA/IDMA/MDMA):
• Understand and minimize latency-critical SDMA/IDMA accesses to L2 or L1P/D.
• Directly reduce critical real-time deadlines, if possible, at peripheral/IO level (e.g.,
increase word size and/or reduce bit rates on serial ports).
• To reduce DSP MDMA latency:
– Increase the priority of the DSP access to DDR2 such that MDMA latency of
MDMA accesses causing stalls is minimized.
Note: Other masters may have real-time deadlines that dictate higher priority
than the DSP.
– Lower the PRIO_RAISE field setting in the DDR2 memory controller's burst
priority register. Values ranging between 0x10 and 0x20 should give decent
performance and minimize latency; lower values may cause excessive
SDRAM row thrashing.
2. Minimize offending scenarios on DSP/caching side:
• If the DSP performing non-cacheable writes is causing the issue, insert protected
non-cacheable reads (as shown in the last list item below) every few writes to
allow the write buffer to empty.
• Use explicit cache commands to trigger cache writebacks during appropriate
times (L1D Writeback All, L2 Writeback All). Do not use these commands when
real-time deadlines must be met.
• Restructure program data and data flow to minimize the offending cache activity.
– Define the read-only data as const. The const C keyword tells the compiler
not to write to the array. By default, such arrays are allocated to the .const
section as opposed to BSS. With a suitable linker command file, the
developer can link the .const section off chip, while linking .bss on chip.
Because programs initialize .bss at run time, this reduces the program's
initialization time and total memory image.
– Explicitly allocate lookup tables and writeable buffers to their own sections.
The #pragma DATA_SECTION (label, section) directive tells the compiler to
place a particular variable in the specified COFF section. The developer can
explicitly control the layout of the program with this directive and an
appropriate linker command file.
– Avoid directly accessing data in slow memories (e.g., flash); copy at
initialization time to faster memories.
• Modify troublesome code.
– Rewrite using DMAs to minimize data cache writebacks. If the code accesses
a large quantity of data externally, consider using DMAs to bring in the data,
using double buffering and related techniques. This will minimize cache
write-back traffic and the likelihood of SDMA/IDMA stalling.
– Re-block the loops. In some cases, restructuring loops can increase reuse in
the cache and reduce the total traffic to external memory.
– Throttle the loops. If restructuring the code is impractical, then it is reasonable
to slow it down. This reduces the likelihood that consecutive SDMA/IDMA
blocks stack up in the cache request pipelines, resulting in a long stall.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
27
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
•
www.ti.com
Protect non-cacheable reads from generating an SDMA stall by freezing the L1D
cache during the non-cacheable read access(es). The following example code
contains a function that protects non-cacheable reads, avoids blocking during the
reads, and, therefore, avoids the deadlock state.
;; ======================================================================== ;;
;; Long Distance Load Word
;;
;;
;;
;;
int long_dist_load_word(volatile int *addr)
;;
;;
;;
;; This function reads a single word from a remote location with the L1D
;;
;; cache frozen. This prevents L1D from sending victims in response to
;;
;; these reads, thus preventing the L1D victim lock from engaging for the ;;
;; corresponding L1D set.
;;
;;
;;
;; The code below does the following:
;;
;;
;;
;;
1. Disable interrupts
;;
;;
2. Freeze L1D
;;
;;
3. Load the requested word
;;
;;
4. Unfreeze L1D
;;
;;
5. Restore interrupts
;;
;;
;;
;; Interrupts are disabled while the cache is frozen to prevent affecting ;;
;; the performance of interrupt handlers. Disabling interrupts during
;;
;; the long distance load does not greatly impact interrupt latency,
;;
;; because the CPU already cannot service interrupts when it's stalled by ;;
;; the cache. This function adds a small amount of overhead (~20 cycles) ;;
;; to that operation.
;;
;;
;;
;; ======================================================================== ;;
.asg
0x01840044,
L1DCC
.global _long_dist_load_word
.text
.asmfunc
; int long_dist_load_word(volatile int *addr)
_long_dist_load_word:
MVKL
L1DCC,
B4
MVKH
L1DCC,
B4
||
DINT
||
MVK
1,
B5
STW
B5,
*B4
LDW
*B4,
B5
NOP
4
SHR
B5,
16,
B5
||
LDW
*A4,
A4
NOP
4
STW
B5,
*B4
RET
B3
||
LDW
*B4,
B5
NOP
4
RINT
.endasmfunc
; L1D Cache Control
; Disable interrupts
; \_ Freeze cache
; /
; POPER -> OPER
; read value remotely
; \_ Restore cache
; /
; Restore interrupts
;; ======================================================================== ;;
;; End of file: ldld.asm
;;
;; ======================================================================== ;;
In the C6474 multicore device, when one GEM is accessing another GEM's L1 or L2
memory it is an MDMA access, so the potential SDMA/IDMA stall can occur. The stall
can be avoided by using the EDMA to transfer data from one GEM's memory to another.
28
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Method 3
Entirely eliminate the exception by removing all SDMA/IDMA accesses to L2 SRAM. For
example, EMAC descriptors and EMAC payload cannot reside in L2. Master peripherals
like the EDMA/QDMA, IDMA, and SRIO cannot access L2. There are no issues with the
CPU itself accessing code/data in L2. This issue only pertains to SDMA/IDMA accesses
to L2.
Deadlock Avoidance
To avoid the manifestation of a C64x+ deadlock, several Workarounds: are suggested
depending on the VBUSM master in question:
VBUSM MASTER
WORKAROUND
GEM
GEMs should not write to the memory of any other GEM. This will cause
complications across any master peripheral that is transferring data to multiple
L2s. GEMs must not directly read from the memory of any other GEMs without
providing the L1D cache disable workaround mentioned in Method 2 to ensure
that the load will not stall itself indefinitely and hang the system.
EDMA3TCx
Inbound and outbound traffic should be programmed on different TC ports (i.e.,
two different EDMA queues, since a given queue maps to a given TC). Note that
in-/out-bound direction is defined as the write direction, meaning that a
DDR2-to-DDR2 transfer is outbound and L2-to-L2 is inbound. Any TC used to
write to DDR should not be used to write to a GEM even when the TC writing to
the DDR is also reading from DDR.
EMAC
EMAC should write to the GEM's memory or the DDR, but not both. This includes
buffers and buffer descriptors. EMAC CPPI descriptors should be placed wholly
in the local wrapper memory, any combination of wrapper and L2 memory (must
match other master transactions), or any combination of wrapper and DDR2
SDRAM (must match other master transactions). Buffer descriptors should not be
placed in any combination of L2 and DDR2 SDRAM.
SRIO
SRIO should transfer payload data to only GEM memories or to DDR2 SDRAM,
but not both. This includes any direct I/O writes as well as any inbound RX
messaging transfer.
SRIO CPPI
SRIO CPPI descriptors should be placed wholly in the local wrapper memory,
any combination of wrapper and L2 memory, or any combination of wrapper and
DDR2 SDRAM. Buffer descriptors should not be placed in any combination of L2
and DDR2 SDRAM.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
29
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.2
Potential Data Corruption on SCR Bridge
Revision(s) Affected:
1.3, 1.2
Details:
This issue manifests itself when two masters write to a bridge endpoint and the
commands arrive on the same clock cycle. The VBUS protocol is violated and the data is
corrupted. The consequence is that one of the writes goes through with corrupt data, the
other completes normally.
On some occasions the bridge may not recover without a reset. There is no software
indication of this nor a means to reset only the bridge. Therefore, the situation must be
avoided. The affected bridges are: TCP, VCP, AIF write port, and DMA bridge to
configuration bus (where key endpoints beyond this bus could include Semaphore
configuration port, EMAC configuration ports, PaRAM configuration port, or SRIO
configuration ports).
Workarounds:
30
Corrective action is taken by avoiding the issue as described in the following:
• TCP: Access to R/W ports is controlled by the Semaphore module; no issue.
• VCP: Access to R/W ports is controlled by the Semaphore module; no issue.
• AIF: DMA access is from a single transfer controller (TC); no issue. If more than one
TC is used, the issue will be exposed.
• DMA bridge to configuration bus: Dedicate a single TC for use of the DMA to write
through this bridge to all the endpoints beyond or use the configuration bus directly
and do not use the DMA to program the following:
– Semaphore configuration port: Configuration registers
– SRIO configuration ports: Configuration registers
– EMAC configuration ports: Configuration registers (caution on using EMAC CPPI
buffer to work around the DSP SDMA/IDMA unexpected stalling, see Advisory
1.2.1 )
– PaRAM configuration port: Do not auto-program channels from one TC to another
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.3
Potential Insertion or Deletion of 2 Bits in SerDes Data Stream
Revision(s) Affected:
1.3, 1.2
Details:
For arbitrary phase mode, a FIFO function is integrated into the SerDes TX serializer.
This FIFO has three states (minus1, center, plus1) and is supposed to be reset to the
center state at startup. From this position, the SerDes is then tolerant to variations of
phase between the input clock (TXBCLKIN) and the SerDes internal clock, caused by
temperature and voltage variations. However, as a result of a logic bug, the possibility
exists that under some circumstances, the FIFO may not start in the center state. When
this happens, there is a risk that the FIFO may subsequently overflow or underflow.
Whether the FIFO fails to initialize to the center state depends on the timing
relationships between several signals, including the SerDes internal clock. Even if the
FIFO fails to initialize to the center state, the FIFO will only underflow or overflow if the
phase relationship between the TXBCLKIN input and the internal SerDes clock vary (due
to temperature or voltage changes) in such a way as to cause their edges to cross in
one particular direction. Overflow results in two bits being added to the data stream.
Underflow results in two bits being deleted. If overflow or underflow occurs at all, it only
happens once per TX lane because after it has occurred the FIFO is configured exactly
as if it had initialized to the center state at startup.
The precise silicon process of the device will also be a factor in whether the overflow or
underflow occurs. Some devices may exhibit this behavior at some particular PVT
combinations, others may never exhibit it. It is not possible to predict whether, or under
what conditions, a device is susceptible. If overflow or underflow occur, it could be at any
time ranging from immediately after startup to weeks, months, or years later.
Workarounds:
The issue can be worked around by software control of two ports on the SerDes. At
initialization, cycling of bits resets the circuit and resolves the issue.
• AIF has a software workaround as follows:
The software workaround limits restart to per macro, not per lane. There is one set of
software control bits for the B8 and another for the B4. For details, see the
device-specific data manual, TMS320C6474 Multicore Digital Signal Processor
(literature number SPRS552). There are new recommendations for the initialization
sequence that is shown in the following code example:
//Enable the Tx Link
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_TX_LINK_EN, ENABLED);
//Set the Link Rate
if (aCommoncfg[0].linkRate == CSL_AIF_LINK_RATE_1x){
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_LINK_RATE, 1X);
}
else if (aCommoncfg[0].linkRate == CSL_AIF_LINK_RATE_2x)
{
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_LINK_RATE, 2X);
}
else if (aCommoncfg[0].linkRate == CSL_AIF_LINK_RATE_4x)
{
CSL_FINST(hAifLink[0]->regs->LCFG[1].LINK_CFG, AIF_LINK_CFG_LINK_RATE, 4X);
}
//Toggle the ENFTP bit
CSL_FINS( hAifLink[0]->regs->AI_SERDES0_TST_CFG, AIF_AI_SERDES0_TST_CFG_INVPATT,
1);
CSL_FINS(hAifLink[0]->regs->AI_SERDES0_TST_CFG, AIF_AI_SERDES0_TST_CFG_INVPATT,
0);
CSL version 3.0.6.2 for the C6474 device has a new hardware control command
(CSL_AIF_CMD_ENABLE_DISABLE_TX_LINK_SI1_1) that has the fix for this
advisory.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
31
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
•
www.ti.com
EMAC has a software workaround and an auto-recovery for this advisory as follows:
There is a new recommendation for initialization sequence as shown in the following
code example. This example code should be used with CSL version 03.00.06.01.
SgmiiCfg.masterEn
= 0x1;
SgmiiCfg.loopbackEn = 0x1;
SgmiiCfg.auxConfig = 0x0000000b;
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
LocalTicks = 0;
while(LocalTicks !=3); // wait for 2us
SgmiiCfg.txConfig
SgmiiCfg.rxConfig
= 0x00000e21; // enable transmitter
= 0x00081021;
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
SgmiiCfg.txConfig
= 0x00001e21; // toggle the ENFTP bit
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
SgmiiCfg.masterEn
= 0x1;
SgmiiCfg.loopbackEn = 0x1;
SgmiiCfg.txConfig
= 0x00000e21; // toggle the ENFTP bit
if (0 == SGMII_config(&SgmiiCfg))
printf("SGMII config successful........\n");
else
printf("SGMII config NOT successful........\n");
// wait for the Auto-negotiation Complete
SGMII_REGS->CONTROL |= 0x1; // Loopback mode is selected
// set full dupex and Gig bits
SGMII_REGS->MR_ADV_ABILITY = 0x9801;
•
SRIO has an auto-recovery as follows:
Auto-recovery resets the link and re-exposes the issue. TI is working to understand
the likelihood of repeated recovery and whether there could be performance impacts
due to repeated recovery.
The software workaround is enabled with a partial fix for AIF and EMAC. A complete fix
will be available in silicon revision 2.x.
32
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.4
MAC EOI Register Write Causes Potential CPU Lockup
Revision(s) Affected:
1.3, 1.2; Fixed in CSL version 03.00.06.01
Details:
A bug has been found affecting multiple cores trying to write the EOI register via the
MAC interface. It causes a lockup of one of the three MAC interfaces that is attempting
to write the EOI register.
When multiple cores try to access the MAC interface one of the three cores that
requested the EOI write gets locked up. This situation occurs when the MAC interface
receives the EOI register write requests from multiple cores like "X" write followed by "Y"
write at same clock, the EOI register updates only one write at a time, X or Y, and
ignores the other write. The EOI write request that was ignored by the MAC locks up the
CPU that requested the write.
Workarounds:
Semaphores can be used to fix the EOI issue. There are two new APIs added to write
the EOI register, one for receive and the other for transmit. The application can make
use of those APIs with the semaphore module, to protect the EOI write when all the 3
cores try to access EOI register at same time.
This workaround is required only when more than one core requests the EOI write. Code
examples for receive and transmit writes are shown below.
• Before and after rxEoiWrite, the semaphore APIs are called:
/* Check Whether Handle opened successfully and then read module status*/
if(hSemHandle!= NULL){
/* Check whether semaphore resource is Free or not */
do{
/* Get the semaphore*/
CSL_semGetHwStatus(hSemHandle,CSL_SEM_QUERY_DIRECT,&response);
}while(response.semFree != CSL_SEM_FREE);
/* write the EOI register */
EMAC_rxEoiWrite(coreNum);
/* Release the semaphore*/
CSL_semHwControl(hSemHandle, CSL_SEM_CMD_FREE_DIRECT,NULL);
•
Before and after txEoiWrite the semaphore APIs are called:
/* Check Whether Handle opened successfully and then read module status*/
if(hSemHandle!= NULL){
/* Check whether semaphore resource is Free or not*/
do {
/* Get the semaphore*/
CSL_semGetHwStatus(hSemHandle,CSL_SEM_QUERY_DIRECT,&response);
} while (response.semFree != CSL_SEM_FREE);
/* write the EOI register */
EMAC_txEoiWrite(coreNum);
/* Release the semaphore*/
CSL_semHwControl(hSemHandle, CSL_SEM_CMD_FREE_DIRECT,NULL);
This issue is fixed in the CSL version 03.00.06.01
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
33
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.5
Potential SerDes Clocking Issue
Revision(s) Affected:
1.3, 1.2
Details:
A bug has been found in the SerDes interfaces that causes a SerDes clocking problem
in normal functional operation. This problem will not occur when external pull-down is
applied on the TCK pin (JTAG controller clock). SerDes are used in the Ethernet
interface (EMAC), Serial RapidIO interface (SRIO) and the Antenna Interface (AIF).
The TCK pin (JTAG controller clock) is internally assigned to an internal signal that is
used by the SerDes macro. For the SerDes macro to get proper clocking in the normal
functional operation, it needs the internal signal to be held low. However, there is an
internal pull-up on the TCK, creating problems for SerDes operation. This problem exists
on all SerDes interfaces.
Workaround:
34
The TCK pin should be externally pulled down with an 1-kΩ resistor.
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.6
I2C: Slave Boot Aborts
Revision(s) Affected:
1.3, 1.2
Details:
I2C Slave Boot is intended to speed the boot process for a system with more than two
devices by allowing a single master read of the I2C EEPROM followed by a broadcast
by that master to all remaining devices on the I2C bus. However, during the I2C slave
boot process an internal exception is encountered, causing the boot sequence to abort
on the slave device(s). Consequently, I2C slave boot does not complete.
Workaround:
Use I2C master boot for all devices in the system. Other boot modes with SRIO or
EMAC may also be utilized, if available on system.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
35
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.7
Potential Random E-fuse Blow
Revision(s) Affected:
1.2
Details:
In the final stages of screening the C6474 device for qualification, a subtle issue has
been uncovered involving e-fuses being inadvertently blown during power up if an
improper power and clock sequence is applied to the device. The e-fuse controller on
the C6474 device may unintentionally blow e-fuses during power up when an invalid
power sequence is used:
• The e-fuse controller has a defect that gates the output of an accidental programming
prevention circuit with a clocked register.
• If proper sequencing of supplies and clocks is not maintained, then the program
enable on the e-fuse ROM will be active until a valid reset (SYS_INITZ) is
propagated to the register.
• The result is susceptibility to inadvertent blowing of e-fuses.
If the 1.1-V CVDD scaled supply ramps before the 1.8-V and 1.1-V fixed supplies, the
logic in the CVDD domain powers up in random state. In this random state, there is small
probability that conditions are met for inadvertent blowing of e-fuses.
The possible impact is that e-fuses that are not supposed to be blown are blown.
• Which e-fuses could inadvertently be blown is random.
• The type of e-fuse randomly blown will determine the end impact to the system,
ranging from no impact to severe impact.
• Each power up event is a new opportunity for exposure to the issue in which e-fuses
could be unintentionally blown.
• The probability of this is low, but not low enough.
Workaround:
Guarantee that the 1.8-V DVDD device input clocks and clock selects are active before a
1.1-V CVDD scaled supply ramps (see Figure 4).
1.8-V DVDD
0.8 V
1
1.1-V DVDD
0.8 V
3
1.1-V CVDD
0.4 V
2
SYSCLK or
ALTCORECLK
CLKSEL
POR
A
1.8-V DVDD valid to 1.1-V CVDD valid, ≥0.5 ms (min).
B
Stable clock to 1.1-V CVDD start, ≥100 µs (min).
C
1.1-V DVDD to 1.1-V CVDD start, ≥0.5 µs (min).
Figure 4. Correct Device Input Clocks, Clock Selects, and Scaled Supply Timings
TI's TMS test procedure follows the above recommendation to ensure that no devices
are shipped with unintentionally blown e-fuses.
Note:
36
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
E-fuses are used in multiple areas within the device. They are used for
memory repair, device ID, EMAC ID, etc.
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.8
EMAC Boot Issue
Revision(s) Affected:
1.3, 1.2
Details:
The EMAC ready announcement frame is not transmitted when the C6474 device is
booted in master and slave modes.
When the DSP is booted in EMAC master/slave boot modes (boot modes 4, 5), the DSP
transmits an Ethernet Ready Announcement (ERA) frame in the form of a BOOTP
request. The BOOTP request is intended to inform the host server that the DSP is ready
to receive boot packets. The ERA frame packet is described in more detail in the
TMS320C6474 Bootloader User's Guide (literature number SPRUG24).
Texas Instruments will fix the Ethernet Ready Announcement frame transmission in the
next silicon revision for C6474 devices.
Workaround 1:
Have the host that is responsible for sending the boot packets broadcast a small boot
table with the program that is shown in the example below. This will cause any C6474
device to restart the EMAC boot procedure (without configuring the MAC peripheral
again) and re-transmit the ERA.
Re-send ERA packet code:
BOOT_REENTRY_ADDR .equ 03c000110h
BOOT_EMAC_OPT
.equ 01088480Ah
MVKL
MVKH
Workaround 2:
BOOT_EMAC_OPT, A1
BOOT_EMAC_OPT, A1
MVKL
MVKH
STH
NOP
0x00000026, A4
0x00000026, A4
A4, *A1
4
;overwrite option field in EMAC bootparam
MVKL
MVKH
BNOP
BOOT_REENTRY_ADDR,
BOOT_REENTRY_ADDR,
B3, 5
B3
B3
The host server would need to rely on prior knowledge of the DSP MAC address to
transmit boot packets to the correct DSP. The DSP will be ready to receive EMAC boot
packets within 2 ms following deassertion of reset.
In the scenario where the boot server reads the MAC address of the DSP from the ERA
packet, the procedure would need to be changed. After some customer TBD delay
where the ERA is not received, the host sends the broadcast packet with the payload
described in Workaround 1.
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
37
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Advisory 1.2.9
EDMA3CC COMPACTV Issue
Revision(s) Affected:
1.2
Details:
A bug has been found inside the EDMA3 channel controller (EDMA3CC). The logic for
decrementing the completion request active (COMPACTV) counter is incorrect for
devices having six or more EDMA3 transfer controllers (EDMA3TCs). Therefore, the
C6474 devices are affected by this bug.
The COMPACTV field inside the channel controller status register (CCSTAT) indicates
the count for the number of outstanding transfer requests requiring completion status
that have been submitted to the transfer controllers. The channel controller increments
this count every time a transfer request (TR) is submitted and is programmed to report
completion (the TCINTEN or TCCHEN or the ITCINTEN or ITCCHEN bits in OPT in the
parameter entry associated with the TR are set). The counter decrements for every valid
transfer completion code (TCC) received back from the transfer controllers. The bug
occurs because the channel controller decrements the counter by an insufficient value
when multiple responses are received concurrently from multiple (two or more) transfer
controllers. Thus, the counter may gradually increase over time until it saturates at 0x3F.
If at any time the count reaches a value of 0x3F, the channel controller does not service
new TRs until the count is less than 0x3F (which will happen when a transfer completion
code is received from a transfer controller for an in-flight request). Once the state is
reached where the counter is close to the saturation value of 0x3F, the performance of
the EDMA decreases dramatically. This decreased performance happens because the
channel controller will artificially limit its number of TRs in flight to the COMPACTV
saturation value thereby preventing full usage of the available TCs. When the count
reaches 0x3F, the TCCERR bit is set in the channel controller error register (CCERR)
causing an error interrupt when enabled.
Workaround:
The workaround is achieved by having the DSP directly program one of the transfer
controllers (bypassing the channel controller) with a transfer request that requires
completion. This request avoids the COMPACTV increment (because TC is programmed
directly) and forces a COMPACTV decrement when the TC responds to the CC with the
completion signaling.
A specific transfer controller and a specific TCC value should be dedicated in the
system for this workaround. TC2 or TC5 are suggested. TC2 is suggested because
TC0 and TC1 can replace its connectivity. TC0 can be used for TCP/VCP transfers. TC5
is suggested because TC4 can replace its connectivity. TC4/TC3 can be used for AIF
transfers.
The DSP should poll the COMPACTV field often enough such that the counter is not
allowed to exceed 0x30. The actual COMPACTV polling interval may need to be set
through experimentation on the specific end system, since the rate of increment of the
counter is system and load specific.
Upon polling, if the value of the COMPACTV field is greater than a certain threshold
(0x20 is suggested), then the DSP should program the TC with a COMPACTV
decrement transfer. Upon completion of that transfer (as signaled in the CC IPR register)
the COMPACTV field should be re-checked, and another COMPACTV decrement
transfer submitted until the value of the counter is less than the threshold.
Note:
38
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
Care must be taken such that the software does not over-decrement the
counter since at the time of polling multiple requests may be in flight in
the system and may result in additional decrements compared to the
current observed value. If too many decrements occur, the counter may
roll under from 0x0 to 0x3F and accidentally result in saturation of the
counter. This is why a value of 0x20 is suggested as the threshold value
(sufficiently large with respect to the number of actual requests that may
be outstanding).
SPRZ283 – October 2008
Submit Documentation Feedback
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
This workaround requires that a specific TC instance is dedicated to the COMPACTV
decrement transfer. The reason is that, depending on the nature of the traffic on a given
queue/TC, it may be difficult to control the timing of the normal CC TR submission to that
TC versus the DSP programming of that TC. There is no hardware protection to prevent
corruption of the TC registers in the case that both CC and DSP software attempt to
program the TC simultaneously.
For the base addresses of the TCs, see the device-specific data manual, TMS320C6474
Multicore Digital Signal Processor (literature number SPRS552). A brief summary of the
TC registers to be configured are provided in Table 5.
Table 5. TC Registers Summary
ADDRESS
REGISTER
DESCRIPTION
SUGGESTED VALUE
TCx Base + 0x0200
Prog Set Options
See the Prog Set Options Register description below
TCx Base + 0x0204
Prog Set Src Address
See Prog Set Src/Dst Address Register description
below
TCx Base + 0x0208
Prog Set Count
0x00010004 (ACNT = 4 and BCNT =1)
TCx Base + 0x020C
Prog Set Dst Address
See Prog Set Src/Dst Address Register description
below
TCx Base + 0x0210
Prog Set B-Dim Idx
0x0 (don't care since BCNT=1). Writing to the PBIDX
register triggers the transfer. Thus, this register
should be written.
Note: The five registers listed in Table 5 should be written in the sequence shown (i.e.,
top to bottom). The last write, to the Prog Set B-Dim Idx register, triggers the transfer.
Prog Set Options Register
The Prog Set Option register is shown in Figure 5. The TCINTEN bit should be set to
0x1. The TCC code should be set to some known value that is not used by other
requests in the system. The other fields should be set to 0x0. Upon completion of the
transfer, the TCC value will be set in the corresponding bit in the IPR/IPRH registers.
The software should poll for this bit in the IPR/IPRH registers and then clear it with the
ICR/ICRH registers before programming the next COMPACTV decrement transfer.
Figure 5. Prog Set Options Register
31
23
15
12
21
20
Reserved
TCCH
_EN
Rsvd
TCINT
_EN
Reserved
TCC
R-0
R/W-0
R-0
R/W-0
R-0
R/W-0
11
10
8
7
22
6
4
19
18
3
2
17
16
1
0
TCC
Rsvd
FWID
Rsvd
PRI
Reserved
DAM
SAM
R/W-0
R-0
R/W-0
R-0
R/W-0
R-0
R/W-0
R/W-0
LEGEND: R/W = Read/Write; R = Read only; -n = value after reset
SPRZ283 – October 2008
Submit Documentation Feedback
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
39
Silicon Revision 1.2 Known Design Exceptions to Functional Specifications
www.ti.com
Prog Set Src/Dst Address Register
Although the user can specify any address for src/dst, one of the following settings is
suggested:
1. Set the src/dst address as 0x31000000. This is a reserved location and transfer to
this address takes less latency. However, the bus error (BUSERR) bit in the TCx
error status register(ERRSTAT) will be set (the TCx error details register (ERRDET)
will also be set). This TCx error should be ignored. This error is localized to the
dedicated TC for this transfer and will not affect the system. Also, by default, the
BUSERR will not cause the EDMA3TC error interrupt. This interrupt gets generated
only when the TCx error enable (ERREN) register is set.
2. The other option is to set the src/dst address to the EDMA3TCx or the EDMA3CC
peripheral ID (PID) register location. This transfer has more latency when compared
to option 1, but will not cause TCx BUSERR condition.
Example code for programming the TC for this workaround (this example uses
TC5):
#include <csl_edma3.h>
#include <soc.h>
#define
#define
#define
#define
#define
#define
#define
EDMA3TC_POPT_REG (*(volatile Uint32*)(CSL_EDMA3TC_5_REGS + 0x200))
EDMA3TC_PSRC_REG (*(volatile Uint32*)(CSL_EDMA3TC_5_REGS + 0x204))
EDMA3TC_PCNT_REG (*(volatile Uint32*)(CSL_EDMA3TC_5_REGS + 0x208))
EDMA3TC_PDST_REG (*(volatile Uint32*)(CSL_EDMA3TC_5_REGS + 0x20C))
EDMA3TC_PBIDX_REG (*(volatile Uint32*)(CSL_EDMA3TC_5_REGS + 0x210))
COMPACTV_XFER_ADDRESS (0x31000000)
COMPACTV_XFER_COMPLETION_CODE (63) /* dedicate one TCC value for this */
void triggerCompactvDecTransfer ()
{
EDMA3TC_POPT_REG = CSL_EDMA3_OPT_MAKE(FALSE, FALSE, FALSE, TRUE,\
COMPACTV_XFER_COMPLETION_CODE, FALSE,\
CSL_EDMA3_FIFOWIDTH_NONE, FALSE, FALSE,\
CSL_EDMA3_ADDRMODE_INCR, CSL_EDMA3_ADDRMODE_INCR);
EDMA3TC_PSRC_REG = COMPACTV_XFER_ADDRESS;
EDMA3TC_PCNT_REG = CSL_EDMA3_CNT_MAKE(4, 1);
EDMA3TC_PDST_REG = COMPACTV_XFER_ADDRESS;
EDMA3TC_PBIDX_REG = CSL_EDMA3_BIDX_MAKE(0, 0);
}
40
TMS320C6474 DSP
Silicon Revisions 1.3, 1.2
SPRZ283 – October 2008
Submit Documentation Feedback
IMPORTANT NOTICE
Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications, enhancements, improvements,
and other changes to its products and services at any time and to discontinue any product or service without notice. Customers should
obtain the latest relevant information before placing orders and should verify that such information is current and complete. All products are
sold subject to TI’s terms and conditions of sale supplied at the time of order acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale in accordance with TI’s standard
warranty. Testing and other quality control techniques are used to the extent TI deems necessary to support this warranty. Except where
mandated by government requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are responsible for their products and
applications using TI components. To minimize the risks associated with customer products and applications, customers should provide
adequate design and operating safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right, copyright, mask work right,
or other TI intellectual property right relating to any combination, machine, or process in which TI products or services are used. Information
published by TI regarding third-party products or services does not constitute a license from TI to use such products or services or a
warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual
property of the third party, or a license from TI under the patents or other intellectual property of TI.
Reproduction of TI information in TI data books or data sheets is permissible only if reproduction is without alteration and is accompanied
by all associated warranties, conditions, limitations, and notices. Reproduction of this information with alteration is an unfair and deceptive
business practice. TI is not responsible or liable for such altered documentation. Information of third parties may be subject to additional
restrictions.
Resale of TI products or services with statements different from or beyond the parameters stated by TI for that product or service voids all
express and any implied warranties for the associated TI product or service and is an unfair and deceptive business practice. TI is not
responsible or liable for any such statements.
TI products are not authorized for use in safety-critical applications (such as life support) where a failure of the TI product would reasonably
be expected to cause severe personal injury or death, unless officers of the parties have executed an agreement specifically governing
such use. Buyers represent that they have all necessary expertise in the safety and regulatory ramifications of their applications, and
acknowledge and agree that they are solely responsible for all legal, regulatory and safety-related requirements concerning their products
and any use of TI products in such safety-critical applications, notwithstanding any applications-related information or support that may be
provided by TI. Further, Buyers must fully indemnify TI and its representatives against any damages arising out of the use of TI products in
such safety-critical applications.
TI products are neither designed nor intended for use in military/aerospace applications or environments unless the TI products are
specifically designated by TI as military-grade or "enhanced plastic." Only products designated by TI as military-grade meet military
specifications. Buyers acknowledge and agree that any such use of TI products which TI has not designated as military-grade is solely at
the Buyer's risk, and that they are solely responsible for compliance with all legal and regulatory requirements in connection with such use.
TI products are neither designed nor intended for use in automotive applications or environments unless the specific TI products are
designated by TI as compliant with ISO/TS 16949 requirements. Buyers acknowledge and agree that, if they use any non-designated
products in automotive applications, TI will not be responsible for any failure to meet such requirements.
Following are URLs where you can obtain information on other Texas Instruments products and application solutions:
Products
Amplifiers
Data Converters
DSP
Clocks and Timers
Interface
Logic
Power Mgmt
Microcontrollers
RFID
RF/IF and ZigBee® Solutions
amplifier.ti.com
dataconverter.ti.com
dsp.ti.com
www.ti.com/clocks
interface.ti.com
logic.ti.com
power.ti.com
microcontroller.ti.com
www.ti-rfid.com
www.ti.com/lprf
Applications
Audio
Automotive
Broadband
Digital Control
Medical
Military
Optical Networking
Security
Telephony
Video & Imaging
Wireless
www.ti.com/audio
www.ti.com/automotive
www.ti.com/broadband
www.ti.com/digitalcontrol
www.ti.com/medical
www.ti.com/military
www.ti.com/opticalnetwork
www.ti.com/security
www.ti.com/telephony
www.ti.com/video
www.ti.com/wireless
Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2008, Texas Instruments Incorporated

Open as PDF

Similar pages: TI TMS320DM335; TI TMS320DM357_1; MICROCHIP PIC24FJ64GA110; TI TMS320F28334; TI OMAP-L137; MICROCHIP PIC24FJ64GB106-I-PT; MICROCHIP PIC24FJ64GA004; TI SPNZ187; MICROCHIP PIC18F46K80-E-PT; TI BQ34Z100PWR; TI BQ26100DRPRG4; TI TMS320AV7110