Intel® E7320 Memory Controller Hub (MCH) Specification Update June 2005 Notice: The Intel® E7320 MCH may contain design defects or errors known as errata that may cause the product to deviate from published specifications. Current characterized errata are documented in this Specification Update. Document Number: 303042-003 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked “Reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. Contact your local Intel sales office or your distributor to obtain the latest specifications before placing your product order. Intel, Intel Xeon, Intel NetBurst, and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Copyright © 2004, Intel Corporation. *Other names and brands may be claimed as the property of others. 2 Intel® E7320 Memory Controller Hub (MCH) Specification Update Contents Revision History ................................................................................................................. 4 Preface ............................................................................................................................... 5 Summary Table of Changes ............................................................................................... 6 Identification Information .................................................................................................... 9 Errata................................................................................................................................ 10 Specification Changes...................................................................................................... 19 Specification Clarifications................................................................................................ 20 Documentation Changes.................................................................................................. 21 Intel® E7320 Memory Controller Hub (MCH) Specification Update 3 Revision History Revision History Version -001 -002 Description Date • Initial publication. August 2004 • Added C4 Stepping information. November 2004 • Added errata 24 -26. -003 4 • Added Specification Clarification 1. June 2005 Intel® E7320 Memory Controller Hub (MCH) Specification Update Preface This document is an update to the memory interface specifications contained in the Affected Documents/Related Documents table below. This document is a compilation of device and document errata and specification clarifications and changes. It is intended for hardware system manufacturers and software developers of applications, operating systems, or tools. Information types defined in Nomenclature are consolidated into the this update document and are no longer published in other documents. This document may also contain information that was not previously published. Affected Documents/Related Documents Document Title Intel® E7320 Memory Controller Hub (MCH) Datasheet Reference Number 303007 Nomenclature Errata are design defects or errors. These may cause the Intel® E7320 MCH to deviate from published specifications. Hardware and software designed to be used with any given stepping must assume that all errata documented for that stepping are present on all devices. Specification Changes are modifications to the current published specifications. These changes will be incorporated in any new release of the specification. Specification Clarifications describe a specification in greater detail or further highlight a specification’s impact to a complex design situation. These clarifications will be incorporated in any new release of the specification. Documentation Changes include typos, errors, or omissions from the current published specifications. These will be incorporated in any new release of the specification. Note: Errata remain in the specification update throughout the product’s lifecycle, or until a particular stepping is no longer commercially available. Under these circumstances, errata removed from the specification update are archived and available upon request. Specification changes, specification clarifications and documentation changes are removed from the specification update when the appropriate changes are made to the appropriate product specification or user documentation (datasheets, manuals, etc.). Intel® E7320 Memory Controller Hub (MCH) Specification Update 5 Summary Table of Changes Summary Table of Changes The following table indicates the errata, specification changes, specification clarifications, or documentation changes which apply to the E7320 MCH. Intel may fix some errata in a future stepping of the component, and account for the other outstanding issues through documentation or specification changes as noted. This table uses the following notations: Codes Used in Summary Table X: Errata exists in the stepping indicated. Specification Change or Clarification that applies to this stepping. (No mark) or (Blank box): This erratum is fixed in listed stepping or specification change does not apply to listed stepping. Doc: Document change or update will be implemented. Plan Fix: This erratum may be fixed in a future stepping of the component. Fixed: This erratum has been previously fixed. No Fix: There are no plans to fix this erratum. Change bar to left of table row indicates this item is either new or modified from the previous version of this document. 6 Intel® E7320 Memory Controller Hub (MCH) Specification Update Summary Table of Changes Errata Stepping No. Status ERRATA C1 C2 C4 1 X X X No Fix Data corruption after an illegal front side bus configuration Write 2 X X X No Fix Improper ECC and Memory Initialization while in Symmetric mode 3 X X X No Fix Single Channel ECC Error Injection issue 4 X X X No Fix PCI Express* add-in card presence detect state misreported 5 X X X No Fix Incorrect PCI Express Link/Lane numbers driven in degraded link 6 X X X No Fix PCI Express Compliance Mode issue 7 X X X No Fix PCI Express link training failures on hot reset 8 X X X No Fix Subsystem Identification and Subsystem Vendor Identification register issue 9 X X X No Fix MCH responds with illegal access on the Hub Interface for 32 GB configurations 10 X X X No Fix MCH hang on PCI Express enhanced configurations to non-existent devices causes hang 11 X X X No Fix Spurious errors logged during link training events 12 X X X No Fix DDR2 write offset issue 13 X X X No Fix HiLoCS bit not readable in memory error address registers 14 X X X No Fix MCH transitions from Polling.Active prematurely 15 X X X No Fix Non-fatal completion timeout errors observed on PCI Express devices 16 X X X No Fix MCH fails to train when non-TS1/TS2 training sequences are received 17 X X X No Fix DIMM sparing issue with demand scrub enabled 18 X X X No Fix Configuration transaction may be ignored in MCH when Configuration Request Retry Status is enabled in PCI Express to PCI/PCI-X bridges 19 X X X No Fix PCI Express x4, x8 links may train down to lower width 20 X X X No Fix SKP ordered set may not be sent within required interval 21 X X X No Fix END symbol omitted from the last PM_Request_Ack DLLP while entering L2 state on x1 PCI Express link 22 X X Plan Fix System hang may occur when entering S4 and S5 power states 23 X X Plan Fix Transposed interrupt messages across Hub Interface 24 X X X No Fix Completion timeout errors in the presence of heavy PCI Express peer-to-peer traffic 25 X X X No Fix SMBDAT and SMBCLK signals pulled down in S5 26 X X X No Fix Multiple PCI Express protocol errors may result in fatal receiver overflow Intel® E7320 Memory Controller Hub (MCH) Specification Update 7 Summary Table of Changes Specification Changes Number SPECIFICATION CHANGES None for this revision of the Specification Update Specification Clarifications Number 1 SPECIFICATION CLARIFICATIONS Clarification to Section 4.4.1, “Memory Remapping”, in the EDS Documentation Changes Number 1 8 DOCUMENTATION CHANGES Interupt Redirection Intel® E7320 Memory Controller Hub (MCH) Specification Update Identification Information Identification Information Component Identification via Programming Interface The Intel® E7320 MCH can be identified by the following register contents: MCH Version Stepping Vendor ID1 Device ID2 Revision Number3 E7320 C-1 8086h 3592h 09h E7320 C-2 8086h 3592h 0Ah E7320 C-4 8086h 3592h 0Ch NOTES: 1. The Vendor ID corresponds to bits 15:0 of the Vendor ID Register located at offset 00 - 01h in the PCI bus 0, device 0, function 0 configuration space. 2. The Device ID corresponds to bits 15:0 of the Device ID Register located at offset 02 - 03h in the PCI bus 0, device 0, function 0 configuration space. 3. The Revision Number corresponds to bits 7:0 of the Revision ID Register located at offset 08h in the PCI bus 0, device 0, function 0 configuration space. Component Marking Information The Intel® E7320 MCH stepping can be identified by the following component markings: MCH Stepping Spec E7320 C-1 SL7P4 E7320 C-2 SL7RE E7320 C-4 SL7XV Intel® E7320 Memory Controller Hub (MCH) Specification Update 9 Errata Errata 1. Data corruption after an illegal front side bus configuration Write Problem: When an illegal FSB configuration write occurs (bits [30:24] of the Configuration Address Register (CONFIG_ADDRESS, I/O address 0CF8h) are non-zero) PCI configuration accesses following this write may be corrupted. Implication: This is a mishandled error case and causes corruption of transactions after this transaction. This is an illegal case. Workaround: Do not write non-zero values to the PCI configuration address register reserved fields. Status: For the steppings effected, see the Summary Table of Changes. 2. Improper ECC and Memory Initialization while in Symmetric mode Problem: ECC and memory initialization is not properly executed when the MCH is in Symmetric Addressing mode. The MCH automatically enters symmetric address bit permuting when precisely four identical ranks of memory are available. Implication: Correctable and uncorrectable memory errors may be detected since ECC is not properly initialized. The entire memory array is not initialized with zeros. Workaround: Refer to your Intel representative for details Status: For the steppings effected, see the Summary Table of Changes. 3. Single Channel ECC Error Injection issue Problem: In single channel mode, single ECC error injection to Quad-word 4/5 or Quad-word 6/7 is not functional. The “Inject all” function works for all Quad-words as expected, as do all injection cases in dual channel mode. Implication: Injected errors will not propagate to the memory array. As a result, when the memory location is read, the Correctable Read Memory Error Channel B and Correctable Read Memory Error Channel A of the DRAM_FERR Register (Device 0, Function 1, Offset 80h bit 0 and 8 Respectively) report no errors. Workaround: Use “inject always” or limit error injection via the ECCDIAG register to the first half of the cache line when in single channel mode. Status: For the steppings effected, see the Summary Table of Changes. 4. PCI Express* add-in card presence detect state misreported Problem: PCI Express ports that are configured as non-hot plug capable incorrectly assert the add-in card Presence Detect State in the PCI Express Slot Status Register (EXP_SLTSTS Device 2-3, Function 0, Offset 7E-7Fh bit 6) regardless of the presence of an add-in card. Implication: Software may interpret the presence of an add-in card when none exists. Workaround: Utilize the Link Active bit in the Vendor Specific Status Register 1(VS_STS1 Device 2-3, Function 0, Offset 47h bit 1) as an alternative to the Presence Detect State bit. Status: For the steppings effected, see the Summary Table of Changes. 10 Intel® E7320 Memory Controller Hub (MCH) Specification Update Errata 5. Incorrect PCI Express Link/Lane numbers driven in degraded link Problem: If a failure of receiver detect or bit/symbol lock occurs on lane 0 (lane 7 in the case of physical lane reversal) while other lanes successfully achieve bit/symbol lock in the early stages of Polling.Active, the MCH will exhibit anomalous lane numbering during the ensuing failed training sequence. Note that this anomalous behavior only occurs in situations where the combination of successful and failing lanes will result in a training failure, and a return to the Polling state. Implication: When such a failed training is in progress, non-compliant non-PAD lane numbers may be observed on the MCH downstream lanes. The observed behavior may be seen as the MCH attempting a link split. Workaround: None Status: For the steppings effected, see the Summary Table of Changes. 6. PCI Express Compliance Mode issue Problem: When a x8 link exits PCI Express Compliance Mode, the MCH will attempt to retrain as two x4 links. This issue manifests itself when the MCH inadvertently enters Compliance Mode. Implication: Upon exiting Compliance Mode, the MCH link will attempt to train a downstream x8 device as two separate x4 links. Depending on the capabilities of the downstream device, the link width will be configured as either x4 or x1. Workaround: Set bit 0 to 1b in Bus 0, Device 0, Function 0, Offset F5h. This will force the MCH to not enter compliance mode. Note that the MCH defaults to Compliance Mode disabled. Status: For the steppings effected, see the Summary Table of Changes. 7. PCI Express link training failures on hot reset Problem: When issuing a hot reset via the bridge control register (BCTRL, Bus 0, Device 2-3, Function 0, Offset 3Eh bit 6, 1b) secondary bus reset bit to a PCI Express slot, the link may fall back degraded to a lower link width. Implication: The link may degrade in width or fail to train all together after a hot reset. Workaround: Implement a software algorithm that issues a Secondary Bus Reset upon a link training failure for 2 ms. The algorithm should support at least three iterations of Secondary Bus Resets. Status: For the steppings effected, see the Summary Table of Changes. 8. Subsystem Identification and Subsystem Vendor Identification register issue Problem: The Subsystem Vendor Identification register (SVID, Bus 0, D0:F0/F1, D1:F0, D2:F0 & D8:F0, Offset 2C-2Dh) and the Subsystem Identification register (SID, Bus 0, D0:F0/F1, D1:F0, D2:F0 & D8:F0, Offset 2E-2Fh) are not able to be written to independently. Writing to one register causes both to become Read Only. Implication: If the values written to these two registers are not written via the Dword address, then the second value written will not be set. Workaround: Write to both registers at the same time using PCI configuration Dword writes. Status: For the steppings effected, see the Summary Table of Changes. 9. MCH responds with illegal access on the Hub Interface for 32 GB configurations Problem: When devices behind the ICH try to access a memory address above 4 GB in systems with 32 GB of physical memory, an illegal access error is incorrectly flagged by the MCH. Intel® E7320 Memory Controller Hub (MCH) Specification Update 11 Errata Implication: A spurious error is flagged, and accesses between 4 GB and 32 GB will not succeed in the 32 GB (maximum) memory configuration, which can result in a system hang. Workaround: Refer to your Intel representative for details the Intel® E7520, E7320, and E7525 Memory Controller Hub (MCH) Components BIOS Specification for details. Status: For the steppings effected, see the Summary Table of Changes. 10. MCH hang on PCI Express enhanced configurations to non-existent devices causes hang Problem: A system hang may occur when writing or reading to offsets above 0x0FF using the PCI Express enhanced configuration space of a non-existent device. Implication: An invalid access error will be flagged, and a system hang may result. Workaround: Polling or testing for devices must be done using offsets below 0x0FF. Access must not be issued to offsets above 0x0FF unless the targeted device is confirmed present. Status: For the steppings effected, see the Summary Table of Changes. 11. Spurious errors logged during link training events Problem: The MCH reports spurious receiver errors during initial link training, after a retrain, or after a secondary bus reset has occurred. Implication: Spurious receiver errors will be logged in the associated port. There are no negative side effects besides the misreported error. Workaround: Upon initial training and after each retrain or secondary bus reset, clear the correctable error detected bit of the PCI Express Device Status register (EXP_DEVSTS, Device 2-3, Function 0, Offset 6E-6Fh bit 0, 1b) and the receiver error status bit of the PCI Express Correctable Error Status register (EXP_CORERRSTS, Device 2-3, Function 0, Offset 110-113h bit 0, 1b). Also clear the FERR/NERR bits that flag correctable errors (EXP_FERR/EXP_NERR, Device 2-3, Function 0, Offset 160-163h / 164-167h bit 6, 1b). Status: For the steppings effected, see the Summary Table of Changes. 12. DDR2 write offset issue Problem: DQ/DQS signals terminate to a level about 300mv below VDDQ/2 between write bursts. No functional failures have been observed as a function of this issue. Implication: Signal integrity issues may be observed. Workaround: None Status: For the steppings effected, see the Summary Table of Changes. 13. HiLoCS bit not readable in memory error address registers Problem: In the Error Address registers, bit 0 (HiLoCS) is not accessible via software and will always return 0b if read. The affected registers are: 12 Register Device:Function:Offset DRAM_SEC1_ADD D0:F1:A0-A3h DRAM_DED_ADD D0:F1:A4-A7h Intel® E7320 Memory Controller Hub (MCH) Specification Update Errata Register Device:Function:Offset DRAM_SCRB_ADD D0:F1:A8-ABh DRAM_RETR_ADD D0:F1:AC-AFh DRAM_SEC2_ADD D0:F1:C8-CBh Implication: Software cannot rely on these bits. Workaround: Refer to your Intel Representative for workaround details. Status: For the steppings effected, see the Summary Table of Changes. 14. MCH transitions from Polling.Active prematurely Problem: During a standard link training sequence, the MCH should remain in Polling.Active until TS1 ordered sets with link and lane set to PAD are received on all lanes that passed Receiver Detect. Because the MCH does not explicitly check for PAD on the link and lane numbers, it is possible for the MCH to transition from Polling.Active to Polling.Config when a downstream device is not executing a standard link training sequence (i.e. when the downstream device is actually in recovery or reset). Implication: This early transition to Polling.Config may result in a degraded link width (e.g. a x4 port may train as x1), but the link will train. Workaround: None required. Status: For the steppings effected, see the Summary Table of Changes. 15. Non-fatal completion timeout errors observed on PCI Express devices Problem: When PCI configuration accesses are made on secondary buses to MCH PCI Express bridges (Device 2-3, Function 0), non-fatal completion timeout errors (EXP_UNCERRSTS, Device 2-3, Function 0, Offset 104h bit 14) may be observed in the MCH. This condition also applies to PCI configuration accesses on any downstream device that is in the hot reset state or is disabled. Implication: The system may escalate non-fatal PCI Express completion timeout errors inadvertently. Workaround: There are two viable workarounds: 1. Mask the completion timeout errors on MCH PCI Express bridge devices with unpopulated slots as identified by the Present Detect State bit (EXP_SLTSTS, Device 2-3, Function 0, Offset 7Eh bit 6) in the PCI Express Slot Status register. If a device is present but disabled or in the hot reset state then the L ink Active bit (VS_STS1, Device 2-3, Function 0, Offset 47h bit 1) should be verified for link status. 2. Construct a completion timeout handler to clear the error and return if the Present Detect State bit and the Link Active bit are clear. Status: For the steppings effected, see the Summary Table of Changes. 16. MCH fails to train when non-TS1/TS2 training sequences are received Problem: During the PCI Express training sequence, if a broken endpoint or a good endpoint on a broken board has correct receiver termination on any lane and transmits signals on that lane that can be seen at the MCH and are not valid TS1/TS2 training sequences, the MCH will fail to train that link at all. Implication: The PCI Express specification intends that, if some lanes are transmitting bogus data instead of valid training sequences, those lanes should be treated as broken, and the link should fail down to an acceptable width (such as x1). If lane 0 were failing in this manner, the link would fail to train per the PCI Express specification. If a higher-numbered lane were failing in this manner, the PCI Intel® E7320 Memory Controller Hub (MCH) Specification Update 13 Errata Express specification requires that the link attempt to train as a x1 on lane 0 - the MCH will not train in this scenario. Failures are anticipated to occur because of a broken transmitter/receiver path, or a silent transmitter. None of those failure modes will cause the MCH to fail to train, since either the receiver termination will be missing, or the transmitted signals will not be seen at the MCH. In order to see invalid transmitted signals at the MCH, either a logic bug in the other PCI Express endpoint would be required, or a signal integrity issue so severe as to make operation impossible. Workaround: None Status: For the steppings effected, see the Summary Table of Changes. 17. DIMM sparing issue with demand scrub enabled Problem: When spare copy is in progress and a demand scrub (as a result of a demand fetch with a correctable error) to an address resolving to the SCRUBLIM is performed, the process of spare copy from the failing DIMM to spare DIMM may terminate prematurely. Implication: A system hang may occur when the spare DIMM is brought “on-line” prematurely and bad data is read from this DIMM. This condition is a result of the premature exit of the spare copy process. Workaround: BIOS should disable demand scrub prior to initiating spare copy and re-enable it after the data migration is complete. Demand scrubbing can be enabled and disabled by updating the Scrub Limit and Control Register (SCRUBLIM Device 8, Function 0, Offset C8-CBh bit 27). Status: For the steppings effected, see the Summary Table of Changes. 18. Configuration transaction may be ignored in MCH when Configuration Request Retry Status is enabled in PCI Express to PCI/PCI-X bridges Problem: Under certain circumstances that include a mix of PCI Express traffic in the presence of completions with Configuration Retry Status (configuration space traffic receiving CRS, and other traffic that is posted / governed by Posted Flow Control credits) on a given PCI Express port, the MCH may ignore and fail to issue an outbound configuration space access indefinitely. This behavior has been observed in configurations with PCI Express to PCI/PCI-X bridge devices under circumstances where at least one device “behind” the bridge is active and operational, while at least one other device “behind” the bridge remains unresponsive to configuration requests for an extended period of time. Such failures ultimately manifest themselves as CPU IERR# assertions, which commonly precipitates a platform reboot. Completions with Configuration Request Retry Status are generally sent by a PCI Express to PCI/PCI-X bridge when it relays configuration space traffic to a PCI/PCI-X device which exhibits a long latency in responding to configuration space traffic. The CRS completion status mechanism is intended to prevent a PCI Express completion timeout from occurring in cases where historical PCI/PCI-X implementations would experience an extended latency without response, but would not generate any timeout or associated error. Implication: A system hang may occur. Workaround: To avoid configuration transactions from being ignored, Intel strongly recommends that BIOS should disable Configuration Request Retries in all PCI Express bridge devices. For Intel® 6700PXH 64-bit PCI Hub this is accomplished by clearing the Bridge Configuration Retry Enable bit in the Device Control register (D0:F0,2:R04Ch bit 15). This bit is cleared by default. Some PCI or PCI-X devices may require lengthy self-initialization sequence (up to 1.5 sec as defined by PCI Express Base Specification 1.0a) to complete before they are able to service Configuration Requests after reset. In order to ensure the ability of the system to successfully enumerate PCI devices, BIOS should disable PCI Express Completion Timeout in the root port configuration of MCH links connected to Intel® 6700PXH 64-bit PCI Hub, Intel® IOP332, and Intel® 41210 devices (including add-in cards) by setting the Completion Timeout Timer Disable bit in the Vendor Specific command register (D2-3:F0:R045h bit 3). BIOS should ensure that the Completion Timeout Timer remains enabled (default) for other active PCI Express links. BIOS 14 Intel® E7320 Memory Controller Hub (MCH) Specification Update Errata should also ensure that the Completion Timeout Error Mask is set in MCH root ports associated with inactive PCI Express links (unpopulated slots or disabled devices) -- refer to erratum 15 for detail. Status: For the steppings effected, see the Summary Table of Changes. 19. PCI Express x4, x8 links may train down to lower width Problem: It has been observed that x4, x8 links may fail to train to their full link widths. This behavior occurs infrequently. The issue is caused by the MCH exiting the Polling.Active state and entering the Polling.Config state prior to the downstream device entering the Polling.Active state. Implication: PCI Express ports may fail train to at full width. Workaround: Intel recommends an algorithm that will issue an Secondary Bus Reset upon a link training failure for 2ms. The algorithm should support at least three iterations of Secondary Bus Resets. Status: For the steppings effected, see the Summary Table of Changes. 20. SKP ordered set may not be sent within required interval Problem: During Link Recovery on a PCI Express port, the MCH may fail to transmit a SKP ordered set within the required time interval as defined in the PCI Express 1.0a Specification if a TLP or DLLP was pending when the link entered Recovery.Idle state. Implication: If the receiving device depends upon receipt of a SKP ordered set to progress through Link Recovery, a timeout will occur resulting in Link Down and automatic reinitialization of the PCI Express link. A link transitions through Recovery only under exceptional operational conditions. Following the Link Recovery timeout and reinitialization, the link should resume normal operation unless the original Link Recovery condition was entered as a result of a hard failure mechanism. Workaround: None Status: For the steppings effected, see the Summary Table of Changes. 21. END symbol omitted from the last PM_Request_Ack DLLP while entering L2 state on x1 PCI Express link Problem: When a x1 link transitions into the L2 state, the MCH may fail to transmit the END symbol of the last PM_Request_Ack DLLP. Implication: If a downstream device expects an END symbol in the last PM_Request_Ack DLLP from the MCH, it may incorrectly decode the electrical ordered set that follows. Endpoints should expect the COM symbol in the electrical ordered set to indicate a final confirmation to transition the link to the L2 state. Workaround: None Status: For the steppings effected, see the Summary Table of Changes. 22. System hang may occur when entering S4 and S5 power states Problem: When the system is transitioning into the S4 or S5 state, the MCH may fail to respond to an ICH power management handshake event resulting in a system lock. Specifically, when the duration between the rising edge of HICLK and the preceding rising edge of HCLKIN is between 1.6ns 2.7ns when measured at the MCH pins, it is possible to encounter this erratum. This also implies that if a platform is outside this range, this erratum will not be encountered. When this failure occurs the system will maintain power and remain unresponsive indefinitely. Once the system becomes unresponsive after encountering this erratum, it will only resume operation after an AC power cycle or an unconditional powerdown. Intel® E7320 Memory Controller Hub (MCH) Specification Update 15 Errata Under normal operation, a transition into S3-S5 will have the following processor bus signature: 1. ICH asserts STPCLK# to the processor. 2. Processor issues a Stop Grant Acknowledge transaction on the processor bus. 3. ICH asserts SLP# to the processor. In the failing case steps 1 and 2 are observed, but step 3 is not. Implication: System may hang during a power management transition. Workaround: Refer to your Intel Representative for workaround details. Status: For the steppings effected, see the Summary Table of Changes. 23. Transposed interrupt messages across Hub Interface Problem: In cases where virtual wire interrupt messages (Assert/Deassert-INT[A, B, C, D]) received on PCI Express are spuriously short (the deassert message is received before the assert message can be forwarded by the MCH to the ICH), the MCH may infrequently transpose the interrupt assert and deassert messages across the Hub Interface. Under normal conditions the MCH will forward an interrupt assert message originating from a PCI Express port over the Hub Interface prior to receiving or forwarding the corresponding deassert message. In the event that the transposition occurs, the virtual wire is left asserted at the ICH when it is in fact de-asserted at the source. The virtual wire will remain asserted until a subsequent interrupt on that same virtual wire arrives to clear the condition. During the period where the virtual wire remains “stuck” asserted, spurious interrupts will be forwarded to the processor(s). Implication: Systems running with a single logical processor (most commonly in uni-processor configurations when Hyper-Threading Technology is disabled) and operating in legacy PIC mode or virtual wire mode A may hang under high I/O-driven interrupt stress. For systems operating in full APIC mode where the number of virtual interrupt lines (intA, intB, etc.) used by all PCI Express adapters in a system exceeds the number of logical processors (threads), the system may hang. Workaround: BIOS updates are required to support PIC mode. Use of PCI Express adapters is not recommended. PCI Express devices down on the motherboard are supported if they are single function devices or have their IOAPIC enabled. Refer to your Intel representative for details. Status: For the steppings effected, see the Summary Table of Changes. 24. Completion timeout errors in the presence of heavy PCI Express peer-topeer traffic Problem: When a single PCI Express port receives a continuous stream of posted transactions targeting a peer PCI Express port (as opposed to targeting memory), and the throughput into the sending port is equal or higher than that of the destination port, the MCH will continuously grant the sending port access to the target port until a break in the posted traffic occurs. Under these conditions, a third PCI Express port attempting to send one or more posted transactions to the same target port will be held off for an unbounded period of time (until a break occurs in the transmit stream from the port currently granted access). Given the right mix of traffic to the port that is thus blocked, and sufficient duration on the “continuous stream” of posted transactions at the target port, a completion timeout error may occur on the port that is blocked. Note that outbound CPU traffic to the target port and completions for inbound reads from the target port are not impacted by the blocking mechanism; only competing peer transfers to the target will be stalled. Implication: When PCI Express peer-to-peer transfers are sufficiently large and uninterrupted, and transfers are initiated on multiple source ports targeting the same destination port, completion timeout errors may occur. In order to trigger such a timeout, one of the peer source ports must be blocked for at least 16.7 ms. 16 Intel® E7320 Memory Controller Hub (MCH) Specification Update Errata Workaround: Limit the uninterrupted duration (total data payload size) for transfers between peer PCI Express ports, such that no one continuous transfer will exceed a duration of 16.7 ms. For reference, each x4 PCI Express port is capable of transferring well over 12 MB of data in 16.7 ms, thus an uninterrupted blockage of such duration is not expected to occur unless extreme circumstances are contrived. Status: For the steppings effected, see the Summary Table of Changes. 25. SMBDAT and SMBCLK signals pulled down in S5 Problem: According to SMBus Specification 2.0 the SMBDAT and SMBCLK signals are to float while in the S5 state. Due to device protection circuitry these signals are pulled down while in the S5 state. Implication: Devices on auxiliary power such as a BMC that share an SMBus connection with the MCH will not be able to signal on the SMBus in the S5 state due to the signals being pulled down. Workaround: A mux can be incorporated into the SMBus to disconnect the MCH when the platform goes into the S5 state. Status: For the steppings effected, see the Summary Table of Changes. 26. Multiple PCI Express protocol errors may result in fatal receiver overflow Problem: If a PCI Express device connected to the MCH generates multiple transaction layer protocol errors, including, unexpected completion packets or malformed transaction layer packets (TLPs) that otherwise pass all link-layer error checking, and have the correct alignment on the interface, the MCH may experience a fatal receiver overflow. Implication: If the above conditions are met, The MCH may detect and log a “fatal” receiver overflow error. MCH behavior in the presence of this error is consistent with the specification, in that continued operation on the port after such an error may be unreliable. Workaround: Intel recommends avoiding use of PCI Express devices that generate unexpected completion or malformed TLP protocol violations. If this is unavoidable, the receiver overflow error detected by the MCH may be escalated to a system event (e.g.: SERR#) that prevents continued operation on the compromised link. Status: For the steppings effected, see the Summary Table of Changes. 27. System marginalities may result in spurious link-down error events on power state changes Problem: On system power state changes (S3, S4, and S5) PCI Express devices are placed in the D3 device power state by the operating system, which results in automatic negotiation with the MCH to enter the L1 link state. In systems where the cumulative noise present at the MCH receiver pins exceeds the MCH receiver threshold for detecting Electrical Idle, the transition into L1 may fail to complete normally, ultimately resulting in a spurious link-down error from the MCH. If link down error (D27:F0:R140h, bit 11) is escalated using a fatal system error (SERR#) mechanism, a blue-screen may result on exposed systems. The PCI Express specification for Electrical Idle at the receiver is 65 mV peak-peak differential, and characterization of the MCH indicates that some lanes on some devices are marginal with respect to this specification. While L1 failures should be exceedingly rare, Intel recognize that this specification is difficult to meet, and acknowledge the exposure Implication: Systems with sufficient noise at the MCH receivers and a BIOS profile that escalates the “link down error” as a fatal system event may be exposed to blue-screen occurrence on system power state transitions. Exposure to the error increases with the cumulative noise (platform noise + silicon noise) present at the MCH receivers when the link is in Electrical Idle. Systems utilizing a BIOS configuration that does not escalate the “link down error” as a fatal error are not exposed. Intel® E7320 Memory Controller Hub (MCH) Specification Update 17 Errata Custom operating systems or future operating systems that independently manage the power state of PCI Express devices outside the scope of system power state transitions would be similarly exposed to link-down errors via the same mechanism. In cases where the destination power state on the attached device is between D0 and D3, any such link-down event constitutes a real error from which software may only recover by fully reconfiguring the devices below the affected link. Workaround: None Status: For the steppings affected, see the Summary Table of Changes. 28. Possible loss of Hot-swap Power Fault Event in dual PCI Express Hot-swap port configurations Problem: During boot, as part of normal PCI enumeration, the external hot-swap expander device on the PCI Express hot-swap ports must be configured. This PCI enumeration proceeds on a per device basis, during which an expander input change on the second port might get lost. There is sufficient time between configuration of the first PCI Express hot-swap port and the second port for this erratum to happen, due to interaction between the internal hot-swap controller and the external hot-swap expander. Implication: A power fault event, as an example, could occur on the second port before it was configured. The power fault event is thus not reported. Workaround: Force the configuration of both hot-swap controllers to occur back-to-back in time. This prevents any controller/expander traffic other than the configuration until both expanders have been configured, and ensures that the controller and the external expander are in agreement. In BIOS, make sure that setting the hot-swap capable bit for one of the hot-swap ports is followed immediately by setting the same bit for the other hot-swap port. Status: For the steppings affected, see the Summary Table of Changes. 18 Intel® E7320 Memory Controller Hub (MCH) Specification Update Specification Changes Specification Changes There are no Specification Changes in this revision of the Specification Update. Intel® E7320 Memory Controller Hub (MCH) Specification Update 19 Specification Clarifications Specification Clarifications 1. Clarification to Section 4.4.1, “Memory Remapping”, in the EDS Section 4.4.1 currently reads as follows: 4.4.1 Memory Remapping An incoming address (referred to as a logical address) is checked to see if it falls in the memory remap window. The bottom of the remap window is defined by the value in the REMAPBASE register. The top of the remap window is defined by the value in the REMAPLIMIT register. An address that falls within this window is remapped to the physical memory starting at the address defined by the TOLM register. A clarification will be made to Section 4.4.1 by adding a second paragraph which will read as follows: 4.4.1 Memory Remapping An incoming address (referred to as a logical address) is checked to see if it falls in the memory remap window. The bottom of the remap window is defined by the value in the REMAPBASE register. The top of the remap window is defined by the value in the REMAPLIMIT register. An address that falls within this window is remapped to the physical memory starting at the address defined by the TOLM register. The remap operation increases the latency of CPU to memory accesses (within the remap area) by three clocks when the pipeline between the CPU interface and memory is empty (the “idle latency” case). This may result in a measurable performance degradation within the remap range for latency-sensitive benchmarks that are run on lightly loaded systems. This latency difference disappears when latency-sensitive benchmarks are run on moderately to heavily loaded systems. 20 Intel® E7320 Memory Controller Hub (MCH) Specification Update Documentation Changes Documentation Changes There are no Documentation Changes in this revision of the Specification Update. 1. Interupt Redirection The bit definition for the hardware interrupt redirection has been added. The following changes will be reflected in the next release of the Datasheet. REDIRCTL - Redirection Control - (D8:F0) Address Offset: 4C - 4Fh Access: R/W, RO Size: 32 bits Default Value: 0000_648Ch Bit Field Default & Access 31:14 00001h 13 1b R/W 12:0 048Ch Description Reserved Interrupt Redirection Algorithm (XTPR). 0 = LRU (least recently used within the lowest priority pool) 1 = highest number in lowest priority pool, default Reserved Intel® E7320 Memory Controller Hub (MCH) Specification Update 21 Documentation Changes 22 Intel® E7320 Memory Controller Hub (MCH) Specification Update