ETC L64364

BookL64364PG.fm5 Page i Friday, January 28, 2000 4:58 PM
L64364
®
ATMizer II+
ATM-SAR Chip
Programming Guide
January 2000
Order Number R14012
BookL64364PG.fm5 Page ii Friday, January 28, 2000 4:58 PM
This document contains proprietary information of LSI Logic Corporation. The
information contained herein is not to be used by or disclosed to third parties
without the express written permission of an officer of LSI Logic Corporation.
Document number DB15-000072-01, First Edition (January 2000)
This document is a guide for system programmer’s involved in the development
of application software for the LSI Logic L64364 ATMizer® II+ ATM-SAR Chip,
and will remain the official reference source for all revisions/releases of this
product until rescinded by an update.
To receive product literature, visit us at http://www.lsilogic.com.
LSI Logic Corporation reserves the right to make changes to any products herein
at any time without notice. LSI Logic does not assume any responsibility or
liability arising out of the application or use of any product described herein,
except as expressly agreed to in writing by LSI Logic; nor does the purchase or
use of a product from LSI Logic convey a license under any patent rights,
copyrights, trademark rights, or any other of the intellectual property rights of LSI
Logic or third parties.
Copyright © 1997 – 2000 by LSI Logic Corporation. All rights reserved.
TRADEMARK ACKNOWLEDGMENT
The LSI Logic logo design and ATMizer are trademarks or registered trademarks
of LSI Logic Corporation. All other brand and product names may be trademarks
of their respective companies.
ii
BookL64364PG.fm5 Page iii Friday, January 28, 2000 4:58 PM
Contents
Chapter 1
Chapter 2
Introduction
1.1
Hardware Overview
1.2
Typical Application
1.3
Software Overview
1.3.1
Data Structures and Maintenance
1.3.2
Host Messaging
1.3.3
Scheduling
1.3.4
Hashing Function
1.3.5
Packet Aging
1.3.6
Interrupt Handling
1.3.7
OAM Cell Processing
1.3.8
AAL3/4 Processing
1.3.9
Initialization
1.3.10 Operating Software
1-1
1-4
1-7
1-7
1-16
1-20
1-21
1-21
1-21
1-22
1-22
1-22
1-23
Host Messaging
2.1
Host Messaging Overview
2.2
Buffer Processing
2.2.1
Buffer Flow
2.2.2
FIFO Location
2.2.3
FIFO Contents
2.2.4
FIFO Implementations
2.3
Rings
2.3.1
Ring Structure
2.3.2
Ring Management
2.3.3
Ring Implementation (Initialization)
2-1
2-2
2-4
2-8
2-10
2-14
2-20
2-20
2-22
2-23
Contents
iii
BookL64364PG.fm5 Page iv Friday, January 28, 2000 4:58 PM
Chapter 3
iv
Scheduling
3.1
Scheduling Invocation
3.1.1
Line Recovered Clock Synchronization
3.1.2
FIFO Full Synchronization
3.2
Scheduler Commands
3.2.1
SCD_Serv( ) Command
3.2.2
SCD_Sched( ) Command
3.2.3
SCD_Tic( ) Command
3.3
The Scheduling Process
3.3.1
A Simple Scheduling Function
3.3.2
Scheduling Lag
3.3.3
Rate Granularity
3.3.4
Time Comparisons
3.3.5
Stopping Connection Scheduling
3.3.6
Race Conditions and Hazards
3.3.7
Scheduling ABR Connections
3.4
UBR Connections
3.4.1
Managing the UBR List in Software
3.4.2
Managing UBR Connections Using the
Scheduler
3.5
VBR Connections
3.5.1
PCR-Based Implementation
3.5.2
SCR-Based Implementation
3.6
ABR Connections
3.7
Local Congestion
3.7.1
Fairness
3.7.2
List Lengths
3.7.3
Detecting a Local Congestion
3.7.4
Minimum Cell Rate Guarantees
3.7.5
MultiPHY Operation
3.8
Source Code Listings
3.8.1
Macros and Types Header File (uTypes.h)
3.8.2
ATMizer II+ Header File (ATMizer2.h)
3.8.3
ATMizer II+ Hardware Header File (Hdr.h)
3.8.4
Extended Instructions Header File (Instr.h)
3.8.5
ABR Functions Header File (ABR.h)
3.8.6
TxCell() and RxCell() (Cell.c)
Contents
3-1
3-2
3-2
3-3
3-3
3-3
3-4
3-4
3-4
3-5
3-6
3-8
3-9
3-13
3-15
3-20
3-21
3-22
3-24
3-24
3-26
3-27
3-30
3-30
3-35
3-36
3-36
3-38
3-44
3-44
3-45
3-49
3-61
3-62
3-64
BookL64364PG.fm5 Page v Friday, January 28, 2000 4:58 PM
3.8.7
3.8.8
3.8.9
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Transmit a CBR Cell (CBR.c)
Transmit a VBR Cell (VBR.c)
Transmit and Receive ABR Cells (ABR.c)
3-66
3-67
3-68
Unschedule
4.1
Introduction
4.2
Unschedule Routine
4-1
4-2
Hashing Function
5.1
Hashing Mechanism
5.2
Hashing Function
5.3
Hash Implementation
5-1
5-2
5-3
Packet Aging
6.1
Introduction
6.2
Mailbox Processing
6.3
Packet Aging Routine
6-1
6-3
6-4
Interrupt Handling
7.1
Introduction
7.2
Nonvectored Interrupt Handler
7.3
Vectored Interrupt Handler
7.3.1
Enable Interrupts
7.3.2
General Handler
7.3.3
Individual Handlers
7-1
7-2
7-5
7-6
7-8
7-10
OAM Cell Processing
8.1
Introduction
8.2
F4 OAM Flow
8.2.1
Initialization of F4 Flow
8.2.2
F4 Flow Transmit
8.2.3
F4 Flow Receive
8.2.4
Host Processing of F4 Flow
8.3
F5 OAM Flow
8-1
8-2
8-3
8-5
8-6
8-9
8-11
Contents
v
BookL64364PG.fm5 Page vi Friday, January 28, 2000 4:58 PM
Chapter 9
Chapter 10
vi
AAL3/4 Processing
9.1
Introduction
9.2
AAL3/4 Segmentation
9.3
AAL3/4 Reassembly
Initialization
10.1 Initialization Overview
10.2 Booting Procedures
10.2.1 Default ATMizer II+ Chip Initialization
10.2.2 Secondary Port EPROM Boot Sequence
10.2.3 Cell Buffer Memory/Serial PROM Boot
Sequence
10.2.4 Cell Buffer Memory/Primary Port Boot
Sequence
10.3 C Preamble Execution
10.4 CPU Initialization and Configuration
10.4.1 Configuration and Cache Control Register
10.4.2 Cache Configuration
10.4.3 Dcache and D-RAM Configuration
10.4.4 Dcache and C-RAM Usage
10.4.5 Icache and I-RAM Configuration
10.4.6 Icache and I-RAM Usage
10.5 Configuration Header File
10.6 Host PCI Access
10.6.1 PCI Bus Configuration
10.6.2 PCI Access to the ATMizer II+ Memory Space
10.7 Memory Allocation
10.7.1 Receive Direction
10.7.2 Transmit Direction
10.7.3 Connection Descriptors
10.7.4 Buffer Descriptors
10.7.5 Data Exchanging Blocks
10.7.6 Related Issues
10.8 Hardware Registers Initialization
10.8.1 EDMA Registers
10.8.2 Scheduler Registers
10.8.3 ACI Registers
Contents
9-1
9-2
9-6
10-1
10-3
10-3
10-5
10-5
10-7
10-8
10-10
10-10
10-13
10-14
10-17
10-19
10-20
10-25
10-31
10-31
10-33
10-35
10-38
10-38
10-39
10-39
10-41
10-42
10-43
10-46
10-49
10-50
BookL64364PG.fm5 Page vii Friday, January 28, 2000 4:58 PM
10.8.4 Timer Registers
10.8.5 APU Registers
Data Structures Initialization
10.9.1 VCD and ACD Initialization
10.9.2 BFD Initialization
10.9.3 Calendar Table Initialization
10.9.4 Ring Initialization
10.9.5 Free Cell List
10.9.6 Miscellaneous Data Structures
10-55
10-57
10-58
10-58
10-61
10-63
10-64
10-65
10-65
Operating Software
11.1 Top Level Structure
11.2 APU Program
11.2.1 Cell Operation Flow
11.2.2 Buffer Operation Flow
11.2.3 Pseudocode
11.3 Host Program
11.3.1 Setting up a Configuration File
11.3.2 Host Tasks
11-1
11-2
11-2
11-3
11-4
11-5
11-5
11-6
10.9
Chapter 11
Customer Feedback
Figures
1.1
1.2
1.3
1.4
1.5
1.6
1.7
2.1
2.2
2.3
2.4
2.5
2.6
L64364 Functional Block Diagram
ATMizer II+ Application Development Platform Block
Diagram
Host Connection Descriptor Format
Buffer Descriptor and Buffer Relationship
ATMizer II+ Memory Organization
Mailbox Entry Format
Ring Message Format
Buffer Descriptor Layout
Transmit Flow
Receive Flow
FIFO Descriptor Declaration
PutFifo() Routine
GetFifo() Routine
Contents
1-3
1-5
1-9
1-13
1-13
1-16
1-20
2-2
2-5
2-7
2-15
2-15
2-16
vii
BookL64364PG.fm5 Page viii Friday, January 28, 2000 4:58 PM
2.7
2.8
2.9
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
3.19
3.20
viii
FIFO Operations
Enhanced Fifo Descriptor Declaration
CBM Layout
TxFifo Descriptor Location
APU TxFifo Descriptor Initialization
Host TxFifo Descriptor Initialization
Modified PutFifo() and GetFifo() Routines
PutFifo() and GetFifo() without Rd Pointer Update
Ring Descriptors Declaration
CBM Ring Size
Primary Memory Ring Size
Ring Initialization
Host PutRing() Call
Host GetRing() Call
APU Ring Initialization
APU PutRing() Call
APU GetRing() Call
GetRing() and PutRing() Routines
A Simple Scheduling Function
Handling Scheduling Lag
Connection Scheduled with Rate 0.3
Calculating Fractional Service Time
Handling Time Comparisons
Stopping Connection Scheduling
Buff Completion Queue Interrupt Handler
Connection Rescheduling
Race Conditions
Interrupt Handler without Race Condition
Resetting the Connection Scheduled Flag
TxCell() Routine for Multiple Class Connections
TxCell() Routine Handling Out-of-Rate Cells
Implementing a UBR Connection List
Managing UBR Lists with the Scheduler
UBR_Send and CBR_Send Combined
A Leaky Bucket Routine
An SCR-Based Leaky Bucket Algorithm
A MultiPHY TxCell()
Enhanced MultiPHY Code
Contents
2-16
2-17
2-18
2-18
2-18
2-18
2-19
2-20
2-21
2-21
2-21
2-23
2-23
2-24
2-24
2-24
2-24
2-25
3-4
3-6
3-7
3-7
3-9
3-10
3-11
3-12
3-13
3-14
3-14
3-16
3-19
3-21
3-23
3-23
3-25
3-27
3-39
3-40
BookL64364PG.fm5 Page ix Friday, January 28, 2000 4:58 PM
3.21
Buff Completion Queue Interrupt Handler for MultiCalendar
Support
3-41
4.1
Unschedule Routine
4-2
5.1
Hashing Table Declarations
5-2
5.2
Hashing Table Initialization
5-3
5.3
Find Prime Routine
5-4
5.4
Inserting a Connection into the Hashing Table
5-5
6.1
HCD_Rx Structure Declarations
6-3
7.1
Nonvectored Interrupts General Handler
7-2
7.2
General Handler Exit to PMON
7-5
7.3
Vectored Interrupts Enabling Routine
7-6
7.4
Vectored Interrupts General Handler
7-8
7.5
IntRxMbx Interrupt Handler
7-10
8.1
OAM Cell Declarations
8-2
8.2
OAM Flow Connection Information
8-3
8.3
OAM Cell Initialization
8-4
8.4
OAM Cell Header Formation
8-5
8.5
OAM_Send() Routine
8-5
8.6
APU OAM_Receive() Routine
8-7
8.7
Host OAM_Receive() Routine
8-9
9.1
AAL3/4 Cell Layout
9-2
9.2
ACD_Ctrl_t Structure
9-3
9.3
SAR_PDU Header Declarations
9-3
9.4
AAL34_Send() Routine
9-5
9.5
AAL34_Receive() Routine
9-6
10.1 CCC Register and SDRAM Controller Initialization
10-4
10.2 Serial Boot Routine
10-6
10.3 Sample Initialization Code
10-8
10.4 CCC Register Layout
10-11
10.5 Tag Test Mode Loaded Data Format
10-15
10.6 Data RAM Configuration Code
10-15
10.7 Separating the Code with the Linker Script
10-21
10.8 Main Loop Example
10-22
10.9 Setting and Loading IRAM
10-23
10.10 PCI Configuration Space Registers
10-32
10.11 PCI Configuration Address Format
10-32
10.12 Programming the Latency Timer in the PCI Configuration
Register
10-33
Contents
ix
BookL64364PG.fm5 Page x Friday, January 28, 2000 4:58 PM
10.13
10.14
10.15
10.16
10.17
10.18
10.19
10.20
10.21
10.22
10.23
10.24
10.25
10.26
10.27
10.28
11.1
11.2
Allocating Memory to Data Structures
ATMizer Code Size Calculation
Memory-T Variables Initialization
Updating Memory Pointers
Loc_BuffPCI and Loc_BuffSec Format
Initializing EDMA Registers
Initializing Scheduler Registers
Initializing ACI Registers
Cascading Timers for a Long Watchdog Timeout
Clearing the APU_Reset Bit
Clearing VCD Fields
Initializing BFDs
Clearing the Calendar Table
Initializing Host Rings
Initializing APU Rings
Free Cell List Initialization
A Typical Configuration File
Opening Connections
1.1
1.2
1.3
1.4
1.5
1.6
2.1
2.2
2.3
2.4
2.5
3.1
3.2
3.3
3.4
3.5
3.6
Host Connection Descriptor Fields
Data Transfer Modes
Data Exchange with Host DMA
Data Exchange without Host DMA
Mailbox Messages
Statistics Result Fields
FIFOs between the Host and APU
Three-Way Messaging, Transmit Direction
Three-Way Messaging, Receive Direction
Two-Way Messaging, Transmit Direction
Two-Way Messaging, Receive Direction
Time Comparisons
Simulation Results for Class 0 and 1 Connections
Simulation Results for Class 2 Connections
Calendar List Length for Varying Link Utilizations
Initial Setup for MultiPHY Connections
PHY 0 Statistics at 155 Mbps with a Single Calendar
10-35
10-36
10-36
10-36
10-40
10-48
10-50
10-54
10-56
10-58
10-58
10-62
10-64
10-64
10-65
10-65
11-6
11-11
Tables
x
Contents
1-10
1-14
1-15
1-15
1-16
1-18
2-10
2-11
2-11
2-12
2-13
3-8
3-31
3-33
3-36
3-42
3-43
BookL64364PG.fm5 Page xi Friday, January 28, 2000 4:58 PM
3.7
7.1
7.2
7.3
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
10.10
10.11
10.12
10.13
PHY 0 Statistics at 155 Mbps with Multiple Calendars
Nonvectored Interrupt Sources
Vectored Interrupt Sources
General Register Map
Data Section Allocation
Configuration Header File Contents
PCI Virtual Address vs. Base Addresses
ATMizer II+ External Memory Map
Secondary Bus Memory Map
BFD Number Allocation
Buffer Location in Secondary Memory
ATMizer II+ Hardware Registers to be Initialized
Data and BFD Transfer Modes
ACI Control Register Initialization
External Vectored Interrupts
Required Open Connection Parameters
ACD Field Calculations
Contents
3-43
7-2
7-5
7-13
10-17
10-26
10-34
10-34
10-35
10-37
10-41
10-44
10-46
10-51
10-57
10-59
10-60
xi
BookL64364PG.fm5 Page xii Friday, January 28, 2000 4:58 PM
xii
Contents
BookL64364PG.fm5 Page xiii Friday, January 28, 2000 4:58 PM
Preface
This document provides information for system programmers and
developers who have a need to evaluate or program the L64364
ATMizer® II+ ATM-SAR Chip.
Audience
This document assumes that you have some familiarity with ATM,
microprocessors and related support devices. The people who benefit
from this book are:
•
engineers and managers who are evaluating the processor for
possible use in a system, and
•
software engineers who are designing the processor into a system.
Organization
This document has the following chapters and appendices:
•
Chapter 1, Introduction, describes the general characteristics and
features of the L64364 ATMizer II+ ATM-SAR chip, describes the
data structures used by the chip, and provides an overview of the
typical software.
•
Chapter 2, Host Messaging, describes the exchange of control
information between the host and the ATMizer II+ chip. It also
discusses methods of transferring information over the PCI bus.
•
Chapter 3, Scheduling, describes the scheduling process and its
implementation in hardware by the ATMizer II+ chip, and includes
sample scheduling code.
•
Chapter 4, Unschedule, describes the unscheduling process and its
implementation in hardware by the ATMizer II+ chip, and includes
sample unscheduling code.
Preface
xiii
BookL64364PG.fm5 Page xiv Friday, January 28, 2000 4:58 PM
•
Chapter 5, Hashing Function, describes the hashing mechanism
and its implementation in the ATMizer II+ chip with sample code.
•
Chapter 6, Packet Aging, presents an overview of the packet aging
process and its relationship to the host processor.
•
Chapter 7, Interrupt Handling, describes the external interrupts
and resets, their interaction with the ATMizer II+ chip, and includes
sample code for interrupt handlers.
•
Chapter 8, OAM Cell Processing, describes the handling of F4 and
F5 Operations and Management Cells by the ATMizer II+ chip.
•
Chapter 9, AAL3/4 Processing, describes ATM Adaptation Layer
3/4 and how the ATMizer II+ chip’s application code can be modified
to support AAL3/4 processing.
•
Chapter 10, Initialization, describes initialization, configuration, and
booting procedures to prepare the ATMizer II+ chip for programming.
•
Chapter 11, Operating Software, describes typical APU and host
operations and includes code segments.
Related Publications
•
L64364 ATMizer® II+ ATM-SAR Chip Technical Manual,
LSI Logic Corporation, Order Number R14008.
•
L64364 ATMizer® II+ Application Development Platform User’s
Guide, Revision 1.0, Preliminary.
•
ATM Forum Traffic Management Specifications
•
MIPS Programmer’s Handbook
Conventions Used in This Manual
The first time a word or phrase is defined in this manual, it is italicized.
The word assert means to drive a signal true or active. The word
deassert means to drive a signal false or inactive.
Hexadecimal numbers are indicated by the prefix “0x” —for example,
0x32CF. Binary numbers are indicated by the prefix “0b” —for example,
0b0011.0010.1100.1111.
xiv
Preface
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 1
Introduction
The application code developed in this manual is provided as a design
example. It is distributed with the expectation that it will be useful, but
without warranty of any kind. The code may be changed without further
notice.
The code supplied with your ATMizer II+ chip or Application Development
Platform (ADP) was initially written for the ADP. You will need to modify
it appropriately for your system design.
This chapter contains the following sections:
•
Section 1.1, “Hardware Overview”
•
Section 1.2, “Typical Application”
•
Section 1.3, “Software Overview”
1.1 Hardware Overview
The L64364 ATMizer II+ ATM-SAR chip provides 155 Mbits/s of
full-duplex operation while performing segmentation and reassembly
(SAR) of ATM Adaptation Layer 5 (AAL5) Convergence Sublayer Protocol
Data Units (CS-PDUs). Refer to the block diagram in Figure 1.1.
A specialized, hardwired AAL5 protocol SAR engine, called the
Enhanced DMA (EDMA), assists the MIPS-based ATM Processing Unit
(APU) in segmentation and reassembly tasks and memory management
functions. Although the EDMA is responsible for all basic segmentation
and reassembly functions, it operates under full control of the APU.
The APU is responsible for traffic management, host messaging, and any
other upper layer tasks. As an option, the advanced functions of the
hardwired units may be switched off to give the APU full control of all
operations. However, this impacts overall performance.
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
1-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
The APU is based on the LSI Logic MIPS II compatible CW4011 RISC
microprocessor core. The processor delivers 160 MIPS peak (110 MIPS
sustained) when operating at 80 MHz. The APU instruction set is
extended with ATM-specific instructions to enhance performance. These
instructions accelerate the cell rate calculations for Available Bit Rate
(ABR) services by allowing direct arithmetic operations (add, subtract
and multiply) on rates expressed as ATM Forum floating point 15-bit
numbers.
Scheduling and policing of different ATM Quality of Service (QoS)
connections can be achieved efficiently with the help of the integrated
hardware Scheduler. The Scheduler supports six priority classes. It uses
calendar tables to create arbitrary traffic schemes to a limit of 64 K
Virtual Connections.
The ATM Cell Interface (ACI) handles the transfer of cells between the
CBM and the Utopia Port. The Utopia Port complies with The ATM Forum
Utopia Level 2, v1.0, multi-PHY specification. The port operates at 50
MHz with 8-bit data buses and cell-level handshaking.
The Timer Unit includes a set of hardware timers and registers that
provide real-time events for the APU. There are seven general-purpose
timers and a TimeStamp Counter implemented in a set of registers. The
start count of the general-purpose timers can be set and can be
cascaded for longer timed intervals. The input clocks to the timers are
individually selectable between an external input or the L64364 system
clock.
The primary host interface for the device is a 33 MHz, 32-bit wide
Peripheral Component Interconnect (PCI) bus. As the bus master, the
L64364 is able to autonomously access control and data structures
located in the system memory. As a bus slave, the device provides
transparent access to secondary memory and to the internal Cell Buffer
Memory (CBM) for external PCI bus masters. The PCI interface
implements four separate FIFOs to maximize the performance of
simultaneous read/write operations as bus master or slave.
The L64364 integrates a Secondary Bus memory controller that provides
a glueless interface for asynchronous SRAMs, synchronous SRAMs and
synchronous DRAMs for secondary memory. It can also serve as an
interface to external physical layer devices such as framers. The memory
1-2
Introduction
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
controller allows APU booting from parallel, byte-wide EPROMs, and
from serial EPROMs.
The device includes a JTAG controller and boundary scan logic to
simplify board-level tests.
Figure 1.1 L64364 Functional Block Diagram
PCI Bus
Local Bus
Clock In
PCI Interface
Secondary Bus
Memory Controller
Clock
PLL
Secondary Port
Primary Port
JTAG
Controller
ATM
Processing
Unit
8 KB
Instruction
Cache
4 KB
Data
Cache
Enhanced
DMA
4 KB
Cell Buffer
Memory
Scheduler
Unit
Timer
Unit
ATM
Cell
Interface
Utopia Bus
Hardware Overview
1-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
1.2 Typical Application
Figure 1.2 shows a block diagram of LSI Logic Corporation’s ATMizer II+
Application Development Platform (ADP). The main features of the
system are:
•
A MIPS 4011 RISC processor from LSI Logic working as the control
(Host) processor, running at 80 MHz.
•
Two ATMizer-II+ ATM-SAR devices, with embedded MIPS 4011
RISC processors, running up to 80 MHz.
•
Host CPU and two ATMizer II+ SAR devices interface with each
other over a 33 MHz, 32-bit PCI bus, compliant with PCI local bus
specifications, Version. 2.1.
•
Host CPU interfaces with the PCI bus through a PCI bridge chip from
V3 Semiconductor.
•
Extra PCI motherboard connectors for supporting up to two
additional PCI cards in the system.
•
All three processors execute PROM based debug monitor (PMON)
from LSI Logic, providing command-line user interface over RS232
Serial ports.
•
10BASE-T Ethernet interface with host CPU, with Trivial File Transfer
Protocol TFTP support.
•
Four ATM physical layer devices (PHY) supporting the multiPHY
Utopia level 2 functionality.
•
Utopia bus configuration (ATM-PHY Interface) configurable through
front panel DIP switches.
•
Utopia frequency configurable through on-board jumpers; default is
33 MHz.
•
All important nets in the design available on headers for probing and
logic analysis.
The external interfaces with the ADP system include:
1-4
•
Six RS232C Serial Ports, two per processor. One port is used for
command-line user interface, and the other port is used for code
downloading.
•
10BASE-T Ethernet interface with the host CPU. The PMON monitor
program on host CPU supports TFTP for data transfer over Ethernet.
Introduction
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
•
Four OC-3 (SONET) ATM-UNI Interfaces for ATM traffic. The
aggregate throughput is limited by ATMizer II+ Utopia interface.
•
Front panel DIP switches for system configuration.
•
Front and back panel LED indicators.
•
110/240 V, 50/60, Hz AC Power supply.
Figure 1.2 ATMizer II+ Application Development Platform Block Diagram
Local
SDRAM
1Mx64
64
Host CPU
Interface
Controller
EPLD
MBUS
Host CPU
LR4500
LBUS
32
Buffer
SONIC
Ethernet
Controller
PCI Bridge
Controller
EPLD
SCN2681
DUART
Address Buffer
and Data
Buffer/Latch
Latch
8
32
Am29F040
FLASH
512Kx8
8
82C55
PIO
PCI
Arbiter
EPLD
V292PBC
PCI Bridge
Shared
ASRAM
128Kx32
32
PCI Bus
Second
ATMizer II+
SAR2
First
ATMizer II+
SAR1
Typical Application
First
Spare PCI
Connector
Second
Spare PCI
Connector
1-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
Figure 1.2
ATMizer II+ Application Development Platform Block Diagram (Cont.)
PCI Bus
32
Utopia
XC1736D
Serial
PROM
ATMizer II+
ATM-SAR
32
SAR
Controller
EPLD
ATMizer II+ Secondary Bus
Local
SDRAM
1Mx32
Buffer
Local
SSRAM
32Kx32
32
32
8
Local
ASRAM
128Kx32
8
8
Am29F040
FLASH
512Kx8
82C54
TIMER
ATMizer II+ Utopia Bus
8
8
S/UNI-LITE
CY7C955
ATM-UNI
PHY Device
ATM-UNI
PHY Device
SONET
SONET
Optical
Transceiver
Optical
Transceiver
Zero Delay
Buffer
Utopia
Controller
EPLD
Utopia Bus of other SAR
1-6
Introduction
8
SCN2681
DUART
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
1.3 Software Overview
This section describes the data structures used for the ATMizer II+ chip
and summarizes the contents of the remaining chapters in this manual.
1.3.1 Data Structures and Maintenance
The following sections describe the functions, format, and maintenance
of connection numbers, Virtual Connection Descriptors (VCDs), APU
Connection Descriptors (ACDs), Host Connection Descriptors (HCDs),
Buffers, and Buffer Descriptors (BFDs).
1.3.1.1 Connection Numbers
Ideally, the host processor needs to know which connection number to
use when it opens a connection. It should use the most recently freed
connection number for the next open connection if there is one. In the
receive direction, the APU needs to read the cell header to determine
which connection the cell belongs to through a hashing table mechanism.
In the transmit direction, the host inserts a connection number into the
VPI/VCI fields in the cell header. The connection numbers are limited to
the range of 0 to MAX_CON_NUM − 1. OAM cells are processed by
using predefined connection numbers that are different from those
associated with regular data flow. All connections are statically opened
during the initialization period.
1.3.1.2 Virtual Connection Descriptors
VCDs store control information about virtual connections and are
typically created when connections are established. They are initialized
by the APU and managed automatically by the EDMA. Only the EDMA
and the Scheduler can access them during normal operation. VCDs can
be located in secondary memory and/or in CBM. Generally, the VCDs
are located in non-cacheable secondary memory area to keep their
consistency since multiple modules can access and modify their
contents. Refer to the L64364 ATMizer II+ ATM-SAR Chip Technical
Manual for details.
Software Overview
1-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
1.3.1.3 Host Connection Descriptors
A Host Connection Descriptor (HCD) contains parameters required by
the APU for an open connection operation plus some other fields
necessary for the host. Parameters needed by the APU depend on the
connection traffic class. The HCDs are located in the 8 Mbyte host
private memory. The host initializes and maintains an array containing
one HCD per requested connection. The HCD size is 128 bytes.
Figure 1.3 shows the format of the descriptor and Table 1.1 describes the
fields of the descriptor.
1-8
Introduction
BookL64364PG.fm5 Page 9 Friday, January 28, 2000 4:58 PM
Figure 1.3
Host Connection Descriptor Format
31
0
0
Connection Number
4
VCD_Ctrl
8
Cell Header (reserved)
12
Crc32
Host -->
16
PCR
APU
fields
Host
PCR
PCR
20
SCR
MCR
Class
24
MBS
ICR
dependent
28
TBE
fields
32
FRIT
36
Status
40
BytesRec
44
BytesSent
maintenance 48
BadBuff
fields
52
PDUSize
56
StartTime
60
TimeStamp
64
BFD_HT
68
Head_PDU
72
Curr_PDU
76
Tail_PDU
Software Overview
PCR
1-9
BookL64364PG.fm5 Page 10 Friday, January 28, 2000 4:58 PM
Table 1.1
1-10
Host Connection Descriptor Fields
Name
Addr
Class
Description
Init
ConNum
0
All
Connection Number
Yes
VCD_Ctrl
4
All
VCD_Ctrl field
Yes
Reserved
8
All
Cell header (implemented later)
Yes
Crc32
12
All
CRC32 for AAL0 mode
Yes
PCR
16
All
Peak Cell Rate in cells/s,
24-bit integer
Yes
MCR
21
ABR
Minimum Cell Rate in cells/s,
24-bit integer
Yes
SCR
22
VBR
Sustained Cell Rate in cells/s,
24-bit integer
Yes
ICR
25
ABR
Initial Cell Rate in cells/s,
24-bit integer
Yes
MBS
26
VBR
Maximum Burst Size
Yes
TBE
29
ABR
Transient Buffer Exposure
Yes
FRTT
33
ABR
Fixed Round-Trip Time
Yes
Status
36
Host
Connection Status
Closed
BytesRec
40
Host
Number of bytes received
0
BytesSent
44
Host
Number of bytes sent
0
BadBuff
48
Host
Number of bad buffers received
0
PDUSIze
52
Host
Size of PDU
0
StartTime
56
Host
Start time of connection
0
TimeStamp
60
Host
TimeStamp of last received BFD
0
BFD_HT
64
Host
Head and tail of BFD list of PDU
0
Head_PDU
68
Host
Head of PDU list
0
Curr_PDU
72
Host
Current PDU being sent to APU
0
Tail_PDU
76
Host
Tail of PDU list
0
Introduction
BookL64364PG.fm5 Page 11 Friday, January 28, 2000 4:58 PM
The APU only needs the first 32 bytes of information in the HCD. Before
the host issues an open connection command, it copies the first 32 bytes
in the HCD for the ready-to-open connection from the host private
memory to the primary memory. The open command tells the APU the
memory location of these bytes. See Section 1.3.2.1, “Mailbox,” for
details.
The rest of the fields in the HCD are for the host’s internal maintenance.
For each connection the host needs to know the:
Status – This field gives the status of the connection.
CLOSED – The connection is closed (the SAR acknowledged the
close request).
REQ_OPEN – The host requests the SAR to open the connection.
OPEN – The SAR acknowledges the open request from the host.
REQ_CLOSED – The host requests the SAR to close the
connection.
BytesRec – This field stores the number of bytes received. It is updated
each time a buffer is received from the SAR.
BytesSent – This field stores the number of bytes sent. It is updated
each time a buffer is sent to the SAR.
BadBuff – This field stores the number of bad buffers received. It is
updated according to the BFD_Ctrl field each time a buffer is
received from the SAR. This is included on the statistics display.
PDUSize – This is the size of the PDU accumulated by the host from the
APU.
StartTime – This field contains the timer value at the time the connection was opened. It is used to calculate the actual transmission rate
for that connection.
TimeStamp – The timestamp of the last received BFD from the APU.
Head_PDU, Curr_PDU and Tail_PDU – The pointers to the PDU list
attached to the HCD to be sent to the APU.
Software Overview
1-11
BookL64364PG.fm5 Page 12 Friday, January 28, 2000 4:58 PM
1.3.1.4 APU Connection Descriptors
ACDs are used by the APU to hold connection related parameters. They
are only accessible by the APU and can be located in secondary
memory, CBM, Dcache and/or D-RAM. There are different ACDs for
different types of connections. To save accessing time, the size of an
ACD should not exceed 32 bytes. The APU needs to initialize the
necessary fields in the ACDs for each connection. At connection setup,
the APU fetches these parameters from a known location in the host’s
primary memory and manipulates them internally. Refer to Chapter 3,
Scheduling, for details. The calculation of the ACD is described in
Table 10.13.
1.3.1.5 Buffers
Buffers hold actual data. The buffer data is transferred by the EDMA and
generated/consumed by the host. Buffers can be located in secondary
memory and/or primary memory. For the simplicity of buffer
management, buffer memory is allocated at initialization time. In this
static memory allocation scheme, the buffers are managed as a free
allocated memory pointer queue (stack).
1.3.1.6 Buffer Descriptors
Buffer Descriptors (BFDs) hold control information about buffers and are
attached to VCDs when buffers are segmented or reassembled (see
Figure 1.4). BFDs are mainly accessed by the EDMA. They may also be
accessed by the APU in more specialized ways of buffer memory
management. BFDs can be located in secondary memory and/or primary
memory. Both the host and the APU have their own buffers. Refer to the
L64364 ATMizer II+ ATM-SAR Chip Technical Manual for more details.
1-12
Introduction
BookL64364PG.fm5 Page 13 Friday, January 28, 2000 4:58 PM
Figure 1.4 Buffer Descriptor and Buffer Relationship
Buffers (actual data)
BFD
pBuffData
Buffer List
0
0
6
7
8
1
0
0
0
0
1
2
3
4
5
pBuffData
pBuffData
pBuffData
6
7
8
9
10
BFD_Ctrl
BFD_UU
ConNum
BuffSize
NextBFD
Buffer Descriptor
pBuffData_PCI
BFD_FreeSel
pBuffData_Sec
1.3.1.7 Buffer, PDU, and BFD Maintenance
The data communication between the host and the APU is performance
critical because of the memory accessing. Figure 1.5 briefly illustrates
the memory organization of ATMizer II+.
Figure 1.5 ATMizer II+ Memory Organization
Primary
Memory
Software Overview
L64364
Secondary
Memory
1-13
BookL64364PG.fm5 Page 14 Friday, January 28, 2000 4:58 PM
Table 1.2 lists the different modes of transferring buffers and BFDs
between primary memory (PM), secondary memory (SM), and cell buffer
memory (CBM).
Table 1.2
Data Transfer Modes
Mode Type
Description
Cell Mode
Individual cells are exchanged between the CBM and PM
or SM.
Packet Mode
Packets are exchanged between the SM and PM using the
EDMA move processor.
BFD Far Mode
BFDs are located in PM.
BFD Local Mode
BFDs are located in SM
BFD Copy Mode
BFDs are copied between PM and SM.
There are two methods of exchanging the cells/packets and BFDs
between memories. The first method is to let the host DMA write the
transmitting cells/packets and BFDs to secondary memory and let the
EDMA write the received cells/packets and BFDs back to primary
memory. Since write operations through the PCI Bus are always faster
than read operations, this will save PCI transmission time, given the
assumption that the host has DMA capability.
The second method assumes that the host does not have DMA
capability. The EDMA performs the data exchange between primary
memory and secondary memory and the operation is transparent to the
ATMizer-II+ and the host. The ADP uses this method. Also, the firmware
operates in both Cell mode and Packet mode. The data exchanges are
summarized in Table 1.3 and Table 1.4.
1-14
Introduction
BookL64364PG.fm5 Page 15 Friday, January 28, 2000 4:58 PM
Table 1.3
Data Exchange with Host DMA
Direction
Cell Mode
Packet Mode
Tx
Host writes to SM
(Optimum)
Move processor reads from PM
Rx
Host reads from SM
Move processor writes to PM
(Optimum)
Table 1.4
Data Exchange without Host DMA
Direction
Cell Mode
Packet Mode
Tx
Tx processor reads from PM Move processor reads from PM
(Optimum)
Rx
Host reads from PM
Move processor writes to PM
(Optimum)
The location of the BFDs is based on how the following configuration
fields in the EDMA_Ctrl register are defined:
•
EDMA_TxBFD_Far
•
EDMA_TxBFD_Copy
•
EDMA_RxBFD_Far
•
EDMA_RxBFD_Copy
Refer to the L64364 ATMizer II+ ATM-SAR Chip Technical Manual for
details.
1.3.1.8 Calendar Table
The Calendar Table is a cell slot array managed by the Scheduler. Each
entry in the Calendar Table corresponds to one cell slot and contains
connection numbers of VCs to be serviced in that slot. It is implemented
in secondary memory. Refer to the L64364 ATMizer II+ ATM-SAR Chip
Technical Manual for details.
Software Overview
1-15
BookL64364PG.fm5 Page 16 Friday, January 28, 2000 4:58 PM
1.3.2 Host Messaging
The APU and host pass messages to each other in one of the following
ways:
•
Mailbox: command and feedback exchanges
•
Ring: buffer number exchanges
1.3.2.1 Mailbox
The external primary port bus master (the host) issues statistics and
connection commands to the ATMizer-II+ through the PCI Mailbox (built
on the ATMizer II+ chip). When the command actions are completed, the
APU sends the acknowledgment back to the host by writing it into a
predefined memory location. The content of the acknowledgment is
exactly the same as that of the command. This fixed memory location
functions similar to a mailbox. Note that the acknowledgment is not
written into the host PCI Mailbox to reduce the PCI bus traffic. Refer to
the L64364 ATMizer II+ ATM-SAR Chip Technical Manual for the detailed
functionality and structure of the Mailbox.
Mailbox Message Data Structure – Each entry in the Mailbox is a
32-bit word. The general format of this word is shown in Figure 1.6. Only
the LSB (bits 15 - 0) is used to decode the type of the message. The
rest of the bits are message-type dependent. All the messages are
defined in Table 1.5.
Figure 1.6 Mailbox Entry Format
31
16 15
Message Dependent
Table 1.5
1-16
0
Message Type
Mailbox Messages
Command
Bits [15:0]
Bits [31:16]
Get Statistics Report
0x0001
Target Address
Open Connection
0x0002
Address to the Host Connection
Descriptor
Close Connection
0x0003
Connection Number
Introduction
BookL64364PG.fm5 Page 17 Friday, January 28, 2000 4:58 PM
Connection Commands – The open connection command transfers the
host-negotiated/determined connection parameters to the ATMizer II+ in
the following manner. The host specifies an address pointer pointing to
a block of memory, copies the part of the HCD associated with that
connection to this memory block, and then tells the APU the starting
address of this block in the open connection command message. The
APU can then read the desired parameters of the connection from the
HCD. The size of this memory block is fixed at 32 bytes but the content
varies according to the traffic class of the connection. Refer to
Section 1.3.1.3 for details.
Generally, the host statically opens all connections, one by one, during
the initialization period. After retrieving an open connection command,
the APU:
•
initializes the VCD for that connection,
•
reads the parameters from the address-given memory block, and
•
calculates the associated ACD.
For each connection, the APU needs to open two connections, a transmit
(Tx) and a receive (Rx) connection. The connection number for the Tx
connection is in the parameter memory block.
If the MAXIMUM_ CONNECTION_NUMBER is 1 K, the APU simply
assigns the Tx connection number + 1 K as the Rx connection number.
No specific parameters are needed for the Rx connection. After the
creation of both connections, the APU acknowledges to the host by
repeating the command message back to the predefined primary
memory location. When the host receives the acknowledgment, it
continues with the next open connection command and so on.
In a system loopback configuration, the RX connection number lookup
method will not be correct since the number in the VPI/VCC fields of the
cell header contain the Tx connection number. To correct this, the APU
adds MAXIMUM_CONNECTION_NUMBER to the result retrieved from
the API/ACI fields of the received cell. When two ADPs are connected
back-to-back, the APU in one system modifies the received cell header
API/ACI fields before it sends the cells back to the other system.
For the close connection command, the APU clears both the Tx
connection VCD and the Rx connection VCD. The Tx connection number
is in the command message and the Rx connection number is derived
Software Overview
1-17
BookL64364PG.fm5 Page 18 Friday, January 28, 2000 4:58 PM
as described above. After that, the APU acknowledges by copying the
close connection command back to the predefined location in the primary
memory. The host checks the content of this location to make sure that
the APU has completed the command before issuing another command.
Get Statistics Command – Statistics results include the number of:
•
cells sent,
•
cells received,
•
PDUs transmitted,
•
PDUs received, and
•
errors.
The three MSBs of the command indicate the command type and the
rest of the bits point to the initial address of a 64-byte fixed block in the
primary memory. The APU puts all of the statistics information in this
block. Table 1.6 describes the fields of the statistics information. Then the
APU acknowledges by copying the command back to the predefined
location in primary memory. Again, the host checks this location before
issuing another command to avoid a race condition.
Table 1.6
Statistics Result Fields
Fields
Addr
(Byte)
Description
RxCells
0
number of received cells
TxCells
4
number of transmitted cells
RxPDU
8
number of received PDUs
TxPDU
12
number of transmitted PDUs
ErrTimeout
16
number of received aborted (timeout) PDUs
ErrRxLost
20
number of received lost cells
ErrConNum
24
number of wrong connection number
ErrCrc10
28
number of errored (CRC10) RM/OAM cells
(Sheet 1 of 2)
1-18
Introduction
BookL64364PG.fm5 Page 19 Friday, January 28, 2000 4:58 PM
Table 1.6
Statistics Result Fields (Cont.)
Fields
Addr
(Byte)
Description
ErrCrc
32
number of received CRC32-errored PDUs
ErrLength
36
number of received length-errored PDUs
ErrAbort
40
number of received aborted (zero length) PDUs
ErrLowMem
44
times that one free buffer list is empty
ErrNoContBuff
48
number of transmitted partially-built PDUs
ErrNoMem
52
times that both free buffer lists are empty
ErrNoData
56
times of no-buffer attached to VCD
(Sheet 2 of 2)
1.3.2.2 Rings
Rings are data structures that support fast messaging between the host
and the APU. They minimize control traffic over the shared PCI Bus and
support the master write-only method of exchanging data. The rings are
described in greater detail in Chapter 2.
Figure 1.7 shows the format of a ring message. In the receive direction,
the ATMizer II+ reassembles cells into buffers. After one buffer is full, the
APU places this buffer number with the status bits (returned from the
EDMA Buffer Completion Queue) into the RxRing. The host checks the
RxRing, retrieves and consumes the data, updates the statistics and then
returns this free buffer to the ATMizer II+ by writing (copying) BuffLarge,
BuffFree and Buffer Number fields into the TxRing with the BuffFree bit
set. After getting this message, the APU issues a buff command to the
EDMA. The EDMA returns the buffer to the large buffer free list if the
BuffFree and the BuffLarge fields are set, or returns the buffer to the
small buffer free list if the BuffFree field is set and the BuffLarge field is
cleared.
In the transmit direction, the host writes the Buffer Number fields into the
TxRing with the BuffFree bit clear. Then the APU issues a buff
command to the EDMA by copying this message into the EDMA_Buff
register. Since the BuffFree bit is cleared, the EDMA segments the buffer
for transmission regardless of the BuffLarge bit. Refer to the L64364
Software Overview
1-19
BookL64364PG.fm5 Page 20 Friday, January 28, 2000 4:58 PM
ATMizer II+ ATM-SAR Chip Technical Manual for details. After the buffer
is completely segmented, the APU puts the buffer number into the
RxRing with the BuffFree bit clear. The host checks the RxRing and
places this free buffer back in its own free buffer list.
Figure 1.7 Ring Message Format
31
21 20
Status Bits
18
FreeSel
17
16
BuffFree BuffLarge
15
0
Buffer Number
1.3.3 Scheduling
The primary objective of a scheduling task is to decide which connection
should be serviced in the current ATM cell slot.
The scheduling task in the ATMizer II+ chip is managed by the on-chip
MIPS processor, the ATM Processing Unit (APU). The APU uses a
hardware Scheduler module to minimize the processor load. This chapter
discusses different approaches for writing the APU application code that
performs the scheduling function. A complete application example is
developed, starting from the simplest code and progressively enhancing
it with additional features until the desired behavior is reached. At each
step, the code is fully commented and various alternatives are discussed.
The hardware Scheduler allows the ATMizer II+ chip to manage a large
number of connections with arbitrary data rates. Other segmentation and
reassembly (SAR) processors base connection scheduling on a set of
timers. Since these processors include a limited number of timers, they
have a limited number of data rates that they can handle. All connections
have to be assigned to one of those rates.
This approach is practical with constant bit rate (CBR) sources whose
rates can be approximated with that of one of the timers. Variable bit rate
(VBR) sources are typically built by executing a leaky bucket algorithm
at each peak cell rate (PCR) event to check whether a cell can be sent
from a connection. Although inefficient, VBR sources can be serviced
with a set of timers.
With the advent and standardization of available bit rate (ABR) by the
ATM Forum, the timer-based approach is not practical anymore. An ABR
source may have an arbitrary and varying-in-time rate, that is difficult to
match by a finite set of timers.
1-20
Introduction
BookL64364PG.fm5 Page 21 Friday, January 28, 2000 4:58 PM
The ATMizer II+ chip uses the hardware Scheduler to effectively create
arbitrary traffic patterns on a large number of connections. The
Scheduler provides primitives that are executed under control of the
APU, an 80 MHz, superscalar MIPS processor core capable of sustaining
110 MIPS performance. The Scheduler primitives act like software
routines except that they are executed by dedicated hardware units and
do not consume CPU bandwidth. However, since the management of the
scheduling process is actually done in software, you can change device
behavior by downloading new application code.
1.3.4 Hashing Function
ATM technology is connection oriented and the data flow between two
end-station entities is based on an established virtual connection
between them. The routing mechanism for the cells which hold the data
is carried in the header; the address space is comprised of 24 bits which
is then sub-divided into two fields. At the end-stations, the cells are
processed based on a connection number. Typically, the maximum
number of connections that an end-station processes is much smaller
than the address space available in the cell header. Therefore, a need
exists for a hashing mechanism to obtain the connection number of a cell
based on the cell header value.
1.3.5 Packet Aging
The concept of packet aging is the notification to the host of idle
connections that have not received a cell for a predefined period. The
ATM Processing Unit (APU) samples the Virtual Connection Descriptor
(VCD) and examines the TimeStamp value on the VCD to determine if
the connection has to be labeled as an idle connection.
1.3.6 Interrupt Handling
The CW4010 processor used in the L64364 ATMizer II+ ATM-SAR Chip
supports three types of interrupt signals:
•
Cold/warm resets and nonmaskable interrupts,
•
External nonvectored interrupts (6), and
•
External vectored interrupts (16).
Software Overview
1-21
BookL64364PG.fm5 Page 22 Friday, January 28, 2000 4:58 PM
The six external nonvectored and 16 vectored interrupts require a
general handler to pass control off to a unique handler for each specific
interrupt. The vectored interrupts also require an enabling routine at
initialization. See Chapter 7 for handler code samples and a sample
enabler routine.
1.3.7 OAM Cell Processing
Chapter 8 outlines the implementation of the Operations and
Management (OAM) cell processing function. The OAM function is
defined at the Physical and the ATM layers. The Physical Layer OAM cell
processing is done by the Framer (for example, the SuniLite Framer
chip). The software running on the ATM Processing Unit performs the
ATM Layer OAM cell processing. The main goal of this software is to
provide the means for you to perform the OAM cell processing.
1.3.8 AAL3/4 Processing
Chapter 9 describes the implementation of AAL3/4 processing in the
ATMizer II+ chip. The EDMA in the ATMizer II+ chip is designed to
implement the AAL5 CS-PDU processing and to support AAL0 type
connections. The segmentation and reassembly support for AAL0
connections provided by the EDMA can be used to implement the
SAR-PDU segmentation and reassembly for AAL3/4 connections. The
AAL3/4 processing can be implemented in the ATMizer II+ by the APU
which can preprocess the PDU data before it is segmented or
reassembled by the EDMA.
1.3.9 Initialization
Chapter 10 discusses the following initialization steps for the ATMizer II+
chip and provides sample code.
1. Booting
The booting step initializes the Configuration and Cache Control
(CCC) register and the SDRAM controller, then copies the
initialization and application code to an executable memory location.
2. C Preamble Execution
1-22
Introduction
BookL64364PG.fm5 Page 23 Friday, January 28, 2000 4:58 PM
C preamble execution includes .bss section clearing, stack
allocation, and initialization of the global data pointer and stack
pointer registers.
3. CPU Initialization and Configuration
CPU initialization and configuration mainly includes cache
configuring and flushing, and interrupt and exception handler setting.
4. Memory Allocation
Memory allocation defines the maps for primary, secondary, and Cell
Buffer Memory (CBM).
5. Hardware Registers Initialization
ATMizer II+ chip hardware initialization and configuration includes all
hardware module registers and mode setting.
6. Data Structures Initialization
Data structures initialization includes Free Cell List initialization,
clearing the Virtual Connection Descriptors (VCDs), and setting the
Buffer Descriptors (BFDs) and Scheduler Calendar Table (SCDs).
1.3.10 Operating Software
Chapter 11 describes the functions performed by the ATM Processing
Unit (APU) in the ATMizer II+ chip and those required of the host. The
APU is responsible for traffic management, host messaging, OAM cell
processing, and statistics collection. The host program allows you to
send commands to the ATMizer II+ and to display the results of these
commands. This involves opening connections, transmitting and
receiving data, and displaying statistics such as effective rate, errors
received, etc.
Software Overview
1-23
BookL64364PG.fm5 Page 24 Friday, January 28, 2000 4:58 PM
1-24
Introduction
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 2
Host Messaging
This chapter describes the types of control information that need to be
exchanged between a host and the ATM Processing Unit (APU) in the
ATMizer II+ chip. It also discusses methods of efficiently moving that
information over the shared interconnecting bus, the PCI Bus. Messaging
application code is developed for each method. The chapter includes the
following sections:
•
Section 2.1, “Host Messaging Overview”
•
Section 2.2, “Buffer Processing”
•
Section 2.3, “Rings”
2.1 Host Messaging Overview
The ATMizer II+ chip is a Segmentation and Reassembly (SAR)
Processor and thus is typically a slave in a system in that it executes
commands issued by an external bus master (host). In this chapter it is
assumed that there is only one host in the system. You can easily extend
the techniques developed here to cases where there are multiple hosts.
The term buffer in this manual is used to denote a memory location and
the data in that location. It is used in this way to simplify the discussion
since the buffer location may hold a packet, part of a packet, or an ATM
cell.
The communications between a host and the ATMizer II+ chip described
in this chapter includes the following host tasks:
1. sending buffers for segmentation
2. getting back completely sent (segmented) buffers
3. receiving reassembled buffers
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
2-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
4. returning received buffers to the free list
5. opening connections
6. closing connections
7. requesting statistics
Because of the flexibility of the ATMizer II+ chip, you may decide to
implement additional commands or certain host tasks on the ATMizer II+
APU.
In a typical system, tasks one through four involve buffer processing, are
performed very often, and should be optimized for performance. Tasks
five through seven are executed only occasionally and their impact on
performance is minimal. Therefore, the two groups of tasks are examined
separately.
Throughout this chapter simple type definitions are used:
typedef unsigned long ulong;
typedef unsigned short ushort;
typedef unsigned char uchar;
2.2 Buffer Processing
In an ATMizer II+ system, a buffer holds payload data for segmentation
or reassembly. Each buffer has an associated Buffer Descriptor (BFD)
that holds control information as described in the L64364 ATMizer II+
ATM-SAR Chip Technical Manual. See Figure 2.1 for a summary of the
BFD layout.
Figure 2.1
31
Buffer Descriptor Layout
29 28 27 26
1
24 23
BFD_Ctrl
2
UU
2-2
NextBFD
pBuffData_PCI
FreeSel
R
pBuffData_Sec
Host Messaging
0
ConNum
BuffSize
3
4
16 15
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
BFD_Ctrl
BFD Control Bits
Word 1, [31:24]
Refer to the L64364 ATMizer II+ ATM-SAR Chip Technical Manual for descriptions of these bits.
UU
AAL5 User-to-User
Data byte reserved for user’s use.
ConNum
Connection Number
Word 1, [15:0]
Each buffer has a 24-bit ConNum associated with it.
BuffSize
Buffer Size
Word 2, [31:16]
The BuffSize field specifies the number of bytes in the
buffer payload.
NextBFD
Next BFD
Word 2, [15:0]
The NextBFD field is used to link BFDs. In the transmit
direction, the EDMA ignores the NextBFD field when it
executes the Buff command so that neither the host nor
the APU need to initialize this field. In the receive direction and, in case of a fragmented PDU, the EDMA places
the number of the next buffer belonging to the same PDU
in NextBFD. If the buffer is the last one for the PDU, the
EDMA places zero in NextBFD.
Word 1, [23:16]
pBuffData_PCI
Word 3, [31:0]
Pointer to Buffer Data in PCI Memory
The pBuffData_PCI field holds the pointer to the buffer
payload in PCI memory. If this field is zero, then the value
in the pBuffData_Sec field is used to point to the buffer
payload in Secondary memory.
FreeSel
Free Select
Word 4, [31:29]
The APU uses this field to indicate to which Free List
(0–5) the BFD belongs.
R
Reserved
Do not modify this field.
Word 4, [28:27]
pBuffData_Sec
Word 4, [26:0]
Pointer to Buffer Data in Secondary Memory
The pBuffData_Sec field holds the pointer to the buffer
payload in Secondary memory. If this field is zero, then
the value in the pBuffData_PCI field is used to point to
the buffer payload in Secondary memory. If both
pBuffData_PCI and pBuffData_Sec are non-zero, the
BFD is in packet mode.
Buffer Processing
2-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
Buffer Descriptors are referenced using 16-bit wide Buffer Numbers
(BuffNum). A Buffer Number is an index into the Buffer Descriptor array.
The base of the array is programmed using the EDMA_BFD_Base
register. Since each entry in the array holds 16 bytes, the EDMA shifts
a BuffNum left by four positions before adding it to EDMA_BFD_Base to
obtain a BFD address.
2.2.1 Buffer Flow
Before describing the different methods of buffer messaging, it is
important to analyze data and control flows for systems including the
ATMizer II+ chip and an external host.
2.2.1.1 Transmit Flow
Since the PCI BUs is a shared bus, you will need to construct some form
of FIFOs to handle data and control information for both transfer
directions. The transmit flow using FIFOs between the host and APU is
shown in Figure 2.2.
2-4
Host Messaging
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
Figure 2.2
Transmit Flow
Host
Host
Buffer for
Transmission
Buffer
Transmitted
Host Messaging
TxFifo
TxDone
APU
APU
EDMA Buff
Request Queue
APU
EDMA TxCell
Request
Queue
Scheduling
EDMA TxCell
Completion
Queue
EDMA
VCD
BFD
BFD
EDMA
ACI TxFifo
in Cell Buffer
Memory
ACI
Utopia Bus
When a host has a buffer ready for segmentation, it places an
appropriate message in the TxFifo. The contents of the message and
various implementations of the TxFifo are described elsewhere in this
section. The APU retrieves the message from the TxFifo and depending
on the messaging scheme used, performs various BFD formatting. In the
simplest form, the host uses exactly the same BFD format as the EDMA
and the APU needs only to issue a Buff command that places a BuffNum
in the EDMA Buff Request Queue. The EDMA retrieves the BuffNum
Buffer Processing
2-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
from the Request Queue and links the referenced BFD to a VCD. The
Connection Number to use is read from the first word of the BFD.
As a separate asynchronous process governed by the scheduling
mechanism, the APU issues TxCell commands to the EDMA. An
example scheduling mechanism is described in Chapter 3, Scheduling.
The EDMA retrieves data payloads from buffers, creates cells in Cell
Buffer Memory (CBM), and puts cells in the ACI TxFifo for transmission
to the Utopia Bus.
Buffer completion occurs when all the data from the buffer payload has
been successfully segmented into cells and placed in the ACI TxFifo.
When a buffer is completed, the EDMA places the BuffNum in the EDMA
TxCell Completion Queue. For the last buffer of a PDU, the EDMA may
have to build a cell without any buffer payload and holding only the AAL5
trailer (padding, UU, CPI, Length and CRC32). This situation occurs
when the PDU length modulo 48 is equal to zero or greater than 40. In
this case, the current Buffer Number is placed in the EDMA Completion
Queue only after this last cell is built.
The APU retrieves the BuffNum from the TxCell Completion Queue and
writes an appropriate message to the TxDone FIFO. Although different
implementations are possible, the message typically consists of the
Buffer Number.
The process above the dashed line in Figure 2.2 (Host Messaging) is the
subject of this chapter while the process in the lower left corner
(Scheduling) is described in Chapter 3, Scheduling.
2.2.1.2 Receive Flow
The data and control flow for the receive direction is shown in Figure 2.3.
This section describes the process to the right and above the dashed line
in Figure 2.3.
Figure 2.3 describes only the case when buffers are pulled out by the
EDMA from a free list as needed. It does not describe the case when
the APU explicitly attaches free buffers to a VCD in the receive direction.
2-6
Host Messaging
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
Figure 2.3
Receive Flow
Utopia Bus
ACI
Host
ACI RxFifo
in Cell Buffer
Memory
APU
RxFifo
Cell Header
Translation
EDMA RxCell
Request
Queue
APU
Host Messaging
RxDone
APU
EDMA RxCell
Completion
Queue
EDMA
VCD
Host
EDMA Buff
Request
Queue
EDMA
BFD
BFD
Free
BFD
BFD
When a cell is received from the Utopia Bus, the ACI places it in the ACI
RxFifo located in the CBM. The APU retrieves the cell from ACI RxFifo
and performs cell header lookup. This operation consists of classifying
the cell based on the cell header and determining the Connection
Number. The Connection Number identifies the Virtual Connection to
which the cell belongs. There are numerous ways to perform this
operation. The simplest one is to form the Connection Number from the
appropriate bits of the cell header VPI/VCI fields by shift and mask
operations. This method is appropriate when the VPI and VCI are
assigned as contiguous numbers. If this is not the case, a more complex
cell header lookup (for example, using hashing) should be performed.
When the APU determines the Connection Number, it places a command
in the EDMA RxCell Request Queue. The EDMA retrieves the command
from the queue and transfers the cell payload to a buffer. If there is no
buffer available, the EDMA pulls out a free buffer from one of the two free
buffer lists and attaches it to the VCD. When a buffer is completed, the
Buffer Processing
2-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
EDMA detaches it from the linked list and places the Buffer Number in
the EDMA RxCell Completion Queue. Note that the linked list of buffers
attached to a receive VCD has a maximum length of one when only
buffers from free lists are used. The list can have more than one BFD
only if buffers are attached explicitly using the EDMA Buff command for
a receive VCD.
The APU retrieves the BuffNum from the EDMA RxCell Completion
Queue and places an appropriate message in the RxFifo. The contents
of the message and various implementations of the RxFifo are described
in Section 2.2.3, “FIFO Contents” and Section 2.2.4, “FIFO
Implementations.” The host retrieves the message and may process the
buffer payload. When the buffer processing is done, the host places a
message in the RxDone FIFO. Although various implementations are
possible, such a message typically consists of the Buffer Number with
additional control bits. The APU retrieves the message from the RxDone
FIFO and issues the EDMA Buff command to place the BuffNum in the
EDMA Buff Request Queue. The EDMA retrieves the BuffNum from the
request queue and links the corresponding BFD to a large or small free
buffer list.
2.2.2 FIFO Location
To maximize system performance, it is important to place the FIFOs in
appropriate locations. The ATMizer II+ chip uses the PCI Bus as the
shared bus to communicate with the host. Since shared buses typically
experience increased latencies, it is important to minimize the total
number of bus requests. Once the bus is acquired, it is less expensive
to continue an established burst than to start a new one. Therefore, a
design goal for an ATMizer II+ chip host messaging system is to minimize
the total number of bus accesses, if necessary at the expense of
increasing burst lengths.
The second issue to consider is that write operations are typically much
less expensive than read operations in both the bus utilization and
master device processing power. This is because write operations are
nonblocking while read operations are blocking. When a data element is
written to an external target device over the PCI Bus, the initiating device
typically writes the data into the PCI master FIFO and then continues
with other tasks. From that perspective, a write operation is of the “shootand-forget” type. When the PCI controller acquires the bus, it has data
2-8
Host Messaging
BookL64364PG.fm5 Page 9 Friday, January 28, 2000 4:58 PM
available in the internal FIFO and may put the stream over the bus
immediately.
When an initiating device has to read a data element from a target device
over the PCI bus, it has to request the bus, wait until the data is retrieved
by the target device, and wait until the data is placed on the bus. This
may take a relatively long time due to bus latencies. When the bus is
acquired and the target is selected by address decoding logic, the target
must respond. The PCI controller has to fetch data from the target
location and place it on the bus. For example, a DRAM-based memory
may require as much as 120 ns (four cycles) to place data on the bus,
assuming the memory bus is not used. These four cycles are then lost
for bus utilization.
If read latency is sufficiently high, better effects may be achieved if the
target device immediately disconnects after decoding its own address.
The master device is required to retry the same operation. In the
meantime, the target has time to fetch data and place it in its slave read
FIFO. In between, the bus arbiter may decide to grant the bus to another
target, effectively increasing the time for the target to fetch data without
adversely affecting bus utilization. However, even this method increases
bus utilization since the bus has to be arbitrated twice for the same data
and the process doesn’t help to resolve the master blocking issue. It may
actually make things worse due to bus re-arbitration.
Consequently, the second design goal for an ATMizer II+ chip host
messaging system should be to privilege PCI write operations and
discourage PCI read operations. Since PCI FIFOs hold data that is
written by a sender and read by the receiver, this goal is achieved if the
FIFOs are located in the receiver memory. Therefore, the RxFifo and
TxDone FIFOs should be located in the PCI primary memory or host
memory accessible from the PCI Bus, and the TxFifo and RxDone FIFO
should be located in either the CBM or secondary memory. To reduce
the secondary memory’s bandwidth requirements, it is recommended
that you place both the TxFifo and the RxDone FIFO in the CBM.
Buffer Processing
2-9
BookL64364PG.fm5 Page 10 Friday, January 28, 2000 4:58 PM
2.2.3 FIFO Contents
As discussed in the previous section, four FIFOs are needed to
exchange information between the host and APU. This section discusses
which elements are actually placed in the FIFOs. FIFO implementation
is described in a later section. See Table 2.1.
Table 2.1
FIFOs between the Host and APU
Name
Sender
Receiver
Contents
TxFifo
Host
APU
buffers for transmission (for
segmentation)
TxDone
APU
Host
buffers sent
RxFifo
APU
Host
buffers received (from reassembly)
RxDone
Host
APU
processed buffers to a free list
The elements placed in the TxFifo and RxFifo may be either Buffer
Descriptors or Buffer Numbers leading to what is called 2- or 3-way
messaging. Since the TxDone and RxDone FIFOs are used only to
signal that a given buffer was completely processed, it is sufficient to
store the Buffer Numbers in them.
2.2.3.1 Three-Way Messaging
In this case, the host writes Buffer Numbers into the TxFifo and the APU
writes Buffers Numbers into the RxFifo. The sequence of events for the
transmit direction is as shown in Table 2.2.
2-10
Host Messaging
BookL64364PG.fm5 Page 11 Friday, January 28, 2000 4:58 PM
Table 2.2
Three-Way Messaging, Transmit Direction
1. The host writes the BuffNum into the TxFifo.
PCI write
2. The APU reads the BuffNum from the TxFifo.
3. The APU issues the Buff command.
4. The EDMA copies the BFD to secondary memory.
PCI read
5. The EDMA processes the buffer.
6. The EDMA places the completed BuffNum in the TxCell
Completion Queue.
7. The APU writes the BuffNum to the TxDone FIFO.
PCI write
As shown, this method involves two write operations and one read
operation over the shared PCI Bus, hence the name of 3-way
messaging. The sequence of events for the receive direction is shown in
Table 2.3.
Table 2.3
Three-Way Messaging, Receive Direction
1. The EDMA completes a buffer and copies the BFD to primary
memory.
PCI write
2. The EDMA places the BuffNum in the RxCell Completion
Queue.
3. The APU writes the BuffNum to the RxFifo.
PCI write
4. The host reads the BuffNum and BFD, and processes the
data.
5. The host writes the processed BuffNum to the RxDone FIFO. PCI write
6. The APU issues the Buff command.
7. The EDMA attaches the BFD to a free list.
There are three write operations over the shared PCI Bus In the receive
direction.
Note that the same type of element (BuffNum) is used both for the TxFifo
and RxDone. The two FIFOs in the receive direction thus are combined
in one using special tag bits for the receiver (APU) to determine the
difference. Similarly, the RxFifo may be combined with the TxDone FIFO.
Buffer Processing
2-11
BookL64364PG.fm5 Page 12 Friday, January 28, 2000 4:58 PM
2.2.3.2 Two-Way Messaging
In this case the host writes Buffer Descriptors into the TxFifo and the
APU writes Buffer Descriptors into the RxFifo. The sequence of events
for the transmit direction is as shown in Table 2.4.
Table 2.4
Two-Way Messaging, Transmit Direction
1. The host writes the BuffNum into the TxFifo.
PCI write
2. The APU copies the BFD to secondary memory and allocates
a local BuffNum.
3. The APU issues the Buff command.
4. The EDMA links the BFD to an appropriate VCD.
5. The EDMA processes the buffer.
6. The EDMA places the completed BuffNum in the TxCell
Completion Queue.
7. The APU writes the BuffNum to the TxDone FIFO.
PCI write
This method involves two operations over the shared PCI bus, hence the
name 2-way messaging. There are some additional steps that the APU
has to perform in 2-way messaging as compared to 3-way messaging.
First, the APU must allocate a free BFD in which to copy the contents of
the BFD retrieved from the TxFifo. This is necessary since the TxFifo
must be emptied rapidly to avoid overflow. Second, the BFD in the TxFifo
must have some sort of an identifier that can be returned later to the host
(step 7 in the transmit events). Such an identifier should be allocated by
the host and may be stored, for example, in the last unused halfword of
the BFD. When the EDMA completes a buffer segmentation and returns
a BuffNum into the TxCell Completion Queue, the APU has to read the
BFD[BuffNum].BuffId field and place this field in the TxDone FIFO,
instead of the BuffNum. Note that in this case, BuffNum is purely a local
number exchanged between the APU and the EDMA while the BuffId is
exchanged between the host and the APU.
Step 2 of the transmit sequence involves a copy operation and its
implementation depends on the location of the TxFifo. The highest
performance can be achieved if the TxFifo is located in the Cell Buffer
Memory. In this case, the APU may simply copy the BFD word by word
into the secondary memory, optionally performing some formatting
2-12
Host Messaging
BookL64364PG.fm5 Page 13 Friday, January 28, 2000 4:58 PM
operations if the Buffer Descriptor format used by the host differs from
that shown in Figure 2.1.
Similar events happen in the receive direction. See Table 2.5.
Table 2.5
Two-Way Messaging, Receive Direction
1. The EDMA completes a buffer and places the BuffNum in the
RxCell Completion Queue.
2. The APU writes the BuffNum to the BuffId field.
3. The APU programs the Move processor to write the BFD into
the RxFifo.
PCI write
4. The host reads the BuffNum and BFD, and processes the
data.
5. The host writes the processed BuffNum to the RxDone FIFO. PCI write
6. The APU issues the Buff command.
7. The EDMA attaches the BFD to a free list.
This time the APU has to explicitly store the BuffNum in the BuffId field
so that the host may return it later to the RxDone FIFO.
Compared to 3-way messaging, 2-way messaging has the advantage of
creating less traffic over the shared PCI bus. One disadvantage of 2-way
messaging is that it requires the host to create a PCI burst to send the
BFD to the TxFifo. Depending on the host PCI Bus controller, a PCI burst
may require setting up a DMA transfer. If a DMA is required, setting it up
may impose a high burden on the host processor. If the host bus
controller is able to collect multiple single-word transactions at
consecutive addresses into 4-word bursts, then 2-way messaging is
much more efficient. The second disadvantage of 2-way signalling is that
it requires more involvement from the APU and may create a bottleneck
if the APU performs many non-SAR related tasks. It also creates slightly
more traffic to and from secondary memory.
Buffer Processing
2-13
BookL64364PG.fm5 Page 14 Friday, January 28, 2000 4:58 PM
2.2.4 FIFO Implementations
This section presents a discussion of various implementations for host
messaging FIFOs, while abstracting the FIFOs’ contents.
2.2.4.1 ATMizer II+ Chip Mailbox
The simplest implementation for a FIFO is to use existing hardware
resources. ATMizer II+ chip includes a 4-entry deep, 32-bit wide,
bidirectional FIFO called Mailbox that may be used for communications
between the APU and an external PCI Bus master. However, there are
significant drawbacks to using the Mailbox for buffer processing:
•
The Mailbox is only four entries deep.
•
The APU-to-host direction involves reading the data from the Mailbox
over the PCI Bus.
The reasons why PCI read operations should be avoided have already
been described in Section 2.2.2, “FIFO Location.”
The APU is usually much faster than any host processor since its
firmware has direct access to hardware resources without the burden of
the multilevel function calls required by typical host operating systems. It
is therefore tempting to assume that Mailbox overflow can be avoided
easily. In the host-to-APU direction, the APU just has to read the Mailbox
quickly enough. In the APU-to-host direction, the APU must check the
Mailbox occupancy level before writing in a new value.
The above assumption is true if the processing time of a command
placed in the host-to-APU Mailbox depended only on the APU. Actually,
the Buff processor takes care of Buff command processing. If the host
writes back-to-back Buff commands to the Mailbox FIFO, the APU will
quickly empty the Mailbox and put appropriate commands in the Buff
Request Queue. Since the Buff processor needs some time to execute
a command, the Buff Request Queue will fill quickly. When this happens,
the APU needs to buffer additional commands into a software-controlled
FIFO and notify the host of an overflow condition. If such a FIFO is
required, the Mailbox may be bypassed and that FIFO used exclusively
to reduce the total overhead.
2-14
Host Messaging
BookL64364PG.fm5 Page 15 Friday, January 28, 2000 4:58 PM
2.2.4.2 Software Controlled FIFO – Shared Descriptor
Multiple software implementations of a FIFO are possible. Choose an
implementation that is adapted to the situation where the reader (APU or
host), and the writer (APU or host), use separate asynchronous
processes, and avoids the need for locking the FIFO descriptor.
Assuming that the FIFO holds 32-bit, unsigned integers, the FIFO
descriptor is declared as shown in Figure 2.4.
Figure 2.4
1
2
3
4
5
6
FIFO Descriptor Declaration
typedef struct {
ulong *Rd;
ulong *Wr;
ulong *Base;
ulong *End;
} Fifo_t, *pFifo_t;
/*
/*
/*
/*
current position to read from */
current position to write to */
Fifo array base*/
end of the Fifo array */
With this declaration, a routine (Figure 2.5) can be implemented to put a
data element in the FIFO:
Figure 2.5
PutFifo() Routine
7 int PutFifo(pFifo_t pFifo, ulong Data)
8 {
9
ulong *Ptr = (pFifo->Wr == pFifo->End) ? pFifo->Base : pFifo->Wr + 1;
10
if (Ptr == pFifo->Rd)
11
return 0;
12
*pFifo->Wr = Data;
13
pFifo->Wr = Ptr;
14
return 1;
15 }
In the above code, the write pointer is first incremented by using the
temporary variable Ptr, wrapping around to the FIFO base if necessary.
If the new Wr pointer reaches the Rd pointer, the FIFO is full and the
routine returns a failure code. Otherwise, the data element is placed at
the current Wr pointer and the Wr pointer value is assigned to the
temporary variable, Ptr (i.e., the Wr pointer is incremented).
Buffer Processing
2-15
BookL64364PG.fm5 Page 16 Friday, January 28, 2000 4:58 PM
The implementation of a routine to retrieve a data element from a FIFO
is similar as shown Figure 2.6.
Figure 2.6
GetFifo() Routine
16 int GetFifo(pFifo_t pFifo, ulong *Data)
17 {
18
if (pFifo->Rd == pFifo->Wr)
19
return 0;
20
*Data = *pFifo->Rd;
21
pFifo->Rd = (pFifo->Rd == pFifo->End) ? pFifo->Base : pFifo->Rd + 1;
22
return 1;
23 }
A FIFO underflow condition is detected by comparing the Rd and Wr
pointers. If they are equal, the FIFO is empty and the routine returns a
failure status. Otherwise, a data element is retrieved and the Rd pointer
is incremented.
Figure 2.7 shows how these routines cooperate to control the FIFO.
Figure 2.7
FIFO Operations
Rd
Rd
Base
Rd
End
a)
b)
Wr
Initial
c)
Wr
After 1 PutFifo
Rd
Wr
After 2 PutFifos
Rd
d)
e)
Wr
After 2 PutFifos and 1 GetFifo
Rd
f)
Wr
After 3 PutFifos and 1 GetFifo
Wr
After 4 PutFifos and 1 GetFifo
Note that with this approach there is always one FIFO element empty
(Figure 2.7). To use it, additional boolean flags would have to be
introduced to distinguish between FIFO full and FIFO empty conditions
(when the Rd pointer matches the Wr pointer). Such flags would have to
be manipulated both by the reader, GetFifo(), and the writer,
PutFifo(). This, in turn, would require locking the FIFO descriptor.
2-16
Host Messaging
BookL64364PG.fm5 Page 17 Friday, January 28, 2000 4:58 PM
In contrast, the implementation shown is completely safe from any race
conditions due to the asynchronous and overlapping execution of the
GetFifo() and PutFifo() commands by different CPUs. It is hazard
free because only the reader can modify the Rd pointer and only the
write command can modify the Wr pointer.
The main problem with this implementation is that it requires a shared
descriptor since both the reader and writer need to access it. As
discussed previously in Section 2.2.2, “FIFO Location,” transactions
through the PCI Bus, particularly reads, are expensive in terms of time
and should be avoided.
2.2.4.3 Software Controlled FIFO – Double Descriptors
You can avoid all read and many (but not all) write accesses to the PCI
Bus when:
•
the FIFO is located in the reader’s memory.
•
both the reader and writer maintain their local copies of a FIFO
descriptor.
Since secondary memory is fast, each master can read it quickly and the
only remaining PCI operations are:
•
for the writer - one write operation to update the reader’s local Wr
pointer
•
for the writer - one write operation to place data in the FIFO
•
for the reader - one write operation to update the writer’s local Rd
pointer
The FIFO descriptor has to be enhanced with one additional field as
shown in Figure 2.8.
Figure 2.8
Enhanced Fifo Descriptor Declaration
1 typedef struct {
2
ulong *Rd;
3
ulong *Wr;
4
ulong **Other;
5
ulong *Base;
6
ulong *End;
7} Fifo_t, *pFifo_t;
/* current position to read from */
/* current position to write to */
/* Fifo array base*/
/* end of the Fifo array */
Buffer Processing
2-17
BookL64364PG.fm5 Page 18 Friday, January 28, 2000 4:58 PM
Assuming that you implement the TxFifo from Figure 2.2 in Cell Buffer
Memory and that the APU FIFO descriptor is also located in Cell Buffer
Memory, the Cell Buffer Memory layout may look like Figure 2.9.
Figure 2.9
CBM Layout
8 struct {
9
Fifo_t TxFifo;
10
ulong TxFifoData[TX_FIFO_SIZE];
11
/* other stuff */
12 } CBM_t, *pCBM_t;
Similarly, the host TxFifo Descriptor may be located in host memory
(Figure 2.10):
Figure 2.10 TxFifo Descriptor Location
13 struct {
14
Fifo_t
TxFifo;
15
/* other stuff */
16 } SHM, *pSHM_t;
The following APU code initializes the APU TxFifo descriptor
(Figure 2.11):
Figure 2.11 APU TxFifo Descriptor Initialization
17
18
19
20
TxFifo->Base = &TxFifoData[0];
TxFifo->End = &TxFifoData[TX_FIFO_SIZE - 1];
TxFifo->Rd
= TxFifo->Wr = TxFifo->Base;
TxFifo->Other = &SHM.TxFifo.Rd;
The following host code initializes the host TxFifo descriptor
(Figure 2.12):
Figure 2.12 Host TxFifo Descriptor Initialization
21
22
23
24
TxFifo->Base = &CBM.TxFifoData[0];
TxFifo->End = &CBM.TxFifoData[TX_FIFO_SIZE - 1];
TxFifo->Rd
= TxFifo->Wr = TxFifo->Base;
TxFifo->Other = &CBM.TxFifo.Wr;
2-18
Host Messaging
BookL64364PG.fm5 Page 19 Friday, January 28, 2000 4:58 PM
The reader and writer routines are modified as follows (Figure 2.13):
Figure 2.13 Modified PutFifo() and GetFifo() Routines
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
int PutFifo(pFifo_t pFifo, ulong Data)
{
ulong *Ptr = (pFifo->Wr == pFifo->End) ? pFifo->Base : pFifo->Wr + 1;
if (Ptr == pFifo->Rd)
return 0;
*pFifo->Wr = Data;
/* ---- PCI write ---- */
pFifo->Wr = Ptr;
*pFifo->Other = Ptr;
/* ---- PCI write ---- */
return 1;
}
int GetFifo(pFifo_t pFifo, ulong *Data)
{
if (pFifo->Rd == pFifo->Wr)
return 0;
*Data = *pFifo->Rd;
pFifo->Rd = (pFifo->Rd == pFifo->End) ? pFifo->Base : pFifo->Rd + 1;
*pFifo->Other = pFifo->Rd;
/*---- PCI write ---- */
return 1;
}
This implementation reduces PCI operations to three writes per data
element. One write is for the data element transfer and the others are for
pointer updates. Obviously, you cannot eliminate the data element
transfer, although it is possible to reduce bandwidth requirements by
grouping multiple data elements in one PCI burst at the expense of
increased processing delay. However, there are ways to reduce the
pointer update traffic.
2.2.4.4 Eliminating Rd Pointer Update
The master Rd pointer update may be eliminated if a special element
value is used as a mark for an empty FIFO position. In the case of the
TxFifo containing Buffer Numbers, zero can be safely used as a mark
because Buffer Number zero is not used by the ATMizer II+ chip
hardware. With this assumption, the PutFifo() and GetFifo() routines are
rewritten as follows (Figure 2.14):
Buffer Processing
2-19
BookL64364PG.fm5 Page 20 Friday, January 28, 2000 4:58 PM
Figure 2.14 PutFifo() and GetFifo() without Rd Pointer Update
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int PutFifo(pFifo_t pFifo, ulong Data)
{
ulong *Ptr = (pFifo->Wr == pFifo->End) ? pFifo->Base : pFifo->Wr + 1;
if (Ptr == pFifo->Rd)
return 0;
*pFifo->Wr = Data;
/* ---- PCI write --- */
pFifo->Wr = Ptr;
return 1;
}
ulong GetFifo(pFifo_t pFifo)
{
ulong Data = *pFifo->Rd;
if (Data == 0)
return 0;
pFifo->Rd = (pFifo->Rd == pFifo->End) ? pFifo->Base : pFifo->Rd + 1;
*pFifo->Other = pFifo->Rd; /* --- PCI write --- */
return Data;
}
Note that the PutFifo() routine now requires only one PCI access to write
the data element at line 6. GetFifo() still needs a PCI write at line 16. You
cannot eliminate this second PCI write operation but can reduce its
frequency by performing it for a group of data elements instead of each
data element. If the group size is chosen to be half of the FIFO size, the
resulting data structure is called a ring and is described in detail in the
next section.
2.3 Rings
Rings are special data structures supporting very fast messaging
between the host and the APU.
At a minimum, two rings have to be maintained. The APU-to-host ring is
located in primary (PCI) memory and is used for both the RxFifo and the
TxDone FIFO. The host-to-APU ring is located in Cell Buffer Memory and
is used to implement both the TxFifo and the TxDone FIFO. As an
alternative, four rings (one per FIFO) may be built. Only the two ring
method is discussed here.
2.3.1 Ring Structure
Rings are described by the following Ring Descriptors (Figure 2.15):
2-20
Host Messaging
BookL64364PG.fm5 Page 21 Friday, January 28, 2000 4:58 PM
Figure 2.15 Ring Descriptors Declaration
1
2
3
4
5
6
7
8
typedef struct {
ulong *Ptr;
/* current element to be read (retrieved from Fifo) */
ulong *End;
/* end of the payload */
ushort *Credit;/* pointer to Credit field in sender memory */
ushort Size;
/* size of the Ring, in words */
ushort Count; /* total number of elements retrieved from the Fifo */
/* or current credit value */
} RingDesc_t, *pRingDesc_t;
Each ring has an associated Ring Descriptor and Ring Credit field. A
Ring Credit is an unsigned short integer located in the writer’s space and
updated by the reader to enable the writer to send more data. There are
two Ring Credits:
•
the APU_Host_Credit located in the Cell Buffer Memory
•
the Host_APU_Credit located in primary memory
There are two Ring Descriptors per ring. One is owned and maintained
by the writer and the other by the reader.
The CBM layout may look like ring structure shown in Figure 2.16:
Figure 2.16 CBM Ring Size
1
2
3
4
5
6
7
struct {
ulong APU_Host_Credit;
ulong Host_APU[HOST_APU_RING_SIZE];
/* Rx Fifo */
/* Tx Fifo */
/* Free cells */
}
/* for example 32 */
Similarly, in the primary memory, the layout might be like the one shown
in Figure 2.17:
Figure 2.17 Primary Memory Ring Size
8 struct {
9
ulong Host_APU_Credit;
10
ulong APU_Host[APU_HOST_SIZE];
11
/* other stuff */
12 }
Rings
/* may be big, like 256 */
2-21
BookL64364PG.fm5 Page 22 Friday, January 28, 2000 4:58 PM
The four Ring Descriptors are located in the memory of each master,
specifically the:
•
Host Ring Descriptors are located in host memory
•
APU Ring Descriptors are located in ATMizer II+ memory
The host cannot (and does not need to) access the APU Ring Descriptor
and vice versa. We call the host-to-APU ring the TxRing and the APUto-host ring the RxRing (although BuffNums of the other direction are
placed in both).
2.3.2 Ring Management
The writer places a data element at the current ring pointer. The current
ring pointer is the writer’s local variable. It is incremented after each write
operation and wraps down to the ring base address when the ring size
is reached.
The reader periodically checks the ring element to which its own local
ring index points. If the data element is not zero, the reader consumes
the data, clears the element to zero, and increments its pointer. It also
wraps down to the ring base address when the ring size is reached.
Since the indexes to the rings are kept separate by the reader and the
writer, it is possible to cause FIFO overflow in the case where one
processor overruns the other processor. We introduce credits to avoid
overflow. The reader gives the writer credits to enable the writer to write
more data. In order to reduce PCI Bus traffic, the reader gives credits in
bursts equal to half of the ring size each time.
To avoid race conditions, the writer keeps a private copy of the number
of words it has sent in the Count variable. The reader informs the writer
at each half-size of the FIFO that it is ready to get another batch by
placing the number of received elements in the *Credit variable. To avoid
FIFO overflow, the writer needs to verify that its Count value does not
reach the *Credit value.
2-22
Host Messaging
BookL64364PG.fm5 Page 23 Friday, January 28, 2000 4:58 PM
2.3.3 Ring Implementation (Initialization)
The assumptions in the following code are:
•
the ring size is a power of two,
•
the value zero is not used as a data element, and
•
all data elements of the rings are initially set to zero.
The initialization for the host looks like the following (Figure 2.18):
Figure 2.18 Ring Initialization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Ring_Desc_t
TxRing, RxRing;
TxRing.Ptr
TxRing.Size
TxRing.End
TxRing.Count
TxRing.Credit
=
=
=
=
=
&Host_APU[0];
/* points to CBM */
HOST_APU_RING_SIZE;
TxRing.Ptr +TxRing.Size;
0;
&Host_APU_Credit;
/* points to primary memory*/
RxRing.Ptr
RxRing.Size
RxRing.End
RxRing.Count
RxRing.Credit
*RxRing.Credit
=
=
=
=
=
=
&APU_Host[0];
/* points to primary memory */
APU_HOST_RING_SIZE;
RxRing.Ptr + RxRing.Size;
RxRing.Size;
&APU_Host_Credit;
/* points to CBM */
RxRing.Size;
When the host wants to put an element (data) in the TxRing,
(host->APU), it calls the following (Figure 2.19):
Figure 2.19 Host PutRing() Call
15 while (PutRing(&TxRing, Data) == 0)
16 ;
The PutRing() routine takes a pointer to the Ring Descriptor and
attempts to put a data element (ulong) into the ring. It returns one if it
succeeds and zero if it fails. (It fails when the ring overflows because the
receiver does not remove the elements in time.)
Note that if you have other tasks to perform, you can replace the while
loop with an if statement and then try again. As described above, the
host waits until there is a place in the ring.
Rings
2-23
BookL64364PG.fm5 Page 24 Friday, January 28, 2000 4:58 PM
When the host wants to retrieve an element from the RxRing,
(APU->host), it calls (Figure 2.20):
Figure 2.20 Host GetRing() Call
1
2
while ( (n = GetRing(&RxRing) == 0)
;
The GetRing() routine takes a pointer to the Ring Descriptor and returns
an element from the ring, or returns zero if the ring is empty. Again, if
you do not want to be stalled while waiting for data, replace the while
statement with an if statement.
Similarly, initialization for the APU is (Figure 2.21):
Figure 2.21 APU Ring Initialization
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Ring_Desc_t
TxRing, RxRing;
RxRing.Ptr
RxRing.Size
RxRing.End
RxRing.Count
RxRing.Credit
=
=
=
=
=
&APU_Host[0];
/* points to primary memory */
APU_HOST_RING_SIZE;
RxRing.Ptr +Rx Ring.Size;
0;
&APU_Host_Credit; /* points to CBM */
TxRing.Ptr
TxRing.Size
TxRing.End
TxRing.Count
TxRing.Credit
*TxRing.Credit
=
=
=
=
=
=
&Host_APU[0];
/* points to CBM */
HOST_APU_RING_SIZE;
TxRing.Ptr +Tx Ring.Size;
TxRing.Size;
&Host_APU_Credit; /* points to primary memory */
TxRing.Size;
When the APU wants to put an element in the RxRing it calls
(Figure 2.22):
Figure 2.22 APU PutRing() Call
17 if (PutRing(&RxRing, Data) == 0) {
18
/* here error code, this ring should never overflow */
19 }
and when it wants to get an element from the TxRing it calls
(Figure 2.23):
Figure 2.23 APU GetRing() Call
20 if ( (n = GetRing(&TxRing)) != 0) {
21
/* process the element */
22 }
2-24
Host Messaging
BookL64364PG.fm5 Page 25 Friday, January 28, 2000 4:58 PM
The source code for both routines follows (Figure 2.24):
Figure 2.24 GetRing() and PutRing() Routines
1 ulong GetRing(pRing_t Ring)
2 {
3
ulong Result = *Ring->Ptr;
4
if (Result != 0) {
5
*Ring->Ptr = 0;
6
if (++Ring->Ptr == Ring->End)
7
Ring->Ptr = Ring->End - Ring->Size;
8
if ((++Ring->Count & ((Ring->Size >> 1) - 1)) == 0 )/* half of the
ring */
9
*Ring->Credit = Ring->Count;
10
}
11
return Result;
12 }
13
14 int PutRing(pRing_t Ring, ulong Data)
15 {
16
if (*Ring->Credit == Ring->Count)
17
return 0;
18
*Ring->Ptr = Data;
19
if (++Ring->Ptr == Ring->End)
20
Ring->Ptr = Ring->End - Ring->Size;
21
Ring->Count++;
22
return 1;
23 }
Rings
2-25
BookL64364PG.fm5 Page 26 Friday, January 28, 2000 4:58 PM
2-26
Host Messaging
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 3
Scheduling
This chapter discusses an ATMizer II+ chip scheduling task. Detailed
information describing the Scheduler may be found in the L64364
ATMizer II+ ATM-SAR Chip Technical Manual. This chapter includes the
following sections.
•
Section 3.1, “Scheduling Invocation”
•
Section 3.2, “Scheduler Commands”
•
Section 3.3, “The Scheduling Process”
•
Section 3.4, “UBR Connections”
•
Section 3.5, “VBR Connections”
•
Section 3.6, “ABR Connections”
•
Section 3.7, “Local Congestion”
•
Section 3.8, “Source Code Listings”
This chapter uses simple type definitions:
typedef unsigned long ulong;
typedef unsigned short ushort;
typedef unsigned char uchar;
3.1 Scheduling Invocation
The primary task of a scheduling process is to decide which connection
should be serviced in the current cell slot. The scheduling process is not
concerned about any memory management or segmentation issues;
these are handled by other processes or by dedicated hardware units.
For the purposes of the scheduling process, each connection appears as
a logical entity with an infinite and continuous data buffer attached to it.
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
3-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
The primitive time unit in the scheduling process is a cell slot. A cell slot
is the time necessary to send one cell on a physical link (for example,
2.82 µs for OC-3 line rates). There are many methods of synchronizing
the ATMizer II+ chip’s scheduling process to physical time. Two
approaches, the Line Recovered Clock and the FIFO Full methods, are
described in the following sections.
3.1.1 Line Recovered Clock Synchronization
In this method, a clock recovered from the line is applied to one of the
ATMizer II+ chip’s timers. The timer is then polled or generates an
interrupt to trigger the scheduling process. If other tasks prevent the
scheduling process from servicing the timer interrupt, the timer handler
can increment a service counter. The scheduling process is called as
long as the counter is nonzero. The counter is decremented after each
call.
3.1.2 FIFO Full Synchronization
A simpler approach relies on the fact that, although the ATMizer II+ chip’s
transmit FIFO is not drained at a fixed clock rate due to UTOPIA
start/stop boundary conditions, the PHY device FIFO is drained at the
constant line rate. On average, the ATMizer II+ chip’s transmit FIFO drain
rate is equal to the line rate. Therefore, it is sufficient to fill the transmit
FIFO as fast as possible until it becomes full. The scheduling process is
called as long as the FIFO is not full. At each call, it puts exactly one cell
in the FIFO and advances its internal time counter. With this approach,
be careful to handle situations correctly when the scheduling process is
unable to create a cell because there is no data to send. An idle cell must
then be put in the FIFO to avoid violating the connection service contract
when data is available. Note that, in many practical situations, the
violation is of a very short duration. It is proportional to the size of the
transmit FIFO and is usually only of importance for CBR traffic. Explicit
generation of idle cells in the transmit FIFO make this technique
unusable for the multiPHY environment.
3-2
Scheduling
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
3.2 Scheduler Commands
The Scheduler offers three commands for scheduling connections. To
increase readability of the code, Scheduler commands have been
encapsulated in C macros. They are:
1
2
3
N = SCD_Serv();
SCD_Sched(N, T);
SCD_Tic();
3.2.1 SCD_Serv( ) Command
The SCD_Serv() command retrieves the number of the connection to be
serviced in the current cell slot from an internal register of the Scheduler.
After reading the register, the macro returns the connection number
immediately. No memory accesses are necessary. The Scheduler then
automatically fetches the next connection number to be serviced.
3.2.2 SCD_Sched( ) Command
The SCD_Sched(N, T) command schedules connection N for service at
cell slot T. The connection descriptor is inserted into a linked list if slot T
is already occupied. The insertion position depends on the Scheduler
mode.
In the Flat mode, all connections get equal priority so a new connection
is always appended to the end of the list. Since the Scheduler maintains
both a head and a tail pointer in the list in the calendar table, the
SCD_Sched() command is executed in constant time.
In Priority mode, the SCD_Sched() macro has to scan the list of
connections present in slot T and insert connection N based on its class.
The list is scanned until the Scheduler finds a connection, M, with a
higher class (and thus lower priority) than the class of connection N.
Then connection N is inserted before connection M. Due to the scanning
of the list, the execution time of SCD_Sched() is variable.
Extensive simulations have shown that lists of connections scheduled for
the same slot, except for the current cell slot, are very short. The average
length is less than one. This observation is true even when there is local
congestion and the aggregate rate of all connections exceeds the link
Scheduler Commands
3-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
rate. System behavior in the presence of local congestion is analyzed in
detail in Section 3.7, “Local Congestion.”
The SCD_Sched(0, T) form of the command is an indication to the
Scheduler that it should use the connection returned by the last
SCD_Serv() command. In Flat mode, commands SCD_Sched(N, T) and
SCD_Sched(0, T) are equivalent. However, the latter form is preferred in
Priority mode as it does not require the Scheduler to fetch the connection
class from the memory.
3.2.3 SCD_Tic( ) Command
The SCD_Tic() command is used to advance the Scheduler to the next
cell slot. As for the SCD_Sched() command, the execution time of
SCD_Tic() depends on the Scheduler mode. It is constant in Flat mode
and depends on the average length of the list in Priority mode.
3.3 The Scheduling Process
This section describes scheduling CBR, VBR, and ABR connections and
develops progressive application code along with the discussion.
Unspecified bit rate (UBR) connection scheduling is covered in Section
3.4, “UBR Connections.”
3.3.1 A Simple Scheduling Function
For an example of the use of Scheduler commands refer to Figure 3.1.
Figure 3.1
1
2
3
4
5
6
7
8
9
10
3-4
A Simple Scheduling Function
void TxCell(ulong aCell)
{
ulong N, T;
N = SCD_Serv();
EDMA_TxCell(N, aCell);
T = TimeNow + ACD[N].ICG;
SCD_Sched(0, T);
SCD_Tic();
TimeNow++;
}
Scheduling
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
As explained in Section 3.1, “Scheduling Invocation,” the TxCell()
routine (line 1 in Figure 3.1) is called when a cell should be sent. The
routine’s single parameter is a cell location in Cell Buffer Memory that
will be used for building the cell. The calling routine must first get a free
cell location before calling TxCell(). If there are no free cell locations,
the transmit FIFO is full and a cell cannot be sent.
In line 4, the SCD_Serv() command retrieves the number of the
connection to service in the current cell slot. Next (line 5), the APU
issues a command to the EDMA to send a cell from that connection. The
next service time is computed in line 6 by adding an intercell Gap (ICG)
to the current time value. The ICG is the inverse of the current
connection rate. It is stored in the APU Connection Descriptor (ACD) as
ACD[N].ICG where N is the connection number.
After a cell is sent and the new service time is computed, the
SCD_Sched() command in line 7 schedules the connection for service at
the new service time. Finally, the Scheduler time is advanced (line 8) by
issuing an SCD_Tic() command and the internal time is advanced (line
9) by incrementing the TimeNow variable.
It is easy to see that the previous code listing can be improved. It will be
used as a starting point, identifying and fixing the flaws one by one until
the desired behavior is achieved.
3.3.2 Scheduling Lag
The code for handling scheduling lag is shown in Figure 3.2. One
problem with the code is that when a connection is scheduled for service
at time T, it may not actually be serviced until some later time T1 > T. In
Flat mode, if slot T already has n connections scheduled for service then
connection N will actually be serviced at time T1 = T + n. In Priority
mode, even if slot T is initially empty, connection N may be pushed to
some time later by higher priority connections. This scheduling lag
effectively reduces the connection rate to a lower value.
To solve that problem, it is necessary to maintain the desired connection
service time, T, in the ACD. Time T is stored as ThTxTime (theoretical
transmit time). With these naming conventions, the scheduling function
can be enhanced to that shown in Figure 3.2.
The Scheduling Process
3-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
Figure 3.2
1
2
3
4
5
6
7
8
9
10
11
12
13
Handling Scheduling Lag
void TxCell(ulong aCell)
{
ulong N, T;
N = SCD_Serv();
EDMA_TxCell(N, aCell);
ACD[N].ThTxTime += ACD[n].ICG;
T = ACD[N].ThTxTime;
if ( T <= TimeNow)
T = TimeNow + 1;
SCD_Sched(0, T);
SCD_Tic();
TimeNow++;
}
Now, the new service time (line 6) is computed by adding the ICG to the
time when the cell should have been sent and not to the current time.
However, an additional precaution is necessary because the new service
time may actually be in the past (that is, less than current time). This
situation may arise when a connection is delayed, due to a transient local
congestion, more than one ICG. Lines 8 and 9 check for this condition
and schedule the connection at the next available time if it is true. Note
that the Scheduler does not allow scheduling connections at the current
slot. The earliest possible time is TimeNow + 1.
3.3.3 Rate Granularity
In addition to the issues discussed in the previous paragraphs, the code
in Figure 3.2 has another problem. The ICG as an integer variable
restricts the rates possible to achieve LCR/n (LCR is the line cell rate of
the physical link in cells/second, and n is the number of connections).
This limitation would, in fact, remove all need for the Scheduler since
similar results are easy to achieve with a set of hardware timers.
To overcome this problem, the ICG has to be stored as either a floatingpoint or fractional number. Although the floating-point format is easier to
manipulate in software, it would introduce a severe performance
bottleneck since the ATMizer II+ chip does not have a complete floating
point hardware unit. The APU has the necessary hardware module to
execute arithmetic operations in ATM Forum floating point format for rate
description, but this format has only nine bits for the mantissa.
3-6
Scheduling
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
In fact, the ICG calculations do not have to be performed with high
precision. A simple fractional format, for example 24.8, is sufficient. In the
24.8 format, 24 bits are used to express the integer part and 8 bits to
express a fractional part. This provides a precision of 1/256 or better than
0.4%. The theoretical transmit time should also use the same format.
Figure 3.3 shows an example of a connection scheduled with a fixed,
normalized rate of 0.3 (normalized means that the LCR is equal to 1 by
definition). The ICG is 1/0.3 = 3.333... or 0x0000.0355 in hexadecimal.
Figure 3.3
Connection Scheduled with Rate 0.3
0
2
ThTxTime = 0x0000
4
0x0355
6
8
0x06AA
10
12
0x09FF
14
0x0D54
16
18
0x10A9
The code performing fractional service time calculations is shown in
Figure 3.4.
Figure 3.4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Calculating Fractional Service Time
#define TIME_FRAC 8
void TxCell(ulong aCell)
{
ulong N, T;
N = SCD_Serv();
EDMA_TxCell(N, aCell);
ACD[N].ThTxTime += ACD[n].ICG;
T = ACD[N].ThTxTime;
if ( T <= TimeNow )
T = TimeNow + (1 << TIME_FRAC);
SCD_Sched(0, T >> TIME_FRAC);
SCD_Tic();
TimeNow += 1 << TIME_FRAC;
}
TIME_FRAC is a constant number of fractional bits. Line 1 sets it to eight
bits.
The Scheduling Process
3-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
Note that the TimeNow variable in line 11 of Figure 3.4 is now
incremented by (1 << TIME_FRAC). This is necessary in order to have
both ThTxTime and TimeNow in the same units. An alternative approach
would be to clear the eight most significant bits (TIME_FRAC) of TimeNow
before the comparison at line 10 is made.
3.3.4 Time Comparisons
A careful reader of line 10 might ask if the comparison is always valid.
TimeNow is continuously incremented and will eventually wrap down as
the result of arithmetic overflow. An example of a case for the scaled
down (8 bits instead of 32) version of ThTxTime and TimeNow is given
next. Since the fact that both variables are fractional is irrelevant here,
the integer values in shown in Table 3.1. will be used.
Table 3.1
Time Comparisons
TimeNow
ThTxTime
Delta
hex Delta
Result
3
1
2
0x002
Past
255
253
2
0x002
Past
254
0
254
0x0FE
Past
255
1
254
0x0FE
Past
2
0
2
0x002
Past
1
3
−2
0x1FE
Future
253
255
−2
0x1FE
Future
0
254
−254
0x102
Future
1
255
−254
0x102
Future
0
2
−2
0x1FE
Future
Although the real difference between TimeNow and ThTxTime varies
depending on the absolute value of both, the difference truncated to eight
bits remains the same. If the difference between TimeNow and ThTxTime
is interpreted as an 8-bit signed number, positive values mean that
ThTxTime is in the past and negative values mean that ThTxTime is in
the future, compared to TimeNow.
3-8
Scheduling
BookL64364PG.fm5 Page 9 Friday, January 28, 2000 4:58 PM
The following equation may be used to determine if time T is in the past:
Equation 3.1
(TimeNow - T) >= 0
The ‘>=’ is used in the comparison because the Scheduler doesn’t
support scheduling connections in the current cell slot (when the
difference is equal to zero). Equation 3.1 may be rewritten as follows:
Equation 3.2
(TimeNow - T + 1) > 0
However, one time unit in our system is numerically equal to (1 <<
TIME_NOW). The code in Figure 3.5 defines a special macro to render
the comparison more readable.
Figure 3.5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Handling Time Comparisons
#define TIME_FRAC 8
#define IsInPast(T)
( ((long) (TimeNow - (T) + (1 << TIME_FRAC))) > 0)
void TxCell(ulong aCell)
{
ulong N, T;
N = SCD_Serv();
EDMA_TxCell(N, aCell);
ACD[N].ThTxTime += ACD[n].ICG;
T = ACD[N].ThTxTime;
if ( IsInPast(T) )
T = TimeNow + (1 << TIME_FRAC);
SCD_Sched(0, T >> TIME_FRAC);
SCD_Tic();
TimeNow += 1 << TIME_FRAC;
}
The code in Figure 3.5 is immune to arithmetic overflow of TimeNow. Of
course if ThTxTime lags by more than 2^(31 - TIME_FRAC) cell slots
behind TimeNow, the comparison yields invalid results. The value above
represents 4.7 seconds for OC-3, far beyond what a local congestion
may create.
3.3.5 Stopping Connection Scheduling
So far, it assumed that there is always data to send from a connection
when the connection is serviced. Since this is rarely true in a real
application, it is necessary to handle situations when a connection
should be serviced but there is no data to send.
The Scheduling Process
3-9
BookL64364PG.fm5 Page 10 Friday, January 28, 2000 4:58 PM
The Scheduler is equipped to help the APU detect this situation. It reads
the VCD_Ctrl.BuffPres bit from the Virtual Connection Descriptor (VCD)
and returns it as SCD_BuffPres in bit position 31 (the sign bit) together
with the connection number. This explains why the link field (NextVCD)
used by the Scheduler is positioned just before the VCD_Ctrl field. In
addition, the EDMA is able to signal through a dedicated, on-chip,
signaling path to the Scheduler that the BuffPres bit has changed for a
given connection. This is necessary since the Scheduler reads the
BuffPres bit and the NextVCD field in advance to have it ready for the
APU when it issues the SCD_Serv() command. If the EDMA changes the
bit later, it signals the change to the Scheduler. This feature is equivalent
to cache snooping in microprocessor caches.
If there is no data to send, simply stop scheduling the connection and
continue to issue the SCD_Serv() command until a connection with data
to send is found or the list of connections scheduled for the current slot
is exhausted. This is illustrated in Figure 3.6.
Figure 3.6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
3-10
Stopping Connection Scheduling
#define
#define
#define
#define
TIME_FRAC 8
IsInPast(T) ( ((long) (TimeNow - (T) + (1 << TIME_FRAC))) > 0)
ConHasNoData(N) ((long) N > 0)
ConHasData(N)
((long) N < 0)
void TxCell(ulong aCell) {
ulong N, T;
do {
N = SCD_Serv();
} while (ConHasNoData(N));
if (ConHasData(N)) {
EDMA_TxCell(N, aCell);
ACD[N].ThTxTime += ACD[n].ICG;
T = ACD[N].ThTxTime;
if ( IsInPast(T) )
T = TimeNow + (1 << TIME_FRAC);
SCD_Sched(0, T >> TIME_FRAC);
}
else {
pCell_t pCell = (pCell_t) &CBM[aCell];
pCell->CDS = CDS_IDLE_CELL;
pCell->CellHdr = 0;
EDMA_TxCell(0, aCell);
}
SCD_Tic();
TimeNow += 1 << TIME_FRAC;
}
Scheduling
BookL64364PG.fm5 Page 11 Friday, January 28, 2000 4:58 PM
3.3.5.1 Connection Rescheduling
The code in Figure 3.6 has two interesting features. First, the
connections that have no data to send are effectively removed from the
calendar table and a question immediately arises as to when the
connections are reinserted in the calendar table (rescheduled).
Obviously, a connection should be rescheduled when it receives data to
send.
To implement that strategy, the APU would have to check the BuffPres
bit each time it requests that a buffer be attached to a VCD, that is, when
it issues a Buff command to the EDMA. Fortunately, this is unnecessary
since the EDMA performs this task automatically. When the BuffPres bit
goes from zero to one as a result of the buffer attachment, the EDMA
puts the connection number in the Buff Completion Queue. This event
might be polled or set up to generate an interrupt. In both cases, the APU
should reschedule the connection at the next available slot.
The code in Figure 3.6 may be installed as a vectored interrupt handler
for this task as shown in Figure 3.7.
Figure 3.7
1
2
3
4
5
Buff Completion Queue Interrupt Handler
void ServBuffComplQueue() {
ulong N = EDMA_ComplQueue();
ulong T = SCD_Now();
SCD_Sched(N, T + 1);
}
If this code is installed as an interrupt handler, additional code is required
to implement register save and restore. In line 2, the connection to
reschedule is retrieved from the EDMA Buff Completion Queue. The
code in line 3 finds the current scheduler time. The connection is
rescheduled to the next time slot (T + 1) in line 4.
Rescheduling the connection immediately when data becomes available
might be questionable. Since the connection is rescheduled at the next
available slot, it might seem that the traffic contract could be violated if
data is absent for a very short time and then available again. However,
detailed analysis in Figure 3.8 shows this is not a problem.
The Scheduling Process
3-11
BookL64364PG.fm5 Page 12 Friday, January 28, 2000 4:58 PM
Figure 3.8
Time
Connection Rescheduling
0
2
T
Data
BuffPres
Scheduled
A connection is scheduled at time 2 and a TxCell command sends the
last cell for the connection. The BuffPres bit is cleared some time after
and when the connection is serviced again at time 3, no data is available.
Cells are not sent and the connection is removed from the calendar.
Later, when data is available, the connection is rescheduled again at
time T. Therefore, the minimum time between T and 2 is always at least
one ICG.
3.3.5.2 Sending Idle Cells
As discussed in Section 3.1, “Scheduling Invocation,” when there is no
cell to send, an explicit idle cell is sent to avoid future contract violations.
This is easily accomplished by the code in lines 20-23 of Figure 3.6. The
pointer to Cell Buffer Free location (&CBM[aCell]) is type casted to an
ATMizer II+ chip’s cell structure (pCell-T).
The structure contains a 4-byte cell descriptor followed by a 4-byte cell
header and a 48-byte cell payload. If a timer is used to derive the cell
slot clock as discussed in Section 3.1.1, “Line Recovered Clock
Synchronization,” the ATM Cell Interface (ACI) in the ATMizer II+ chip or
the framer automatic idle cell generation may be used.
The cell descriptor is set in such a way that the CDS_Tbytes field has
the value 48, which when sent, instructs the ACI to clear all cell payload
to the UTOPIA Bus. The cell header is cleared, creating an explicit idle
cell. The cell is put into the ACI transmit FIFO by specifying the null
connection to the EDMA. Note that using the EDMA to put the idle cell
in the transmit FIFO guarantees correct cell ordering. If a cell is put
directly into the transmit FIFO, it might get in front of other cells that are
to be built by the EDMA through processing requests already present in
the EDMA Request Queue.
3-12
Scheduling
BookL64364PG.fm5 Page 13 Friday, January 28, 2000 4:58 PM
The ACI module may send idle cells from cell location 0 automatically
when the transmit FIFO is empty and the ACI_TxIdle bit is set in the
ACI_Ctrl register. However, this feature cannot be used if the scheduling
method described in Section 3.1.2, “FIFO Full Synchronization,” is used.
3.3.6 Race Conditions and Hazards
When there are two or more processing units using the same resource
(in this case the Scheduler), it is important to analyze the system’s
behavior for possible race conditions. Careful analysis of Figure 3.6 and
Figure 3.7 reveals that a race condition exists. Refer to Figure 3.9.
Figure 3.9
Race Conditions
0
1
2
3
4
5
6
7
BuffPres
T1
T2
After the last cell is sent at slot 1, the BuffPres bit is cleared at time T1.
Next, a new buffer is attached at time T2 and the code from Figure 3.7
is invoked. The code attempts to reschedule the connection previously
scheduled at time 2. This is a fatal failure and results in some lost
connections.
The solution in this case is to store an explicit flag in the ACD to signal
if the connection is currently scheduled or not. The code in Figure 3.7 is
modified as shown in Figure 3.10 to check the flag before rescheduling
the connection.
The Scheduling Process
3-13
BookL64364PG.fm5 Page 14 Friday, January 28, 2000 4:58 PM
Figure 3.10 Interrupt Handler without Race Condition
1
2
3
4
5
6
7
8
void ServBuffComplQueue() {
ulong N = EDMA_ComplQueue();
ulong T = SCD_Now();
if (!ACD[N].Scheduled) {
SCD_Sched(N, T + 1);
ACD[N].Scheduled = 1;
}
}
The code from Figure 3.6 is also modified as in Figure 3.11 to reset the
flag.
Figure 3.11 Resetting the Connection Scheduled Flag
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
3-14
#define
#define
#define
#define
TIME_FRAC 8
IsInPast(T) ( ((long) (TimeNow - (T) + (1 << TIME_FRAC))) > 0)
ConHasNoData(N) ((long) N > 0)
ConHasData(N) ((long) N < 0)
void TxCell(ulong aCell)
{
ulong N, T;
do {
N = SCD_Serv();
if (ConHasNoData(N))
ACD[N].Scheduled = 0;
} while (ConHasNoData(N));
if (ConHasData(N)) {
EDMA_TxCell(N, aCell);
ACD[N].ThTxTime += ACD[n].ICG;
T = ACD[N].ThTxTime;
if ( IsInPast(T) )
T = TimeNow + (1 << TIME_FRAC);
SCD_Sched(0, T >> TIME_FRAC);
}
else {
pCell_t pCell = (pCell_t) &CBM[aCell];
pCell->CDS = CDS_IDLE_CELL;
pCell->CellHdr = 0;
EDMA_TxCell(0, aCell);
}
SCD_Tic();
TimeNow += 1 << TIME_FRAC;
}
Scheduling
BookL64364PG.fm5 Page 15 Friday, January 28, 2000 4:58 PM
If the Buffer Completion Queue handler is invoked using interrupts, the
interrupts should be disabled immediately after entering the TxCell()
routine (after line 7). They may be enabled after exiting the do {} while
loop after line 12. This precaution is necessary to avoid the race
condition due to the TxCell() routine clearing the Scheduled bit while
the ServBuffComplQueue() routine is attempting to set it.
3.3.7 Scheduling ABR Connections
The code in Figure 3.10 and Figure 3.11 correctly handles Constant Bit
Rate (CBR) and Variable Bit Rate (VBR) connections. Of course, inverse
leaky bucket calculations would have to be performed to compute new
ICGs as described in Section 3.5.1, “PCR-Based Implementation,” but
the scheduling process is now complete.
However, the situation is slightly more complex with Available Bit Rate
(ABR) connections that may have Resource Management (RM) cells to
send even if there is no data to send. The detailed algorithm for handling
ABR connections is described in Section 3.6, “ABR Connections.” Here,
the discussion will concentrate only on the scheduling task.
The interface with the ABR specific code is handled using two functions:
•
int ABR_Send(ConNum, aCell, ...)
•
int ABR_Receive(ConNum, aCell, ...)
3.3.7.1 Sending a Cell from an ABR Connection
The ABR_Send() routine is called to send a cell from an ABR connection.
The routine decides if a Forward Resource Management (FRM) cell, a
Backward Resource Management (BRM) cell, or a data cell should be
sent. Next, it builds the appropriate cell in the cell buffer and sends it out.
The routine also updates the ICG in the ACD, increasing or decreasing
the connection rate according to the information present in the RM cells.
Even if the connection has no data to send, it may still have an FRM or
a BRM cell to send. This suggests a complete rewrite of the TxCell()
routine since the ConHasNoData() test can no longer be relied upon to
skip connections that do not require any processing. The new TxCell()
routine is shown in Figure 3.12.
The Scheduling Process
3-15
BookL64364PG.fm5 Page 16 Friday, January 28, 2000 4:58 PM
Figure 3.12 TxCell() Routine for Multiple Class Connections
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
3-16
#define TIME_FRAC 8
#define IsInPast(T) ( ((long) (TimeNow - (T) + (1 << TIME_FRAC))) > 0)
#define ConHasData(N) ((long) N < 0)
#define ConClass(N)
( ((N) >> 16) & 3)
typedef int QoS_Send_t(ulong ConNum, ulong aCell);
extern QoS_Send_t CBR_Send, VBR_Send, ABR_Send, UBR_Send;
QoS_Send_t *QoS_Send[] = {
CBR_Send, VBR_Send, ABR_Send, UBR_Send };
void TxCell(ulong aCell)
{
ulong
N, T;
do {
N = SCD_Serv();
if (N == 0) {
pCell_t pCell = (pCell_t) &CBM[aCell];
pCell->CDS = CDS_IDLE;
pCell->CellHdr = 0;
EDMA_TxCell(0, aCell);
break;
}
if (QoS_Send[ConClass(N)](N, aCell)) {
ACD[N].ThTxTime += ACD[N].ICG;
T = ACD[N].ThTxTime;
if ( IsInPast(T) )
T = TimeNow + (1 << TIME_FRAC);
SCD_Sched(0, T >> TIME_FRAC);
break;
}
else
ACD[N].Scheduled = 0;
} while (1);
SCD_Tic();
TimeNow += 1 << TIME_FRAC;
}
int CBR_Send(ulong ConNum, ulong aCell)
{
if (ConHasData(ConNum)) {
EDMA_TxCell(ConNum, aCell);
return 1;
}
return 0;
}
Scheduling
BookL64364PG.fm5 Page 17 Friday, January 28, 2000 4:58 PM
To process different traffic classes, four routines are defined in
Figure 3.12:
•
CBR_Send(...)
•
VBR_Send(...)
•
ABR_Send(...)
•
UBR_Send(...)
Each routine returns a boolean status code indicating if the connection
should be rescheduled. Status code 0 means that the connection should
not be rescheduled, usually because there was no data to send.
The type of the CBR/VBR/ABR/UBR_Send routine is declared as
QoS_Send_t in line 7. Also an array of four pointers to QoS functions is
defined and initialized in line 10.
The code starts with the same do {} while loop at line 16 as before.
This time, however, if the Scheduler does not have a cell to send, an idle
cell is built and sent out. Otherwise, the appropriate QoS_Send function
is called to return a status code.
The CBR_Send(...) function defined in line 41 simply checks if there is
data to send, sends a cell, and returns a status code. More complex
calculations involving recomputing the ICG are necessary for VBR and
ABR connections.
If required, the connection is unscheduled (line 34) and the search
continues for other connections to service in the current cell slot by
reentering the do {} while loop. Otherwise, the new ThTxTime is
computed and the connection is rescheduled. Finally in line 37, the time
is advanced.
3.3.7.2 Receiving an RM Cell
The ABR_Receive() routine is called when an RM cell has been received
on an ABR connection. The routine first determines if this is an FRM or
a BRM. In the case of a BRM, the routine updates the ACD by computing
the new connection rate and ICG, and then discards the cell. In the case
of an FRM, the required fields from the cell are stored in the ACD and
the cell is also discarded. However, as described in The ATM Forum
Traffic Management Specifications, v4.0, the ABR_Receive() routine may
The Scheduling Process
3-17
BookL64364PG.fm5 Page 18 Friday, January 28, 2000 4:58 PM
also choose to immediately send back (turn around) the FRM cell as an
out-of-rate cell (CLP = 1).
The last requirement (turning around an FRM cell and sending it out of
rate) is rather awkward to implement. At this time, the RM cell is present
in the cell buffer. If it is put immediately in the transmit FIFO, it may result
in a priority inversion since the out-of-rate cell may take the place of a
higher priority (for example, CBR) cell in the transmit direction. Note that
you could choose to simply discard out-of-rate cells as this is an allowed
option according to the ATM Forum specifications. This is not an efficient
solution (from the network perspective); there should at least be an
attempt to send the cell out.
Another option would be to build a separate FIFO of out-of-rate cells and
keep them in the cell buffer waiting for an empty slot. This approach is
not recommended for two reasons. First, it reduces the size of the
Receive FIFO which, in turn, increases the risk of cell loss. Second,
before sending an ABR cell, you would have to scan this new FIFO to
determine if an FRM cell from the same connection is waiting there. If
this is not done, there is the risk that an out-of-rate cell may sit in the
FIFO and its data age past usefulness.
The solution adopted is a compromise. Just one out-of-rate cell is
allowed to be saved temporarily in the cell buffer. If another out-of-rate
cell has to be saved, possibly from another connection, the previous cell
is discarded. This strategy has the following advantages:
•
avoids priority inversion as the cells are sent only by the sending task
which can correctly decide on priorities
•
keeps the Receive FIFO sufficiently large
•
simplifies discarding outdated RM cells as only the cell has to be
checked
The disadvantage, of course, is that out-of-rate cells (already at their
destination) will be discarded if the destination experiences a local
congestion or is close to congestion. However, this is a small price
compared with the simplifications listed above. Also, since the out-of-rate
cells are tagged (CLP=1), they have a higher probability of being
dropped by the network. The expanded TxCell() code is shown in
Figure 3.13.
3-18
Scheduling
BookL64364PG.fm5 Page 19 Friday, January 28, 2000 4:58 PM
Figure 3.13 TxCell() Routine Handling Out-of-Rate Cells
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#define
#define
#define
#define
TIME_FRAC 8
IsInPast(T) ( ((long) (TimeNow - (T) + (1 << TIME_FRAC))) > 0)
ConHasData(N) ((long) N < 0)
ConClass(N)
( ((N) >> 16) & 3)
typedef int QoS_Send_t(ulong ConNum, ulong aCell);
extern QoS_Send_t CBR_Send, VBR_Send, ABR_Send, UBR_Send;
QoS_Send_t *QoS_Send[] = {
CBR_Send, VBR_Send, ABR_Send, UBR_Send };
void TxCell(ulong aCell)
{
ulong
N, T;
do {
N = SCD_Serv();
if (N == 0) {
if (aOutOfRateCell) {
EDMA_TxCell(0, aOutOfRateCell);
aOutOfRateCell = 0;
ACI_Free(aCell);
}
else {
pCell_t pCell = (pCell_t) &CBM[aCell];
pCell->CDS = CDS_IDLE;
pCell->CellHdr = 0;
EDMA_TxCell(0, aCell);
}
break;
}
if (QoS_Send[ConClass(N)](N, aCell)) {
ACD[N].ThTxTime += ACD[N].ICG;
T = ACD[N].ThTxTime;
if ( IsInPast(T) )
T = TimeNow + (1 << TIME_FRAC);
SCD_Sched(0, T >> TIME_FRAC);
break;
}
else
ACD[N].Scheduled = 0;
} while (1);
SCD_Tic();
TimeNow += 1 << TIME_FRAC;
}
The Scheduling Process
3-19
BookL64364PG.fm5 Page 20 Friday, January 28, 2000 4:58 PM
The modified code shows that, before sending an explicit idle cell, line
18 should be checked to see if there is an out-of-rate cell waiting for
transmission. If there is one, it is sent and the current cell location is
returned to the ACI free cell list (line 21).
Note that, after sending a regular cell (line 32), the out of rate cell must
be checked to see if it has been outdated. However, since this can occur
only for the ABR connection, it can be handled in the ABR_Send()
routine. The CBR_Send() routine is not changed and is not repeated in
the listing.
3.3.7.3 Unscheduled ABR Connections
An FRM cell may also be received when the connection in the opposite
direction is not scheduled because it has no data to send. In this case
there are several options. The simplest option would be to immediately
put the received cell in the transmit FIFO, possibly after verifying that the
current ICG is respected. However, as explained above, this may result
in priority inversion, in which the turned-around cell takes the place of a
higher-priority cell in the other direction.
As already shown in Figure 3.12, this approach results in a slightly more
resource-consuming method of scheduling the connection so that the
correct priorities are respected and a full ABR source behavior is
implemented.
3.4 UBR Connections
The Undefined Bit Rate (UBR) connections are usually at the lowest
priority and should be serviced when no cells from higher priority QoS
connections can be sent. Within the UBR class, connections are serviced
round robin.
The ATM Forum Traffic Management Specifications allow you the option
of defining Peak Cell Rate (PCR) for UBR connections. If link rates are
collected during the signaling phase, it makes sense to set the PCR to
the minimum link rate since cells at faster rates are discarded anyway. In
this example, it is assumed that the PCR is not enforced. However, it will
be seen that the UBR scheduling code developed here can handle PCR.
3-20
Scheduling
BookL64364PG.fm5 Page 21 Friday, January 28, 2000 4:58 PM
To implement the nonPCR service policy, the UBR connections are kept
in a circular list. When a connection is serviced, the list pointer is
advanced to the next connection. There are several possible
implementations of this scheduling strategy. Two of them are to:
•
manage the UBR list in software
•
use the Scheduler to manage the UBR list
3.4.1 Managing the UBR List in Software
Managing the UBR list in software is an easy task but it consumes CPU
resources. The natural place to put the connection list pointer is in the
NextVCD field of the VCD. Note that VCDs and ACDs are two different
data structures. The EDMA uses VCDs to maintain SAR state variables.
The software uses ACDs to maintain rate-related variables. In both the
demonstration software and in this manual, ACDs are used only for the
transmit direction. Other applications also may require building ACDs for
the receive direction.
Retrieving a connection from a UBR list requires reading the NextVCD
field and the VCD_Ctrl.BuffPres bit. Since the EDMA may modify the
BuffPres bit at any time, the VCD_Ctrl field has to be read through
uncacheable memory space. Therefore, to avoid cache trashing, it is
probably also more efficient to read the NextVCD from the uncacheable
space. This results in two separate read operations from the secondary
memory, slowing down the APU considerably. However, for the sake of
the example, the code is developed in Figure 3.14.
Figure 3.14 Implementing a UBR Connection List
1
2
3
4
5
6
7
8
9
10
11
12
13
14
ulong UBR_List;
int UBR_Send(ulong aCell) {
ulong Head = UBR_List;
if (Head)
do {
ulong N = UBR_List;
UBR_List = VCD[N].NextVCD;
if (VCD[N].VCD_Ctrl.BuffPres) {
EDMA_TxCell(ConNum, aCell);
return 1;
}
} while (N != Head);
return 0;
}
UBR Connections
3-21
BookL64364PG.fm5 Page 22 Friday, January 28, 2000 4:58 PM
First, a static variable (UBR_List) is defined to hold the current pointer
of the list. Next, if the list is not empty, the connection is retrieved from
the current position and the list pointer is advanced. If the connection has
data to send, a cell is sent and status code 1 is returned, signalling that
a cell was sent. If not, then the pointer continues scanning for a full circle,
coming back to the starting position.
3.4.2 Managing UBR Connections Using the Scheduler
If a fixed ICG of one cell slot is used, and all UBR connections are put
in class 3 (lowest priority), the Scheduler can manage the UBR list in
hardware.
It might seem that this approach has a drawback when the Scheduler is
configured to operate in the Priority mode, where the calendar table
holds only a head pointer to the list of connections scheduled for the slot.
To schedule a new connection, the Scheduler has to scan the list in order
to put the connection at the end of the list of connections with the same
priority. This is not a problem for other QoS traffic sources, such as CBR
or ABR, where connection rates are typically much lower than the link
rate. For those classes, the ICG makes the lists quite short because
connections to service are spread over multiple slots. For UBR, however,
all connections are at the same slot and each time one is scheduled, the
Scheduler has to scan the entire list. The list can be very long.
Fortunately, this is not a serious problem. When an ICG of one is used
for all UBR connections, the long list is built at the current cell slot and
not at the next cell slot. To see how that occurs, assume that a long list
is present at the current slot and the next slot is empty. When a
connection is serviced, it is placed at the next slot. This is a fast
operation since the next slot is empty. Since a connection was serviced,
the time is advanced and the SCD_Tic() command is executed. The
Scheduler advances to the next slot, grabs the previously scheduled
connection and appends it at the end of the list for the current slot, which
is quite long. Since the Scheduler returned to the starting point, this
process repeats forever.
The key factor to the execution efficiency is that the Scheduler keeps
both head and tail pointers of lists for all four traffic classes in the current
slot in internal registers. It means that the execution time of the
SCD_Tic() command does not depend on the number of connections
in the current slot (it does of course depend on the number of
3-22
Scheduling
BookL64364PG.fm5 Page 23 Friday, January 28, 2000 4:58 PM
connections in the next slot) and the number of memory accesses is
minimized.
Given these considerations, the Scheduler is used to manage the UBR
connection list. The pointer to the UBR_Send() routine is already
installed in the QoS_Send array (line 10 of Figure 3.13), so the only task
remaining is to define the routine itself. See Figure 3.15.
Figure 3.15 Managing UBR Lists with the Scheduler
1
2
3
4
5
6
7
8
int UBR_Send(ulong ConNum, ulong aCell)
{
if (ConHasData(ConNum)) {
EDMA_TxCell(ConNum, aCell);
return 1;
}
return 0;
}
If the ICG of UBR connections is set to zero, the code at line 34 of
Figure 3.13 takes care of rescheduling the connection to the next slot.
In fact, looking carefully at the UBR_Send() routine, it can be seen that
it is exactly the same as the CBR_Send() routine. The only differences
are the class and the ICG value. To save instruction memory, just one of
them may be used as shown in Figure 3.16
Figure 3.16 UBR_Send and CBR_Send Combined
1
2
3
4
#define UBR_Send CBR_Send
QoS_Send_t *QoS_Send[] = {
CBR_Send, VBR_Send, ABR_Send, UBR_Send };
Now it is easy to see that if a PCR enforcement is required for a UBR
connection, the ICG is simply set to 1/PCR instead of zero.
UBR Connections
3-23
BookL64364PG.fm5 Page 24 Friday, January 28, 2000 4:58 PM
3.5 VBR Connections
The VBR connections usually are scheduled using timers running at
connection PCRs. Many existing SAR devices have a finite set of timers
and a connection rate has to be approximated by the closest timer rate.
Other SARs use a bandwidth table approach where connections to be
serviced are kept in a table with each entry corresponding to a cell slot.
Bandwidth tables are primitive versions of the Scheduler calendar table
with two important differences:
•
Bandwidth tables are static. They are set at initialization time and not
modified during run time. The calendar table is created dynamically
as the connections are rescheduled.
•
The bandwidth table holds just one entry while the calendar table
holds a list of connections. Since bandwidth tables are static, any
conflicts are resolved at initialization. A conflict is a situation where
more there one connection should be serviced in one cell slot. The
calendar table resolves conflicts by delaying connection servicing
during run time.
3.5.1 PCR-Based Implementation
The Scheduler module in the ATMizer II+ chip allows very easy
implementation of VBR scheduling. A connection ICG is set to 1/PCR.
When a connection is to be serviced, a leaky bucket test is performed.
If the test is positive, a cell is sent. Otherwise, another connection is
tested. The leaky bucket test can be implemented as shown in
Figure 3.17.
3-24
Scheduling
BookL64364PG.fm5 Page 25 Friday, January 28, 2000 4:58 PM
Figure 3.17 A Leaky Bucket Routine
1
2
3
4
5
6
7
8
9
10
11
12
int LeakyBucket(ulong N) {
long X = ACD[N].Bucket;
X -= TimeNow - ACD[N].LstCmplTime;
if ( X <= ACD[N].Limit) {
if (X < 0)
X = 0;
ACD[N].Bucket = X + ACD[N].Increment
ACD[N].LastComplTime = TimeNow;
return 1;
}
return 0;
}
This routine is called to check if connection N (passed as a parameter)
is allowed to send a cell. First (line 2), the current contents of the
connection bucket is assigned to a temporary variable. Next (line 3), a
difference between the current time and the last time a cell was sent from
this connection is subtracted from the bucket. If the bucket value is less
than the limit, a cell is sent, the bucket is incremented, and the routine
returns 1 to signal that the test is positive.
This algorithm has to be executed at every PCR event to make sure that
the cell under consideration is conforming. This is inefficient in the
following circumstances:
1. If the data is being sent into the network at a rate close to the
Sustainable Cell Rate (SCR), which is the case under steady
conditions, then the leaky bucket test would satisfy X > Limit at many
of the PCR events. Hence for every PCR/SCR execution of the leaky
bucket algorithm, one cell at most is able to get through at the SCR.
In other words, the algorithm must be invoked PCR/SCR times to
send one cell, leading to inefficient utilization of resources.
2. If an application temporarily has no data to send, the leaky bucket
algorithm must continue to run at every PCR event, which wastes
cycles.
The second problem can be resolved easily by suspending the
scheduling of connections with no data to send, much the same as was
done for CBR connections. To resolve the first problem, the leaky bucket
algorithm must be revised.
VBR Connections
3-25
BookL64364PG.fm5 Page 26 Friday, January 28, 2000 4:58 PM
3.5.2 SCR-Based Implementation
To avoid invocations at the PCR or SCR, every time a cell is successfully
transmitted the nearest time at which the next conforming cell can be
transmitted is computed. Then, instead of executing the leaky bucket
algorithm at every PCR event, the APU can wait until the earliest
conforming time is reached to transmit the cell.
Before the APU code is developed, first look at what happens when a
leaky bucket algorithm is used to check whether a received cell is
conforming. A cell is conforming if:
Equation 3.3
Bucket - (ArrivalTime - LastComplTime) <= Limit
In addition, each time a conforming cell is received, the bucket is updated
as follows:
Equation 3.4
Bucket -= (ArrivalTime - LastComplTime) Bucket
+= Increment
Therefore, if to compute the earliest time, T, when a conforming cell may
be sent, it is necessary to solve the following equation for T:
Equation 3.5
Bucket - (T - TimeNow) = Limit
Equation 3.6
T = Bucket + TimeNow - Limit
Since the ICG is the difference between current time and the next cell
transmission time, it is computed easily as:
Equation 3.7
ICG = Bucket - Limit
Note that TimeNow is used in the above equations instead of ThTxTime
as for other classes of service. The leaky bucket calculations
automatically adjust for the lag.
Figure 3.18 is an enhanced version of Figure 3.17. If there is data to be
sent, the code first updates (line 4) the contents of the bucket. The
LastCmplTIme can be computed easily as the difference between
ThTxTime and ICG. The code then computes the new ICG (line 6)
corresponding to the earliest time a conforming cell from the same
connection can be sent. The newly computed ICG is compared to 1/PCR
(line 7) to avoid sending cells at rates exceeding PCR. This is a crude
3-26
Scheduling
BookL64364PG.fm5 Page 27 Friday, January 28, 2000 4:58 PM
way to perform this check; a more elaborate way would be to execute a
second leaky bucket calculation for PCR conformance.
Note that ThTxTime is updated to the current time (line 8) so that the
next cell is scheduled after the current ICG.
Figure 3.18 An SCR-Based Leaky Bucket Algorithm
1
2
3
4
5
6
7
8
9
10
11
12
int VBR_Send(ulong N, ulong aCell) {
if (ConHasData(N) {
N = ConConNum(N);
ACD[N].Bucket -= TimeNow - (ACD[N].ThTxTime - ACD[N].ICG);
ACD[N].Bucket = maxi(0, ACD[N].Bucket) + ACD[N].Increment;
ACD[N].ICG = ACD[N].Bucket - ACD[N].Limit;
ACD[N].ICG = maxi(ACD[N].ICG_PCR, ACD[N].ICG);
ACD[N].ThTxTime = TimeNow;
return 1;
}
return 0;
}
3.6 ABR Connections
For Available Bit Rate (ABR) service, the behavior of an end system is
governed by a set of rules for both source and destination end systems.
The rules are defined in The ATM Forum Traffic Management
Specifications, v4.0 for rate-based flow control. The basic idea of
rate-based flow control is to send special cells at regular intervals. These
special cells, called Resource Management (RM) cells, are used to probe
the state of the network. As the RM cells travel through the network,
following the same route as the data cells, ATM switches may change
their contents. When an RM cell arrives at its destination, it is turned
around and sent back to the source following the same route in the
opposite direction. Finally, when the RM cell returns to the source, the
source has to modify the connection rate based on the information
inserted in the RM cell by the switches along the connection route.
Since the source and destination rules, and the associated pseudocode,
are clearly described in the ATM Forum specifications, it is not necessary
to build the ABR code progressively as was done for the preceding
cases. Instead, fully commented C code implementing the ABR_Send()
and the ABR_Receive() functions is given in Section 3.8, “Source Code
ABR Connections
3-27
BookL64364PG.fm5 Page 28 Friday, January 28, 2000 4:58 PM
Listings.” The C code is quite faithful to the pseudocode given in the
Traffic Management Specifications except for the following differences:
•
•
•
Variable name choices in the ATM Forum pseudocode are quite poor.
To make the code easier to read, the following variable names are
used:
–
Count is replaced by InRateCell
–
Turn-around is replaced by PresBRM
–
First-turn is replaced by LastWasFRM
–
Unack is replaced by FRM_sinceBRM
The following ABR parameters are set to the constant values
specified below:
–
Nrm = 32
–
Trm = 100 ms
–
Mrm = 2
There is no support for out-of-rate Forward Resource Management
cells. Source behavior No. 11 specifies that FRM cells may be sent
out of rate at a rate not exceeding TCR (Tagged Cell Rate).
The pseudocode of the Traffic Management Specifications, v4.0,
Appendix I, chose to implement this behavior by sending out-of-rate
FRM cells only if the Allowed Cell Rate (ACR) is below TCR. This
may not be the best course of action since it stops any data traffic.
Moreover, it requires that a separate data structure be used to
schedule the out-of-rate FRM cells, with its accompanying costs in
greater memory usage. In light of this, it was decided that the
advantages of supporting out-of-rate FRM cells at TCR (TCR = 10
cells/s) were not worth the cost and decided not to support them in
our implementation.
3-28
•
Section 3.8.9, “Transmit and Receive ABR Cells (ABR.c),” contains
the code that discards out-of-date BRM cells waiting for
transmission. This is also a deviation from the ATM Forum
pseudocode. A justification for the deviation was given in Section
3.3.7.2, “Receiving an RM Cell.”
•
Section I.7 of Appendix I of the Traffic Management Specifications
has a detailed discussion of the options available for turning around
FRM cells at the destination. There are five distinct implementations
Scheduling
BookL64364PG.fm5 Page 29 Friday, January 28, 2000 4:58 PM
that maintain compliance with the Traffic Management Specifications,
namely:
1. The newly arrived cell is sent as an out-of-rate BRM cell in
addition to being scheduled for in-rate transmission.
2. The old cell is sent as an out-of-rate BRM cell and the
newly arrived cell is scheduled for in-rate transmission.
3. The newly arrived cell is scheduled for in-rate transmission
and the old cell is dropped.
4. Two copies of the newly arrived cell are scheduled for
in-rate transmission.
5. Both the old cell and the newly arrived cell are scheduled
for in-rate transmission.
The implementation described in Section 3.3.7.2, “Receiving an RM
Cell,” does not strictly fit in any of these five categories. It lies somewhere
between options 1 and 3. If the link is lightly loaded, then the
implementation approaches the behavior of option 1. If the link is heavily
loaded, it approaches the behavior of option 3. The discussion in the
Traffic Management Specifications, Section I.7.1 of Appendix I
discourages the use of option 3, since the analysis given there leads to
the conclusion that the rate of BRM cells would be much lower than that
of FRM cells and lead to a decrease in responsiveness. However, this
analysis is believed to be inaccurate, particularly for the case when
ACRbck = 0. The text claims that there will be no flow of BRM cells at all
under option 3. As explained in the next bullet, this is not the case in our
implementation. When ACRbck = 0, the rate of in-rate BRM cells should
approach 1/32 of the rate of FRM cells. Under this condition, option 3 still
leads to more acceptable performance, even in the presence of heavy
link traffic.
An aspect of destination behavior that is not very clear from the
pseudocode given in Appendix I of the Traffic Management
Specifications is handling the FRM cells that arrive when the connection
is not scheduled for transmission. If the connection is immediately
scheduled for transmission for the next cell slot, as recommended, this
may lead to the flow of in-rate BRM cells at a rate that cannot be policed
by the return path congestion-control mechanisms. To avoid this, invoke
the connection rescheduling rules described in Section 3.3.5.1,
“Connection Rescheduling.” Under those rules, the connection is
ABR Connections
3-29
BookL64364PG.fm5 Page 30 Friday, January 28, 2000 4:58 PM
rescheduled for the next transmission (even though there may be no data
to send) and, if another FRM cell arrives before the next transmission
instant, it awaits its turn so that it can be sent in-rate.
To optimize the cache performance of the architecture, the contents of
the VCD must fit within 32 bytes since this also happens to be the size
of the cache line in the 4010 RISC processor. The code in Section 3.8.9,
“Transmit and Receive ABR Cells (ABR.c)” contains the details of how
this is achieved.
3.7 Local Congestion
Local congestion is the situation when the sum of the active connection
rates exceed the output link bandwidth. A local congestion may be
transient causing a small buildup of connection lists in the calendar table,
or of long duration causing important delays in connection service times.
In the presence of local congestion, some connections are serviced at
rates lower than requested. In this section, system behavior is analyzed
in the presence of local congestion.
3.7.1 Fairness
Normally, all connections are serviced according to their rates. In the
presence of a congestion, some connections have their actual rates
reduced. It is important that the rate reductions satisfy the following
requirements:
•
Priorities are respected. Rates of lower priority connections are
reduced, eventually to zero, before the rates of higher priority
connections are reduced.
•
Rates are reduced in fair manner. An example of fairness criteria is
the maximum-minimum fairness defined in the Traffic Management
Specifications.
To verify that the scheduling algorithm described in this chapter satisfies
the above requirements, a series of simulations were performed. A very
important result of these simulations is that priorities are respected and
maximum-minimum fairness is achieved. One set of simulations with the
achieved results is described in the following paragraphs.
3-30
Scheduling
BookL64364PG.fm5 Page 31 Friday, January 28, 2000 4:58 PM
In this set of simulations, there are two constant sets of connections
assigned to class 0 and 1. Each set is composed of ten connections and
uses 30% of the link bandwidth. The third set is composed of ten
connections belonging to class 2 and is requesting an increasing share
of the bandwidth. The normalized aggregate rates of the class 2
connections are set to 0, 0.3, 0.4, 0.5, 0.6, 0.9, and 1.2. The total
requested link utilization is thus 0.6, 0.9, 1.0, 1.1, 1.2, 1.5, and 1.8. Since
the actual link utilization cannot exceed 1.0, the actual rates of the class
2 connections are decreased by the system, while classes 0 and 1 are
unaffected.
Table 3.2 shows the results of the simulations for classes 0 and 1. The
column Actual represents the actual connection rate, the column
Requested is the requested connection rate, and the column Error is the
difference between them. For reference, the requested ICG (inverse of
requested rate) is shown in the last column.
Table 3.2
Simulation Results for Class 0 and 1 Connections
ConNum
Class
Actual
Requested
Error
Req. ICG
1
0
0.0080
0.0071
-12.68%
140.85
2
0
0.0100
0.0096
-4.17%
104.17
3
0
0.0510
0.0511
0.20%
19.57
4
0
0.0410
0.0404
-1.49%
24.75
5
0
0.0410
0.0404
-1.49%
24.75
6
0
0.0400
0.0392
-2.04%
25.51
7
0
0.0150
0.0141
-6.38%
70.92
8
0
0.0250
0.0250
0.00%
40.00
9
0
0.0290
0.0289
-0.35%
34.60
10
0
0.0440
0.0442
0.45%
22.62
11
1
0.0130
0.0123
-5.69%
81.30
12
1
0.0350
0.0347
-0.86%
28.82
13
1
0.0060
0.0051
-17.65%
196.08
(Sheet 1 of 2)
Local Congestion
3-31
BookL64364PG.fm5 Page 32 Friday, January 28, 2000 4:58 PM
Table 3.2
Simulation Results for Class 0 and 1 Connections (Cont.)
ConNum
Class
Actual
Requested
Error
Req. ICG
14
1
0.0190
0.0183
-3.83%
54.64
15
1
0.0380
0.0381
0.26%
26.25
16
1
0.0190
0.0187
-1.60%
53.48
17
1
0.0390
0.0397
1.76%
25.19
18
1
0.0570
0.0573
0.52%
17.45
19
1
0.0280
0.0284
1.41%
35.21
20
1
0.0470
0.0475
1.05%
21.05
(Sheet 2 of 2)
The differences between requested and actual rates are small.
Connection 13 has a high error which is actually a simulation artifact due
to the short simulation time (1000 slots compared to an ICG of 196
slots). With that exception, all connections are serviced at actual rates
close to the requested rates in spite of a local congestion experienced
by lower priority connections in class 2.
The situation is different for class 2 connections that oversubscribe the
link. See Table 3.3.
3-32
Scheduling
BookL64364PG.fm5 Page 33 Friday, January 28, 2000 4:58 PM
Table 3.3
Simulation Results for Class 2 Connections
Actual
Req.
Error
Req.
ICG
Actual
Req.
Error
Req.
ICG
21
0.0250
0.0248
-0.81%
40.32
0.0340
0.0348
2.30%
28.74
22
0.0200
0.0204
1.96%
49.02
0.0450
0.0464
3.02%
21.55
23
0.0280
0.0279
-0.36%
35.84
0.0340
0.0351
3.13%
28.49
24
0.0490
0.0498
1.61%
20.08
0.0310
0.0319
2.82%
31.35
0.0300
0.0303
0.99%
33.00
0.0420
0.0429
2.10%
23.31
26
0.0440
0.0451
2.44%
22.17
0.0590
0.0606
2.64%
16.50
27
0.0120
0.0115
-4.35%
86.96
0.0510
0.0530
3.77%
18.87
28
0.0510
0.0528
3.41%
18.94
0.0030
0.0026
-15.38%
384.62
29
0.0150
0.0155
3.23%
64.52
0.0300
0.0311
3.54%
32.15
30
0.0210
0.0218
3.67%
45.87
0.0600
0.0617
2.76%
16.21
21
0.0550
0.0769
28.48%
13.00
0.0530
0.0658
19.45%
15.20
22
0.0540
0.0617
12.48%
16.21
0.0530
0.0571
7.18%
17.51
23
0.0550
0.0806
31.76%
12.41
0.0060
0.0058
-3.45%
172.41
24
0.0020
0.0015
-33.33%
666.67
0.0550
0.1022
46.18%
9.78
0.0340
0.0345
1.45%
28.99
0.0540
0.0846
36.17%
11.82
26
0.0540
0.0741
27.13%
13.50
0.0040
0.0041
2.44%
243.90
27
0.0200
0.0206
2.91%
48.54
0.0100
0.0103
2.91%
97.09
28
0.0550
0.0827
33.49%
12.09
0.0530
0.0637
16.80%
15.70
29
0.0400
0.0411
2.68%
24.33
0.0540
0.1303
58.56%
7.67
30
0.0250
0.0262
4.58%
38.17
0.0530
0.0761
30.35%
13.14
Set
25
25
0.9
1.1
Set
1.0
1.2
(Sheet 1 of 2)
Local Congestion
3-33
BookL64364PG.fm5 Page 34 Friday, January 28, 2000 4:58 PM
Table 3.3
Simulation Results for Class 2 Connections (Cont.)
Actual
Req.
Error
Req.
ICG
Actual
Req.
Error
Req.
ICG
21
0.0550
0.1361
59.59%
7.35
0.0110
0.0107
-2.80%
93.46
22
0.0550
0.1892
70.93%
5.29
0.0480
0.1597
69.94%
6.26
23
0.0450
0.0467
3.64%
21.41
0.0470
0.1027
54.24%
9.74
24
0.0540
0.1793
69.88%
5.58
0.0100
0.0095
-5.26%
105.26
0.0540
0.1692
68.09%
5.91
0.0470
0.1766
73.39%
5.66
26
0.0260
0.0270
3.70%
37.04
0.0470
0.0887
47.01%
11.27
27
0.0540
0.0993
45.62%
10.07
0.0470
0.1918
75.50%
5.21
28
0.0110
0.0111
0.90%
90.09
0.0460
0.0790
41.77%
12.66
29
0.0200
0.0200
0.00%
50.00
0.0460
0.2087
77.96%
4.79
30
0.0210
0.0220
4.55%
45.45
0.0460
0.1727
73.36%
5.79
Set
25
1.5
Set
1.8
(Sheet 2 of 2)
When the link capacity is not exceeded (sets 0.9 and 1.0 in the table),
class 2 connections are scheduled at rates close to those requested.
However, in the presence of local congestion (sets 1.1 and above), rates
of some connections are reduced.
It is interesting to analyze how the connection rates are reduced. The
aggregate link bandwidth available for class 2 connections is 0.4 and,
since there are ten class 2 connections, the fair share of the link is 0.04
per connection. All connections with rates below the fair share, which are
called conforming connections, are satisfied at their requested rates. The
rates of all conforming connections are then subtracted from the
available link bandwidth (0.4 in this example). The nonconforming
connections equally share the result of the subtraction.
3-34
Scheduling
BookL64364PG.fm5 Page 35 Friday, January 28, 2000 4:58 PM
A connection is conforming if and only if:
L
r ≤ S = -------NC
Equation 3.8
where
and
r
S
L
NC
is
is
is
is
the
the
the
the
actual rate of the connection
fair share
link’s total bandwidth
total number of connections on the link
The actual rate of a nonconforming connection, j, may be calculated as
follows:
L–
rj =
Equation 3.9
where
and
rj
ri
K
NK
is
is
is
is
∑
ri
i
∈K
-------------------------
the
the
the
the
NC – NK
actual rate of nonconforming connection j
actual rate of connection i
set of all conforming connections
number of conforming connections
3.7.2 List Lengths
Another area of concern in the presence of local congestion is the length
of the calendar lists, particularly in Priority mode. Intuitively, one may
think that the lists become longer, slowing down Scheduler operations.
Since the Scheduler has to scan the lists in Priority mode, the execution
time is proportional to the average list length.
Fortunately, in this case, the intuition is wrong. In fact, the list lengths
average less than one, except for the current cell slot. To understand how
this works, first consider system behavior in the absence of local
congestion and assume that the number of active connections does not
change. This is equivalent to saying that all scheduled connections have
infinite buffers of data to send. When the link utilization is less than 1.0,
some cell slots are empty and the calendar table is sparse. As the link
utilization increases, more and more slots are nonempty. When the link
utilization exceeds 1.0 and the system enters a local congestion state,
the list of connections at the current slot starts to grow, creating a wave
propagating throughout the calendar table. Slots in front of the wave have
Local Congestion
3-35
BookL64364PG.fm5 Page 36 Friday, January 28, 2000 4:58 PM
the list length decreasing while the list length at the current slot
increases.
Simulations described in the previous section were used to extract the
average length of connection lists at the current cell slot and the next
nine cell slots. As shown in Table 3.4, in the absence of local congestion,
the list length at the current slot approaches 1.0 while subsequent slots
have lengths proportional to the sum of all rates, called the actual link
rate. When the actual link rate increases to the maximum link rate (1.0),
the average list length also increases. In the presence of local
congestion, the list length of the current slot increases further, while the
list lengths at subsequent slots decrease.
Table 3.4
Calendar List Length for Varying Link Utilizations
Link
Now+0 Now+1 Now+2 Now+3 Now+4 Now+5 Now+6 Now+7 Now+8 Now+9
0.6
0.95
0.61
0.61
0.61
0.61
0.60
0.60
0.60
0.60
0.60
0.9
2.39
0.91
0.90
0.90
0.90
0.90
0.90
0.90
0.90
0.89
1.0
5.13
0.99
0.98
0.97
0.95
0.94
0.92
0.90
0.87
0.86
1.1
7.62
0.76
0.75
0.75
0.75
0.74
0.73
0.72
0.72
0.71
1.2
7.98
0.66
0.65
0.65
0.65
0.65
0.64
0.64
0.64
0.64
1.5
7.96
0.73
0.73
0.72
0.71
0.71
0.71
0.70
0.70
0.69
1.8
9.15
0.64
0.64
0.64
0.64
0.64
0.63
0.63
0.63
0.63
2.1
10.10
0.64
0.64
0.64
0.63
0.63
0.63
0.63
0.63
0.63
3.7.3 Detecting a Local Congestion
In some applications, it may be necessary to detect local congestion.
This is achieved easily by monitoring the connection lag, that is, the
difference between the Theoretical Transmit Time stored in the ACD and
the actual time. If the difference exceeds a threshold, the local
congestion state is declared.
3.7.4 Minimum Cell Rate Guarantees
One of the ABR parameters is the Minimum Cell Rate (MCR). An ABR
connection is guaranteed to be serviced at minimum at the MCR even in
3-36
Scheduling
BookL64364PG.fm5 Page 37 Friday, January 28, 2000 4:58 PM
the presence of network congestion. The guarantee is possible due to
the connection-oriented nature of ATM which verifies that enough
bandwidth is available during call setup. If the network cannot satisfy
MCR, the new call is rejected.
An ATM end system might also perform a similar verification to ensure
that the sum of all MCRs does not exceed the outgoing link bandwidth
minus the rates of higher priority connections. Alternatively, this check
may be performed by an ingress switch.
Even if the call setup verifications are performed, it is not sufficient to
guarantee MCR during run time in the presence of local congestion. To
understand that, group connections into two sets, set K with connections
that have an MCRK > 0 and set M with connections that have an
MCRM = 0. In the absence of local congestion, MCRs for set K are
respected due to the appropriate rules in the Traffic Management
Specifications. In the presence of local congestion, it is preferable that
the rates of all nonconforming connections (as defined in Section 3.7.1,
“Fairness”) to be decreased in a fair manner unless this decrease results
in a rate lower than MCR, in which case the rate should be set to MCR.
However, the actual rates are limited by Equation 3.9 on page 3-35,
which does not take into account MCR. In other words, connections with
MCR = 0 take outgoing link bandwidth from connections with MCR > 0,
resulting in MCR violations.
One possible solution to this problem is to put set K (connections with
MCR > 0) in a separate class with higher priority than set M (connections
with MCR = 0). Although simple, this solution results in an unfair situation
for set M. Then the algorithm always would satisfy set K in full and
reduce the rates of set M, while the desired behavior is to reduce rates
of both sets K and M unless the decrease results in a rate lower than
MCR.
A more complex solution is to dynamically switch connections between
two priority classes in run time. For that to work, the Real Cell Rate must
be measured. The Real Cell Rate (RCR) is defined as the rate observed
on the outgoing link as opposed to the Actual Cell Rate (ACR) which is
governed by ABR source and destination rules. The measurement may
be performed by computing inverses of real ICGs and averaging them
over some time. If the Real Cell Rate drops below MCR, the connection
is moved to a higher priority class. It returns to a lower priority class
Local Congestion
3-37
BookL64364PG.fm5 Page 38 Friday, January 28, 2000 4:58 PM
when the Real Cell Rate exceeds MCR. Sufficient hysteresis should be
introduced to avoid oscillations.
This manual does not contain the C code necessary to implement the
MCR guarantees.
3.7.5 MultiPHY Operation
The code described in the previous sections is well suited for a single
PHY environment. It also can be used for multiPHY applications without
ABR when local congestion may be avoided by rejecting calls that would
result in congestion. Since this is inherently impossible with ABR (there
is no notion of average rate), a multiPHY environment with ABR requires
enhancements to the basic code.
When a FIFO full strategy is used to pace the invocation of the TxCell()
scheduling routine (see Section 3.1.2, “FIFO Full Synchronization”), the
ATMizer II+ chip tries to send cells to a PHY FIFO as quickly as possible.
When the PHY FIFO becomes full, it paces down the ACI TxFIFO which,
in turn, paces down invocation of the scheduling routine. Recall that each
invocation of TxCell() results in one cell placed in the ACI TxFIFO.
Consider a situation where there are two outgoing links, A and B. Link A
is oversubscribed and link B is not. Since there is only one calendar, the
scheduling algorithm puts more cells in the TxFIFO for link A than for link
B. This, in turn, results in a head-of-line blocking and under-utilization of
link B. Note that it is not good enough to have a separate TxFIFO for
each PHY device to avoid this problem. The root cause is that there is
only one calendar.
To avoid the problem, one calendar table is needed per PHY device. If
all PHY devices are synchronized to the same network clock, they can
be served in a round-robin way, invoking the TxCell() routine with a
different calendar each time. The CalSwitch command can be used to
change the calendar table of the ATMizer II+ hardware Scheduler. The
Scheduler modifies the internal pointers such that all subsequent
commands are performed on the new calendar.
The TxCell() routine needs to be encapsulated in a wrapper routine,
SendCell(), that issues the CalSwitch command as shown in
Figure 3.19.
3-38
Scheduling
BookL64364PG.fm5 Page 39 Friday, January 28, 2000 4:58 PM
Figure 3.19 A MultiPHY TxCell()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
ushort aCell[PHY_NUM];
ulong
SCD_Ctrl[PHY_NUM];
ushort CurrentPHY;
/* free cell addresses */
/* SCD_Ctrl registers for each calendar */
/* current index into above tables */
int SendCell()
{
int i;
ulong SaveTime = TimeNow;
do {
aCell[CurrentPHY] = ACI_Free();
if (aCell[CurrentPHY] == 0)
return 0;
} while (++CurrentPHY < PHY_NUM);
for (i = 0; i < PHY_NUM; i++) {
TimeNow = SaveTime;
/* Cal_Switch() is a macro that issues CalSwitch Command */
Cal_Switch(i);
TxCell(aCell[i]);
}
CurrentPHY = 0;
return 1;
}
The code first tries to acquire one free cell location per PHY device and
to store it in the aCell array. If this is not possible because the ACI
TxFIFO does not have enough free locations, the routine exits with a
failure status code. The next invocation then continues free cell
acquisition at the place where the previous invocation failed. When
enough free cells are acquired, the code invokes the TxCell() routine
once per PHY device, each time modifying the calendar base address.
The current time must be preserved as each invocation of TxCell()
increments it.
In high-speed applications (for example three DS-3 lines), the memory
bandwidth requirements may be reduced if multiple cells for each PHY
are built before the calendar table is changed. The code in Figure 3.20
implements this strategy.
Local Congestion
3-39
BookL64364PG.fm5 Page 40 Friday, January 28, 2000 4:58 PM
Figure 3.20 Enhanced MultiPHY Code
1
ushort aCell[PHY_NUM * PHY_BLOCK];
/* free cell addresses */
2
ulong
SCD_Ctrl[PHY_NUM * PHY_BLOCK]; /* SCD_Ctrl registers for each
calendar */
3
ushort CurrentPHY;
/* current index into above
tables */
4
5
int SendCell()
6
{
7
int i, j;
8
ulong SaveTime = TimeNow;
9
do {
10
aCell[CurrentPHY] = ACI_Free();
11
if (aCell[CurrentPHY] == 0)
12
return 0;
13
} while (++CurrentPHY < PHY_NUM * PHY_SIZE);
14
15
for (i = 0; i < PHY_NUM; i++) {
16
Cal_Switch(i);
17
TimeNow = SaveTime;
18
for (j = i * PHY_BLOCK; j < (i + 1) * PHY_BLOCK; j++)
19
TxCell(aCell[j]);
20
}
21
CurrentPHY = 0;
22
return 1;
23 }
This code is very similar to the previous one with the exception that
PHY_BLOCK cells are built for each PHY before the calendar base
address is switched.
Since multiple calendars are used for the connections of different PHY
devices, the connections need to be rescheduled after data is attached
to the VCD in the corresponding calendar. Therefore, the rescheduling
code from Section 3.3.5.1, “Connection Rescheduling” is modified for
multicalendar support as shown in Figure 3.21.
3-40
Scheduling
BookL64364PG.fm5 Page 41 Friday, January 28, 2000 4:58 PM
Figure 3.21 Buff Completion Queue Interrupt Handler for
MultiCalendar Support
1
2
3
void ServBuffComplQueue() {
ulong N = EDMA_ComplQueue();
int Cal_No = (ACD[N].ACD_Ctrl >> ACD_CalNo)
& 0x3;
4
int CurrCal = (Hdr->SCD.CalSwitch) >> 6;
5
if (CurrCal == Cal_No) {
6
ulong T = SCD_Now();
7
SCD_Sched(N, T + 1);
8
}
9
else {
10
/* Macro to switch calendar */
11
CalSwitch(Cal_No);
12
ulong T = SCD_Now();
13
SCD_Sched(N, T + 1);
14
CalSwitch(CurrCal);
15
}
16
}
If the PHY devices are not running on the same network clock and are
of different rates, calendar switching can be done such that the number
of cells serviced from each calendar (including idle cells) is proportional
to the line rate of the PHY device corresponding to the calendar.
Multiple calendars are used in the multiPHY operation to improve the
QoS for connections of different PHY devices. Using multiple calendars
reduces the head-of-line blocking inherent with a single calendar in
multiPHY operation.
An analysis of the jitter of CBR connections is included here to illustrate
the improvement in the variance of the intercell gap when multiple
calendars are used for a multiPHY operation. The CBR connections are
opened on two PHY devices as shown in Table 3.5. The line rate of PHY
0 is OC-3 (155 Mbps) and the line rate of PHY 1 is DS3 (45 Mbps). Note
that, for the connections of PHY 1, the intercell gap in the calendar is
multiplied by 155/45 since the calendar slot time corresponds to the line
rate of OC-3.
Local Congestion
3-41
BookL64364PG.fm5 Page 42 Friday, January 28, 2000 4:58 PM
Table 3.5
Initial Setup for MultiPHY Connections
Connection
Number
Calendar
Number
PHY Device
Number
Rate in
Cells/s
Rate in
Mbits/s
Intercell Gap
(µs)
1
0
0
176603
74.88
5.66
2
0
0
176603
74.88
5.66
3
0
1
7580
3.21
132.09
4
0
1
7580
3.21
132.09
5
0
1
7580
3.21
132.09
6
0
1
7580
3.21
132.09
7
0
1
7580
3.21
132.09
8
0
1
7580
3.21
132.09
9
0
1
7580
3.21
132.09
10
0
1
7580
3.21
132.09
11
0
1
7580
3.21
132.09
12
0
1
7580
3.21
132.09
13
0
1
7580
3.21
132.09
14
0
1
7580
3.21
132.09
15
0
1
7580
3.21
132.09
16
0
1
7580
3.21
132.09
Two measurements were made on the two PHY 0 connections using an
HP E1697A. Table 3.6 shows the variance in the intercell gap for the
connections of PHY 0 when a single calendar is used to schedule cells
to both the PHY devices.
3-42
Scheduling
BookL64364PG.fm5 Page 43 Friday, January 28, 2000 4:58 PM
Table 3.6
PHY 0 Statistics at 155 Mbps with a Single Calendar
Connection
Number
Calendar
Number
PHY Device
Number
Intercell Gap
(µs)
Intercell Gap
Variance
(µs)
Rate
(Mb/s)
1
0
0
8.7
15.35
48.74
2
0
0
8.7
15.35
48.74
As described earlier, in case of multiple PHY devices that are scheduled
using the same calendar, the connections on the faster PHY device
suffer a greater variance in the intercell gap (and thereby jitter in the rate)
due to the head-of-line blocking by the cells belonging to the slower
devices.
The same measurements were taken using two calendars, one for each
PHY. The results are shown in Table 3.7.
Table 3.7
PHY 0 Statistics at 155 Mbps with Multiple Calendars
Connection
Number
Calendar
Number
PHY Device
Number
Intercell Gap
(µs)
Intercell Gap
Variance
(µs)
Rate
(Mb/s)
1
0
0
5.83
0.55
73.1
2
0
0
5.83
0.55
73.1
It can be seen from the tables that the jitter is substantially improved
when using two calendars. Note also that the transmission rate increased
to near maximum.
The code listing provided in Section 3.8, “Source Code Listings,” does
not include multiPHY operation.
Local Congestion
3-43
BookL64364PG.fm5 Page 44 Friday, January 28, 2000 4:58 PM
3.8 Source Code Listings
The remainder of this section provides sample listings for all of the
ATMizer II+ code developed for topics described in this chapter. The code
is composed of the following files:
•
uTypes.h
defines basic types
•
ATMizer2.h
main header file
•
Hdr.h
all hardware definitions
•
Instr.h
definitions of CW4010 extended instructions
•
ABR.h
declarations specific to ABR
•
Cell.c
main routine for sending and receiving cells
•
CBR.c
handles CBR and UBR traffic
•
VBR.c
handles VBR traffic
•
ABR.c
handles ABR traffic
3.8.1 Macros and Types Header File (uTypes.h)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
3-44
/* $Id: uTypes.h,v 1.3 1996/06/04 22:16:59 zhifeng Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995–1999 LSI Logic Corporation
*
* uTypes.h - Main include file defining basic types and macros
*
* -----------------------------------------------------------*/
#ifndef _UTYPES_H
#define _UTYPES_H
typedef unsigned long
ulong;
typedef unsigned short ushort;
typedef unsigned char
uchar;
#define U16
0x0000ffff
/* -----------------------------------------* Macros to access a data element of different type
*/
#define byte(x)
( *( (uchar
*) &(x)) )
#define half(x)
( *( (ushort *) &(x)) )
#define word(x)
( *( (ulong
*) &(x)) )
Scheduling
BookL64364PG.fm5 Page 45 Friday, January 28, 2000 4:58 PM
26
27
28
29
30
31
/* -----------------------------------------*
Macro to create bit masks
*/
#define one(x)
((ulong) 1 << (ulong) (x))
#endif
3.8.2 ATMizer II+ Header File (ATMizer2.h)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
/* $Id: ATMizer2.h,v 1.13 1996/08/06 22:15:42 thomasd Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995–1999 LSI Logic Corporation
*
* ATMizer2.h - Main include file for L64364
* -----------------------------------------------------------*/
#ifndef _ATMIZER2_H
#define _ATMIZER2_H
/* -----------------------------------------------------------*
MACRO DEFINITIONS
*/
/*
* Fractional part of time, currently 24.8 is recommended
* If you increase this value, make sure that your ICG
* can accommodate the maximum value
*/
#define TIME_FRAC
8
/* Calendar size mask. This value can be adjusted according
* to the calendar size.
*/
#define CAL_SIZE_MASK
63
/*
* Value to use for Cell Descriptor to send an explicit
* idle cell
*/
#define CDS_IDLE
(4 << 10)
/*
* Test to avoid scheduling connections in the past
*/
#define IsInPast(T)
((long)TimeNow - (long)T + (1 << TIME_FRAC) > 0)
/*
* Extract SCD_BuffPres and SCD_Class fields from
* a value returned by SCD_Serv()
*/
#define ConHasData(N)
((long) N < 0)
Source Code Listings
3-45
BookL64364PG.fm5 Page 46 Friday, January 28, 2000 4:58 PM
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
3-46
#define ConClass(N)
( ((N) >> 16) & 3)
/*
* Check the PTI field of a cell header
*/
#define PTI_HDR
(7 << 1)
#define PTI_RM
(6 << 1)
/* -----------------------------------------------------------*
TYPE DECLARATIONS
*/
/* -----------------------------------------------------------* Receive and transmit rings
*/
typedef struct {
ulong
*Ptr;
ulong
*Base;
ulong
*End;
} Ring_t, *pRing_t;
/* -----------------------------------------------------------* Statistics vector
*/
typedef struct {
ulong
RxCells;
/* received cells */
ulong
TxCells;
/* transmitted cell */
ulong
RxPDU;
/* received PDUs */
ulong
TxPDU;
/* transmitted PDU */
ulong
ErrCrc;
/* received crc errrored PDUs */
ulong
ErrLength;
/* received length errored PDU */
ulong
ErrAbort;
/* received aborted (zero length) PDUs*/
ulong
ErrLowMem;
/* one free buffer list is empty */
ulong
ErrNoContBuff; /* received partially built PDUs */
ulong
ErrNoMem;
/* Both free buffer list are empty */
ulong
ErrNoData;
/* no buffer is attached to VCD */
ulong
ErrTimeout
/* received aborted (timeout) PDUs */
ulong
ErrRxLost;
/* received lost cells */
ulong
ErrConNum;
/* wrong connection number */
ulong
ErrCrc10;
/* errored (crc10) RM cells */
} Stat_t, *pStat_t;
/* -----------------------------------------------------------* ATM cell in Cell Buffer. This declaration does not support tag bytes
*/
typedef struct {
ulong CDS;
ulong CellHdr;
uchar Payd[48];
} Cell_t, *pCell_t;
/* -----------------------------------------------------------*
APU Connection descriptor.
Scheduling
BookL64364PG.fm5 Page 47 Friday, January 28, 2000 4:58 PM
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
* The size depends on the QoS supported.
*
CBR, UBR
:
8 bytes
*
CBR, VBR, UBR
: 20 bytes
*
CBR, VBR, UBR and ABR
: 32 bytes
*
* To simplify the coding (and speed array index calculations)
* we will always use 32 bytes per connection.
*
* ABR specific data is declared in ABR.h, here we only declare
* descriptors for other QoS. In all cases, the first two words
* are the same (ICG and ThTxTime)
* The following 3 words are used by VBR only.
*
* The first word (ICG) also stores the connection ‘Scheduled’ flag
* on bit 31, while the ICG uses bits 23:0 and bits 30:24 are unused
* and all zero. Instead of using bitfields for this data structure
* (for example like that:
*
typedef struct {
*
ulong Scheduled:1,
*
ICG:30;
* which is quite inefficient, we define access macros:
*
ACD_Sched(ACD[N])
- sets connection state to ‘Scheduled’
*
ACD_UnSched(ACD[N]) - sets connection state to ‘not Scheduled’
*
ACD_IsSched(ACD[N]) - returns true if connection is scheduled
*
* Since the value of ICG is only used when connection is scheduled
* we store state ‘Scheduled’ as bit 31 = 0, which avoids clearing
* this bit everytime ICG is fetched.
*/
typedef struct {
ulong
ICG;
/* in 16.8 format */
ulong
ThTxTime;
/* The following declarations are for VBR connections only */
ulong
Bucket;
/* Current Bucket contents */
ulong
Increment; /* Bucket Increment each time a cell is sent */
ulong
Limit;
/* Bucket limit */
ulong
ICG_PCR;
/* 1/PCR */
uchar
Pad[32-6*4];
ushort ACD_Ctrl;
} ACD_t, *pACD_t;
#define ACD_Sched(x)
#define ACD_UnSched(x)
#define ACD_IsSched(x)
byte(x) = 0
byte(x) = 0x80
( *( (long
*) &(x)) >= 0 )
/* -----------------------------------------------------------* Type of CBR/VBR/ABR/UBR_send functions
*/
typedef int QoS_Send_t(const ulong, const ulong);
/* -----------------------------------------------------------* Out of rate cell in cell buffer
Source Code Listings
3-47
BookL64364PG.fm5 Page 48 Friday, January 28, 2000 4:58 PM
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
3-48
*/
typedef struct {
ushort aCell;
ushort ConNum;
} OutOfRate_t;
/* -----------------------------------------------------------*
EXTERNAL DECLARATIONS
*/
/* -----------------------------------------------------------*
Declaration of data structures located in external memory
*/
extern pACD_t ACD;
extern pVCD_t VCD;
extern pBFD_t BFD;
/* -----------------------------------------------------------*
Declaration of global variables located in Data RAM
*/
extern
extern
extern
extern
extern
extern
extern
extern
ulong
Ring_t
Ring_t
OutOfRate_t
Stat_t
ulong*
ulong*
ulong*
TimeNow;
TxRing;
RxRing;
OutOfRate;
Stat;
HCD_MsgBase;
Stats_MsgBase;
APU2Host_Mbx;
/* -----------------------------------------------------------* Declarations of global functions
*/
extern void Initialize(void);
extern QoS_Send_t CBR_Send, VBR_Send, ABR_Send, UBR_Send;
extern int ABR_Receive(const ulong, const ulong);
extern
extern
extern
extern
extern
extern
extern
extern
extern
extern
extern
extern
extern
void ComplMsg(const ulong, const pRing_t);
void BuffMsg(void);
void HostMsg(const ulong);
void RxCell(const ulong);
void TxCell(const ulong);
void BFS_Error(const ulong);
void HostMsg(const ulong);
ulong GetRing(pRing_t);
ulong PutRing(pRing_t, ulong);
ulong iramSize(ulong, ulong);
void setDram(const ulong, const ulong);
void setIram(const ulong, const ulong);
void loadIram(const ulong, const ulong, const ulong);
#endif
Scheduling
BookL64364PG.fm5 Page 49 Friday, January 28, 2000 4:58 PM
3.8.3 ATMizer II+ Hardware Header File (Hdr.h)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/* $Id: Hdr.h,v 1.4 1996/06/04 22:16:05 zhifeng Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995–1999 LSI Logic Corporation
*
* Hdr.h - Declarations for all the L64364 hardware resources
*
*-----------------------------------------------------------*/
#ifndef _HDR_H_
#define _HDR_H_
/*_____________________________________________________________
*
*
HARDWARE REGISTERS MAP
*_____________________________________________________________
*/
/* EDMA memory mapped registers */
typedef struct EDMA_Reg_s {
/* ItemName;
Offs,
Size,
R/W,
Description */
ulong
TxCompl;
/* 0x00,
32, R,
Read Transmit Completion Queue. */
ulong
TxConNum;
/* 0x04,
32, R/W,
Connection number for the TxCell command. */
ulong
TxCell;
/* 0x08,
32, R/W,
Issue a TxCell command. */
ulong
pad1;
/* 0x0c,
32, N/A,
Padding bits. */
ulong
TxConAct;
/* 0x10,
32, R,
Current ConNum processed by TxCell processor*/
ulong
pad2[11];
/* 0x14,
11*32,
N/A,
Padding bits. */
ulong
RxCompl;
/* 0x40,
32, R,
Read Transmit Completion Queue. */
ulong
RxConNum;
/* 0x44,
32, R/W,
Connection number for the RxCell command*/
ulong
RxCell;
/* 0x48,
32, R/W,
Issue a RxCell command. */
ulong
pad3;
/* 0x4c,
32, N/A,
Padding bits. */
ulong
RxConAct;
/* 0x50,
32, R,
Current ConNum processed by RxCell processor.*/
ulong
pad4;
/* 0x54,
32, N/A,
Padding bits. */
ushort
RxBuffOffs; /* 0x58,
16, R/W,
Offset for the receive Buffers payload. */
ushort
pad5;
/* 0x5a,
16, N/A,
Padding bits. */
Source Code Listings
3-49
BookL64364PG.fm5 Page 50 Friday, January 28, 2000 4:58 PM
51
ulong
pad6[9];
/* 0x5c,
9*32,
N/A,
52
Padding bits. */
53
ulong
Buff;
/* 0x80,
32, R/W,
54
Issue a buff command. */
55
ulong
pad7;
/* 0x84,
32, N/A,
56
Padding bits. */
57
ulong
ConReAct;
/* 0x88,
32, R,
58
Buff processor conection Reactivation message.*/
59
ulong
pad8;
/* 0x8c,
32, N/A,
60
Padding bits. */
61
ulong
BuffConAct; /* 0x90,
32, R,
62
Current ConNum processed by Buff processor.*/
63
ushort
LBuff;
/* 0x94,
16,
R/W,
64
head of Large Free Buffer lists. */
65
ushort
SBuff;
/* 0x96,
16, R/W,
66
head of Small Free Buffer lists. */
67
ushort
TxBuffOffs; /* 0x98,
16, R/W,
68
Offset for the transmit Buffers payload. */
69
ushort
pad9;
/* 0x9a,
16, N/A,
70
Padding bits. */
71
ulong
pad10;
/* 0x9c,
32, N/A,
72
Padding bits. */
73
ulong
MoveSrc;
/* 0xa0,
32, R/W,
74
Program source address for a move command. */
75
ulong
MoveDst;
/* 0xa4,
32, R/W,
76
Program destination address for a move command.*/
77
ushort
MoveCount;
/* 0xa8,
16, R/W,
78
Program the byte count and issue a move command.*/
79
ushort
pad11;
/* 0xaa,
16, N/A,
80
Padding bits. */
81
ulong
pad12[5];
/* 0xac,
5*32,
N/A,
82
Padding bits. */
83
ushort
Ctrl;
/* 0xc0,
16, R/W,
84
EDMA control bits. */
85
ushort
pad13;
/* 0xc2,
16, N/A,
86
Padding bits. */
87
ushort
Status;
/* 0xc4,
16, R,
88
Check the EDMA status. */
89
ushort
pad14;
/* 0xc6
16, N/A,
90
Padding bits. */
91
ushort
LBuffSize;
/* 0xc8,
16, R/W,
92
Size of large buffer in bytes. */
93
ushort
SBuffSize;
/* 0xca,
16, R/W,
94
Size of small buffer in bytes. */
95
ushort
VCD_Base;
/* 0xcc,
16, R/W,
96
Base address of the VC Descriptor Table. */
97
ushort
pad15;
/* 0xce,
16, N/A,
98
Padding bits. */
99
ushort
BFD_LBase;
/* 0xd0,
16, R/W,
100
Local Base address of Buffer Descriptor Table*/.
101
ushort
BFD_FBase;
/* 0xd2,
16, R/W,
102
Far Base address of Buffer Descriptor Table.*/
103 } EDMA_Reg_t, *pEDMA_Reg_t;
3-50
Scheduling
BookL64364PG.fm5 Page 51 Friday, January 28, 2000 4:58 PM
104
105
106 /* ACI memory mapped registers */
107 typedef struct ACI_Reg_s {
108
/* ItemName;
Offs,
Size,
R/W,
Init,
Description */
109
ushort
Ctrl;
/* 0x00,
16, R/W,
Y
110
ACI Control field. */
111
ushort
FreeList;
/* 0x02,
16, R/W,
Y
112
Beginning of free cell list. */
113
uchar
TxTimer;
/* 0x04,
8,
R/W,
Y
114
Transmit time-out. */
115
uchar
TxSize;
/* 0x05,
8,
R/W,
Y
116
Maximum number of cells in Transmit Fifo.
117
uchar
TxLimit;
/* 0x06,
8,
R/W,
Y
118
Num of cells in TxFifo to generate
an interrupt. */
119
uchar
RxLimit;
/* 0x07,
8,
R/W,
Y
120
Num of cells in RxFifo to generate
an interrupt. */
121
ulong
RxMask;
/* 0x08,
32, R/W,
Y
122
Receive polling mask. */
123
ushort
Free;
/* 0x0c,
16, R/W,
124
Get or return a free cell location.
125
ushort
pad1;
/* 0x0e,
16, N/A,
126
Padding bits. */
127
ushort
RxRead;
/* 0x10,
16, R,
128
Get cell from Receive Fifo. */
129
ushort
pad2;
/* 0x12,
16, N/A,
130
Padding bits. */
131
ushort
TxWrite;
/* 0x14,
16, W,
132
Put cell in Transmit Fifo. */
133
ushort
pad3;
/* 0x16,
16, N/A,
134
Padding bits. */
135
uchar
RxCels;
/* 0x18,
8,
R,
136
Number of cells in the Receive Fifo.
137
uchar
pad4;
/* 0x19,
8,
N/A,
138
Padding bits. */
139
uchar
TxCels;
/* 0x1a,
8,
R,
140
Number of cells in the Transmit Fifo.
141
uchar
pad5;
/* 0x1b,
8,
N/A,
142
Padding bits. */
143
ushort
Error;
/* 0x1c,
16, R,
144
Get a cell from the Fifo. */
145
ushort
pad6;
/* 0x1e,
16, N/A,
146
Padding bits. */
147 } ACI_Reg_t, *pACI_Reg_t;
148
149
150 /* Timer Unit memory mapped registers */
151 typedef struct TIM_Reg_s {
152
/* ItemName;
Offs,
Size,
R/W,
Init,
Description */
Source Code Listings
*/
*/
*/
*/
3-51
BookL64364PG.fm5 Page 52 Friday, January 28, 2000 4:58 PM
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
3-52
ulong
TimeStamp;
uchar
Timer1;
uchar
pad1;
uchar
TimerInit1;
uchar
pad2;
uchar
Timer2;
uchar
pad3;
uchar
TimerInit2;
uchar
pad4;
uchar
Timer3;
uchar
pad5;
uchar
TimerInit3;
uchar
pad6;
uchar
Timer4;
uchar
pad7;
uchar
TimerInit4;
uchar
pad8;
uchar
Timer5;
uchar
pad9;
uchar
TimerInit5;
uchar
pad10;
uchar
Timer6;
uchar
pad11;
uchar
TimerInit6;
uchar
pad12;
uchar
Timer7;
uchar
pad13;
Scheduling
/* 0x00,
32, R/W,
0
Time Stamp Counter. */
/* 0x04,
8,
R/W,
Y
Timer Value. */
/* 0x05,
8,
N/A,
Padding bits. */
/* 0x06,
8,
R/W,
0
Timer Initialization value.
/* 0x07,
8,
N/A,
Padding bits. */
/* 0x08,
8,
R/W,
Y
Timer Value. */
/* 0x09,
8,
N/A,
Padding bits. */
/* 0x0a,
8,
R/W,
0
Timer Initialization value.
/* 0x0b,
8,
N/A,
Padding bits. */
/* 0x0c,
8,
R/W,
Y
Timer Value. */
/* 0x0d,
8,
N/A,
Padding bits. */
/* 0x0e,
8,
R/W,
0
Timer Initialization value.
/* 0x0f,
8,
N/A,
Padding bits. */
/* 0x10,
8,
R/W,
Y
Timer Value. */
/* 0x11,
8,
N/A,
Padding bits. */
/* 0x12,
8,
R/W,
0
Timer Initialization value.
/* 0x13,
8,
N/A,
Padding bits. */
/* 0x14,
8,
R/W,
Y
Timer Value. */
/* 0x15,
8,
N/A,
Padding bits. */
/* 0x16,
8,
R/W,
0
Timer Initialization value.
/* 0x17,
8,
N/A,
Padding bits. */
/* 0x18,
8,
R/W,
Y
Timer Value. */
/* 0x19,
8,
N/A,
Padding bits. */
/* 0x1a,
8,
R/W,
0
Timer Initialization value.
/* 0x1b,
8,
N/A,
Padding bits. */
/* 0x1c,
8,
R/W,
Y
Timer Value. */
/* 0x1d,
8,
N/A,
*/
*/
*/
*/
*/
*/
BookL64364PG.fm5 Page 53 Friday, January 28, 2000 4:58 PM
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
Padding bits. */
8,
R/W,
0
Timer Initialization value. */
/* 0x1f,
8,
N/A,
Padding bits. */
/* 0x20,
6,
R/W,
Y
Timer-out enable. */
/* 0x21,
3*8,
N/A,
Padding bits. */
/* 0x24,
6,
W,
Timer-out clear. */
/* 0x25,
3*8,
N/A,
Padding bits. */
/* 0x28,
32, W,
Y
Timer clock selection. */
uchar
TimerInit7; /* 0x1e,
uchar
pad14;
uchar
Enable;
uchar
pad15[3];
uchar
Clear;
uchar
pad16[3];
ulong
ClockSel;
} TIM_Reg_t, *pTIM_Reg_t;
/* Scheduler Unit memory mapped registers */
typedef struct SCD_Reg_s {
/* ItemName;
Offs,
Size,
R/W,
Description */
ulong
Ctrl;
/* 0x00,
32, R/W,
Control register. */
ushort
pad1;
/* 0x04,
16, N/A,
Padding bits. */
ushort
CalSize;
/* 0x06,
16,
R/W,
Size of the Calendar Table. */
ushort
pad2;
/* 0x08,
16, N/A,
Padding bits. */
ushort
Now;
/* 0x0a,
16, R/W,
Current cell slot pointer. */
ulong
Serv;
/* 0x0c,
32, R,
execute service command. */
ulong
Sched;
/* 0x10,
32, W,
execute schedule command. */
ulong
pad3;
/* 0x14,
32, N/A,
Padding bits. */
ulong
Tic;
/* 0x18,
32, W,
execute tic command. */
} SCD_Reg_t, *pSCD_Reg_t;
/* APU memory mapped registers */
typedef struct APU_Reg_s {
/* ItemName;
Offs,
Size,
R/W,
Description */
ulong
AddrMap;
/* 0x00,
32, R/W,
Memory mapping register. */
ushort
pad1;
/* 0x04,
16, N/A,
Padding bits. */
ushort
Watchdog;
/* 0x06,
16, R/W,
APU watchdog timer value. */
ulong
Srl;
/* 0x08,
32, R/W,
Read a word from a serial EPROM. */
Source Code Listings
3-53
BookL64364PG.fm5 Page 54 Friday, January 28, 2000 4:58 PM
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
3-54
ushort
pad2;
/* 0x0c,
16, N/A,
Padding bits. */
ushort
VIntEnable; /* 0x0e,
16,
R/W,
Interrupt mask. */
ulong
VIntBase;
/* 0x10,
32, R/W,
Interrupt base address. */
ulong
Status;
/* 0x14,
32, R,
System status bits. */
} APU_Reg_t, *pAPU_Reg_t;
/* PORT Controller memory mapped registers */
typedef struct PC_Reg_s {
/* ItemName;
Offs,
Size,
R/W,
Description */
uchar
pad1[3];
/* 0x0,
3*8,
N/A,
Padding bits. */
uchar
PP_Ctrl;
/* 0x03,
8,
R/W,
Primary Port control. */
ulong
PP_RxMbx;
/* 0x04,
32, R,
Input Mailbox (host -> APU) */
ulong
PP_TxMbx;
/* 0x08,
32, R/W,
Output Mailbox (APU -> host). */
ulong
pad2[29];
/* 0x0c,
29*32,
N/A,
Padding bits. */
ulong
SP_Ctrl;
/* 0x80,
32, R/W,
Secondary Port control register. */
ulong
SP_SDRAM;
/* 0x84,
32, R/W,
SDRAM control register. */
ulong
SP_Refresh; /* 0x88,
32, R/W,
SDRAM refresh register. */
} PC_Reg_t, *pPC_Reg_t;
/* Reserved for hardware registers external to ATMizer-II+ CWM. */
typedef struct EXT_Reg_s {
uchar
Extern[1024]; /* Hardware registers for external modules.*/
} EXT_Reg_t, *pEXT_Reg_t;
/* ATMizer-II+ CWM hardware register map */
typedef struct Hdr_Reg_s {
/* ItemName;
Size,
VirAddr,
Description */
EDMA_Reg_t
volatile EDMA;
/* 256 bytes,
b8000000,
Hardware registers for EDMA. */
uchar
pad1[256 - sizeof(EDMA_Reg_t)];
ACI_Reg_t
uchar
SCD_Reg_t
uchar
volatile ACI;
/* 256 bytes,
b8000100,
Hardware registers for ACI. */
pad2[256 - sizeof(ACI_Reg_t)];
volatile SCD;
/* 128 bytes,
b8000200,
Hardware registers for SCHEDULER. */
pad4[128 - sizeof(SCD_Reg_t)];
Scheduling
BookL64364PG.fm5 Page 55 Friday, January 28, 2000 4:58 PM
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
TIM_Reg_t
uchar
APU_Reg_t
uchar
PC_Reg_t
uchar
EXT_Reg_t
volatile TIM;
/* 128 bytes,
b8000280,
Hardware registers for TIMER. */
pad3[128 - sizeof(TIM_Reg_t)];
volatile APU;
/* 256 bytes,
b8000300,
Hardware registers for APU. */
pad5[256 - sizeof(APU_Reg_t)];
volatile PC;
/* 256 bytes,
b8000400,
Hardware registers for Port Controller.*/
pad6[256 - sizeof(PC_Reg_t)];
volatile EXT;
/* 1k bytes,
b8000800,
Hardware registers for external modules. */
} Hdr_t, *pHdr_t;
/*_____________________________________________________________
*
*
HARDWARE REGISTER DEFINITIONS
*_____________________________________________________________
*/
/* ------------------------------------------------------* Buffer Status bits, returned in a Completion Queue
*/
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
BFS_ErrAll
BFS_ConNumRet
BFS_BuffCont
BFS_DirTx
BFS_ErrNoData
BFS_ErrNoMem
BFS_ErrNoContBuff
BFS_ErrLowMem
BFS_ErrAbort
BFS_ErrLength
BFS_ErrCrc
BFS_BuffFree
BFS_BuffLarge
31
30
29
28
27
26
25
24
23
22
21
17
16
/* ---------------------------------------* EDMA Control register EDMA_Ctrl
*/
typedef struct {
ushort
Res1:4,
RxBFD_Far:1,
RxBFD_Copy:1,
TxBFD_Far:1,
TxBFD_Copy:1,
Res2:2,
ByteSwap:1,
Source Code Listings
3-55
BookL64364PG.fm5 Page 56 Friday, January 28, 2000 4:58 PM
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
3-56
OrHdr:1,
ConReAct:1,
UU:1,
RxCopy:1,
TxCopy:1;
} EDMA_Ctrl_t;
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
EDMA_RxBFD_Far
EDMA_RxBFD_Copy
EDMA_TxBFD_Far
EDMA_TxBFD_Copy
EDMA_ByteSwap
EDMA_OrHdr
EDMA_ConReAct
EDMA_UU
EDMA_RxBuffCopy
EDMA_TxBuffCopy
11
10
9
8
5
4
3
2
1
0
/* ------------------------------------------------------* Buffer Descriptor control bits
*/
typedef struct {
ushort
BuffCont:1,
EFCI:1,
CLP:1,
BuffFree:1,
BuffLarge:1,
ErrAbort:1,
ErrLength:1,
ErrCrc:1,
ConNumMSB:8;
} BFD_Ctrl_t;
#define
#define
#define
#define
#define
#define
#define
#define
BFD_BuffCont
BFD_EFCI
BFD_CLP
BFD_BuffFree
BFD_BuffLarge
BFD_ErrAbort
BFD_ErrLength
BFD_ErrCrc
15
14
13
12
11
10
9
8
/* ---------------------------------------* Buffer Descriptor (BFD)
*/
typedef struct {
BFD_Ctrl_t
BFD_Ctrl;
ushort
ConNum; /* Connection number to which buffer belongs */
ushort
BuffSize;
/* Size of the buffer */
ushort
NextBFD;
/* Index to next BFD in the list */
ulong
pBuffData;
/* pointer to the payload */
ushort
UU_CPI;
Scheduling
BookL64364PG.fm5 Page 57 Friday, January 28, 2000 4:58 PM
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
ushort
BuffNum;
} BFD_t, *pBFD_t;
/* -----------------------* Control field of the VCD
*/
typedef struct {
ushort
BuffPres:1,
ConAct:1,
BuffCont:1,
BuffFree:1,
BuffLarge:1,
BuffDone:1,
EFCI:1,
CLP:1,
PHY:5,
CellHold:1,
AAL0:1,
DirTx:1;
} VCD_Ctrl_t;
#define
#define
#define
#define
#define
#define
#define
#define
VCD_BuffPres
VCD_ConAct
VCD_BuffCont
VCD_BuffFree
VCD_BuffLarge
VCD_BuffDone
VCD_EFCI
VCD_CLP
#define VCD_CellHold
#define VCD_AALO
#define VCD_DirTx
/* connection is open */
/* connection active status */
/*
/*
/*
/*
address of physical device to use */
do not send out cell */
AAL0 mode of operation */
set for Tx, cleared for Rx */
15
14
13
12
11
10
9
8
2
1
0
/* ------------------------------* auxiliary control word for AAL0
*/
typedef struct {
ulong
Tbytes:6,
Crc10:1,
Reserved:1,
Offs:6,
Unused:18;
} AAL0_Ctrl_t;
/* -------------------------------* ATM cell header
*/
typedef struct {
ulong
VPI:12,
VCI:16,
PTI:1,
/* only one PTI bit is named as such */
Source Code Listings
3-57
BookL64364PG.fm5 Page 58 Friday, January 28, 2000 4:58 PM
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
3-58
EFCI:1,
EOM:1,
CLP:1;
} CellHdr_t;
#define CELL_EFCI
#define CELL_EOM
#define CELL_CLP
2
1
0
/* -------------------------------* Virtual Circuit Descriptor (VCD)
*/
typedef struct {
ushort
Class;
ushort
NextVCD;
VCD_Ctrl_t
VCD_Ctrl;
ushort
Nbytes;
ulong
pBuffData;
AAL0_Ctrl_t
Crc32;
CellHdr_t
CellHdr;
ushort
BuffSize;
ushort
PayldLen;
ushort
TailBFD;
ushort
NextBFD;
ushort
CurrBFD;
ushort
UU_CPI;
} VCD_t, *pVCD_t;
/* -------------------------------* Cell Descriptor in Cell Buffer
*/
typedef struct {
ulong
Next:16,
Tbytes:6,
Crc10:1,
Par:1,
BOM:1,
EOM:1,
Len:1,
PHY:5;
} CDS_t;
#define
#define
#define
#define
#define
#define
#define
CDS_Tbytes
CDS_Crc10
CDS_Par
CDS_BOM
CDS_EOM
CDS_Len
CDS_PHY
10
9
8
7
6
5
0
/* -------------------------------*
EDMA Status register
*/
Scheduling
/* used by the scheduler */
/* used by the scheduler */
/* VCD control bits */
/* num of bytes processed */
/* pointer to curr buffer */
/* partial CRC-32 result */
/* Cell Header, tx only */
/* size of the current buffer */
/* total length of frame */
/* index to tail BFD */
/* index to current BFD */
/* index to next BFD */
BookL64364PG.fm5 Page 59 Friday, January 28, 2000 4:58 PM
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
typedef struct {
ushort
RxCellComplFull:1,
TxCellComplFull:1,
BuffComplFull:1,
MoveRxPend:1,
RxCellMsg:1,
TxCellMsg:1,
BuffMsg :1,
MoveBuffPend:1,
RxCellReqFull:1,
TxCellReqFull:1,
BuffReqFull:1,
MoveReqFull:1,
RxCellBusy:1,
TxCellBusy:1,
BuffBusy:1,
MoveBusy:1;
} EDMA_Status_t;
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
EDMA_MoveBusy
EDMA_BuffBusy
EDMA_TxCellBusy
EDMA_RxCellBusy
EDMA_MoveReqFull
EDMA_BuffReqFull
EDMA_TxCellReqFull
EDMA_RxCellReqFull
EDMA_MoveBuffPend
EDMA_BuffMsg
EDMA_TxCellMsg
EDMA_RxCellMsg
EDMA_MoveRxPend
EDMA_BuffComplFull
EDMA_TxCellComplFull
EDMA_RxCellComplFull
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/* -----------------------------------------------------------*
APU_AddrMap register
*/
typedef struct {
ulong
Reset:1,
Boot:2,
SecMSB:5,
Res1:3,
PriMSB:5,
IntAck:6,
Res2:3,
ExcMap:7;
} APU_AddrMap_t;
/* ------------------------------------------------------------
Source Code Listings
3-59
BookL64364PG.fm5 Page 60 Friday, January 28, 2000 4:58 PM
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
*
APU_Status register
*/
typedef struct {
ulong
MbxFull:1,
Res1:3,
NowBusy:1,
TicBusy:1,
SchedBusy:1,
ServBusy:1,
Res2:7,
Watchdog:1,
EDMA_RxCelFull:1,
ACI_RxFull:1,
RxMbx:1,
EDMA_TxCellFull:1,
EDMA_RxCell:1,
ACI_Rx:1,
EDMA_TxCell:1,
EDMA_Buff:1,
ACI_Err:1,
ACI_Tx:1,
IntExt:2,
IntTim:4;
} APU_Status_t;
3-60
Scheduling
/*_____________________________________________________________
*
*
ACCESS MACROS
*_____________________________________________________________
*/
#define
#define
#define
#define
ACI_Send(x)
ACI_GetFree()
ACI_Free(x)
ACI_RxRead()
Hdr->ACI.TxWrite = (x)
Hdr->ACI.Free
Hdr->ACI.Free = (x)
Hdr->ACI.RxRead
#define EDMA_TxCell(x, y)
{Hdr->EDMA.TxConNum = (x);
Hdr->EDMA.TxCell = (y);}
#define EDMA_RxCell(x, y)
{Hdr->EDMA.RxConNum = (x);
Hdr->EDMA.RxCell = (y);}
#define EDMA_Status()
Hdr->EDMA.Status
#define EDMA_TxCompl()
Hdr->EDMA.TxCompl
#define EDMA_RxCompl()
Hdr->EDMA.RxCompl
#define EDMA_BuffConReAct() Hdr->EDMA.ConReAct
#define EDMA_Buff(x)
Hdr->EDMA.Buff = (x)
#define
#define
#define
#define
#define
SCD_Sched(x, y)
SCD_Serv()
SCD_Tic()
SCD_Now()
SCD_SetNow(x)
#define PP_RxMbx()
Hdr->SCD.Sched = ( (x) << 16 | (y) )
Hdr->SCD.Serv
Hdr->SCD.Tic = 0
Hdr->SCD.Now
Hdr->SCD.Now = (x)
Hdr->PC.PP_RxMbx
BookL64364PG.fm5 Page 61 Friday, January 28, 2000 4:58 PM
630
631
632
633
634
635
636
637
638
639
640
/*_____________________________________________________________
*
*
EXTERNAL DECLARATIONS FOR HARDWARE RESSOURCES
*_____________________________________________________________
*/
extern pHdr_t
extern uchar *
Hdr;
CBM;
#endif
3.8.4 Extended Instructions Header File (Instr.h)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
/* $Id: Instr.h,v 1.2 1996/06/04 22:16:18 zhifeng Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995–1999 LSI Logic Corporation
*
* Instr.h - Declarations for the CW4010 extended instructions
*
suitable for ATM Forum defined rate calculation.
*
These declarations can be used with a GNU GCC compiler
*
* -----------------------------------------------------------*/
#ifndef _INSTR_H
#define _INSTR_H
#define maxi(a, b)
\
({
int __z, __a = (a), __b = (b);
\
__asm__ (“maxi %0,%1,%2” : “=r” (__z) : “r” (__a), “r”(__b) );
__z; })
#define mini(a, b)
\
({
int __z, __a = (a), __b = (b);
\
__asm__ (“mini %0,%1,%2” : “=r” (__z) : “r” (__a), “r”(__b) );
__z; })
#define rmul(a, b)
\
({
int __z, __a = (a), __b = (b);
\
__asm__ (“rmul %1,%2\n\tmflo %0”
:
\
“=r” (__z) : “r” (__a), “r”(__b) : “h”, “l” );
\
__z; })
#define radd(a, b)
\
({
int __z, __a = (a), __b = (b);
\
__asm__ (“radd %1,%2\n\tmflo %0”
: \
“=r” (__z) : “r” (__a), “r”(__b) : “h”, “l” );
\
__z; })
#define rsub(a, b)
\
({
int __z, __a = (a), __b = (b);
\
__asm__ (“rsub %1,%2\n\tmflo %0”
: \
“=r” (__z) : “r” (__a), “r”(__b) : “h”, “l” );
\
__z; })
Source Code Listings
\
\
3-61
BookL64364PG.fm5 Page 62 Friday, January 28, 2000 4:58 PM
39
40
41
42
43
44
45
46
47
48
49
50
#define r2u(a)
({
int __z,
__asm__ (“r2u
“=r” (__z) :
__z; })
#define u2r(a)
({
int __z,
__asm__ (“u2r
“=r” (__z) :
__z; })
\
__a = (a); \
%1\n\tmflo %0”
: \
“r” (__a) : “h”, “l” );
\
\
__a = (a); \
%1\n\tmflo %0”
: \
“r” (__a) : “h”, “l” );
\
#endif
3.8.5 ABR Functions Header File (ABR.h)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
3-62
/* $Id: ABR.h,v 1.5 1996/07/03 01:04:30 zhifeng Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995 LSI Logic Corporation
*
* Available Bit Rate - Source and Destination End System Behavior
*
* ABR.h - Header file for ABR functions
*
* -----------------------------------------------------------*/
#ifndef _ABR_H
#define _ABR_H
typedef struct {
/*
0 */ ulong
ICG;
/* Inter-Cell-Gap, in fractional fromat */
/* 4 */ ulong ThTxTime; /* Theoretical Transmit Time in fract format*/
/* 8 */ ulong LastTimeFRM; /* Last Time a Forward RM cell was sent*/
/* 12 */ uchar
logRIF:4,
logRDF:4;
/* 13 */ uchar
CRM;
/* limit of FRM in absence of BRM */
/* 14 */ uchar FRM_SinceBRM; /* count of FRM since last received BRM */
/* 15 */ uchar InRateCell;
/* Count of In-Rate cells since last FRM*/
/* 16 */ ushort ACR;
/* Allowed Cell Rate */
/* 18 */ ushort MCR;
/* Minimum Cell Rate */
/* 20 */ ushort ICR;
/* Initial Cell Rate */
/* 22 */ ushort PCR;
/* Peak Cell Rate */
/* 24 */ ushort PVec;
/* binary flags and some service parameters*/
/* 26 */ ushort
BRM_ER;
/* BRM: Explicit Rate */
/* 28 */ ushort
BRM_CCR;
/* BRM: Current Cell Rate */
/* 30 */ ushort
BRM_MCR;
/* BRM: Minimum Cell Rate */
} ABR_t, *pABR_t;
#define LCR (149.76e6/8/53)
/* Line Cell Rate */
/* ------------------------------------------------------------
Scheduling
BookL64364PG.fm5 Page 63 Friday, January 28, 2000 4:58 PM
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
*
*/
#define
#define
#define
#define
#define
ABR constant service parameters
logNRM
NRM
TRM
MRM
TCR
5
(1 << logNRM)
/* default 32 */
( (ulong) (0.1 * LCR) ) /* default 100 ms */
2
((1 << RATE_NZ) | (3 << RATE_EXP) | 128 )
/* decimal 10 in ABR rate format */
/* -----------------------------------------------------------*
Definition of PVec bitfield.
* Bits 1-0 store binary state variables
* Bits 4-2 store 3 bits from MSG field of BRM
* Bits 15-5 store optionally negotiable service parameters.
*
*
0
LastWasFRM
*
1
PresBRM
*
2
BRM_NI
*
3
BRM_CI
*
4
BRM_BN
*
7:5
logCDF
* 15:8
ADTF
(TM specs require 10 bits here, only 8 fit)
*
* CDF has a default of 1/16 and is optionally negotiated
* If you need the to negotiate that value, use the definition below
*
#define logCDF
((pABR->PVec >> 6) & 3)
* otherwise the default one is faster.
*/
#define logCDF
4
/*
* ADTF has a default value of 0.5 s and is optionally negotiated
* If you need to negotiate that value per VC, use the definition below.
* This definition provides granularity of 80 ms which is not as good
* as 10 ms required by TM specs, but it should be sufficient.
* #define ADTF
( (ulong) ((pABR->PVec >> 9) * LCR * 10.23 / 128) )
* otherwise the default one is (much) faster
*/
#define ADTF
( (ulong) (0.5 * LCR) )
/* -----------------------------------------------------------* Macro’s to manipulate binary state variables
*/
#define F_LastWasFRM
0x01
#define F_AllowIncACR
0x02
#define F_PresBRM
0x04
#define F_ALL
(F_LastWasFRM | F_AllowIncACR | F_PresBRM)
/*
* Macro’s to manipulate Message Type field of a RM cell
*/
#define BRM_NI
0x10
/* No Increase */
#define BRM_CI
0x20
/* Congestion Indication */
#define BRM_BN
0x40
/* BECN Cell */
Source Code Listings
3-63
BookL64364PG.fm5 Page 64 Friday, January 28, 2000 4:58 PM
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
#define BRM_DIR
#define BRM_ALL
#define BRM_SHIFT
0x80
/* Direction, not stored */
(BRM_NI | BRM_CI | BRM_BN | BRM_DIR)
2
/* offset to shift BRM bits */
/* -----------------------------------------------------------* Private local macro’s
*/
#define InterCellGap(x) (((ulong) LCR) / r2u(x)) << TIME_FRAC
#define RATE_EXP
#define RATE_NZ
#define RM_CDS
#define ABR_ID
9
14
/* Exponent part of ABR rate */
/* Valid bit of ABR rate */
((8 << 10) | (1 << 9)) /*RM cell buff descriptor*/
0x01
#endif
3.8.6 TxCell() and RxCell() (Cell.c)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
3-64
/* $Id: Cell.c,v 1.9 1996/06/25 01:16:26 zhifeng Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995–1999 LSI Logic Corporation
*
* Cell.c - Receive and Transmit a cell
* -----------------------------------------------------------*/
#include “uTypes.h”
#include “Config.h”
#include “Hdr.h”
#include “ATMizer2.h”
/* -----------------------------------------------------------*
Receive Cell
*
* Name:
RxCell(const ulong aCell)
*
* Description: This function is called if the RxCell processor’s
*
request queue is not full and there is a cell in
*
the ACI Receive Fifo. APU gets this cell and checks
*
the cell header. If it is a RM cell, process it.
*
Otherwise, invoke RxCell processor to process it.
*
* Parameter:
aCell - address of the cell in Cell Buffer
*
* Return value: None
*
* -----------------------------------------------------------*/
void RxCell(const ulong aCell)
{
/* retrieve the header */
Scheduling
BookL64364PG.fm5 Page 65 Friday, January 28, 2000 4:58 PM
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
ulong CellHdr = word(CBM[aCell + 4]);
#ifdef LOOP_BACK
ulong ConNum
#else
ulong ConNum
#endif
= (CellHdr >> 4) + MAX_CON_NUM;
= CellHdr >> 4;
/*
* Very Simple Cell Header Lookup:
*
take only VCI (must be in range 0 .. CON_NUM-1
* OAM cells and signalling VCI are not processed
*/
if (ConNum >= MAX_CON_NUM)
{
Stat.ErrConNum++;
ACI_Free(aCell);
}
else {
/* Check if it is a RM cell */
if ( (CellHdr & PTI_HDR) == PTI_RM ) {
if (word(CBM[aCell]) & one(CDS_Crc10)) {
Stat.ErrCrc10++;
ACI_Free(aCell);
}
else
ABR_Receive(ConNum, aCell);
}
else
EDMA_RxCell(ConNum, (CellHdr << 16) | aCell );
}
Stat.RxCells++;
}
/* -----------------------------------------------------------*
Transmit a Cell
*
* Name:
TxCell(const ulong aCell)
* Description: This function is called if the TxCell processor’s
*
request queue is not full and there is a free cell
*
location. APU gets the ConNum from the Scheduler
*
checks the class of this connection and calls the
*
corresponding procedures.
* Parameters: aCell - address of a free cell location in Cell Buffer
* Return value: None
* -----------------------------------------------------------*/
#define UBR_Send CBR_Send
QoS_Send_t *QoS_Send[] = {
CBR_Send, VBR_Send, ABR_Send, UBR_Send };
void TxCell(const ulong aCell)
{
Source Code Listings
3-65
BookL64364PG.fm5 Page 66 Friday, January 28, 2000 4:58 PM
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120 }
ulong
do {
N, T;
N = SCD_Serv();
if ((N & U16) == 0) {
if (OutOfRate.aCell) {
EDMA_TxCell(0, OutOfRate.aCell);
OutOfRate.aCell
= 0;
OutOfRate.ConNum = 0;
ACI_Free(aCell);
}
else {
pCell_t pCell = (pCell_t) &CBM[aCell];
pCell->CDS = CDS_IDLE;
pCell->CellHdr = 0;
EDMA_TxCell(0, aCell);
}
break;
}
if (QoS_Send[ConClass(N)](N, aCell)) {
N &= U16;
ACD[N].ThTxTime += ACD[N].ICG;
T = ACD[N].ThTxTime;
if ( IsInPast(T) )
T = TimeNow + (1 << TIME_FRAC);
SCD_Sched(0, (T >> TIME_FRAC) & CAL_SIZE_MASK);
break;
}
ACD_UnSched(ACD[N]);
} while (1);
SCD_Tic();
TimeNow += 1 << TIME_FRAC;
Stat.TxCells++;
3.8.7 Transmit a CBR Cell (CBR.c)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
3-66
/* $Id: CBR.c,v 1.2 1996/06/04 00:35:45 zhifeng Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995–1999 LSI Logic Corporation
*
* CBR.c - Send a cell from a Constant Bit Rate connection.
* -----------------------------------------------------------*/
#include “uTypes.h”
#include “Hdr.h”
#include “ATMizer2.h”
/*
*
*
*
-----------------------------------------------------------Name:
CBR_Send()
Description: Behavior for a transmit CBR cell
parameters: ConNum: BuffPres|Class|Connection Number (from Scheduler)
Scheduling
BookL64364PG.fm5 Page 67 Friday, January 28, 2000 4:58 PM
17
18
19
20
21
22
23
24
25
26
27
28
29
30
*
aCell:
cell address in Cell Buffer
* returns status code:
*
0
no cell sent
*
1
data cell sent
* -----------------------------------------------------------*/
int CBR_Send(const ulong ConNum, const ulong aCell)
{
if (ConHasData(ConNum)) {
EDMA_TxCell(ConNum, aCell);
return 1;
}
return 0;
}
3.8.8 Transmit a VBR Cell (VBR.c)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
/* $Id: VBR.c,v 1.6 1996/08/06 19:02:47 thomasd Exp $ */
/* -----------------------------------------------------------*
ATMizer-2
*
Copyright (C) 1995–1999 LSI Logic Corporation
*
* VBR.c - Send a cell from a VBR connection
* -----------------------------------------------------------*/
#include “uTypes.h”
#include “Hdr.h”
#include “ATMizer2.h”
#include “Instr.h”
/* -----------------------------------------------------------* Name:
VBR_Send()
* Description: Behavior for a transmit VBR cell
* parameters: ConNum: BuffPres|Class|Connection Number (from Scheduler)
*
aCell:
cell address in Cell Buffer
* returns status code:
*
0
no cell sent
*
1
data cell sent
* -----------------------------------------------------------*/
int VBR_Send(const ulong ConNum, const ulong aCell)
{
if (ConHasData(ConNum)) {
ulong N = ConNum & U16;
ACD[N].Bucket -= TimeNow - (ACD[N].ThTxTime - ACD[N].ICG);
ACD[N].Bucket = maxi(0, ACD[N].Bucket) + ACD[N].Increment;
ACD[N].ICG = maxi(ACD[N].ICG_PCR, ACD[N].Bucket - ACD[N].Limit);
ACD[N].ThTxTime = TimeNow;
EDMA_TxCell(N, aCell);
return 1;
}
return 0;
}
Source Code Listings
3-67
BookL64364PG.fm5 Page 68 Friday, January 28, 2000 4:58 PM
3.8.9 Transmit and Receive ABR Cells (ABR.c)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
3-68
/*
/*
*
*
*
*
*
$Id: ABR.c,v 1.5 1996/06/04 01:15:55 zhifeng Exp $ */
-----------------------------------------------------------ATMizer-2
Copyright (C) 1995–1999 LSI Logic Corporation
ABR.c - Available Bit Rate Source and Destination End System Behavior
As per ATM Forum Traffic Management specifications v4.0 (af-tm0056.000)
* The following simplications that are believed to be realistic
* were made:
*
1. Following parameters are set to constant
*
Nrm = 32, Trm = 100 ms, CDF = 1/16, ADTF = 0.5
*
2. CRM range is 0 .. 255
*
3. Sending out-of-rate cells at TCR is not implemented
* -----------------------------------------------------------*/
#include “uTypes.h”
#include “ABR.h”
#include “Hdr.h”
#include “Instr.h”
#include “ATMizer2.h”
/* -----------------------------------------------------------* Name:
ABR_Send()
* Description:
Behavior for a transmit ABR cell
* parameters: ConNum: BuffPres|Class|Connection Number (from Scheduler)
*
aCell:
cell address in Cell Buffer
* returns status code:
*
0
no cell sent
*
1
data cell sent
*
2
backward RM cell sent
*
3
forward
RM cell sent
* -----------------------------------------------------------*/
int ABR_Send(const ulong ConNum, const ulong aCell)
{
register pABR_t pABR = (pABR_t) &ACD[ConNum & U16];
/*
* substraction TimeNow-LastTimeFRM works even when TimeNow wraps
around
* It is mandatory that TimeNow is maintained as an ulong variable
* incremented at each cell slot.
*/
register ulong TimeDiff = (TimeNow - pABR->LastTimeFRM) >>
TIME_FRAC;
/* ----- Source Rule 3a ----* - After the first in-rate forward RM cell, in rate cells shall
* - be sent in the folowing order:
* - a. The next in-rate cell shall be in-rate forward RM cell
Scheduling
BookL64364PG.fm5 Page 69 Friday, January 28, 2000 4:58 PM
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
* - if and only if, since the last in-rate forward RM cell was
sent, either:
* i.
at least Mrm in-rate cells have been sent and
* at least Trm time has elapsed
* or
* ii. Nrm-1 in-rate cells have been sent
*/
if ( pABR->InRateCell >= NRM ||
(TimeDiff > TRM && pABR->InRateCell > MRM) ) {
/* ------ Source Rule 5 -----* - Before sending a forward in-rate RM-cell, if ACR > ICR
* - and the time T that has elapsed since the last in-rate
* - forward RM-cell was sent is greater than ADTF, then ACR
shall be reduced to ICR.
*/
if (TimeDiff > ADTF && pABR->ACR > pABR->ICR)
pABR->ACR = pABR->ICR;
/* ------ Source Rule 6 -----* - Before sending in-rate forward RM cell and after adjusting
* - ACR according to Rule 5 above, if at least CRM in-rate
* - forward RM-cells have been sent since the last backward
* - RM-cell with BN = 0 was received, then ACR shall be
* - reduced by at least ACR*CDF, unless this reduction would
* - result in a rate below MCR, in which case ACR shall be
set to MCR
* Expression evaluation:
*
ACR = ACR - ACR * CDF = ACR - ACR / (1/CDF)
*
= ACR - ( (ACR.exp - logCDF) | ACR.frac )
*
= ACR (ACR - (logCDF << RATE_EXP))
*/
if (pABR->FRM_SinceBRM >= pABR->CRM) {
/* the subtraction below may underflow - but than the NZ bit
* will be cleared effectiviy resetting result to 0
*/
pABR->ACR = rsub(pABR->ACR, pABR->ACR - (logCDF <<
RATE_EXP) );
pABR->ACR = maxi(pABR->ACR, pABR->MCR);
}
/*
* build and send in-rate forward RM cell according to
* Source Rules 4, 7, 10.
*/
word(CBM[aCell])
= RM_CDS;
/* RM Cell Header with PTI set to 6 (PTI_RM) */
word(CBM[aCell +
4]) = word(VCD[ConNum & U16].CellHdr) |
PTI_RM;
/* ID = 1, Msg = 0, ER = PCR */
word(CBM[aCell +
8]) = (ABR_ID << 24) | pABR->PCR;
/* CCR = , MCR = MCR */
word(CBM[aCell + 12]) = word(pABR->ACR);
EDMA_TxCell(0, aCell);
Source Code Listings
3-69
BookL64364PG.fm5 Page 70 Friday, January 28, 2000 4:58 PM
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
3-70
pABR->FRM_SinceBRM++;
pABR->InRateCell
= 1;
pABR->LastTimeFRM
= TimeNow;
pABR->PVec
|= F_LastWasFRM;
pABR->ICG
= InterCellGap(pABR->ACR);
return 3;
}
/* ------ Source Rule 3-b -----* - b. The next in-rate cell shall be a backward RM cell if
* condition a. above is not met, if a backward RM cell is
* waiting for transmission and if either:
* i. no in-rate backward RM cell has been sent since the last
* in-rate forward RM cell
* ii. no data cell is waiting for transmission
*/
else if ( (pABR->PVec & F_PresBRM) &&
(!ConHasData(ConNum) || (pABR->PVec & F_LastWasFRM))) {
/*
* build and send in-rate backward RM cell
*/
register ulong
Msg
= (ABR_ID << 8) | BRM_DIR |
((pABR->PVec << BRM_SHIFT) & BRM_ALL);
if (VCD[ConNum & U16].VCD_Ctrl.EFCI)
Msg |= BRM_CI;
/* Cell Descriptor */
word(CBM[aCell])
= RM_CDS;
/* Cell Header with PTI = 6 (PTI_RM) */
word(CBM[aCell +
4]) = word(VCD[ConNum & U16].CellHdr) |
PTI_RM;
/* ID = 1, DIR=1, (BN, CI, NI) <- BRM, ER <- BRM_ER */
word(CBM[aCell +
8]) = (Msg << 16) | pABR->BRM_ER;
/* CCR<-BRM_CCR, MCR<-BRM<-MCR */
word(CBM[aCell + 12]) = word(pABR->BRM_CCR);
EDMA_TxCell(0, aCell );
pABR->InRateCell++;
pABR->PVec &= ~F_LastWasFRM & ~F_PresBRM;
/*
* If the waiting out of rate cell is from the same
* connection, discard it because it is outdated.
*/
if (OutOfRate.ConNum == (ConNum & U16)) {
ACI_Free(OutOfRate.aCell);
OutOfRate.aCell
= 0;
OutOfRate.ConNum = 0;
}
return 2;
}
/* ------ Source Rule 3-c -----* - c. The next in-rate cell sent shall be a data cell if neither
* condition a. nor condition b. above is met, and if a data
Scheduling
BookL64364PG.fm5 Page 71 Friday, January 28, 2000 4:58 PM
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
* cell is waiting for transmission
*/
else if ( ConHasData(ConNum)) {
/*
* send a data cell.
*/
EDMA_TxCell(ConNum, aCell);
pABR->InRateCell++;
return 1;
}
return 0;
}
/* -----------------------------------------------------------* Name:
ABR_Receive()
* Description:
Behavior for a received RM cell
* parameters:
ConNum:
Connection Number after header look-up
*
aCell:
cell address in Cell Buffer
* return code:
*
0
received backward RM cell
*
1
received forward
RM cell
* -----------------------------------------------------------*/
int ABR_Receive( const ulong ConNum, const ulong aCell)
{
register pABR_t pABR = (pABR_t) &ACD[ConNum & U16];
/*
* get ID, Msg, ER fields in RM cell
*/
register ulong ER
= word(CBM[aCell + 8]);
register ulong Msg = ER >> 16;
ER &= U16;
/*
* Test for bit DIR that occupies sign position after the shifting
*/
if ( Msg & BRM_DIR ) {
/* if DIR == Backward */
/* ------ Source rule 8a -------* - When a backward RM cell is received with CI=1
* - then ACR shall be reduced by at least ACR*RDF,
* - unless that reduction would result in a rate below MCR
* - in which case ACR shall be set to MCR.
*/
if ( Msg & BRM_CI ) { /* if CI set in BRM
*/
/*
* Expression evaluation:
*
ACR = ACR - ACR * RDF = ACR - ACR / (1/RDF)
*
= ACR - ( (ACR.exp - logRDF) | ACR.frac )
*
= ACR (ACR - (logRDF << RATE_EXP))
* RDF is power of 2 in range 1..1/32,768
* logRDF is stored in PVec (4 bits)
*
* the substraction below may underflow - but than the
* NZ bit will be cleared effectivily resetting X to 0
Source Code Listings
3-71
BookL64364PG.fm5 Page 72 Friday, January 28, 2000 4:58 PM
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
3-72
*/
register ulong X = pABR->ACR - (pABR->logRDF << RATE_EXP);
pABR->ACR = rsub(pABR->ACR, X);
}
/* ------ Source rule 8b -----* - If the backward RM cell has both CI=0 and NI=0 then ACR
* - ACR may be increased by no more than RIF*PCR, to a rate
* - not greater than PCR
*/
else if ( (Msg & BRM_NI) == 0 ) {
/*
* ACR = ACR + RIF * PCR = ACR + PCR / (1/RIF)
*
= ACR + ( (PCR.exp - logRIF) | PCR.frac )
*
= ACR + PCR - (logRIF << RATE_EXP)
* RIF is power of 2 in range 1/32768 .. 1
*/
ulong X
= pABR->PCR - (pABR->logRIF << RATE_EXP);
pABR->ACR = radd(pABR->ACR, X);
pABR->ACR = mini(pABR->ACR, pABR->PCR);
}
/* ------ Source rule 8b and 9 -----* - When a backward RM-cell is received and after ACR is
* - adjusted according to Source Rule 8, if ACR is greater
* - than ER from RM cell, then ACR shall be reduced to no
* - greater than ER, unless ER is unless than MCR, in which
* - case ACR shall be set to MCR
*/
pABR->ACR = mini(ER, pABR->ACR);
pABR->ACR = maxi(pABR->ACR, pABR->MCR);
pABR->ICG = InterCellGap(pABR->ACR);
if (( Msg & BRM_BN) == 0) /*if it is source generated RM cell*/
pABR->FRM_SinceBRM = 0;
ACI_Free(aCell);
return 0;
}
else {
/* if DIR == Forward */
/* ------ Destination rule 3 -----* Destination rule 3 has 5 options. The option implemented is
* described below. This is believed to be the most reasonable
* although it is not the cheapiest one.
*
* - If a forward RM cell is received by the destination while
* - another turned-around RM cell (on the same VC) is scheduled
* - for in-rate transmission:
* - a. The contents of the old cell are overwritten by the
* contents of the new cell
* - b. The old cell (after being overwritten) shall be sent
*
out-of-rate
* - c. The new cell is scheduled for in-rate transmission
*/
pABR->PVec
= (pABR->PVec & ~(BRM_ALL >> BRM_SHIFT)) |
((Msg & BRM_ALL) >> BRM_SHIFT);
Scheduling
BookL64364PG.fm5 Page 73 Friday, January 28, 2000 4:58 PM
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289 }
290
pABR->BRM_ER
= ER;
word(pABR->BRM_CCR) = word(CBM[aCell + 12]);
/*
* Check if there presents a BRM cell for the current
* connection. If so, replace the contents of the old one
and send it out-of-rate.
*/
if (pABR->PVec & F_PresBRM) {
if (OutOfRate.aCell)
ACI_Free(OutOfRate.aCell);
ER |= BRM_DIR;
if (VCD[ConNum & U16].VCD_Ctrl.EFCI)
ER |= BRM_CI;
word(CBM[aCell])
= RM_CDS;
word(CBM[aCell + 4]) |= CELL_CLP;
half(CBM[aCell + 8])
= ER;
OutOfRate.aCell
= aCell;
OutOfRate.ConNum = ConNum & U16;
}
else
ACI_Free(aCell);
pABR->PVec
|= F_PresBRM;
/*
* Reschedule this connection if it is not currently scheduled
*/
if (!ACD_IsSched(ACD[ConNum & U16])) {
SCD_Sched(ConNum & U16, \
((TimeNow >> TIME_FRAC) + 1) & CAL_SIZE_MASK );
ACD_Sched(ACD[ConNum & U16]);
}
return 1;
}
Source Code Listings
3-73
BookL64364PG.fm5 Page 74 Friday, January 28, 2000 4:58 PM
3-74
Scheduling
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 4
Unschedule
This chapter describes the motivation and the procedure to unschedule
a connection that is already scheduled in the calendar of the ATMizer II+
chip and includes the following sections:
•
Section 4.1, “Introduction”
•
Section 4.2, “Unschedule Routine”
4.1 Introduction
When a connection needs to be unscheduled, it is necessary to remove
the conection from the calendar table. There are a couple of reasons that
a connection may need to be unscheduled.
1. The rate for a scheduled connection increases. This can introduce
jitter for the connection when the intercell gap decreases. Hence, a
connection that is scheduled many slots down in the calendar now
has to be rescheduled closer to the current slot. To ensure that the
same connection does not appear twice in the calendar, the
connection should be unscheduled from its previous slot and then
rescheduled in the new slot.
2. The connection is closed. When a connection is closed, it is
neccesary to unschedule or remove the connection from the
calendar to make sure that no more cells are sent for the connection.
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
4-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
4.2 Unschedule Routine
The following code illustrates the unscheduling of a connection
(Figure 4.1):
Figure 4.1
Unschedule Routine
1 /* Unschedule if Connection in Calendar */
2 UnscheduleCal(ConNum){
3
ushort *Calendar;
4
5
uchar CalNo = (uchar) ((ACD[ConNum].ACD_Ctrl >> 13) & 0x03);
6
Calendar = (ulong )CalendarAddr[CalNo];
7
8
if ( !ACD_IsSched(ACD[ConNum]) ) {
9
10
if(ACD[ConNum].ThTxTime > SCD_Now()){
11
if(word(Read_SCD_Ctrl()) & ONE(SCD_FlatMode))
12
CurrConNum = Calendar[(ACD[ConNum].ThTxTime)*2];
13
else
14
CurrConNum = Calendar[ACD[ConNum].ThTxTime];
15
16
if(CurrConNum == ConNum){
17
if(word(Read_SCD_Ctrl()) & ONE(SCD_FlatMode)){
18
Calendar[(ACD[ConNum].ThTxTime)*2] = VCD[ConNum].NextVCD;
19
/* if only one conn in slot*/
20
if(Calendar[((ACD[ConNum].ThTxTime)*2)+1] == CalNum)
21
Calendar[((ACD[ConNum].ThTxTime)*2)+1] = 0;
22
}
23
else
24
Calendar[ACD[ConNum].ThTxTime] = VCD[ConNum].NextVCD;
25
}
26
else{
27
while(VCD[CurrConNum].NextVCD != ConNum){
28
CurrConNum = VCD[CurrConNum].NextVCD;
29
}
30
VCD[CurrConNum].NextVCD = VCD[ConNum].NextVCD;
31
if(word(Read_SCD_Ctrl()) & ONE(SCD_FlatMode))
32
if(Calendar[((ACD[ConNum].ThTxTime)*2)+1] == CalNum)
33
Calendar[((ACD[ConNum].ThTxTime)*2)+1] = CurrConNum;
34
}
35
ACD_UnSched(ACD[ConNum]);
36
}
37
}
38 }
4-2
Unschedule
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
Here are the steps that were taken in the sample code to unschedule
Connection ConNum:
Line 6
Get the base address of the calendar in which the connection
is scheduled (may not be the current calendar).
Line 8
Make sure the connection is scheduled in the calendar by
checking the ACD.
Line 10 Next, check the slot (ThTxTime) where the connection is
supposed to be scheduled in the calendar. If the ThTxTime is
greater than the Now slot, then the connection is not yet
merged into the internal cache of the scheduler. However, if
the ThTxTime is less than or equal to the current slot, the
connection is already cached by the scheduler. The scheduler
has no knowledge of the connection until the connection is
cached.
Line 20 When you know that the connection is still in the calendar,
you can traverse the linked list to get the connection you are
interested in and then remove it from the list. To do this, get
the first connection of the calendar slot (line12 for Flat mode,
line 14 for Priority mode). In Flat mode, the scheduler keeps
both the head and tail of the slot. If the head of the slot is the
same connection number as the one that is to be
unscheduled, then just make the NextVCD field of the
connection the top of the slot (line 16). In Flat mode, the
scheduler holds both the head and tail of the slot, hence,
check to see if the connection is the only one in the slot. If so,
clear out the tail of the slot.
Line 26 If the connection is elsewhere in the slot, then parse the list
until you find the connection. Then remove it from the list.
Line 35 Once the connection is removed from the list, the ACD has to
be updated to reflect it.
Unschedule Routine
4-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
If the connection is cached internally in the scheduler, it may not be
worthwhile to uncache it. The reason is that, if the connection is now
faster than before, it is not possible to schedule it any faster than the
current slot and, if the connection is to be closed, then you can close it
once it is served. By reading the head and tail registers it is possible to
know when the connection will be served and then you can take
appropriate action (either remove it so that it is closed or schedule it with
the new ICG).
4-4
Unschedule
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 5
Hashing Function
This chapter describes the hashing function implemented in the ATMizer
II+ chip and includes the following sections:
•
Section 5.1, “Hashing Mechanism”
•
Section 5.2, “Hashing Function.”
•
Section 5.3, “Hash Implementation”
5.1 Hashing Mechanism
ATM technology is connection oriented and the data flow between two
end-station entities is based on an established virtual connection
between them. The routing mechanism for the cells which hold the data
is carried in the cell header; the address space is comprised of 24 bits
which is then subdivided into two fields. At the end stations, the cells are
processed based on a connection number. Typically, the maximum
number of connections that an end station processes is much smaller
than the address space available in the cell header. Therefore, a need
exists for a hashing mechanism to obtain the connection number of a cell
based on the cell header value.
The input value to the hashing mechanism is the cell header and the
output is the connection number corresponding to the cell header. Thus,
the hash table indexes each cell header with the corresponding
connection number and enables the retrieval of the connection number
based on the cell header. The hashing function uses the cell header to
generate an index and the entry in the table the index points to is
checked to yield the connection number for the cell header. Since the
address space is much larger than the maximum number of connections
(and hence the maximum size of the table), there is a possibility that two
distinct cell headers can give rise to the same index value. This is
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
5-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
referred to as a “collision.” Therefore, all the cell headers that give rise
to the same index value are linked together by means of a linked list. If
the hash table has more than one entry, the entries are sorted to obtain
the connection number that corresponds to the cell header. The hash
table is structured in an array as defined in Figure 5.1.
Figure 5.1
Hashing Table Declarations
/* -----------------------------------------------------*
Hash Type/Function Type
*/
typedef struct Hash_Entry_t Hash_Entry_t, *pHash_Entry_t;
struct Hash_Entry_t {
ulong ConNum;
/* Connection number */
ulong VPI_VCI;
/* VPI_VCI field of the Cell header */
pHash_Entry_t
Hash_Next; /* Pointer to next Hash
entry */
};
The collision resolution described above is referred to as “chained
addressing.” The size of the array that constitutes the hash table is
determined by the maximum number of connections and by the hash
function used to compute the index value from the cell header. The next
section presents the details of the hashing function and discusses some
implementation issues.
5.2 Hashing Function
The key step in obtaining the connection number from the cell header is
the computation of the index of the hash table where the cell header and
the connection number are stored. This function necessarily maps more
than one cell header (or VPI/VCI value) to one index since it cannot be
a one-to-one function. However, the function is chosen such that the
spread of the index numbers is statistically even across the range of cell
header values, given that the cell header values are randomly chosen
and are equally likely. This is best achieved if the index is computed as
the mod function of the cell header value with respect to a prime number.
Therefore, the design variable in the choice of the hashing function and
hence the hash table is the prime number for the computation of the
index using the mod function.
5-2
Hashing Function
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
The total number of connections that can be processed by the ATMizer
II+ chip is determined by the number of VCD structures that can be
supported in the memory up to a maximum of 224. Let the total number
of receive connections that are supported in an application be Nmax.
Therefore, the prime number for the computation of the hash function is
less than Nmax. The simplest choice of this prime number is that it is the
largest prime less than Nmax. The hashing function can be chosen to be
the mod function with respect to the largest prime number less than the
maximum number of connections.
5.3 Hash Implementation
The Hashing Table size is based on the total available memory space.
As a trade-off, the larger the table is, the wider and more even the
hashing items are distributed, and the less the chance for a collision.
Once the Hashing Table is determined, it needs to be initialized. Since
each entry in the hash table is actually a flat linked list, the user also
needs to reserve and initialize a free entry pool which will be used once
a hash table entry is taken by one connection while another connection
also has the same hit. Figure 5.2 shows the C routine to accomplish this
task.
Figure 5.2
Hashing Table Initialization
static void InitHash()
{
pHash_Entry_t temp;
ulong n;
/*
* HashTable entries are already cleared.
* Need to create a free entry pool.
*/
freeEntryPool = HashTable + sizeof(Hash_Entry_t) *
MAX_CON_NUM;
temp = freeEntryPool;
for(n = 0; n < MAX_CON_NUM; n++)
temp[n].Hash_Next = &temp[n + 1];
temp[n - 1].Hash_Next = 0;
};
Hash Implementation
5-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
After the initialization of the hash table, the program needs to calculate
the prime number. The C code segment for the calculation is shown in
Figure 5.3.
Figure 5.3
Find Prime Routine
ulong FindPrime(ulong N)
{
ulong i, j, temp;
/* Find the largest prime less than the TableSize */
for (i = 1; i < N; i++) {
temp = N - i;
for (j = temp / 2; j > 1; j--) {
if ( (temp % j) == 0 ) {
break;
}
}
if (j == 1) {
return temp;
}
}
return 0;
}
Since each entry in the hash table is actually a flat linked list, the
insertion procedure needs to check whether there is a connection
already in this slot. If so, it needs to take one free entry from the free
entry pool to append to the end of the list of that slot. Figure 5.4 shows
the C code segments for these actions.
5-4
Hashing Function
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
Figure 5.4
Inserting a Connection into the Hashing Table
/****************** Insertion *******************/
/*
* Now insert this connection into the Hashing Table.
* Note, only Rx direction needs hashing table.
* On SDP we always loop Tx cells back as Rx cells, so the
* cell header remains the same.
*/
/* Entry in the hashing table = VPI_VCI MOD prime */
{
ulong n, VPI_VCI;
pHash_Entry_t entry;
VPI_VCI = Tx;
n = VPI_VCI % prime;
/* If HashTable[n].ConNum = 0, then insert into this entry
*/
if (HashTable[n].ConNum == 0) {
HashTable[n].ConNum = Rx;
HashTable[n].VPI_VCI = VPI_VCI;
}
else { /* else, always insert the new one right after the
head of the list */
entry = freeEntryPool;
freeEntryPool = entry->Hash_Next;
entry->ConNum = Rx;
entry->VPI_VCI = VPI_VCI;
entry->Hash_Next = HashTable[n].Hash_Next;
HashTable[n].Hash_Next = entry;
}
}
Hash Implementation
5-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
5-6
Hashing Function
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 6
Packet Aging
This chapter describes the packet aging function in the ATMizer II+ chip
and includes the following sections:
•
Section 6.1, “Introduction”
•
Section 6.2, “Mailbox Processing”
•
Section 6.3, “Packet Aging Routine”
6.1 Introduction
The concept of packet aging is the notification to the host of idle
connections that have not received a cell for a predefined period. The
ATM Processing Unit (APU) samples the Virtual Connection Descriptor
(VCD) and examines the TimeStamp value on the VCD to determine if
the connection has to be labeled as an idle connection. In the present
scheme, issuing a EDMA_RxCell command by the APU, together with an
End of Message (EOM) cell, terminates the unfinished buffer. This
terminates the buffer with (possibly) the ErrCRC, ErrLength and ErrAbort
bits set and the EDMA places the terminated buffer in the
EDMA_RxCompl queue. This buffer is placed on the ring in sequence
along with the rest of the buffers corresponding to the connection by the
APU.
The idle connection number and the terminated buffer number are sent
to the host so that the connection status can be updated to the idle state.
The APU sends the host a message through the Mailbox with the
address of the HCD_Rx structure in the PCI memory where the
connection number and the buffer number are placed by the APU. The
host samples the Mailbox and, if it is not empty, retrieves the message
and processes it. Also, the buffer that is retrieved from the ring is
processed according to the buffer processing policy of the host. In the
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
6-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
current scheme, there is no identifier in the buffer status bits to indicate
that the buffer was terminated by the APU and thereby should be
differentiated from other buffers that have CRC, length, or abort errors.
Therefore, the possibility exists that the buffer from the ring is retrieved
before the Mailbox message is processed. The host’s buffer and
connection processing protocols should be designed to account for this
possibility.
The discussion of the above scheme can be summarized as follows: the
APU performs the packet aging routine and informs the host of the idle
connections based on the expiration of the TimeStamp values in the
VCD.
6-2
Packet Aging
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
6.2 Mailbox Processing
The Mailbox is used by the APU to send the message about a
connection that has been in the idle state. The connection number and
the terminated buffer number are placed in the HCD_Rx structure in the
PCI memory whose structure is shown below in Figure 6.1.
Figure 6.1
HCD_Rx Structure Declarations
/* layout of the primary PCI memory */
typedef struct {
ulong ConNum;
/* Connection Number */
ulong CellHeader; /* Cell Header in Tx
direction; In Rx - BuffNum */
ulong Class;
/* Class of the connection */
ulong PCR;
/* PCR of the connection */
ulong SCR_MCR;
/* SCR(VBR) or MCR(ABR) of
the connection */
ulong MBS_ICR;
/* MBS(VBR) and ICR(ABR) */
ulong TBE;
/* TBE */
ulong FRTT;
/* Round trip time */
} PCI_HCD_t, *pPCI_HCD_t;
typedef volatile struct {
ulong
CmdAck_APU;
ulong
CmdAck;
ulong
TxCredit;
ulong
RxRing[RX_RING_SIZE];
ulong
Ext_Msg;
Stat_t
Stat[2];
Config_t
Config;
PCI_HCD_t
HCD_Tx;
PCI_HCD_t
HCD_Rx;
uchar
Buff[4];
} PCI_t, *pPCI_t;
Since more than one connection may be in the idle state, the APU
updates the HCD_Rx after a confirmation that the previous message
sent to the host was processed by using the CmdAck_APU field in the
PCI memory. This handshake mechanism prevents the APU from
overwriting the HCD_Rx before the corresponding message is processed
by the host.
Mailbox Processing
6-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
6.3 Packet Aging Routine
The APU enters the packet aging procedure periodically and checks the
connections in sequential order for idle connections. The APU checks if
the TimeStamp on the VCD is expired based on a preprogrammed
timeout value. If the connection has been idle, the APU checks if there
is a buffer that is being processed by the EDMA. In the current
implementation, the VCD_BuffPres bit is set if there is a buffer currently
being processed that is attached to the VCD. If the buffer is present, the
APU terminates the buffer by sending an EOM cell and puts the
connection number and the buffer number in the HCD_Rx structure
before issuing a message to the host. If there is no buffer present, the
connection number and a zero buffer number are placed in HCD_Rx.
6-4
Packet Aging
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 7
Interrupt Handling
This chapter describes the interrupt handler function implemented in the
L64364 ATMizer II+ ATM-SAR Chip and includes the following sections:
•
Section 7.1, “Introduction”
•
Section 7.2, “Nonvectored Interrupt Handler”
•
Section 7.3, “Vectored Interrupt Handler”
7.1 Introduction
The CW4011 processor used in the L64364 ATMizer II+ ATM-SAR Chip
supports three types of interrupt signals:
•
Cold/warm resets (CRESETn and WRESETn signals) and
nonmaskable interrupts (NMIn signal)
•
External nonvectored interrupts (EXiNTn[5:0])
•
External vectored interrupt (EXViNTn)
This chapter focuses on the software side of external interrupt handling
performed by the L64364 ATMizer II+ ATM-SAR Chip. It uses sample
code developed by LSI Logic Corporation for the ATMizer II+ Application
Development Platform (ADP) to illustrate the design steps. Refer to the
L64364 ATMizer II+ ATM-SAR Chip Technical Manual for a detailed
description of the interrupt handling mechanism.
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
7-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
7.2 Nonvectored Interrupt Handler
In the L64364 ATMizer II+ architecture, there are six nonvectored
interrupts which are used to handle catastrophic events. They are listed
in Table 7.1 with number 5 being the highest priority.
Table 7.1
Nonvectored Interrupt Sources
Name
Number
IP Bit in
Cause
Reg.
CW4011
Status
Reg. Bit
Description
IntPCIErr
5
7
15
PCI abort or parity error
IntSBErr
4
6
14
Secondary Bus error
IntRateExc
3
5
13
Rate calculation exception or
OCA Bus timeout
IntRxMbxOvr
2
4
12
Receive Mailbox overflow
IntSCD_BusErr
1
3
11
Scheduler bus error
IntEDMA_BusErr
0
2
10
EDMA bus error
The general handler detects which interrupt occurs by checking the
Cause register, then jumps to the specific handler that takes care of that
event. The following code is a sample of event handling (Figure 7.1):
Figure 7.1
Nonvectored Interrupts General Handler
#include <mips.h>
#include <lr64363.h>
#define int0
#define int1
#define int2
#define int3
#define int4
#define int5
.text
.globl handler
.ent handler
.set noat
handler:
7-2
Interrupt Handling
0x0400
0x0800
0x1000
0x2000
0x4000
0x8000
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
# check to see whether it’s for me
.set noreorder
mfc0
k0,C0_CAUSE
nop
.set reorder
# first see whether EXCCODE = 0
and
k1,k0,CAUSE_EXCMASK
bne
k1,zero,trap
# mask IP bits with int mask
.set noreorder
mfc0
k1,C0_SR
nop
and
k0,k1
.set reorder
# now check the intEDMA_BusErr IP bit
and
k1,k0,int0
bne
k1,zero,intEDMA_BusErr
# now check the intSCD_BusErr IP bit
and
k1,k0,int1
bne
k1,zero,intSCD_BusErr
# now check the intRxMbxOvr IP bit
and
k1,k0,int2
bne
k1,zero,intRxMbxOvr
# now check the intRateExc IP bit
and
k1,k0,int3
bne
k1,zero,intRateExc
# now check the intSBAddrErr IP bit
and
k1,k0,int4
bne
k1,zero,intSBAddrErr
# now check the intPCIErr IP bit
and
k1,k0,int5
bne
k1,zero,intPCIErr
# handle trap exceptions such as bus error
# address error, etc ...
trap:
b
done
intRxMbxOvr:
intRateExc:
Nonvectored Interrupt Handler
7-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
intSBAddrErr:
intPCIErr:
# reset the hardware modules
li
k0,M_APU_AddrMap
li
k1,ADRM_RESET
sw
k0,(k1)
b
done
intEDMA_BusErr:
intSCD_BusErr:
# reset the hardware modules
li
k0,M_APU_AddrMap
li
k1,ADRM_RESET
sw
k0,(k1)
# reinitialize the SDRAM controller
# since it is possible that the bus error
# occurs on the SDRAM page
b
sdram_init
sdram_init:
# Load base address
li
t2, SBC_BASE
/* Precharge command */
li
t0, 0x4033b753
sw
t0, 4(t2)
# control reg
/* Mode register */
li
t0, 0x20228530
sw
t0, 4(t2)
# control reg
/* Set Mode */
li
t0, 0x0000eeee
li
t3, 0x80811800
sw
t0, (t3)
/* Set Control */
li
t0, 0x10228530
sw
t0, 4(t2)
# control reg
/* Load refresh register */
li
t0, 0x00000300
sw
t0, 8(t2)
# refresh reg
.set noreorder
done:
7-4
Interrupt Handling
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
mfc0
k0,C0_EPC
nop
j
k0
rfe
.set reorder
.end handler
.set at
If you use the LSI Logic PMON for debugging, the handler can print out
a message that describes the type of the interrupt. It then transfers
control to PMON for debugging by calling the exit() function as the
following example code illustrates (Figure 7.2):
Figure 7.2
General Handler Exit to PMON
.data
int0_msg:
.assize “Interrupt:
EDMA Bus Error\n”
la
jal
a0,int0_msg
printf
# get message to print
# jump to printf routine
jal
exit
# exit to PMON
7.3 Vectored Interrupt Handler
In the L64364 ATMizer II+ architecture, there are 16, prioritized, vectored
interrupts as shown in Table 7.2.
Table 7.2
Vectored Interrupt Sources
Name
Number1
Description
IntEDMA_ComplFull
15
TxCell, RxCell or Buff Completion Queue
is full
IntACI_RxFull
14
ACI Receive FIFO full
IntRxMbx
13
Receive Mailbox FIFO nonempty
IntMove_Compl
12
Move complete
IntEDMA_RxCompl
11
RxCell Completion Queue nonempty
IntACI_RxThrld
10
ACI Receive FIFO exceeds threshold
(Sheet 1 of 2)
Vectored Interrupt Handler
7-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
Table 7.2
Vectored Interrupt Sources (Cont.)
Name
Number1
Description
IntEDMA_TxCompl
9
TxCell Completion Queue nonempty
IntEDMA_BuffCompl
8
Buff Completion Queue nonempty
IntACI_Err
7
Tiimeout, parity, or short-cell error
IntACI_TxThrld
6
ACI Transmit FIFO drops below threshold
IntExt[1:0]
5–4
External interrupt inputs (user-defined)
IntTim[3:1]
3–1
Timers 3–1 timeout
IntTim[0]
0
Timer 8 timeout
(Sheet 2 of 2)
1. APU Status register interrupt bit numbers and APU_VIntEnable register bit
numbers match the interrupt numbers shown here.
The L64364 ATMizer II+ ATM-SAR Chip Technical Manual has a detailed
description of its interrupt mechanism. This section shows how to handle
the vectored interrupts with sample code.
7.3.1 Enable Interrupts
The following code shows how to enable the vectored interrupts
(Figure 7.3):
Figure 7.3
Vectored Interrupts Enabling Routine
#include “regdef.h”
#include “cp0_scobra.h”
.text
.set noreorder
.globl VectEn
.ent
VectEn
VectEn:
/* store return address on stack */
subu sp,24
sw
ra,20(sp)
/*
* Map vector interrupt table through APU_VIntBase and
* APU_VIntEnable registers.
7-6
Interrupt Handling
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
*/
/* Hardware Register Base address */
la t1, 0xb8000000
/* load address of vector interrupt table */
la t0, V_Handler
/* align vector interrupt table address */
srl t0, t0, 7
/* store address in the APU_VIntBase reg */
sw t0, 0x310(t1)
/* interupt mask */
/*
* Don’t enable IntAci_Tx here, it will jump
* to its handler right way. Enable it when
* a connection is opened.
*/
#define APU_VINT_MASK \
(EDMA_COMPLFULL << 15) | \
(ACI_RXFULL << 14) | \
(RXMBX
<< 13) | \
(EDMA_MOVE << 12) | \
(EDMA_RXCELL << 11) | \
(ACI_RX
<< 10) | \
(EDMA_TXCELL << 9) | \
(EDMA_BUFF
<< 8) | \
(ACI_ERR
<< 7) | \
(ACI_Tx
<< 6) | \
(EXT1
<< 5) | \
(EXT0
<< 4) | \
(TIMER3
<< 3) | \
(TIMER2
<< 2) | \
(TIMER1
<< 1) | \
(TIMER8
<< 0)
li
t0, APU_VINT_MASK
/* enable interrupts at APU */
sh t0, 0x30e(t1)
/* enable Vectored Interrupts in CCC reg */
/* t0 <- CCC Register */
mfc0
t0, C0_CONFIG
nop
nop
Vectored Interrupt Handler
7-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
/* enable vectored interrupts(set EVI) */
li t1, 0x02000000
or t0, t0, t1
mtc0
t0, C0_CONFIG
nop
nop
/* enable Vectored Interrupts in C0_SR reg */
mfc0
t0, C0_SR
nop
nop
ori t0, t0, SR_IE
mtc0
t0, C0_SR
nop
nop
/* Return to the caller */
lw ra, 20(sp)
addu
sp, 24
j
ra
nop
nop
.set reorder
.end VectEn
7.3.2 General Handler
The following code is a sample for the general handler (Figure 7.4):
Figure 7.4
Vectored Interrupts General Handler
.text
.align
7 # aligned for APU_VIntBase register.
.set noreorder
V_Handler:
/* ################################################## */
/* Vectored table
*/
/* ################################################## */
timer8:
/* Timer #8 timed out */
j
timer8_handler
/* Int 0 */
nop
timer1:
/* Timer #1 timed out */
j
timer1_handler
/* Int 1 */
nop
timer2:
/* Timer #2 timed out */
j
timer2_handler
/* Int 2 */
7-8
Interrupt Handling
BookL64364PG.fm5 Page 9 Friday, January 28, 2000 4:58 PM
nop
timer3:
j
timer3_handler
nop
ext_int0:
j
ext_int0_handler
nop
ext_int1:
j
ext_int1_handler
nop
aci_tx:
j
nop
aci_err:
j
aci_tx_handler
aci_err_handler
nop
edma_bcq_ne:
/* Timer #3 timed out */
/* Int 3 */
/* External interrupt #0 */
/* Int 4 */
/* External interrupt #1 */
/* Int 5 */
/* ACI Tx FIFO drops below
threshold */
/* Int 6 */
/* ACI Error FIFO non-empty */
/* Int 7 */
/* EDMA buffer completion
queue non-empty */
j
edma_bcq_ne_handler /* Int 8 */
nop
edma_txcq_ne:
/* EDMA TxCell completion
queue non-empty */
j
edma_txcq_ne_handler/* Int 9 */
nop
aci_rx:
/* ACI Rx FIFO exceeds
threshold */
j
aci_rx_handler
/* Int a */
nop
edma_rxcq_ne:
/* EDMA RxCell completion
queue non-empty */
j
edma_rxcq_ne_handler/* Int b */
nop
edma_move:
/* EDMA move completion */
j
edma_move_handler
/* Int c */
nop
rx_mbox_ne:
/* Rx mailbox non-empty */
j
rx_mbox_ne_handler /* Int d */
nop
aci_rx_full:
/* ACI Rx FIFO full */
j
aci_rx_full_handler /* Int e */
nop
edma_cq_full:
/* EDMA completion
queue full */
j
edma_cq_full_handler/* Int f */
nop
Vectored Interrupt Handler
7-9
BookL64364PG.fm5 Page 10 Friday, January 28, 2000 4:58 PM
7.3.3 Individual Handlers
Each interrupt event has its own interrupt handler. This handling routine
is the same C or assembly file used in a regular polling mode application,
except that the application is in interrupt-driven mode. Following is a
sample code for handling the IntRxMbx interrupt (Figure 7.5):
Figure 7.5
IntRxMbx Interrupt Handler
#include “regdef.h”
#include “cp0_scobra.h”
#include “Interrupt.h”
.text
.set noreorder
rx_mbox_ne_handler:
/* Rx mailbox non-empty */
/*
* allocate some space on the stack prior to enabling
ints.
*/
subu sp,C_SIZE*4
.set reorder
#if !REG_MAP | RXMBX_DEBUG
/* now save the rest of the registers */
sw AT,C_AT*4(sp)
sw v0,C_V0*4(sp)
sw v1,C_V1*4(sp)
sw a0,C_A0*4(sp)
sw a1,C_A1*4(sp)
sw a2,C_A2*4(sp)
sw a3,C_A3*4(sp)
sw t0,C_T0*4(sp)
sw t1,C_T1*4(sp)
sw t2,C_T2*4(sp)
sw t3,C_T3*4(sp)
sw t4,C_T4*4(sp)
sw t5,C_T5*4(sp)
sw t6,C_T6*4(sp)
sw t7,C_T7*4(sp)
sw t8,C_T8*4(sp)
sw t9,C_T9*4(sp)
#endif
sw ra,C_RA*4(sp)
mflo t6
sw t6,C_LO*4(sp)
7-10
Interrupt Handling
BookL64364PG.fm5 Page 11 Friday, January 28, 2000 4:58 PM
mfhi t6
sw t6,C_HI*4(sp)
subu sp,24
# allocate min size context
li
# atmizer hardware register base
k1, 0xb8000000
.set noreorder
#if RXMBX_DEBUG
la a0, got_rx_mbox_ne
jal printf
nop
la
jal
nop
lw
jal
nop
#endif
a0, apu_status_msg
printf
# list all active interrupts
a0,0x314(k1)
print_reg
# read APU_Status register
#if !REG_MAP
lw a0, 0x404(k1)
#else
lw s4, 0x404(k1)
#endif
jal HostMsg
nop
# read PP_RxMbx register
# read PP_RxMbx register
#if RXMBX_DEBUG
la a0, pass_msg
jal printf
nop
nop
#endif
#if RXMBX_DEBUG
la a0, EPC_msg
jal printf
nop
#lw a0,(C_EPC*4+24)(sp)
mfc0 k0,C0_EPC
nop
jal print_reg
jal printf
nop
#endif
.set reorder
Vectored Interrupt Handler
7-11
BookL64364PG.fm5 Page 12 Friday, January 28, 2000 4:58 PM
addu sp,24
# deallocate
#if !REG_MAP | RXMBX_DEBUG
lw AT,C_AT*4(sp)
lw v0,C_V0*4(sp)
lw v1,C_V1*4(sp)
lw a0,C_A0*4(sp)
lw a1,C_A1*4(sp)
lw a2,C_A2*4(sp)
lw a3,C_A3*4(sp)
lw t0,C_T0*4(sp)
lw t1,C_T1*4(sp)
lw t2,C_T2*4(sp)
lw t3,C_T3*4(sp)
lw t4,C_T4*4(sp)
lw t5,C_T5*4(sp)
# t6 is restored later
lw t7,C_T7*4(sp)
lw t8,C_T8*4(sp)
lw t9,C_T9*4(sp)
#endif
lw ra,C_RA*4(sp)
lw t6,C_LO*4(sp)
mtlo
t6
lw t6,C_HI*4(sp)
mthi
t6
.set noreorder
/* restore t6, EPC and deallocate stack */
#if !REG_MAP | RXMBX_DEBUG
lw t6,C_T6*4(sp)
#endif
#lw k0,C_EPC*4(sp)
mfc0
k0,C0_EPC
nop
addu
sp,C_SIZE*4
j
k0
/* return from interrupt R3000 mode */
rfe
.set reorder
The above code is a regular interrupt handler. It must save the contents
of registers before jumping into the handling routine and restore them
before exiting from the handler. Since the vectored interrupt events
happen so frequently, this kind of storing/restoring decreases the overall
7-12
Interrupt Handling
BookL64364PG.fm5 Page 13 Friday, January 28, 2000 4:58 PM
performance dramatically. To avoid it, you may separate the entire
general register set into the three domains shown in Table 7.3.
Table 7.3
General Register Map
General
Domain1
Domain2
$zero ($0)
$at ($1)
$s8 ($30)
$ra ($31)
$v0 ($2)
$v1 ($3)
$sp ($29)
$a0 ($4)
$s4 ($20)
$gp ($28)
$a1 ($5)
$s5 ($21)
LO
$a2 ($6)
$s6 ($22)
HI
$a3 ($7)
$s7 ($23)
$s0 ($16)
$k0 ($26)
$s1 ($17)
$k1 ($27)
$s2 ($18)
$t4 ($12)
$s3 ($19)
$t5 ($13)
$t0 ($8)
$t6 ($14)
$t1 ($9)
$t7 ($15)
$t2 ($10)
$t8 ($24)
$t3 ($11)
$t9 ($25)
The idea behind this separation is to divide the general registers into two
nonoverlapped sets, one is used only by the regular routines and the
other one is used only by the interrupt handlers. Thus, the handlers do
not have to unnecessarily store/restore those not shared. The shared
part still needs saving/reloading.
Many tools support the option to restrict references to the specified
registers when generating the assembly code. The following procedure
uses Gnu tools. As shown in Table 7.3, general registers are required by
the compiler and assembler, and should be used by both regular routines
and the interrupt handlers. The source files are separated into two sets,
one called regular files and the other called handler files.
Vectored Interrupt Handler
7-13
BookL64364PG.fm5 Page 14 Friday, January 28, 2000 4:58 PM
Step 1. Compile the files with -S and -ffixed options to generate the
assembly files using only registers in domain 1.
Step 2. Convert domain 1 registers in the handler code to domain 2
registers through a post-processing script file.
Step 3. Call the assembler to assemble the files to generate the object
and executable code.
The drawback of this scheme is that it violates the MIPS convention, and
thus makes it difficult (if not impossible) for debugging since the
debugger has no idea of this register shuffle. You may have to follow the
MIPS convention before developing a bug-free working code. This is also
indicated in the sample code provided in this section.
7-14
Interrupt Handling
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 8
OAM Cell Processing
This chapter outlines the implementation of the Operations and
Management (OAM) cell processing function in ATMizer II+ software. The
OAM function is defined at the Physical and the ATM layers. The Physical
Layer OAM cell processing is done by the framer (for example, the
SuniLite Framer chip). The software running on the ATM Processing Unit
(APU) performs the ATM Layer OAM cell processing. The software
examples in this chapter are provided for demonstration and evaluation
purposes.
This chapter contains the following sections:
•
Section 8.1, “Introduction,”
•
Section 8.2, “F4 OAM Flow,”
•
Section 8.3, “F5 OAM Flow,”
8.1 Introduction
The OAM cells are defined by the International Telecommunications
Union in specification ITU-T I.610. OAM cells are loaded by the host to
the ATMizer II+ chip. These cells are transferred between two ATM end
units to convey management information about the network. The OAM
cell flows are defined at the Physical layer and the ATM layer. They have
predefined header values to distinguish them from the regular data cells
on the link. The OAM flows ‘F1’ and ‘F3’ are at the Physical Layer. The
flows ‘F4’ and ‘F5’ are at the ATM Layer. An ATM cell is identified as an
F4 OAM cell by a 0x00000040 header and an F5 OAM cell is identified
by a 0x0000000A header. The F4/F5 OAM cells are treated as out-ofband cells and passed to the host directly by the APU which processes
them without involving the EDMA. The software support for OAM is used
to create OAM cell flows and filter the incoming OAM cells.
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
8-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
8.2 F4 OAM Flow
The F4 OAM flow is for the management of the Virtual Paths (VPs)
between ATM end units. The connections in the ATMizer II+ chip are
characterized by a connection number for the hardware. Since the F4
OAM cell flow for a VP is independent of the connections in that VP, a
separate connection number needs to be reserved for the support of the
OAM flow for each VP. This connection number is used by the Scheduler
to send the OAM cells in the Tx direction. In the Rx direction, the OAM
cells are received and processed by the APU. OAM cells are passed to
the host through Mailbox messaging because:
•
they are individually complete and do not have to be reassembled,
•
they occur at a low rate, and
•
since they contain important information on the condition of the
network, the host needs to be notified immediately.
Since only one cell is to be sent, the use of a buffer to carry the cell
(small buffer) is inefficient. Therefore, the received OAM cell contents for
a VP are copied into the primary memory and the host is notified. A
structure called OAM_VPC_t is defined in the primary memory to transfer
the contents of the OAM cell to and from the host as shown below in
Figure 8.1.
Figure 8.1
OAM Cell Declarations
/* -----------------------------------------------------*
OAM cell declaration
*/
typedef struct {
ulong
ConNum;
Cell_t
OAM_Cell;
} OAM_VPC_t, *pOAM_VPC_t;
typedef volatile struct {
ulong
CmdAck;
ulong
CmdAck_APU;
#if defined(OAM_F4) || defined(OAM_F5)
ulong
CmdAck_OAM_Tx;
ulong
CmdAck_OAM_Rx;
#endif
ulong
TxCredit;
ulong
RxRing[RX_RING_SIZE];
8-2
OAM Cell Processing
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
ulong
Ext_Msg;
Stat_t
Stat[2];
Config_t
Config;
PCI_HCD_t
HCD_Tx;
PCI_HCD_t
HCD_Rx;
#if defined(OAM_F4) || defined(OAM_F5)
OAM_VPC_t
OAM_VPC_Tx;
OAM_VPC_t
OAM_VPC_Rx;
#endif
uchar
Buff[4];
} PCI_t, *pPCI_t;
The OAM_VPC_Tx structure is used by the host to send the contents of
a VP OAM cell along with the connection number associated with the
OAM flow. OAM_VPC_Rx is used by the APU to send the contents of
the OAM cell filtered by the APU to the host. Since several OAM flows
may be open at the same time, the messaging between the host and the
APU is done using a handshake through the CmdAck_OAM_Tx and
CmdAck_OAM_Rx fields in the primary memory. This prevents the
structure from being overwritten before it is read. The host also maintains
a structure (OAM_VP) for each VP OAM flow of type OAM_VPC_t. The
connection information for the OAM flow is maintained in the HCD_PAR_t
structure OAM_VP_HCD as shown in Figure 8.2.
Figure 8.2
OAM Flow Connection Information
#ifdef OAM_F4
#define OAM_F4_COUNT 1
OAM_VPC_t
OAM_VP[OAM_F4_COUNT];
HCD_PAR_t
OAM_VP_HCD[OAM_F4_COUNT];
#endif
The ATM end unit that transmits the F4 OAM cells is denoted by
F4_B_NT1 and the ATM end unit that receives and turns around the
OAM cells is denoted by F4_B_NT2. The code for these two end units
is compiled separately by the compiler directives #ifdef F4_B_NT1 and
#ifdef F4_B_NT2, respectively.
8.2.1 Initialization of F4 Flow
The F4_B_NT1 host opens an F4 flow for a VP by issuing an open
connection command with a connection number. The connection
parameters are initialized in OAM_VP_HCD. The connection is initialized
as a CBR connection with a specified cell rate thereby setting the cell
rate of the OAM cell flow. In the ATMizer II+ chip, the scheduling of the
F4 OAM Flow
8-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
connection is done by the Scheduler and the first cell is scheduled when
a buffer is attached to the VCD corresponding to the connection. Since
the OAM flow does not use the data from the buffer attached to the VCD,
a dummy buffer is attached by the host to the VCD to start the OAM cell
transmission. Before attaching a dummy buffer, the host initializes the
OAM cell in the primary memory for the VP and then issues a Buff
command via the TxRing to attach the dummy buffer. This procedure is
shown in the code below, Figure 8.3.
Figure 8.3
OAM Cell Initialization
#ifdef OAM_F4
#ifdef F4_B_NT1
/* Open Connection for OAM_VP */
ConNum = OAM_VP_OFFSET;
for (i=0; i < pPCI->Config.OAM_VPCount; i++)
{
OAM_VP_HCD[i].ConNum = ConNum;
OAM_VP_HCD[i].CellHeader = OAM_F4E_CellHdr |
((ConNum - OAM_VP_OFFSET) << CELL_VPI);
OAM_VP_HCD[i].Class
= Class_CBR;
OAM_VP_HCD[i].PCR
= OAM_F4_PCR;
OAM_VP_HCD[i].Status = REQ_OPEN;
if (Open_Connection(&OAM_VP_HCD[i]))
Halt("Cannot open OAM_VP connection");
OAM_VP_HCD[i].StartTime = mfc0(9);
OAM_VP_HCD[i].Status = OPEN;
OAM_INT/* Initialize the OAM Cell */
OAM_Init(OAM_VP_HCD[i].ConNum,
OAM_VP_HCD[i].CellHeader, OAM_Perf_Mon, &OAM_VP[i]);
/* Send OAM Cell to the APU */
OAM_Send(&OAM_VP[i], MSG_OAM_F4);
printf("Opened Connection OAM_VP with VP= %d.\n",OAM_VP_HCD[i].ConNum);
/* Send dummy Buffer to start connection */
half(BFD[BuffNum].BFD_Ctrl) = 0;
BFD[BuffNum].ConNum = ConNum;
n = 0;
while (RingPut(&TxRing, BuffNum) == 0)
if (++n == TIMEOUT_TX_RING)
Halt("TxRing timeout in OpenConnection");
ConNum++;
}
#endif
#endif
8-4
OAM Cell Processing
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
The code in Figure 8.3 OAM_VP_OFFSET is defined to be the offset
reserved for the connection numbers for the VP OAM flows. The OAM
cell header is formed using the connection number for the VP and
OAM_F4E_CellHdr which is defined in Figure 8.4.
Figure 8.4
OAM Cell Header Formation
#define OAM_F4E_CellHdr
0x00000040 /* ATM Layer VPI OAM cell- F4 End-to-End */
The routine OAM_Init() initializes the OAM_VP structure and the
routine OAM_Send() sends it to the APU. In OAM_Init(), the contents
of the OAM cell are initialized based on the function of the OAM cell as
defined in ITU-T I.610. Once the APU receives the message, it initializes
the OAM cell contents in the secondary memory. From secondary
memory, the contents are copied into the cell in Cell Buffer Memory
(CBM) before the OAM cell is sent out.
8.2.2 F4 Flow Transmit
When the connection number assigned to the VP is serviced by the
Scheduler, the OAM cell contents are copied from the secondary
memory OAM_VP structure to the cell in CBM. The cell is then sent out.
This is achieved by the routine OAM_Send() in the TxCell routine of the
APU as shown below in Figure 8.5.
Figure 8.5
OAM_Send() Routine
/* -------------------------------------------------------* Name:
OAM_Send()
*
* Description: Send OAM cell for F4 and F5 flows
*
* parameters:
pCell:
cell address in Cell Buffer
*
* -------------------------------------------------------*/
void OAM_Send(const ulong ConNum, const ulong OAM_Type, const pCell_t pCell)
{
#if defined(F4_B_NT1) || defined(F5_B_NT1)
int i;
#endif
uchar *Ptr = 0;
switch (OAM_Type)
{
#ifdef OAM_F4
case MSG_OAM_F4:
Ptr = (uchar *) &OAM_VP[ConNum-OAM_VP_OFFSET].OAM_Cell.Payld;
F4 OAM Flow
8-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
break;
#endif
#ifdef OAM_F5
case MSG_OAM_F5:
Ptr = (uchar *) &OAM_VC[ConNum-OAM_VP_OFFSET].OAM_Cell.Payld;
break;
#endif
default:
break;
}
#if defined(F4_B_NT1) || defined(F5_B_NT1)
/* Processing for B_NT1 - Send OAM Cell */
pCell->CDS = ONE(CDS_Crc10);
switch (OAM_Type)
{
#ifdef OAM_F4
case MSG_OAM_F4:
pCell->CellHdr = OAM_VP[ConNum-OAM_VP_OFFSET].OAM_Cell.CellHdr;
break;
#endif
#ifdef OAM_F5
case MSG_OAM_F5:
pCell->CellHdr = OAM_VC[ConNum-OAM_VC_OFFSET].OAM_Cell.CellHdr;
break;
#endif
default:
break;
}
for (i=0; i < 48; i++)
pCell->Payld[i] = *Ptr++;
EDMA_TxCell(0,pCell);
return;
#endif
#if defined(F4_B_NT2) || defined(F5_B_NT2)
Halt("Example code does not send OAM cells from B_NT2");
#endif
}
8.2.3 F4 Flow Receive
On the receiving side, the F4 OAM cell is filtered by the APU using the
cell header and the OAM_Receive()routine is called in the RxCell
processing by the APU to process the OAM cell. In the OAM_Receive()
code shown in the following example, the F4_B_NT2 end unit turns the
cell around and re-transmits it back to F4_B_NT1. In other applications,
the cell contents may be modified to notify F4_B_NT1 of changes in the
network conditions. The same routine is used by F4_B_NT1 to process
8-6
OAM Cell Processing
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
the OAM cell received back from F4_B_NT2. The F4_B_NT1 end unit
informs the host of the OAM cell by copying the cell contents to the
primary memory and sending a message to the host through the
Mailbox, as shown in Figure 8.6.
Figure 8.6
APU OAM_Receive() Routine
/* ------------------------------------------------------* Name:
OAM_Receive()
* Description: Processing of OAM cell received for F4 or F5 flows
*
* parameters:
pCell:
cell address in Cell Buffer
*
* ------------------------------------------------------*/
void OAM_Receive(const ulong OAM_Type, const pCell_t pCell)
{
/* In this example code we turn around the OAM cell and update
* the outgoing OAM cell.
*/
#if defined(F4_B_NT1) || defined(F5_B_NT1)
ulong Msg, Cmd;
ulong *Src, *Dst;
long Status;
int i;
#endif
ulong ConNum = 0;
switch (OAM_Type)
{
#ifdef OAM_F4
case MSG_OAM_F4:
ConNum = OAM_VP_OFFSET + (pCell->CellHdr & OAM_VP_MASK >>
CELL_VPI);
break;
#endif
#ifdef OAM_F5
case MSG_OAM_F5:
ConNum = ((pCell->CellHdr & OAM_VC_MASK) >> CELL_VCI);
break;
#endif
default:
break;
}
if (pCell->CDS & ONE(CDS_Crc10))
Stat.ErrCrc10++;
#if defined(F4_B_NT1) || defined(F5_B_NT1) /* Test to see if the Mailbox if free */
Cmd = pPCI->CmdAck_OAM_Rx;
while (Cmd != MAILBOX_FREE)
F4 OAM Flow
8-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
Cmd = pPCI->CmdAck_OAM_Rx;
pPCI->CmdAck_OAM_Rx = MAILBOX_BUSY;
/* Copy the contents of the cell to the OAM_VP */
Src = (ulong *) &pPCI->OAM_VPC_Rx.OAM_Cell;
Dst = (ulong *) pCell;
pPCI->OAM_VPC_Rx.ConNum = ConNum;
for (i=0; i < sizeof(Cell_t)/sizeof(ulong); i++)
*Dst++ = *Src++;
/* pPCI->OAM_VPC_Rx.OAM_Cell = *pCell; */
/* Send the ConNum in the MailBox to host */
switch (OAM_Type)
{
#ifdef OAM_F4
case MSG_OAM_F4:
pPCI->Ext_Msg = MSG(MSG_OAM_F4, &pPCI->OAM_VPC_Rx);
break;
#endif
#ifdef OAM_F5
case MSG_OAM_F5:
pPCI->Ext_Msg = MSG(MSG_OAM_F5, &pPCI->OAM_VPC_Rx);
break;
#endif
default:
break;
}
Msg = MSG(MSG_ASYNC, &pPCI->Ext_Msg);
Status = (long) APU_Status();
while (Status < 0)
Status = (long) APU_Status();
PP_TxMbx(Msg);
ACI_Free(pCell);
return;
#endif
#if defined(F4_B_NT2) || defined(F5_B_NT2)
/* Update using contents of OAM_Cell
* Turn around the received OAM Cell
*/
/*
NULL Update in this example code
*/
/* Send the cell with modified payload */
while (EDMA_Status() & ONE(EDMA_TxCellReqFull)) { ; }
EDMA_TxCell(0,pCell);
switch (OAM_Type)
{
#ifdef OAM_F4
8-8
OAM Cell Processing
BookL64364PG.fm5 Page 9 Friday, January 28, 2000 4:58 PM
case MSG_OAM_F4:
Stat.OAM_VP++;
break;
#endif
#ifdef OAM_F5
case MSG_OAM_F5
Stat.OAM_VC++;
break;
#endif
default:
break;
}
return;
#endif
}
8.2.4 Host Processing of F4 Flow
The host processes the OAM cell for the VP with the OAM_Receive()
routine. An example of that routine is shown below. You can easily
modify the code or add to it for your specific application (Figure 8.7).
Figure 8.7
Host OAM_Receive() Routine
/* -------------------------------------------------------* Name:
OAM_Receive()
*
* Description: Processing of OAM cell received for F4 and F5 flow
*
* parameters:
Msg:
Message with address of OAM Cell
*
* -------------------------------------------------------*/
void OAM_Receive(const ulong Msg)
{
/* In this example code we update the contents of the OAM_VP *
ulong *TargetAddr = (ulong *) MapFromAPU((ulong) MSG_PTR(Msg));
ulong *Dst;
ulong OAM_Type = MSG_TAG(Msg);
ulong ConNum = *TargetAddr;
pOAM_F45_t Payld;
int i;
switch (OAM_Type)
{
#ifdef OAM_F4
case MSG_OAM_F4:
Payld = (pOAM_F45_t) &OAM_VP[ConNumOAM_VP_OFFSET].OAM_Cell.Payld;
Dst = (ulong *) &OAM_VP[ConNum - OAM_VP_OFFSET];
for (i=0; i < (sizeof(Cell_t)/sizeof(ulong) + 1); i++)
F4 OAM Flow
8-9
BookL64364PG.fm5 Page 10 Friday, January 28, 2000 4:58 PM
*Dst++ = *TargetAddr++;
break;
#endif
#ifdef OAM_F5
case MSG_OAM_F5:
Payld = (pOAM_F45_t) &OAM_VC[ConNum OAM_VC_OFFSET].OAM_Cell.Payld;
Dst = (ulong *) &OAM_VC[ConNum - OAM_VC_OFFSET];
for (i=0; i < (sizeof(Cell_t)/sizeof(ulong) + 1); i++)
*Dst++ = *TargetAddr++;
break;
#endif
default:
break;
}
switch (Payld->OAM_Func_Type)
{
case OAM_Fault_AIS:
case OAM_Fault_FERF:
{
pOAM_AIS_t OAM_AIS = (pOAM_AIS_t) Payld->R;
/* Process the information */
#if defined(F4_B_NT1) || defined(F5_B_NT1)/* NULL BEHAVIOR */
#endif
#if defined(F4_B_NT2) || defined(F5_B_NT2)
/* NULL BEHAVIOR FOR B_NT2 */
#endif
break;
}
/*Placeholders for user modification*/
case OAM_Perf_Fwd:
case OAM_Perf_Bck:
case OAM_Perf_Mon:
{
pOAM_Perf_t OAM_Perf = pOAM_Perf_t) Payld->R;
/* Process the information */
#if defined(F4_B_NT1) || defined(F5_B_NT1)
#endif
#if defined(F4_B_NT2) || defined(F5_B_NT2)
/* NULL BEHAVIOR FOR B_NT2 */
#endif
break;
}
case OAM_Act_Perf:
case OAM_Act_Cont:
{
pOAM_Act_t OAM_Act = (pOAM_Act_t) Payld->R;
/* Process the information */
#if defined(F4_B_NT1) || defined(F5_B_NT1)
#endif
8-10
OAM Cell Processing
BookL64364PG.fm5 Page 11 Friday, January 28, 2000 4:58 PM
#if defined(F4_B_NT2) || defined(F5_B_NT2)
#endif
break;
}
default: break;
}
#if defined(F4_B_NT1) || defined(F5_B_NT1)
/* Copy the Modified OAM Cell to the APU */
pPCI->CmdAck_OAM_Rx = MAILBOX_FREE;
return;
#endif
}
Therefore, using the OAM_VPC_t structure, the OAM cells are passed
from the host to the APU and back. The F4 flow for each VP is facilitated
by opening a connection and initializing a VCD.
8.3 F5 OAM Flow
The F5 OAM flow is defined for a Virtual Connection (VC). As before, a
different connection number is assigned to the OAM flow for a VC.
Therefore, a VC with OAM flow is represented by two connection
numbers in the ATMizer II+ chip. One connection number is for the
regular data flow with the appropriate scheduling mechanism (CBR,
VBR, ABR etc.), and the other is for the OAM flow associated with the
VC. The Mailbox messaging is used by the APU to convey the OAM cell
contents to and from the host as described in the previous section. The
procedures for sending and receiving OAM cells remains the same as
before with the exception of the cell header filtering, which is different for
F5 flow as defined in ITU-T I.610.
F5 OAM Flow
8-11
BookL64364PG.fm5 Page 12 Friday, January 28, 2000 4:58 PM
8-12
OAM Cell Processing
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 9
AAL3/4 Processing
This chapter describes the software for AAL3/4 processing in the
ATMizer II+ chip. The EDMA in the ATMizer II+ chip is designed to
process AAL5 CS-PDUs and to support AAL0 type connections. The
segmentation and reassembly support for AAL0 connections provided by
the EDMA can be used to implement the CS-PDU segmentation and
reassembly for AAL3/4 connections. The ATM Processing Unit (APU) in
the ATMizer II+ chip can preprocess AAL3/4 PDU data before it is
segmented or reassembled by the EDMA.
This chapter contains the following sections:
•
Section 9.1, “Introduction”
•
Section 9.2, “AAL3/4 Segmentation”
•
Section 9.3, “AAL3/4 Reassembly”
9.1 Introduction
The AAL3/4 CS-PDU defined by ITU-T I.363 is segmented into cells as
shown in Figure 9.1. The Segment Type in the SAR-PDU header
indicates whether the SAR-PDU is a Beginning of Message (BOM), a
Continuation of Message (COM), an End of Message (EOM), or a Single
Segment Message (SSM). The Sequence Number (SN) is incremented
by the sender, starting at the BOM, so the receiver can detect missing
or out-of-order cells. The Multiplex Identification (MID) allows up to 1024
logical connections to be multiplexed over a single ATM virtual channel.
The Length Indicator in the trailer indicates the number of valid data
octets in the payload (44 for BOM and COM segments; maybe less for
EOM and SSM segments).
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
9-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
Figure 9.1
AAL3/4 Cell Layout
ATM AAL3/4 Cell
SAR-PDU
SAR-PDU
Header
SAR-PDU
Trailer
Cell
Header ST
SN
MID
SAR-PDU
Payload
LI
5
2
Octets Bits
4
Bits
10
Bits
44
Octets
6
Bits
ST = Segment Type
SN = Sequence Number
MID = Multiplex Identification
CRC
10
Bits
LI = Length Indicator
CRC = Cyclic Redundancy Check
In the segmentation process, the SAR_PDU header and the trailer have
to be put together by the APU before the cell is transmitted. The next
Section outlines the steps needed to support an AAL3/4 connection in
the ATMizer II+ application code.
9.2 AAL3/4 Segmentation
The processing needed for AAL3/4 connections is different from that for
AAL5 connections. To distinguish AAL3/4 connections from AAL5
connections, the AAL type needs to be a part of the ACD structure.
The AAL3/4 SAR_PDU header and trailer are processed by the APU
after the EDMA fills in the SAR-PDU data in the cell payload. To facilitate
this, the VCD is set in the AAL0 mode (VCD_ALL0 control bit set) with
Cell Hold turned on (VCD_CellHold control bit set). In the CellHold mode,
the EDMA processes the cell but does not send the cell out to the ACI.
Instead, the EDMA returns the cell address to the auxilliary completion
queue with the Cell-Hold completion message. The connection number
is returned in the Cell-Hold completion message. The APU determines
the AAL type of the connection based on the AAL_Type field in
ACD_Ctrl_t structure. The format for ACD_Ctrl_t structure is shown in
Figure 9.2.
9-2
AAL3/4 Processing
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
Figure 9.2
ACD_Ctrl_t Structure
typedef struct
{
ushort Tx:1,
CalNum:2,
Class:3,
EFCI:1,
CLP:1,
PHY:5,
CellHold:1,
AAL_Type:2;
} ACD_Ctrl_t;
The Cell-Hold completion message is processed by the APU to complete
the header and the trailer of the SAR-PDU using the SAR_Hdr field,
which holds the sequence number, segment type, and the MID of the
SAR-PDU as shown in Figure 9.3.
Figure 9.3
SAR_PDU Header Declarations
/* ---------------------------------------* SAR-PDU Header type
*/
typedef struct
{
ushort ST:2,
/* Segment Type */
SN:4,
/* Sequence Number */
MID:10;
/* Multiplexing Identifier */
} SAR_HDR_t;
The ACD structure is modified to hold the SAR_Hdr field as shown in the
following:
typedef struct
{
ulong ICG;
/* in 16.8 format */
ulong ThTxTime;
/* The following declarations are for VBR connections only */
ulong Bucket;
/* Current Bucket contents */
ulong Increment;
/* Bucket Increment each time a cell is sent */
ulong Limit;
/* Bucket limit */
ulong ICG_PCR;
/* 1/PCR */
uchar Pad[32-6*4];
ushort ACD_Ctrl;
/* ACD Control for the connection */
SAR_HDR_t SAR_Hdr;
/* SAR-PDU Header */
} ACD_t, *pACD_t;
AAL3/4 Segmentation
9-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
To leave the space for the header and the trailer in the cell, the AAL0
connection type is provided with the VCD_Offs and VCD_Tbytes fields
in the CRC32 field of the VCD. The VCD_Offs specifies the offset from
the beginning of the cell to the start of the payload data, and
VCD_Tbytes specifies the number of bytes to be filled in the payload of
the cell. For a cell size of 52 bytes (cell header and payload), VCD_Offs
is set to 10 (4 bytes for CDS, 4 bytes for the cell header, and 2 bytes for
the SAR-PDU header), and the VCD_Tbytes is set to 44 which is the size
of the SAR-PDU. The EDMA returns the actual number of bytes copied
into the cell in CDS_Tbytes, which can then be used to determine the LI
field of the trailer.
In the TxCell routine of the application code, an EDMA_TxCell command
for a AAL3/4 connection is issued in the same way as it is issued for an
AAL5 connection. Therefore, the TxCell routine is not altered to support
AAL3/4 processing.
When the payload data is copied from the buffer to the cell, the EDMA
returns the connection number in the completion queue and the APU
processes the message to complete the header and the trailer for the
SAR-PDU. The ST field of the SAR-PDU is based on the CDS_BOM and
CDS_EOM bits that are set by the EDMA to indicate the beginning and
end of a message. The SN field of the SAR-PDU is copied from the ACD
structure. The trailer is formed by setting LI based on the CDS_Tbytes
and CDS_Crc10 is set in the CDS to enable the CRC10 generation for
the contents of the cell by the ACI. Finally, the SN is updated in the ACD
for the connection. Note that the CDS_Head is also updated to point to
the next cell address that was sent to the EDMA for this connection. The
SAR-PDU is then sent to the ACI by issuing an EDMA_TxCell with a zero
connection number. This is shown in the following code (Figure 9.4):
9-4
AAL3/4 Processing
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
Figure 9.4
AAL34_Send() Routine
/* -------------------------------------------------------* Name:
AAL34_Send()
*
* Description: Send AAL34 cell after it is returned by the
*
EDMA in the TxCompl queue.
*
* parameters:
ConNum:
Connection number of the VC.
*
* -------------------------------------------------------*/
void AAL34_Send(const ulong ConNum)
{
register pCell_t pCell = 0;
ulong CDS_Head = (ulong) ACD[ConNum].CDS_Head;
if (ACD[ConNum].CDS[CDS_Head])
{
pCell = (pCell_t) ((ulong)pCBM + (ulong) ACD[ConNum].CDS[CDS_Head]);
}
else
Halt("Empty Cell for AAL34 Tx Connection");
ACD[ConNum].CDS[CDS_Head] = 0;
ACD[ConNum].CDS_Head = (CDS_Head + 1) % 10;
/* Set the Crc10 bit of the CDS */
pCell->CDS |= ONE(CDS_Crc10);
/* Set the Segment type of the cell *
if (pCell->CDS & ONE(CDS_BOM))
{
if (pCell->CDS & ONE(CDS_EOM))
{
ACD[ConNum].SAR_Hdr.ST =
}
else
ACD[ConNum].SAR_Hdr.ST =
}
else
{
if (pCell->CDS & ONE(CDS_EOM))
{
ACD[ConNum].SAR_Hdr.ST =
}
else
ACD[ConNum].SAR_Hdr.ST =
}
ST_SSM;
ST_BOM;
ST_EOM;
ST_COM;
/* Update the SAR_Hdr and SAR_Trailer fields of the cell */
half(pCell->Payld[0]) = half(ACD[ConNum].SAR_Hdr);
half(pCell->Payld[46]) = (ushort) pCell->CDS & 0x0000fc00;
/* Update the ACD for the connection */
ACD[ConNum].SAR_Hdr.SN = (ACD[ConNum].SAR_Hdr.SN + 1) & 0xf;
EDMA_TxCell(0,pCell);
}
AAL3/4 Segmentation
9-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
9.3 AAL3/4 Reassembly
The processing of an AAL3/4 connection in the receive direction involves
the extraction of the data from the SAR-PDU after verifying that the
header and trailer are correct and without errors. Since, the
BFD_ErrLength and BFD_ErrCrc control bits in the BFD are not set by
the EDMA, the APU needs to update them when the SAR-PDU is
processed. The processing of the AAL3/4 cell is done by the
AAL3/4_Receive() routine as shown in the Figure 9.5.
Figure 9.5
AAL34_Receive() Routine
/* -------------------------------------------------------* Name:
AAL34_Receive()
*
* Description: Processing of AAL34 cell received
*
* parameters:
pCell:
cell address in Cell Buffer
*
* -------------------------------------------------------*/
void AAL34_Receive(const ulong ConNum, const pCell_t pCell)
{
pAAL34_Payld_t Payld = (pAAL34_Payld_t) &pCell->Payld;
ulong CurrBFD = (ulong) VCD[ConNum].CurrBFD;
ulong CellHdr = pCell->CellHdr;
switch (Payld->SAR_Hdr >> ST_SHIFT )
{
case ST_BOM:
pCell->CDS |= ONE(CDS_BOM);
break;
case ST_SSM:
case ST_EOM:
pCell->CDS |= ONE(CDS_EOM);
CellHdr |= ONE(CELL_EOM);
break;
case ST_COM:
/* pCell->CDS &= ~( ONE(CDS_BOM) | ONE(CDS_EOM) ); */
break;
default:
break;
}
/* Set the Tbytes of the cell */
pCell->CDS &= 0xffff03ff; /* Clear out the CDS_TBytes */
pCell->CDS |= (ulong) (Payld->SAR_Tlr & LI_MASK);
/* Error in the Crc10 of the cell */
if (pCell->CDS & ONE(CDS_Crc10))
9-6
AAL3/4 Processing
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
half(BFD[CurrBFD].BFD_Ctrl) = half(BFD[CurrBFD].BFD_Ctrl) |
ONE(BFD_ErrCrc);
/* Sequence number error */
if ((Payld->SAR_Hdr & SN_MASK) != (half(ACD[ConNum].SAR_Hdr) & SN_MASK) )
half(BFD[CurrBFD].BFD_Ctrl) |= ONE(BFD_ErrLength);
/* Update the SN of the ACD */
ACD[ConNum].SAR_Hdr.SN = (ACD[ConNum].SAR_Hdr.SN + 1) & 0xf;
/* Send the cell to the EDMA for processing */
EDMA_RxCell(ConNum, CellHdr, pCell);
}
AAL3/4 Reassembly
9-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
9-8
AAL3/4 Processing
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 10
Initialization
This chapter describes initialization and configuration tasks, and provides
sample code for the ATMizer II+ chip.
•
Section 10.1, “Initialization Overview”
•
Section 10.2, “Booting Procedures”
•
Section 10.3, “C Preamble Execution”
•
Section 10.4, “CPU Initialization and Configuration”
•
Section 10.5, “Configuration Header File”
•
Section 10.6, “Host PCI Access”
•
Section 10.7, “Memory Allocation”
•
Section 10.8, “Hardware Registers Initialization”
•
Section 10.9, “Data Structures Initialization”
10.1 Initialization Overview
The following steps are typically required to initialize the ATMizer II+ chip:
Step 1. Booting
The booting step initializes the Configuration and Cache
Control (CCC) register and the SDRAM controller, then copies
the initialization and application code to an executable memory
location.
Step 2. C Preamble Execution
C preamble execution includes .bss section clearing, stack
allocation, and initialization of the global data pointer and stack
pointer registers.
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
10-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
Step 3. CPU Initialization and Configuration
CPU initialization and configuration mainly includes cache
configuring and flushing, and interrupt and exception handler
setting.
Step 4. Memory Allocation
Memory allocation defines the maps for primary, secondary,
and Cell Buffer Memory (CBM).
Step 5. Hardware Registers Initialization
ATMizer II+ chip hardware initialization and configuration
includes all hardware module registers and mode setting.
Step 6. Data Structures Initialization
Data structures initialization includes Free Cell List initialization,
clearing the Virtual Connection Descriptors (VCDs), and setting
the Buffer Descriptors (BFDs) and Scheduler Calendar Table
(SCDs).
Step 7. Interrupt Handler Initialization
When the interrupts are enabled, the interrupt handlers are
used by the software to process the interrupts. See Chapter 7,
“Interrupt Handling” for details.
The LSI Logic PROM Monitor (PMON) can be used for default
initialization for steps 1 through 3 above and as a software application
debugger. However, you do not need PMON for your system. Following
steps 1 through 3, you may load the application program (with the load
command) to the desired memory location.
To compile and link a program for execution, PMON provides a utility
program called pmcc. Pmcc invokes the host’s C compiler with the correct
arguments, then flags and generates FastFormat records of the code for
downloading. You may suppress the default initialization included in
PMON and provide your own initialization code. Sample code is provided
in Section 10.2, “Booting Procedures,” Section 10.3, “C Preamble
Execution,” and Section 10.4, “CPU Initialization and Configuration.”
The remainder of this chapter follows the above steps and illustrates
each in detail.
10-2
Initialization
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
10.2 Booting Procedures
The L64364 ATMizer II+ chip supports the following booting procedures:
•
booting from the Secondary Port’s byte-wide EPROM page
•
booting from CBM/serial PROM
•
booting from CBM/Primary Port
Each booting procedure requires a unique sequence of code before
reading the actual application code. In the boot code, the ATM
Processing Unit (APU) in the ATMizer II+ chip must make sure that the
CCC register and the SDRAM controller are initialized first. The SDRAM
controller initialization is necessary before it can be used; this is
important for applications running code out of SDRAM. The APU also
needs to remap the exception vector space to a location where the
application program will reside. An example of this default initialization
code from the LSI Logic PMON is shown in Section 10.2.1 . For
applications using boot procedures from CBM, the boot code should
contain a routine for copying the application program to an executable
memory location before jumping to that location for execution. The
remainder of this section assumes that the application code is stored in
SDRAM.
10.2.1 Default ATMizer II+ Chip Initialization
The following code (Figure 10.1) initializes the CCC register and the
SDRAM controller to default settings before jumping to the main loop of
PMON:
Booting Procedures
10-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
Figure 10.1 CCC Register and SDRAM Controller Initialization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include <regdef.h>
#include <cp0_scobra.h>
.text
.globl atmizer2Init
.ent atmizer2Init
atmizer2Init:
# setup the CCC configuration register
# enable: CMP, II+E, DIE, MUL, MAD, BGE, IPWE(1K), WB
# Icache: 2 way set assoc, 4K set size
# Dcache: 2 way set assoc, 4K set size
# CCC <- 0000 0001 1111 0011 1011 0110 0010 0000
li t0, 0x01f3b620
.set noreorder
# load the CP0 configuration register
mtc0
t0, C0_CCC
.set reorder
# setup
li t1,
li t2,
sw t1,
the wait states of secondary memory page 0, 1, 2 */
0xf8000888
M_SBCR
(t2)
# setup the AddrMap Register to direct exceptions
# to the SDRAM.
li t1,(0<<ADRM_EXCMAP_SHFT)
li t2,M_APU_AddrMap
sw t1,(t2)
# setup the SDRAM config info
# Load base address
li t2, SBC_BASE
/* Precharge command */
li t0, 0x4033b753
sw t0, 4(t2) # control reg
/* Mode register */
li t0, 0x20228530
sw t0, 4(t2) # control reg
/*
li
li
sw
Set
t0,
t3,
t0,
Mode */
0x0000eeee
0x80810000
(t3)
/* Set Control */
li t0, 0x10228530
sw t0, 4(t2) # control reg
10-4
Initialization
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
52
53
54
55
56
57
58
/* Load refresh register */
li t0, 0x00000300
sw t0, 8(t2) # refresh reg
j ra
.end atmizer2Init
The CCC register’s default setting can be overwritten with the desired
configuration for the specific application after the APU jumps out of the
boot sequence and into the application-specific code. This is described
in detail in Section 10.4, “CPU Initialization and Configuration.”
10.2.2 Secondary Port EPROM Boot Sequence
Booting from the Secondary Port’s 8-bit EPROM is selected if the
SYS_BOOT[1:0] pins are 0b00 when the PCI_RSTn signal is deasserted. The boot exception address of 0xBFC0.0000 is mapped to
physical address 0x0000.0000. There is no special boot code required in
this case since the EPROM is mapped to a page on the Secondary Port
and is used only for the firmware for the APU and data structures.
10.2.3 Cell Buffer Memory/Serial PROM Boot Sequence
When serial PROM boot is selected (SYS_BOOT[1:0] = 0b11), the APU
remains in reset after the PCI_RSTn signal is deasserted until the first
256 bytes of data from the serial PROM are copied into CBM. When the
copy is completed, the APU leaves the reset state and begins execution
from CBM. The 64 instructions in CBM copy the remaining code from the
serial PROM to a valid memory location (CBM, Primary Port memory, or
Secondary Port memory) by reading the APU_SRL register and then
jumping to that location to continue execution.
The boot code in CBM has to be located in a different address area from
the actual application program. See Section 10.4.5, “Icache and I-RAM
Configuration” for descriptions of how this is done. The CBM boot code
cannot be compiled separately from the main application program
because it requires the size of the application code to be copied.
The following code (Figure 10.2) is used to boot from a serial PROM:
Booting Procedures
10-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
Figure 10.2 Serial Boot Routine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include <regdef.h>
#include <cp0_scobra.h>
#define start_addr 0x80820000
.extern start_addr 4
.extern etext
4
.text
.globl serialBoot
.ent serialBoot
serialBoot:
# set SR and CAUSE to something sensible
li v0,SR_BEV
.set noreorder
.set noat
mtc0
nop
mtc0
nop
v0,C0_SR
zero,C0_CAUSE
# set up the CPU and SDRAM controller
# this is the routine described in Section 10.2.1
jal
nop
atmizer2Init
# copy code from Serial PROM to SDRAM
# get size of code and data to be copied
# assuming they are to be copied to SDRAM addr 0xa0820000
li t1,start_addr
# beginning of .text section
# to be copied
la t0,etext
# end of .text section
subu
t2,t0,t1
# number of instr bytes to copy
li
t1,0xa0820000 # SDRAM address where the program
# is to be copied to
loop_text:
lw
t2,M_APU_SRL
# read Serial PROM
sub
t2,4
# decrement counter by 4 bytes (1 word)
sw
t2,(t1)
# store instr word from Serial PROM
# to SDRAM
.set noreorder
.set noat
bnez
t2,loop_text
# get next word
addi
t1,0x4
# point to next addr in SDRAM
.set at
.set reorder
10-6
Initialization
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
la
subu
la
t3,edata
t2,t3,t0
t1,_fdata
loop_data:
lw
t2,M_APU_SRL
sub
t2,1
sw
t2,(t1)
.set
.set
bnez
addi
.set
.set
noreorder
noat
t2,loop_data
t1,0x4
at
reorder
#
#
#
#
end of .data section
number of data words to copy
address where the .data section
is to be copied to
#
#
#
#
read Serial PROM
decrement counter by 1
store data word from Serial PROM
to SDRAM
# get next data word
# point to next addr in SDRAM
# jump to SDRAM for start of execution
li
t1,0xa0820000
j
t1
nop
.set at
.set reorder
.end serialBoot
This code consists of about 56 instructions after compilation. It fits
completely in CBM and executes when the APU jumps to the reset
exception vector. It is compiled with and linked to the application code.
The link address for the boot code is at 0xB000.0000 and the link
address for the application code is at 0x80B2.0000 if the application code
is cache resident.
If the application code is separated into two parts, one in a noncacheable
area and the other in I-RAM, the copying process is a little more
complicated. The copying code has to be modified to copy the two
sections to two different memory areas.
10.2.4 Cell Buffer Memory/Primary Port Boot Sequence
Booting from CBM/Primary Port is selected if the SYS_BOOT[1:0] pins
are 0x10 when the PCI_RSTn signal is deasserted. In this boot
sequence, the reset exception address is remapped to CBM address 0.
However, the APU is still in reset until the XPP_APU_Reset bit of the
XPP_Ctrl register is cleared. After remapping the reset address, the
external PCI master first configures the PCI configuration space of the
ATMizer II+ chip and then puts the boot code (such as the one described
above with minor modification to the copy-from address) into CBM. When
Booting Procedures
10-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
this is done, clear the XPP_APU_Reset bit so the APU will jump to CBM
and start execution.
10.3 C Preamble Execution
The initialization code following the booting procedure prepares the
processor for execution of an application program. It typically performs
the following tasks:
•
initializes memory
•
clears the .bss and .sbss sections
•
flushes the caches
•
copies program data from PROM to RAM
•
initializes the stack pointer and the global data pointer registers
•
enables interrupts
•
switches from noncacheable to cacheable space
The sample initialization code follows (Figure 10.3):
Figure 10.3 Sample Initialization Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* start-up code, cacheable, no exceptions */
#include <regdef.h>
#include <cp0_scobra.h>
#define STKSIZE 8192
.comm stack, STKSIZE
.text
/*
* This is the entry point of the entire application
*/
init:
.set noreorder
10-8
/*
* setup the Status register (BEV=0, Kernel Mode, Rupts enabled)
* enable access to CP0 in user mode
* enable interrupts
* enable HW interrupt 1
*/
li
a0, (SR_CU0 | SR_IE | 0x0000)
Initialization
BookL64364PG.fm5 Page 9 Friday, January 28, 2000 4:58 PM
23
mtc0
a0, C0_SR
/* load the CP0 status register */
24
25
/* clear the SW interrupt bits in the Cause Register */
26
nop
27
mtc0
zero, C0_CAUSE
/* only SW interrupt bits are writeable */
28
nop
29
30
# clear bss
31
la
v0, _fbss
32
la
v1, end
33 1: sw
$0, 0x0(v0)
34
sw
$0, 0x4(v0)
35
sw
$0, 0x8(v0)
36
sw
$0, 0xc(v0)
37
addu
v0,16
38
blt v0, v1, 1b
39
40
# flush the caches
41
# first set up a K1seg sp & gp
42
la sp, stack+STKSIZE-24
43
or sp, K1BASE
44
la gp, _gp
45
or gp, K1BASE
46
47
/* flush the caches */
48
.set nowarn
49
.word ( 0xbc030000 ) /* instruction CACHE_FLUSHID */
50
.set warn
51
52
53
# copy .data to RAM
54
# src=etext dst=_fdata stop=edata
55
la
t0, etext
56
la
t1, _fdata
57
la
t2, edata
58 1: lw
t3, (t0)
59
sw
t3, (t1)
60
addu
t0, 4
61
addu
t1, 4
62
blt t1, t2, 1b
63
64
# ok to use k0seg now, so initialize sp & gp
65
la
sp, stack+STKSIZE-24
66
la
gp, _gp
67
68
# transfer to main program
69
# reg indirect necessary to switch segments
70
la
t0, main
71
jal
t0
72
73 _exit:
74
b
_exit
75
.end init
C Preamble Execution
10-9
BookL64364PG.fm5 Page 10 Friday, January 28, 2000 4:58 PM
The above example is a general initialization routine. It may be varied
under different cases. For instance, you may choose your own $gp and
$sp values to map the .sbss and .sdata sections and the stack to the
data RAM instead of using the linker default $gp and $sp values. You
may modify the linker script to achieve the same goal. This is described
in detail in Section 10.4, “CPU Initialization and Configuration.”
If the .sbss section is mapped to data RAM, then clearing the .sbss
section is not necessary since this external block of memory will not be
referenced. Instead, the .sbss section in the data RAM should be
cleared. Note, both Icache and Dcache should be flushed before they are
set and loaded. Also, the above code assumes that the program will run
from the PROM, that is, the .text section is located in the PROM. If the
application code is in I-RAM or SSRAM of secondary memory, it should
be loaded to the corresponding destination address. The sample code is
illustrated in Section 10.2.3, “Cell Buffer Memory/Serial PROM Boot
Sequence.”
10.4 CPU Initialization and Configuration
CPU initialization includes secondary memory controller initialization,
cache configuration, and interrupt settings. Secondary memory controller
initialization and interrupt settings are described in the L64364 ATMizer
II+ ATM-SAR Chip Technical Manual. This section discusses different
choices of Icache and Dcache configuration, initialization, and utilization.
10.4.1 Configuration and Cache Control Register
The CCC register allows software to configure various aspects of the
APU. Figure 10.4 shows the format of the CCC register in the ATMizer
II+ chip. The paragraphs following the figure describe the fields and bits
and their required settings. Refer to the L64364 ATMizer II+ ATM-SAR
Chip Technical Manual for more detail.
10-10
Initialization
BookL64364PG.fm5 Page 11 Friday, January 28, 2000 4:58 PM
Figure 10.4 CCC Register Layout
31
29
28
27
26
25
24
23
22
21
20
19
18
17
16
EWP
r
ISR1
EVI
CMP
IIE
DIE
MUL
MAD
TMR
BGE
IE0
IE1
13
12
11
10
9
8
7
6
5
4
3
2
1
0
DE0
DE1
TE
WB
SR0
SR1
IsC
TAG
INV
R
15
14
IS[1,0]
DS[1,0]
IPW
IPS[1,0]
R
Reserved
[31:29]
These bits are not used in the L64364 and should be
cleared.
EWP
External Write Priority
28
This bit defines SCBus arbitration priority between data
reads and writes in the 4-level write buffer. Clearing EWP
gives higher priority to data read requests if the read
address does not match any of the write addresses in the
write buffer. Setting EWP gives higher priority to data
writes.
R
Reserved
27
This bit is not used in the L64364 and should be cleared.
ISR1
Icache 1 Enable
26
Scratch-pad RAM mode enable (Icache set 1). Setting
depends on application.
EVI
External Vectored Interrupt
25
1 = enable, 0 = disable. Setting depends on application.
CMP
R3000 Compatibility Mode
24
1 = enable, 0 = disable. Setting depends on application.
IIE
Icache Invalidate Request Enable
23
1 = enable, 0 = disable. Setting depends on application.
DIE
Dcache Invalidate Request Enable
22
1 = enable, 0 = disable. Setting depends on application.
MUL
Floating-Point Multiplier Unit Enable
21
1 = enable, 0 = disable. Enabled in ATMizer II+ chip.
CPU Initialization and Configuration
10-11
BookL64364PG.fm5 Page 12 Friday, January 28, 2000 4:58 PM
MAD
Multiplier Accumulate Extension Enable
20
1 = enable, 0 = disable. Disabled in ATMizer II+ chip.
TMR
Timer Facility Enable
19
1 = enable, 0 = disable. Setting depends on application.
When enabled and the value in the Count register equals
the value in the Compare register, sets interrupt IP7 in
the Cause register.
BGE
Bus Grant Enable
18
1 = enable, 0 = disable. Enabled in ATMizer II+ chip.
When enabled, the ATMizer II+ chip recognizes external
logic as the BIU Bus master.
IE0
Icache Set 0 Enable
17
1 = enable, 0 = disable. Setting depends on application.
IE1
Icache Set 1 Enable
16
1 = enable, 0 = disable. Setting depends on application.
Note:
10-12
For I-cache, IE1 MUST be set to enable operation of the
cache memory and ISR1 determines whether it is used in
cache mode or scratch pad mode.
IS[1:0]
Icache Size
[15:14]
0b00 = 1 Kbyte, 0b01 = 2 Kbyte, 0b10 = 4 Kbyte,
0b11 = 8 Kbyte. Set to 0b10 in the ATMizer II+ chip for a
4 Kbyte Icache.
DE0
Dcache Set 0 Enable
13
1 = enable, 0 = disable. Setting depends on application.
DE1
Dcache Set 1 Enable
12
1 = enable, 0 = disable. Setting depends on application.
DS[1,0]
Dcache Size
[11,10]
0b00 = 1 Kbyte, 0b01 = 2 Kbyte, 0b10 = 4 Kbyte,
0b11 = 8 Kbyte. Set to 0b01 in the ATMizer II+ chip for a
1 Kbyte Dcache.
IPW
Internal Page Write Enable
9
1 = enable, 0 = disable. Setting depends on application.
IPS[1:0]
Internal Page Size
[8:7]
0b00 = 1 Kbyte, 0b01 = 2 Kbyte, 0b10 = 4 Kbyte,
0b11 = 8 Kbyte. Setting depends on application.
Initialization
BookL64364PG.fm5 Page 13 Friday, January 28, 2000 4:58 PM
TE
Translation Buffer Enable
6
1 = enable, 0 = disable. Disabled in ATMizer II+ chip.
WB
Write Through/Write Back Cache Select
5
0 = write through, 1 = write back. Defines cache operation for addresses not mapped by the Translation Buffer.
Setting depends on application.
SR0
Scratch-pad RAM Mode Enable (Dcache Set 0)
4
When this bit is set and the DE0 bit is cleared, Dcache
Set 0 is configured as scratch-pad RAM. When this bit is
cleared, the DE0 bit enables/disables Dcache mode for
Set 0. Setting depends on application.
SR1
Scratch-pad RAM Mode Enable (Dcache Set 1)
3
When this bit is set and the DE1 bit is cleared, Dcache
Set 1 is configured as scratch-pad RAM. When this bit is
cleared, the DE1 bit enables/disables Dcache mode for
Set 1. Setting depends on application.
Note:
For the data cache, either SRx or DEx, but NOT both, is set
to enable either scratch pad or cache operation.
IsC
Isolate Cache Mode
2
When set, APU store operations go to the cache but do
not propagate to external memory. Setting depends on
application.
TAG
Tag Test Mode
When set, load and store operations access the Tag
RAMs and can be used for Tag RAM testing. Setting
depends on application.
INV
Cache Invalidate Mode
0
When set, cache contents are invalidated. Used only for
cache diagnostic and debug operations. Setting depends
on application.
1
10.4.2 Cache Configuration
The ATMizer II+ Icache and Dcache organizations are as follows:
•
The Icache consists of two sets of 4 Kbytes (8 Kbytes total). The
Dcache consists of two sets of 2 Kbytes (4 Kbytes total).
CPU Initialization and Configuration
10-13
BookL64364PG.fm5 Page 14 Friday, January 28, 2000 4:58 PM
•
Direct mapped is selected when only one set of cache is enabled.
Two-way set associative is selected when two sets are enabled.
•
One cache line is eight words (4 double-words = 32 bytes = 256
bits). Refill address ordering is wrap-around from the missing
address.
•
Write back or write through is selectable by the WB bit in the CCC
register.
Both Icache and Dcache can be configured as scratch-pad RAMs. Each
scratch-pad RAM must be located in one specific physical address space
such as a local or secondary data memory. The APU may load the
frequently referenced instructions and data structures into the
scratch-pad RAMs to greatly reduce memory access time.
10.4.3 Dcache and D-RAM Configuration
Dcache can be configured as direct mapped if one set is enabled or twoway set associative if both sets are enabled. When configured either way,
Dcache behaves like regular cache. It may also be configured as data
RAM. Either Dcache set (0 or 1) can be configured as scratch-pad RAM
by setting the SR0 or SR1 bit of the CCC register. The scratch-pad RAM
must be located at a specific physical address like a secondary data
memory. Since the ATMizer II+ chip has Dcache Tag RAMs, the Tags
must be programmed by isolating the cache before setting the SR bit.
To program a data Tag RAM, set the following bits in the CCC register
using an MTC0 instruction (Figure 10. 6 shows an example code at the
end of this section):
CCC_ISC = 1 - Isolate Cache Mode Enable
CCC_INV = 0 - Invalidate Mode Disable
CCC_TAG = 1 - Tag Test Mode Enable
CCC_DC0 or CCC_DC1 = 1 - Set 0 or Set 1 Enable
The MTC0 instruction has one delay slot and the instruction immediately
following it should not be a load or a store. All load and store instructions
following the MTC0 instruction access the data Tag RAM selected by the
CCC_DC0 or CCC_DC1 bit using the format shown in Figure 10.5. Since
the Dcache set size is 2 Kbytes, only the upper 21 bits of the data are
for the tag. Also, the Valid (V) bit should always be 1 to initialize the tag
field. The Hit (HT) bit is ignored during a store operation. For a load
operation, the Hit bit is set if a match occurs.
10-14
Initialization
BookL64364PG.fm5 Page 15 Friday, January 28, 2000 4:58 PM
Figure 10.5 Tag Test Mode Loaded Data Format
31
10 9
Tag Data
3
Reserved
2
1
0
HT V WB
If the Dcache scratch-pad RAM is enabled, an access to the scratch-pad
RAM area is a secondary memory access without any stall cycle.
You can choose one or two sets of Dcache, two sets of data RAM, or
one set of Dcache and one set of data RAM. Note that only one set of
cache can be set (that is, tag field specified) at a time. To map the data
RAM to a physical memory area, isolate the Dcache (ISC = 1) and set
the tag test mode (TAG = 1). In the tag test mode, all the memory
accesses go only to the Tag RAM and the APU stores the tags in the
Dcache. The following code is an example of data RAM configuration
(Figure 10.6):
Figure 10.6 Data RAM Configuration Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/*****************************************************/
* setDram(addr, set)
* set tag field of the selected set to map the D-ram
* to addr, the addr should be a cachable virtual address.
* This can be executed from kseg1 with interrupts disabled.
*/
.text
.globlsetDram
.entsetDram
setDram:
.setnoreorder
subusp, 24
# allocate min size context
sw s0, 4(sp)
moves0, a0
# save a0 for later use
# Fill the tags set1 with the coming addr in a0 */
mfc0t3, C0_CONFIG# save the original CP0 configuration register
nop
nop
nop
move
t0, t3
# Select set0 or set 1?
# a1 = 0 -> t0 = 1 << 12
# a1 = 1 -> t0 = 1 << 13
andi
a1, 1
xor
a1, 1
addiu a1, 1
CPU Initialization and Configuration
10-15
BookL64364PG.fm5 Page 16 Friday, January 28, 2000 4:58 PM
31
/*
32
* disable cache mode
33
* enable Tag test and Isolate cache mode for Dchache set0 or set1
34
*/
35
and
t0, ~(CCC_IE1 | CCC_IE0 | CCC_DE1 | CCC_DE0 | CCC_INV)
36
or
t0, (CCC_TAG | CCC_ISC)
37
sll
t1, a1, 12
38
or
t0, t1
39
mtc0t0, C0_CONFIG
# load to the CP0 configuration register
40
nop
41
nop
42
nop
43
44
and
a0, 0xfffffc00
# Map to cachable virtual address
45
or
a0, 2
# 2 = vaild
46
li
t1, 0
47 LOOP4:
48
sw
a0, 0(t1)
# store the tag ram
49
addiu a0, a0, 32
# advance the tag value by 4 words
50
addiu t1, t1, 32
# advance the tag position by 4 words
51
sltiu t2, t1, 2048
# continue if <=2k
52
bne
t2, zero, LOOP4
53
nop
54
55
mtc0
t3, C0_CONFIG
# restore the original CP0 configuration register
56
nop
57
nop
58
nop
59
60
.set reorder
61
lw
s0, 4(sp)
62
addu
sp, 24
# deallocate
63
j
ra
64
65
.end setDram
10-16
Initialization
BookL64364PG.fm5 Page 17 Friday, January 28, 2000 4:58 PM
10.4.4 Dcache and C-RAM Usage
When the compiler and linker generate executables, they divide all data
into one of the four sections listed in Table 10.1.
Table 10.1
Data Section Allocation
Name
Description
.data
The .data section contains memory that the linker can initialize to nonzero values before the
program begins to execute. The assembler uses 32-bit addressing to access these symbols.
.sdata
Similar to the .data section, except that the linker places it within a 64 Kbyte region pointed
to by the $gp register so that the assembler can use economical 16-bit addressing to access
it.
.bss
The .bss section consists of noninitialized data, which should be initialized to zero by the C
preamble before the program begins to execute. Its data size is greater than the value
specified by the -G command line option. The assembler uses 32-bit addressing to access
these symbols.
.sbss
Similar to .bss section, except that its data size is smaller than the value specified by the -G
command line option and the linker places it within a 64 Kbyte region pointed to by the $gp
register. The assembler can use economical 16-bit addressing to access it.
The combined size of the .sdata and .sbss sections must not exceed 64
Kbytes. Items equal to or smaller than the specified size go in the .sdata
or the .sbss section. The -G command line option for each compiler or
assembler can increase the size of the data items to be put into the
.sdata and .sbss sections. If a -G value is not specified to the compiler,
the default is eight.
Here is an example:
1
2
int a = 5;
char b;
Variable a will be located in the .data or .sdata section and initialized to
five. Variable b will be located in the .bss or .sbss section and initialized
to zero. If the code is compiled with a -G 4 option, both variables will be
put into the small sections since both their data sizes are not greater than
four bytes. If another value is chosen for the -G option, like -G 2, then
variable a will be put in the .data section since it is larger than two bytes
while variable b will be put in the .sbss section since it is smaller than
two bytes.
CPU Initialization and Configuration
10-17
BookL64364PG.fm5 Page 18 Friday, January 28, 2000 4:58 PM
The small data section (.sdata) and the small bss section (.sbss) are
relatively addressed through the Global Pointer register $gp. The
assembler code looks like the following line:
lw
$v1, offset($gp)
The data section (.data) and the bss (.bss) section are absolutely
addressed. The assembler code looks like the following two lines:
1
2
lui
lw
$v1, 16-bit-upper-address
$v1, 16-bit-offset($v1)
Because addressing items through $gp is faster than through a general
method, you can put as many items as possible in the .sdata or .sbss
sections.
To optimize code execution, you can intentionally force the frequently
referenced data structures (that is, .sdata and .sbss sections) to be
located in the data RAM area. During the design phase, part of the
physical memory is allocated and the data RAM is mapped into it by
setting the corresponding tag address.
Note:
When setting the tag address, use cacheable virtual
address.
The .sdata section, the .sbss section, and the stack is then forced into
the data RAM range. This can be achieved in two ways as described
below.
The first way is relatively simple. In the C preamble, as described in
Section 10.3 , the $gp and the $sp is set to let the .sbss section and the
stack fall into the data RAM range. Note that in this method, you should
declare all the global variables as noninitialized variables and do the
initialization in the code with expressions. In this way, all the global
variables will be in the .sbss section and none in the .sdata section.
Normally, when the linker links the object files, it creates a _gp symbol
and its value should be assigned to the $gp register. After you modify
the $gp register, the whole .sbss section is allocated to a different
memory (data RAM in this case). Since .sbss holds only noninitialized
variables and their default values should be zero, it does not matter
where the .sbss section locates, providing it does not overwrite any other
legal memory contents. The same reasoning applies to the stack which
holds all the local variables. Since the .sdata section contents are
10-18
Initialization
BookL64364PG.fm5 Page 19 Friday, January 28, 2000 4:58 PM
initialized, you cannot simply change the $gp value. To do so would result
in incorrect data addressing and fetching.
Note:
It is your responsibility to make sure that no actual data
from the .sbss section and the stack overlap when they are
put into the same cache set. You can always allocate one
cache set to the .sbss section and the other set to the
stack.
The second method is more complex. You can modify the linker script to
set the .sdata and .sbss sections’ starting and ending addresses and the
$gp value according to the design. Then let the complier and linker put
all global variables into these two sections with the proper -G option. The
data RAM now is exactly mapped to the .sdata and .sbss sections. In
the program, after setting the Dcache tags, you will need to load the
contents of the .sdata section into the Dcache before enabling the data
RAM mode. In this way, all global and local variables are intentionally put
within the range of the data RAM. During normal operation, all references
to these variables go to the data RAM area and dramatically reduce the
data fetching time.
10.4.5 Icache and I-RAM Configuration
The Icache, similar to the Dcache, may be configured as I-RAM.
However, in contrast to the Dcache, only Set 1 may be configured as
either Icache or I-RAM. Set 0 can be configured only as cache. The
I-RAM set is a 4 Kbyte, single-cycle SRAM contained within the ATMizer
II+ chip. If the Icache scratch-pad RAM mode is enabled, an access to
the scratch-pad RAM mapped area is a secondary memory access
without any stall cycle.
Set 1 of the Icache is configured as I-RAM by setting the IR1 bit of the
CCC register. The procedure to configure an Icache as I-RAM is similar
to that described in the Dcache section. Set or clear the following CCC
register bits as indicated:
CCC_IR1 = 1 - Configure Icache Set 1 as scratch-pad RAM
CCC_ISC = 1 - Isolate Cache Mode Enable
CCC_INV = 0 - Invalidate Mode Disable
CCC_TAG = 1 - Tag Test Mode Enable
CCC_IS0 or CCC_IS1 = 1 - Set 0 or Set 1 Enable
CPU Initialization and Configuration
10-19
BookL64364PG.fm5 Page 20 Friday, January 28, 2000 4:58 PM
The Icache tag setting is also similar to the Dcache tag setting with the
data format shown in Figure 10.5, except that only the upper 20 bits are
for the tag (since the Icache set size is 4 Kbytes) and the WB bit is
ignored.
10.4.6 Icache and I-RAM Usage
The I-RAM set may hold up to 1 K instructions of the user-written
firmware to power the APU. These instructions are frequently referenced
and need to reside permanently in the I-RAM to reduce the instruction
fetching time. The two sets of Icache can hold up to 2 K instructions.
Based on different application code sizes, you may configure the Icache
differently. For code size smaller than 8 Kbytes, you may configure two
4 Kbyte Icache sets and let the APU automatically load the instructions
into the Icaches after the first reference. If the total code size is larger
than 8 Kbytes, you may separate the code into two parts and configure
Set 0 as regular Icache and Set 1 as I-RAM. The frequently referenced
code part (for example, the interrupt handler) is preloaded into the
I-RAM. The other part (for example, the initialization routine) is run
regularly with only one Icache set. Another choice is to dedicate all 8
Kbytes of Icache to one part of the code and locate the rest of the code
in off-chip memory. Note that it is not possible to configure Icache Set 0
as I-RAM. However, you can obtain the same effect by leaving the two
sets as true instruction caches, mapping the code that should reside in
the Icache into a cacheable memory area and all other code into the
noncacheable area. It is not necessary to load instructions into Icaches
since the APU will do that after the first code access.
To separate the code into several parts, you can modify the linker script
to create multiple, noncontiguous, text sections. As an example, the
following code is part of a linker script used by the GNU linker
(Figure 10.7):
10-20
Initialization
BookL64364PG.fm5 Page 21 Friday, January 28, 2000 4:58 PM
Figure 10.7 Separating the Code with the Linker Script
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
SECTIONS
2
{
/* Read-only sections, merged into text segment: */
.iram 0x80b00000 :
{
_sIram = . ;
APU_Loop.o(.text)
APU_CBR.o(.text)
APU_VBR.o(.text)
APU_ABR.o(.text)
APU_Cell.o(.text)
APU_Compl.o(.text)
APU_HostMsg.o(.text)
APU_Error.o(.text)
Ring.o(.text)
_eIram = . ;
}
.loader 0xa0b20000 :
{
_sLoader = .;
APU_Ram.o(.text)
_eLoader = .;
}
.rest 0x80b21000 :
{
*(.text)
}
In the previous code example, three noncontiguous text sections were
created: .iram, .loader, and .rest. The .iram section starts from
symbol _sIram in line 6 and ends at symbol _eIram in line 16. In the
following code example, the size of .iram is calculated by subtracting
_sIram from _eIram. Then the Icache loading routine is called to load the
.iram section into the I-RAM.
Note:
A similar technique also is used to determine the size of the
code when loading from serial PROM.
As in the above script, the Icache/Dcache tag setting routine and the
Icache loading routine (APU_Ram.o) are linked to the noncacheable area
(.loader section) since it should be running out of the caches. The
.iram section is linked to the cacheable area since it should be running
from the Icache. The .rest section is also linked to a cacheable area.
After the Icache is loaded and I-RAM mode is enabled, the program
counter jumps to the starting routine in the .iram section. Since the
CPU Initialization and Configuration
10-21
BookL64364PG.fm5 Page 22 Friday, January 28, 2000 4:58 PM
program counter jumps from the noncacheable area to the cacheable
area with a 32-bit address difference, the jalr instruction should be
used for the jump rather than the jal instruction whose branch range is
only 26 bits offset.
Also, external variables _sIram and _eIram should be declared as an
unknown length array to avoid any relocation errors caused by the -G
option. The constant definitions are left out of the example in Figure 10.8
for simplicity.
Figure 10.8 Main Loop Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
/*________________________________________________________
*
*
MAIN LOOP
*________________________________________________________
*/
int main(void)
{
Func *f = (Func *)IRAM_START;
extern char _sIram[], _eIram[];
char *src = _sIram;
char *dst = _eIram;
ulong iram_size = mini( (dst - src), IRAM_SIZE);
/*
* set tag field to map the DRAM to the memory
*/
setDram(DRAM_START & 0xffffff, 1);
/*
* set tag field to map the IRAM to the memory
*/
setIram(IRAM_START & 0xffffff);
/*
* Load the code into the iram
*/
loadIram(0, (ulong)src, iram_size);
/*
* Initialize all necessary configurations for ATMizer-II+
*/
Initialize();
(* f)();
return 0;
}
10-22
Initialization
BookL64364PG.fm5 Page 23 Friday, January 28, 2000 4:58 PM
During the initialization period, the APU first needs to map the I-RAM to
the designated physical memory area and then load the instructions into
Icache. Mapping the I-RAM to a physical memory area is similar to
mapping the D-RAM. Isolate the Icache (ISC = 1) and put it in the tag
test mode (TAG = 1). The APU then stores the tag in the Icache. To load
the firmware into the I-RAM, enable the cache mechanism and put the
Icache in the data test mode. The APU first disables the cache to load
the instructions from the external memory. It then enables Icache Set 1
(IC1 = 1) and the cache isolated mode (ISC = 1), so the following
memory access goes only to the Icache. Thus the fetched instruction is
stored in Icache. The above procedure is repeated until the complete
.iram section is loaded into Icache. The Icache is then configured as
I-RAM (IR1 = 1, IC1 = 1) and the .iram section now resides permanently
in I-RAM. The following code (Figure 10.9) is an example of setting and
loading the I-RAM:
Figure 10.9 Setting and Loading IRAM
1 /****************************************************
2
* setIram(addr)
3
*
set tag field of set 1 to map the iram to addr, the
4
*
addr should be the physical address.
5
*/
6
.text
7
.globl setIram
8
.ent setIram
9
10 setIram:
11
.set noreorder
12
subu
sp, 24
# allocate min size context
13
14
/* Fill the tags set1 with the coming addr in a0
*/
15
mfc0
t3, C0_CONFIG
# save the original CP0 configuration register
16
nop
17
nop
18
nop
19
move
t0, t3
20
21
/*
22
* disable cache mode
23
* enable Tag test and Isolate cache mode for Ichache set1
24
*/
25
and
t0, ~(CCC_IE1 | CCC_IE0 | CCC_DE1 | CCC_DE0)
26
or
t0, (CCC_TAG | CCC_ISC | CCC_IE1)
27
mtc0
t0, C0_CONFIG# load to the CP0 configuration register
28
nop
29
nop
30
nop
CPU Initialization and Configuration
10-23
BookL64364PG.fm5 Page 24 Friday, January 28, 2000 4:58 PM
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
or
li
LOOP1:
sw
addiu
addiu
sltiu
bne
nop
mtc0
nop
nop
nop
a0, 2
t1, 0
# 2 = vaild
a0,
a0,
t1,
t2,
t2,
#
#
#
#
0(t1)
a0, 32
t1, 32
t1, 4096
zero, LOOP1
t3, C0_CONFIG
.set reorder
addu
sp, 24
j
ra
store the tag ram
advance the tag value by 4 words
advance the tag position by 4 words
continue if <= 4k
# restore the original CP0 configuration register
# deallocate
.end setIram
/*************************************************************
* loadIram(dst, src, n)
*
copy n words from src into iram at dst
*/
.text
.globl loadIram
.ent loadIram
loadIram:
.set noreorder
subu
sp, 24
# allocate min size context
mfc0
t3, C0_CONFIG # save the original CP0 configuration register
nop
nop
nop
LOOP2:
10-24
/* disable IsC so that intr can now be fetched
*/
/* from the memory
*/
/*
* disable cache
* disable Tag test and Isolate cache mode
*/
move
t0, t3
and
t0, ~(CCC_IE1 | CCC_IE0 | CCC_TAG | CCC_ISC | CCC_DE1 | CCC_DE0)
mtc0
t0, C0_CONFIG # load to the CP0 configuration register
nop
nop
nop
lw
t1, 0(a1)
Initialization
# load an instruction from the memory
BookL64364PG.fm5 Page 25 Friday, January 28, 2000 4:58 PM
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
/* Enable IsC so that intr can now be written
*/
/* into the data part of the iram
*/
/*
* enable Icache set1 with size 4k.
* enable Isolate cache mode
*/
or
t0, (CCC_IE1 | CCC_IS4 | CCC_ISC)
mtc0
t0, C0_CONFIG
# load to the CP0 configuration register
nop
nop
nop
sw
t1, 0(a0)
# store the instruction into the iram
addiu
addiu
sub
bgez
nop
a0,
a1,
a2,
a2,
#
#
#
#
a0, 4
a1, 4
a2, 4
LOOP2
advance the dst by 4 bytes
advance the src by 4 bytes
decrement the size by 4 bytes
continue if n >= zero
/*
* IRAM operation on Icache Bank 1 is enabled by setting both IE1
* and IR1 bits.
* Also, enable WB mode to speed up the performance.
*/
or
t3, CCC_IR1 | CCC_IE1 | CCC_IS4 | CCC_WB
mtc0
t3, C0_CONFIG # restore the original CP0 configuration register
nop
nop
nop
.set reorder
addu
sp, 24
# deallocate
/* change to non-cachable address */
j
ra
.end loadIram
10.5 Configuration Header File
Table 10.2 describes the contents of the configuration head file
(config.h). All the parameters for the system configuration (shared or
not) by the ATMizer II+ chip and the host are defined in this file. The
ATMizer II+ chip and the host initialize the system according to this
header file. Changes in the corresponding values of the parameters will
adjust the configuration of the system.
Configuration Header File
10-25
BookL64364PG.fm5 Page 26 Friday, January 28, 2000 4:58 PM
Table 10.2
Configuration Header File Contents
APU1
Host1
Description
MaxOpenCon
1024
1024
Maximum open connections allowed
RxRing_Credit
16
16
APU -> Host RxRing credit value
pRxRing_Credit_APU/Host
0xA806.8480
0xBA06.8480
APU -> Host RxRing credit address
RxRing_Base_APU/Host
0xA806.8400
0xBA06.8400
APU -> Host RxRing base
RxRing_Size
32
32
APU -> Host RxRing size
TxRing_Credit
16
16
Host -> APU TxRing credit value
pTxRing_Credit_APU/Host
0xB000.0080
0xB400.0080
Host -> APU TxRing credit address
TxRing_Base_APU/Host
0xB000.0000
0xB400.0000
Host -> APU TxRing base
TxRing_Size
32
32
Host -> APU TxRing size
HCD_MessBase_APU/Host
0xA806.8484
0xBA06.8484
Common location for open
connection message data
Stats_MessBase_APU/Host
0xA806.8584
0xBA06.8584
Common location for statistics
message data
Rx_SBuffSize
64
64
Size of small buffers for Rx data
Rx_LBuffSize
256
256
Size of large buffers for Rx data
Rx_SBuffCount0
170
170
Number of small buffers for Rx data,
list 0
Rx_LBuffCount0
170
170
Number of large buffers for Rx data,
list 0
Rx_SBuffCount1
170
170
Number of small buffers for Rx data,
list 1
Rx_LBuffCount1
170
170
Number of large buffers for Rx data,
list 1
Name
APU -> Host RxRing
Host -> APU TxRing
Commands Related
Rx Direction
(Sheet 1 of 6)
10-26
Initialization
BookL64364PG.fm5 Page 27 Friday, January 28, 2000 4:58 PM
Table 10.2
Configuration Header File Contents (Cont.)
Name
APU1
Host1
Description
Rx_SBuffCount2
170
170
Number of small buffers for Rx data,
list 2
Rx_LBuffCount2
170
170
Number of large buffers for Rx data,
list 2
Rx_SBuffCount3
170
170
Number of small buffers for Rx data,
list 3
Rx_LBuffCount3
170
170
Number of large buffers for Rx data,
list 3
Rx_SBuffCount4
170
170
Number of small buffers for Rx data,
list 4
Rx_LBuffCount4
170
170
Number of large buffers for Rx data,
list 4
Rx_SBuffCount5
170
170
Number of small buffers for Rx data,
list 5
Rx_LBuffCount5
170
170
Number of large buffers for Rx data,
list 5
RxBFDCount
2040
2040
Total RxBFD in all 6 Small and
Large lists
RxBFDBase_APU/Host
0xA800.0000
0xBA0. 0000
Rx BFD table base address
RxSBuff_APU/Host
0xA801.0000
0xBA01.0000
Rx small buffers pool base address
RxLBuff_APU/Host
0xA802.0000
0xBA02.0000
Rx large buffers pool base address
Tx_BuffSize0
1024
1024
Maximum size of buffers for Tx data,
list 0
Tx_BuffSize1
1024
1024
Maximum size of buffers for Tx data,
list 1
Tx_BuffSize2
1024
1024
Maximum size of buffers for Tx data,
list 2
Tx_BuffSize3
1024
1024
Maximum size of buffers for Tx data,
list 3
Tx Direction
(Sheet 2 of 6)
Configuration Header File
10-27
BookL64364PG.fm5 Page 28 Friday, January 28, 2000 4:58 PM
Table 10.2
Configuration Header File Contents (Cont.)
Name
APU1
Host1
Description
Tx_BuffSize4
1024
1024
Maximum size of buffers for Tx data,
list 4
Tx_BuffSize5
1024
1024
Maximum size of buffers for Tx data,
list 5
Tx_BuffSize6
1024
1024
Maximum size of buffers for Tx data,
list 6
Tx_BuffSize7
1024
1024
Maximum size of buffers for Tx data,
list 7
Tx_BuffCount0
256
256
Number of buffers for Tx data, list 0
Tx_BuffCount1
256
256
Number of buffers for Tx data, list 1
Tx_BuffCount2
256
256
Number of buffers for Tx data, list 2
Tx_BuffCount3
256
256
Number of buffers for Tx data, list 3
Tx_BuffCount4
256
256
Number of buffers for Tx data, list 4
Tx_BuffCount5
256
256
Number of buffers for Tx data, list 5
Tx_BuffCount6
256
256
Number of buffers for Tx data, list 6
Tx_BuffCount7
256
256
Number of buffers for Tx data, list 7
TxBFDCount
2048
2048
Total Count of TxBFDs in 8 lists
TxBFDBase_APU/Host
0xA800.8000
0xBA00.8000
Tx BFD table base address
TxBuff_APU/Host
0xA806.0000
0xBA06.0000
Tx buffers pool base address
EDMA_BFD_FBase
0xA800.0000
n/a
Buffer Descriptor table in primary
memory
EDMA_BFD_LBase
0xA080.0000
n/a
Buffer Descriptor table in secondary
memory
EDMA_VCD_Base
0xA060.0000
n/a
Virtual Connection Descriptor table
base
EDMA_TxBFD_Copy
1
n/a
Tx BFD local or far mode
EDMA_RxBFD_Copy
1
n/a
Rx BFD local or far mode
EDMA Related
(Sheet 3 of 6)
10-28
Initialization
BookL64364PG.fm5 Page 29 Friday, January 28, 2000 4:58 PM
Table 10.2
Configuration Header File Contents (Cont.)
Name
APU1
Host1
Description
EDMA_TxBFD_Far
n/a
n/a
Tx BFD are copied to/from far
address
EDMA_RxBFD_Far
n/a
n/a
Rx BFD are copied to/from far
address
EDMA_ConReAct
0
n/a
Enable connection reactivation in Tx
direction
EDMA_ByteSwap
0
n/a
Ctrl byte swapping for cell
transferring
EDMA_Compat
0
n/a
Compatibility in byte swapping for
off-word boundary buffers with
L64364.
EDMA_OrHdr
0
n/a
Ctrl the generation and extraction of
the cell header
EDMA_Ctrl
see note 2
n/a
EDMA control fields
SCD_CalBase0
0xA061.0000
>> 9
n/a
Calendar 0 base address
SCD_CalBase1
0xA061.0400
n/a
Calendar 1 base address
SCD_CalBase2
0xA061.0800
n/a
Calendar 2 base address
SCD_CalBase3
0xA061.0C00
n/a
Calendar 3 base address
SCD_FlatMode
see note 2
n/a
Scheduler’s operating mode: flat or
priority
SCD_VCDinCB
0
n/a
Number of VCDs in cell buffer
SCD_Cal_Size0
1024
n/a
Number of cell slots in calendar 0
SCD_Cal_Size1
1024
n/a
Number of cell slots in calendar 1
SCD_Cal_Size2
1024
n/a
Number of cell slots in calendar 2
SCD_Cal_Size3
1024
n/a
Number of cell slots in calendar 3
SCD_Ctrl
see note 3
n/a
Scheduler control fields
Scheduler Related
(Sheet 4 of 6)
Configuration Header File
10-29
BookL64364PG.fm5 Page 30 Friday, January 28, 2000 4:58 PM
Table 10.2
Configuration Header File Contents (Cont.)
APU1
Host1
Description
ACI_TxSize
16
n/a
Maximum number of cells in Tx
FIFO
ACI_RxLimit
4
n/a
Threshold in RxFIFO to generate an
interrupt
ACI_TxLimit
8
n/a
Threshold in TxFIFO to generate an
interrupt
ACI_RxMask
0x00FF.FFFF
n/a
Rx polling mask
ACI_Phy
0
n/a
Phy physical address to respond to
in slave mode
ACI_LoopBack
see note 2
n/a
Set for on-chip loop back
ACI_Parity
0
n/a
Enable for Utopia parity generation
and error detection
ACI_CellSize
00
n/a
The cell size
ACI_HEC
1
n/a
Set to generate or verify the HEC bit
ACI_TxIdle
0
n/a
Set to generate idle cells when Tx
FIFO is empty
ACI_FixedPr
0
n/a
Set to enable the priority of Phy
device in Rx direction
ACI_Slave
0
n/a
Ctrl the master/slave operation of
the Utopia bus
ACI_DirectPoll
0
n/a
Enable direct or multiplexed polling
scheme
ACI_Reset
0
n/a
Set the Tx and Rx state machines to
idle state
ACI_Ctrl
see note 4
n/a
ACI control fields
ResvCB
0
n/a
In words, needed to calculate
ACI_freelist
Name
ACI Related
(Sheet 5 of 6)
10-30
Initialization
BookL64364PG.fm5 Page 31 Friday, January 28, 2000 4:58 PM
Table 10.2
Configuration Header File Contents (Cont.)
Name
APU1
Host1
Description
n/a
defined at
compile time
Host Connection Descriptor table in
private memory
Host Private
HCD_Base
(Sheet 6 of 6)
1. n/a = not applicable
10.6 Host PCI Access
For the host, the accesses to the ATMizer II+ CBM, Mailbox FIFO,
XPP_Control register, and secondary memory are handled through the
PCI Bus. The following discussion assumes that the ATMizer II+ is a
satellite.
10.6.1 PCI Bus Configuration
Before accessing the data for read or write, the software has to initialize
and configure the PCI Bus. To do so, it has to set up the SAR PCI
configuration space.
The ATMizer II+ chip supports type 0 configuration space access. PCI
configuration space registers are shown in Figure 10.10. Shaded
registers in the figure are not used by the ATMizer II+ chip. Configuration
space writes to unused registers are completed normally, although data
is ignored. Configuration space reads of unused registers are completed
normally with all data bits 0. Refer to the L64364 ATMizer II+ ATM-SAR
Chip Technical Manual for more detail.
Host PCI Access
10-31
BookL64364PG.fm5 Page 32 Friday, January 28, 2000 4:58 PM
Figure 10.10 PCI Configuration Space Registers
31
16 15
0x00
Device ID
0x04
Status
0x08
0
Vendor ID
Command
Class Code
BIST
0x0c
Revision ID
Header Type
0x10
Latency Timer
Cache Line Size
Base Address Register 1
0x14
Base Address Register 2
0x18
Base Address Register 3
0x1c
Base Address Register 4
0x20
Base Address Register 5
0x24
Base Address Register 6
0x28
Cardbus CIS Pointer
0x2c
Subsystem ID
Subsystem Vendor ID
0x30
Expansion ROM Base Address
0x34
Reserved
0x38
Reserved
0x3c
Max Latency
Min Grant
Note:
Interrupt Pin
Interrupt Line
The configuration space registers are documented in the
PCI Bus little endian format (least significant byte is byte 0).
The PCI configuration space is accessed by the host with the address
format described in Figure 10.11.
Figure 10.11 PCI Configuration Address Format
31
1
24 23
0
1
1
0
1
0
0
6
Don’t Care
5
0
Offset
Hexadecimal base address: 0xB400.00000
The first thing the host must do is set the Command field in the PCI configuration space registers (offset 0x06, virtual address 0xB400 0006) to
a default value. A reasonable setting is 0x0006. To enable the configuration write cycles to the SAR, the following sequence (Figure 10.12) is
needed to enable the bridge chip.
10-32
Initialization
BookL64364PG.fm5 Page 33 Friday, January 28, 2000 4:58 PM
Figure 10.12 Programming the Latency Timer in the PCI Configuration Register
/* Program the Latency timer in the Configuration register */
*((uchar *) 0xb800005e) = 0x0a;
*((uchar *) 0xbd000000) = 0xe7;
printf("Programming Command register; ");
pPCI_Conf->Command = 0x06;
printf("Programmed Command register.");
*((uchar *) 0xbd000000) = 0xff;
*((uchar *) 0xb800005e) = 0x06;
Further details on the programming of the PCI configuration registers can
be obtained from the L64364 ATMizer II+ ATM-SAR Chip Technical
Manual.
10.6.2 PCI Access to the ATMizer II+ Memory Space
Next, it is necessary to set up the PCI address space that will be used
to access the SAR memory. This is done by writing the base address of
the memory space you want to use into Base Address register 1 and 2
(offset 0x10 and 0x14) in the PCI configuration space. This address can
be one of the four PCI base addresses described in Table 10.3.
Note that there are four addresses, but that you actually only need two
of them. Only Base Address register 1 and 2 are defined in the PCI configuration space of the ATMizer II+ chip. The other two addresses could
be used if a second ATMizer II+ chip was connected to the same PCI
Bus.
Base Address register 1 defines slave transfers to the ATMizer II+ CBM,
Mailbox FIFO, and XPP_Control register. The memory map for this
address range is shown in Table 10.4. Base Address register 2 maps the
ATMizer II+ secondary memory into PCI memory space. Refer to
Table 10.5.
Host PCI Access
10-33
BookL64364PG.fm5 Page 34 Friday, January 28, 2000 4:58 PM
Once the base address registers are set, the host must use the address
format specified in Table 10.3 when it wants to access the ATMizer II+
memory space through the PCI bus.
Table 10.3 PCI Virtual Address vs. Base Addresses
PCI Base Address1
Host Virtual
Base Address2
0xB500.0000
0xB500.0000
0xB600.0000
0xB600.0000
0xB5800.0000
0xB580.0000
0xB700.0000
0xB700.0000
1. The base address the ATMizer II+ chip will
scan on the PCI Bus, according to the “Base
Register 1 or 2” value.
2. The base address used by the host to access
the ATMizer II+ memory space.
Example:
If Base Address register 1 is set to 0xB500.0000, the virtual
base address the host must use to write to the ATMizer II+
CBM is 0xB500.0000. The CBM range then is 0xB500.0000
to 0xB500.0FFF. The hardware registers are accessed by
the host in little endian format starting at 0xB500.7000.
Table 10.4 ATMizer II+ External Memory Map
10-34
PCI Memory
Module
Size
0x0000–0x0FFF
Cell Buffer Memory
4 Kbyte
0x4000–0x400F
Mailbox FIFO
16 bytes
Initialization
BookL64364PG.fm5 Page 35 Friday, January 28, 2000 4:58 PM
Table 10.5 Secondary Bus Memory Map
Start Address
End Address
Size
Device Type
Bus
Size
0x0000.0000
0x000F.FFFF
1 Mbyte
EPROM/SRAM
8
0x0020.0000
0x002F.FFFF
1 Mbyte
PHY
8
0x0040.0000
0x005F.FFFF
2 Mbytes
EPROM/SRAM
32
0x0060.0000
0x007F.FFFF
2 Mbytes
SSRAM
32
0x0080.0000
0x00FF.FFFF
8 Mbytes
SDRAM
32
10.7 Memory Allocation
In the ATMizer II+ chip, MAX_CON_NUM (1024) connections can be
opened simultaneously. This value has a direct impact on the allocation
of different memory blocks.
Assuming that only two buffers per connection will be used at a given
time, that leads to 2 K transmit buffers and 2 K receive buffers maximum
in system memory. The memory space needed by the different modules
of the ATMizer II+ chip need to be allocated before the software can
enter the main loop and the transfer of data occurs. To enable optimum
use of the memory available on the ADP, the Memory_t structure (shown
in Figure 10.13) with the following code, is used to allocate the memory
to different data structures.
Figure 10.13 Allocating Memory to Data Structures
1 /*
2
* Memory_t structure for memory allocation
3
*/
4 typedef struct {
5
ulong Sram;
6
ulong SramEnd;
7
ulong Ssram;
8
ulong SsramEnd;
9
ulong Sdram;
10
ulong SdramEnd;
11
ulong Phy;
12
ulong PhyEnd;
13
ulong Shr;
14
ulong ShrEnd;
15 } Memory_t, *pMemory_t;
Memory Allocation
10-35
BookL64364PG.fm5 Page 36 Friday, January 28, 2000 4:58 PM
The size and location of the ATMizer code is determined by the compiler
variables “_ftext” and “_end”. The size of the code is computed as
shown in Figure 10.14.
Figure 10.14 ATMizer Code Size Calculation
1
#define CODE_SIZE _end - _ftext
The Memory_t variables are initialized to the starting and end points of
the memory available for allocation as shown in Figure 10.15.
Figure 10.15 Memory-T Variables Initialization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/*
* Memory_t Initialization for memory allocation
*/
Memory_t Memory;
Memory.Sram = APU_BASE_SEC + SRAM_OFFS + SRAM_PMON_OFF;;
Memory.SramEnd = APU_BASE_SEC + SRAM_OFFS + SRAM_SIZE;
Memory.Ssram = APU_BASE_SEC + SSRAM_OFFS;
Memory.SsramEnd = APU_BASE_SEC + SSRAM_OFFS + SSRAM_SIZE;
Memory.Sdram = APU_BASE_SEC + SDRAM_;
Memory.SdramEnd = APU_BASE_SEC + SDRAM_OFFS + SDRAM_SIZE;
Memory.Phy = APU_BASE_SEC + PHY_OFFS;;
Memory.PhyEnd = APU_BASE_SEC + PHY_OFFS ;
Memory.Shr = (ulong) MapForAPU((void *) &pPCI->Buff);
Memory.ShrEnd = APU_BASE_PCI + PCI_SIZE;
Since the memory allocation starts after the ATMizer code in the
Secondary memory, the Memory pointers are updated based on the
location of the code in the Secondary memory as shown in Figure 10.16.
Figure 10.16 Updating Memory Pointers
1
2
3
4
5
6
7
8
9
10
11
switch ( ((ulong) _ftext >> 20) & 0xf ) {
case 2: Memory.Phy += CODE_SPACE;
break;
case 4: Memory.Sram += CODE_S
break;
case 6: Memory.Ssram += CODE_SPACE;
break;
case 8: Memory.Sdram += CODE_SPACE;
break;
default : break;
}
10-36
Initialization
BookL64364PG.fm5 Page 37 Friday, January 28, 2000 4:58 PM
This allows the software to be compiled to SSRAM or SDRAM without
modifying the intialization routines. The initialization of the secondary
memory is done by the InitSEC routine in the host code which initializes
the SCD calendar pointers, the VCD pointer, the BFD pointers and the
ACD pointer. These pointers are passed to the ATMizer II+ chip through
the configuration structure and are used in programming the hardware
registers.
The BFD numbers are allocated based on the EDMA_TxBFD_Far,
EDMA_TxBFD_Copy, EDMA_RxBFD_Far and EDMA_RxBFD_Copy bits
in the EDMA_Ctrl register. The allocation of the BFD numbers is done to
optimize the usage of memory space in the primary and secondary
memories as shown in Table 10.6.
Table 10.6 BFD Number Allocation
TxBFD Location RxBFD Location
TxBFD Number
RxBFD Number
Local
Local
1
TxBFDCount + 1
Local
Far
1
1
Local
Copy
RxBFDCount + 1 1
Far
Local
1
1
Far
Far
1
TxBFDCount + 1
Far
Copy
RxBFDCount + 1 1
Copy
Local
1
TxBFDCount + 1
Copy
Far
1
TxBFDCount + 1
Copy
Copy
1
TxBFDCount + 1
When the TxBFDs are in local memory (secondary memory) and the
RxBFDs are in the far memory (primary PCI memory), then the BFD
numbers for both can start at 1. On the other hand, if the RxBFD
numbers start at (TxBFDCount + 1) as in the case when both Tx and
RxBFDs are in the secondary memory, then the memory space
corresponding to BFD numbers 1 to TxBFDCount is not used in the far
memory.
Memory Allocation
10-37
BookL64364PG.fm5 Page 38 Friday, January 28, 2000 4:58 PM
10.7.1 Receive Direction
For the receive direction, the buffers, the RxBFDs, and the RxRing
should be in shared memory (primary memory). If the RxBFDs are in
packet mode, a copy of the buffers and RxBFDs may also be in
secondary memory (SDRAM).
RxBFDs - The BFD size is 16 bytes, so 32 Kbytes (16 x 2 K) of primary
and secondary memory are required.
Buffer Pool - In the receive direction, 64 bytes are required for small
buffers and 256 bytes for large buffers. So, the two available buffer pools
are:
Small buffers: 64 x 1 K = 64 Kbytes of primary and secondary
memory
Large buffers: 256 x 1 K = 256 Kbytes of primary and secondary
memory
RxRing - The RxRing contains 32 buffer numbers. RX_RING_SIZE,
therefore, is defined as 32. One more word is needed for TxRing credits,
so the RxRing requires (4 x RX_RING_SIZE) + 4 bytes or 132 bytes of
primary memory.
10.7.2 Transmit Direction
For the transmit direction, the buffers and the BFDs are located in
primary memory and the TxRing is in Cell Buffer Memory. If the ATMizer
II+ chip is in the packet mode, a copy of the buffers and RxBFDs may
also be in secondary memory.
TxBFDs - The BFD is 16 bytes, so 32 Kbytes (16 x 2 K) of primary and
secondary memory are required.
Buffer Pool - To send buffer data of 2 Kbytes and open all 1 K
connections at initialization would require 2 Mbytes of primary memory
for the pool. However, since the contents of the transmitted buffers is not
important, they are overlapped in memory. The 1 K buffers are
overlapped every 16 bytes, i.e., buffer n+1 starts 16 bytes after the
beginning address of buffer n.
10-38
Initialization
BookL64364PG.fm5 Page 39 Friday, January 28, 2000 4:58 PM
The space required then is:
(16 x 2 K) + 1024 - 16 = 33 Kbytes of primary and secondary
memory
TxRing - The TxRing is located in CBM on the ATMizer II+ chip for fast
accessing. The ring contains 32 buffer numbers. TX_RING_SIZE,
therefore, is defined as 32. One more word is needed for RxRing credits,
so the TxRing requires (4 x TX_RING_SIZE) + 4 bytes or 132 bytes of
primary memory.
10.7.3 Connection Descriptors
VCDs (32 bytes), ACDs (32 bytes), and SCDs (4 bytes) should be
located in secondary memory. The memory allocation is as follows:
VCDs: 32 x 1 K x 2 = 64 Kbytes in secondary memory
ACDs: 32 x 1 K = 32 Kbytes in secondary memory
SCDs: 4 x 1 K = 4 Kbytes in secondary memory
The host maintains an array containing one Host Connection Descriptor
per requested connection in its private memory. The structure of the HCD
is described in Section 1.3.1.3 . The HCDs require 64 x 1 K = 64 Kbytes
of host private memory.
10.7.4 Buffer Descriptors
The buffer pointers in the BFDs indicate whether the BFDs are in cell
mode or in packet mode. The pBuffData.SEC and pBuffData.PCI fields
of the BFD are initialized in the InitBFD routine. The BFD_FreeList field
of the BFD is used in the Rx direction to support six free lists. The
software can take advantage of this field in the Tx direction for supporting
up to eight free lists. Each list can then be put in cell or packet mode
with different buffer sizes, enabling a more sophisticated buffer
management scheme in the Tx direction. Similarly, eight BFD lists can
be initialized for the PreAttach BFDs.
When the BFD list is in cell mode with the buffers in secondary memory
or in packet mode, the secondary memory for the buffers can be chosen
to be in SSRAM, SDRAM or in SRAM. Furthermore, the starting address
Memory Allocation
10-39
BookL64364PG.fm5 Page 40 Friday, January 28, 2000 4:58 PM
of the buffer location can be selected to be 0, 1, 2 or 3. Therefore, offword boundary buffers can be supported by this initialization scheme.
The location of the buffers of the BFD list is determined by the
configuration variables Loc_BuffPCI and Loc_BuffSec. The format of the
variables is shown in Figure 10.17.
Figure 10.17 Loc_BuffPCI and Loc_BuffSec Format
31
30
29
28
27
26
25
24
Pre
Pre
Pre
Pre
Pre
Pre
Pre
Pre
Attach Attach Attach Attach Attach Attach Attach Attach
7
6
5
4
3
2
1
0
15
14
Reserved
13
12
11
10
9
8
23
22
Reserved
7
6
21
20
19
18
17
16
Rx
Rx
Rx
Rx
Rx
Rx
Large Large Large Large Large Large
5
4
3
2
1
0
5
4
3
2
1
0
Rx
Rx
Rx
Rx
Rx
Rx
Tx
Tx
Tx
Tx
Tx
Tx
Tx
Tx
Small SMall Small Small Small SMall
BFD7 BFD6 BFD5 BFD4 BFD3 BFD2 BFD1 BFD0
5
4
3
2
1
0
For each list if the corresponding bit is set in Loc_BuffPCI, then the buffer
is located in PCI memory and the list is cell mode in PCI memory.
Similarly, if the bit is set in Loc_BuffSec then the buffer is located in the
secondary memory. If both the bits corresponding to a BFD list are set,
then the BFD list is in packet mode.
If a BFD list is in cell mode with the buffer in secondary memory or in
packet mode, then the location of the buffer in secondary memory can
be chosen. To do this, Sec_BuffLoc1 and Sec_BuffLoc0 are used. The
format of these variables is the same as above. For determining the
location of the buffers of TxBFD list 0, bit 0 of Sec_BuffLoc1 and
Sec_BuffLoc0 is used as shown in Table 10.7
10-40
Initialization
BookL64364PG.fm5 Page 41 Friday, January 28, 2000 4:58 PM
Table 10.7 Buffer Location in Secondary Memory
Sec_BuffLoc1, bit 0/
Sec_BuffLoc0 bit 0
Buffer Location
00
N/A
01
SSRAM
10
SDRAM
11
SRAM
Similarly, for other lists, the corresponding bits from Sec_BuffLoc1 and
Sec_BuffLoc0 determine the location of the buffer in secondary memory.
The offset of the buffer in PCI memory and secondary memory is
determined in a similar manner for all the BFD lists using the variables
Off_BuffPCI1 and Off_BuffPCI0 for PCI memory offset, and
Off_BuffSec1 and Off_BuffSec0 for secondary memory offset. Note that
in case of packet mode BFDs, the PCI offset and secondary offset
should be the same.
10.7.5 Data Exchanging Blocks
When issuing the open connection and get statistics commands,
the host and the APU need to share a common fixed location in primary
memory to exchange parameters and data.
open connection – When the host sends an open connection
command to the APU, the host copies the first 64 bytes of the HCD from
its HCD table in its private memory to a fixed location in primary memory.
Refer to Section 1.3.1.3 for details. The space required for the host-toAPU connection parameters is 256 bytes (64 x 4 bytes) in primary
memory.
get statistics – When the host requests statistics from the APU, the
APU copies the relevant data to a fixed location in primary memory and
sends an acknowledgment to the host. The data is described in
Table 10.8. The space required for the statistics results is 256 bytes (64
x 4 bytes) in primary memory.
Memory Allocation
10-41
BookL64364PG.fm5 Page 42 Friday, January 28, 2000 4:58 PM
10.7.6 Related Issues
The following sections discuss other issues related to memory.
10.7.6.1 Cacheable and Noncacheable
In general, if multiple modules can access the same data structure, this
structure should be located in the noncacheable memory area to ensure
that the APU and the host can always fetch the updated values. Only the
APU and the host internally manipulated data structures (e.g., ACDs and
HCDs) should be located in the cacheable area. All the other data
structures (e.g., BFDs, VCDs, SCDs, Rings, Credits, acknowledgments,
Statistics Results, and host-to-APU parameters) should reside in the
noncacheable area.
10.7.6.2 Memory Access
APU accesses to secondary memory depend on the following factors:
•
The number of connections and connection rates determines if ACDs
are in data cache or a cache line needs to be fetched.
•
Connection QoS determines ACD size. CBR and UBR traffic typically
require less ACD access (fewer bytes). ABR traffic requires full ACD
access (32 bytes) for RM cells (typically 2 out of 32 cells).
•
Connection lookup can be done by masking and shifting of the cell
headers and does not require any memory access.
The number of SCD accesses to secondary memory depends on the
scheduler mode (Flat or Priority). Priority mode execution time is variable
because of the dependence on the calendar table connection linked-list
length. Flat mode has constant connection-searching time.
10-42
Initialization
BookL64364PG.fm5 Page 43 Friday, January 28, 2000 4:58 PM
10.7.6.3 Caching Policy
ACDs can be allocated to SSRAM, so cache write through is used.
10.8 Hardware Registers Initialization
Hardware registers should be initialized correctly to make each hardware
module work properly. Various hardware register configuration options
are described In this section.
Refer to Appendix A, “Register Summary,” of the L64364 ATMizer II+
ATM-SAR Chip Technical Manual for references to register layout and
content information.
Table 10.8 lists and describes all of the hardware registers that need to
be initialized.
Hardware Registers Initialization
10-43
BookL64364PG.fm5 Page 44 Friday, January 28, 2000 4:58 PM
Table 10.8
ATMizer II+ Hardware Registers to be Initialized
Register
Description
EDMA_Ctrl
EDMA Control fields
EDMA_BFD_FBase
Buffer Descriptor table in primary memory
EDMA_BFD_LBase
Buffer Descriptor table in secondary memory
EDMA_SBuffSize
Size of small buffers for Rx data
EDMA_LBuffSize
Size of large buffers for Rx data
EDMA_SBuff0
Head of small free buffer list 0
EDMA_LBuff0
Head of large free buffer list 0
EDMA_SBuff1
Head of small free buffer list 1
EDMA_LBuff1
Head of large free buffer list 1
EDMA_SBuff2
Head of small free buffer list 2
EDMA_LBuff2
Head of large free buffer list 2
EDMA_SBuff3
Head of small free buffer list 3
EDMA_LBuff3
Head of large free buffer list 3
EDMA_SBuff4
Head of small free buffer list 4
EDMA_LBuff4
Head of large free buffer list 4
EDMA_SBuff5
Head of small free buffer list 5
EDMA_LBuff5
Head of large free buffer list 5
EDMA_VCD_Base
Virtual Connection Descriptor table base
SCD_Ctrl
Scheduler Control register
SCD_CalBase1
Base address of calendar 1
SCD_CalBase2
Base address of calendar 2
SCD_CalBase3
Base address of calendar 3
SCD_CalSize0
Number of cell slots in calendar 0
(Sheet 1 of 3)
10-44
Initialization
BookL64364PG.fm5 Page 45 Friday, January 28, 2000 4:58 PM
Table 10.8
ATMizer II+ Hardware Registers to be Initialized (Cont.)
Register
Description
SCD_CalSize1
Number of cell slots in calendar 1
SCD_CalSize2
Number of cell slots in calendar 2
SCD_CalSize3
Number of cell slots in calendar 3
ACI_Ctrl
ACI Control field
ACI_RxMask
Rx polling mask
ACI_FreeList
Beginning of the free cell list
ACI_TxSize
Maximum number of cells in Tx FIFO
ACI_RxSize
Maximum number of cells in Rx FIFO
ACI_RxLimit
Threshold in Rx FIFO to generate an interrupt
ACI_TxLimit
Threshold in Tx FIFO to generate an interrupt
ACI_TxTimer
Cell holding time in Tx FIFO
TM_TimeStamp
Timestamp Counter
TM_Timer1
Timer 1 value
TM_TimerInit1
Timer 1 initialization value
TM_Timer2
Timer 2 value
TM_TimerInit2
Timer 2 initialization value
TM_Timer3
Timer 3 value
TM_TimerInit3
Timer 3 initialization value
TM_Timer4
Timer 4 value
TM_TimerInit4
Timer 4 initialization value
TM_Timer5
Timer 5 value
TM_TimerInit5
Timer 5 initialization value
(Sheet 2 of 3)
Hardware Registers Initialization
10-45
BookL64364PG.fm5 Page 46 Friday, January 28, 2000 4:58 PM
Table 10.8
ATMizer II+ Hardware Registers to be Initialized (Cont.)
Register
Description
TM_Timer6
Timer 6 value
TM_TimerInit6
Timer 6 initialization value
TM_Timer7
Timer 7 value
TM_TimerInit7
Timer 7 initialization value
TM_Enable
Time-out enable
TM_ClockSel
Timer clock selection
(Sheet 3 of 3)
10.8.1 EDMA Registers
The EDMA registers are all prefixed with EDMA_ and are described in
the following paragraphs.
10.8.1.1 EDMA_Ctrl Register
Bits in this register determine the data and BFD transfer modes as
described in Table 10.9.
Table 10.9
Data and BFD Transfer Modes
Mode Type
Description
Cell Mode
Individual cells are exchanged between CBM and primary or
secondary memory.
Packet Mode
Complete packets are exchanged between primary memory
and secondary memory.
Far Mode
BFDs are located in primary memory.
Local Mode
BFDs are copied between secondary memory and primary
memory.
Since write operations through the PCI Bus of the ATMizer II+ chip are
always faster than read operations, the best modes are Cell mode and
Local mode. In these, the host writes transmit cells and BFDs to
secondary memory and the ATMizer II+ EDMA writes received cells and
10-46
Initialization
BookL64364PG.fm5 Page 47 Friday, January 28, 2000 4:58 PM
BFDs back to primary memory. This saves PCI Bus transmission time if
the host also has DMA capability.
If the host does not have DMA capability, use Packet and Far modes and
let the ATMizer II+ EDMA exchange the data and BFDs between primary
memory and secondary memory. The exchanges are then transparent to
the APU and the host.
The pBuffData.SEC and pBuffData.PCI fields of the BFDs point to
secondary memory and primary memory respectively. If both fields are
nonzero, then the BFD is in Packet mode. If pBuffData.SEC is zero, then
the BFD is in Cell mode with the buffer in primary memory. On the other
hand, if pBuffData.PCI is zero, then the BFD is in Cell mode with the
buffer in secondary memory.
The configuration of the following bits in the EDMA_Ctrl register
determine where the BFDs are located:
•
EDMA_TxBFD_Far
•
EDMA_TxBFD_Copy
•
EDMA_RxBFD_Far
•
EDMA_RxBFD_Copy
Far mode is selected when the Far bits are set and Local mode is
selected when the Copy bits are set. The EDMA disregards the Far bits
when the Copy bits are set.
10.8.1.2 EDMA_BFD_Base Registers
BFDs in either primary or secondary memory are referenced by adding
the Buffer Number times the size of the BFD to the value in the
EDMA_BFD_FBase (Far or primary memory BFD base address) register
or to the value in the EDMA_BFD_LBase (Local or secondary memory
BFD base address) register. The registers are selected by the EDMA
based on the settings of the Far and Copy bits in the EDMA_Ctrl register.
10.8.1.3 EDMA_Buff Registers
The EDMA_LBuffSize and EDMA_SBuffSize registers specify the sizes
of large and small buffers to be used when a buffer is linked from a free
buffer list. Both EDMA_LBuffSize and EDMA_SBuffSize must be equal
Hardware Registers Initialization
10-47
BookL64364PG.fm5 Page 48 Friday, January 28, 2000 4:58 PM
to or larger than 48 for correct EDMA operation. The EDMA_LBuff and
EDMA_SBuff registers point to the beginning of large and small free
buffer lists.
10.8.1.4 EDMA_VCD_Base Register
The EDMA_VCD_Base register is used to calculate the VCD address by
adding its contents to the connection number multiplied by the size of the
VCD.
10.8.1.5 EDMA Registers Initialization Code
In the following initialization code, Packet mode was chosen for data
transfers and Local mode for BFD maintenance. The C code illustrates
how to initialize the related EDMA registers (Figure 10.18).
Figure 10.18 Initializing EDMA Registers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/*
* EDMA related parameters
*/
#define EDMA_BFD_FBase RxBFD_FBase_APU
/* Buffer Descriptor table in primary memory */
#define EDMA_BFD_LBase RxBFD_LBase_APU
/* Buffer Descriptor table in secondary memory */
#define EDMA_VCD_Base VCD_Base
/* Virtual Connection Descriptor table base */
#define TxBFD_Copy1
/* Tx BFD local mode (1) or far mode (0) */
#define RxBFD_Copy1
/* Rx BFD local mode (1) or far mode (0) */
#define TxBFD_Far0
/* Tx BFD are in Far base (1) or Local base (0) */
#define RxBFD_Far0
/* Rx BFD are in Far base (1) or Local base (0) */
#define ConReAct0
/* Enable connection reactivation in Tx direction */
#define ByteSwap0
/* Ctrl byte swapping for cell transferring */
#define OrHdr 0
10-48
Initialization
BookL64364PG.fm5 Page 49 Friday, January 28, 2000 4:58 PM
32
/* Ctrl the generation and extraction of the cell header */
33
34 #define EDMA_Ctrl ( (ConReAct << EDMA_ConReAct) |\
35
(ByteSwap << EDMA_ByteSwap) |\
36
(OrHdr << EDMA_OrHdr) |\
37
(TxBFD_Far << EDMA_TxBFD_Copy) |\
38
(RxBFD_Far << EDMA_TxBFD_Far) |\
39
(RxBFD_Copy << EDMA_RxBFD_Copy) |\
40
(RxBFD_Far << EDMA_RxBFD_Far) )
41
/* EDMA control fields */
42
43
/* EDMA related initialization */
44
Hdr->EDMA.SBuff=
(ushort)Head_Rx_SBuff;
45
Hdr->EDMA.LBuff=
(ushort)Head_Rx_LBuff;
46
Hdr->EDMA.VCD_Base=
(ushort)EDMA_VCD_Base;
47
Hdr->EDMA.BFD_LBase
(ushort)EDMA_BFD_LBase;
48
Hdr->EDMA.BFD_FBase= (ushort)EDMA_BFD_FBase;
49
Hdr->EDMA.Ctrl=
EDMA_Ctrl;
10.8.2 Scheduler Registers
The two Scheduler registers that need to be initialized are described in
the following paragraphs.
10.8.2.1 SCD_Ctrl Register
The Scheduler Control register, SCD_Ctrl, provides information about the
calendar table base address and the Scheduler mode of operation. The
Scheduler operates in the Flat mode when the SCD_FlatMode bit in the
register is set and in the Priority mode when the SCD_FlatMode bit is
cleared. Flat mode gives all connections equal service priority; Priority
mode services the connections with lower class-of-service values first.
The SCD_VCDinCB field in the SCD_Ctrl register determines the
location of VCDs. All VCDs containing connection numbers equal to or
less than the value in SCD_VCDinCB are located in CBM. The
addresses of VCDs containing connection numbers greater than the
value in SCD_VCDinCB are computed by adding the value in the
EDMA_VCD_Base register to their connection numbers.
Hardware Registers Initialization
10-49
BookL64364PG.fm5 Page 50 Friday, January 28, 2000 4:58 PM
10.8.2.2 SCD_CalSize Register
The SCD_CalSize register is used to program the size of the calendar
table in units of cell slots. The memory required to store the calendar
table is calculated as:
Equation 10.1
Memory (bytes) = SCD_CalSize * (2 + (2 * SCD_FlatMode))
10.8.2.3 Scheduler Registers Initialization Code
The following C code illustrates how to initialize the Scheduler registers
(Figure 10.19):
Figure 10.19 Initializing Scheduler Registers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/*
* Scheduler related parameters
*//****************************************************
#define SCD_CalBase
(SCD_Base >> 9)
/* Calendar base address */
#define SCD_FlatMode
0
/* Scheduler’s operating mode: flat (1) or priority (0) */
#define SCD_VCDinCB
0
/* Number of VCDs in cell buffer */
#define SCD_Cal_Size
MAX_CON_NUM
/* Number of cell slots in calendar */
#define SCD_Ctrl \
( (SCD_VCDinCB << 24) | (SCD_FlatMode << 23) | \ (SCD_CalBase &
0xfffff) )
18
/* Scheduler control fields */
19
20
/* Scheduler related initialization */
21
Hdr->SCD.Ctrl
= SCD_Ctrl;
22
Hdr->SCD.CalSize = SCD_Cal_Size;
10.8.3 ACI Registers
The ACI registers determine the operation of the APU and ACI in relation
to the Utopia Bus. Those that require initialization are described in the
following paragraphs.
10-50
Initialization
BookL64364PG.fm5 Page 51 Friday, January 28, 2000 4:58 PM
10.8.3.1 ACI_Ctrl Register
All of the assigned bits and fields in the ACI_Ctrl register must be
initialized. The initialization code provided here sets the bits and fields as
shown in Table 10.10. Refer to the L64364 ATMizer II+ ATM-SAR Chip
Technical Manual for a detailed description of the register.
Table 10.10 ACI Control Register Initialization
Field/Bit
Initialization
ACI_PHY field
PHY address to respond to in slave mode
ACI_Loopback bit
Set to enable on-chip loopback
ACI_Parity bit
Set to enable Utopia parity generation and error
detection
ACI_CellSize field
Depends on application as follows:
0b00 = 52/53 bytes
0b01 = 56/57 bytes
0b10 = 60/61 bytes
0b11 = 64/65 bytes
ACI_HEC bit
Set to generate and verify HEC
ACI_TxIdle bit
Set to generate idle cells when the Tx FIFO is empty
ACI_FixedPr bit
Set to enable a fixed priority scheme in the receive
direction (port 0 has highest priority and port 23 lowest
priority)
ACI_Slave bit
Depends on application. When set, the APU is a
Utopia Bus slave and responds to the address in the
ACI_PHY field. When cleared, the APU is the Utopia
Bus master.
ACI_DirectPoll bit
Depends on application and ACI_Slave bit. When set,
the APU uses a direct polling scheme and supports up
to four slave devices on the UTOPIA Bus. When
cleared, the APU assigns PHY addresses [3:0] to the
CLAV[3:0] lines of the UTOPIA Bus and supports
multiplexed polling of up to 24 slave devices. See also
the ACI_RxMask register description.
ACI_Reset bit
Set to place the ACI Transmitter and Receiver state
machines to their idle state.
Hardware Registers Initialization
10-51
BookL64364PG.fm5 Page 52 Friday, January 28, 2000 4:58 PM
10.8.3.2 ACI_RxMask Register
The ACI_RxMask register contains 24 N bits, one for each PHY device
supported in multiplexed polling. When an N bit in the register is set, the
ACI receiver includes PHY device N in its polling; otherwise, the device
is skipped.
10.8.3.3 ACI_FreeList Register
The ACI_FreeList register is used only at initialization to set the
beginning of the free cell list. The register is 16 bits wide. The calculation
is:
Equation 10.2
ACI_FreeList = CBM_base + (SCD_VCDinCB * sizeof(VCD))
+ (ResvCB * sizeof(long)) +sizeof(TxRing)
where
CBM_base is the base address of CBM.
SCD_VCDinCB is the total number of VCDs allocated to CBM.
ResvCB is the reserved space and might be 0. Cell number 0 is
always reserved. It is used for idle cell generation when that feature
is enabled. If the feature is disabled, the cell location may be used
as regular cell memory.
TxRing is the transmit ring for messaging between the host and the
APU.
10.8.3.4 ACI_TxSize Register
The 8-bit ACI_TxSize register is used to set the maximum size of the
transmit FIFO to guarantee sufficient free cell locations for the receive
FIFO, since both FIFO’s share the same area in CBM. If the total number
of transmit cells in CBM reaches ACI_TxSize, the CBM manager returns
cell number 0 when the APU requests a free cell location.
10.8.3.5 ACI_RxSize Register
The 8-bit ACI_RxSize register is used to set the maximum size of the
receive FIFO to guarantee sufficient free cell locations for the transmit
FIFO, since both FIFO’s share the same area in CBM.
10-52
Initialization
BookL64364PG.fm5 Page 53 Friday, January 28, 2000 4:58 PM
10.8.3.6 ACI_Limit Registers
The ACI_TxLimit and ACI_RxLimit registers are used to program the
threshold for the number of cells in the transmit or receive FIFO that will
generate an interrupt. When the actual number of cells exceeds the
ACI_RxLimit or drops below the ACI_TxLimit, an interrupt is delivered to
the APU (when enabled). The register is eight bits wide.
10.8.3.7 ACI_TxTimer Register
The ACI_TxTimer register is used to set the cell holding time in the
transmit FIFO depending on the selected timer. The register is eight bits
wide.
10.8.3.8 ACI_FreeCount Register
The ACI_FreeCount register is used to set the free cell count at
initialization. The register is eight bits wide.
10.8.3.9 ACI Registers Initialization Code
The following C code illustrates how to initialize the ACI registers
(Figure 10.20):
Hardware Registers Initialization
10-53
BookL64364PG.fm5 Page 54 Friday, January 28, 2000 4:58 PM
Figure 10.20 Initializing ACI Registers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
<<
/*
* ACI related parameters
*/
#define ACI_TxSize
16
/* Maximum number of cells in Tx Fifo */
#define ACI_RxSize
16
/* Maximum number of cells in Rx Fifo */
#define ACI_RxLimit
4
/* Threshold in RxFifo to generate an interrupt */
#define ACI_TxLimit
8
/* Threshold in TxFifo to generate an interrupt */
#define ACI_RxMask
0x00ffffff
/* Rx polling mask */
#define ACI_Phy
0
/* Phy physical address to respond to in slave mode */
#define ACI_LoopBack
1
/* set for on-chip loop back */
#define ACI_Parity
0
/* enable for Utopia parity generation and error detection
*/
#define ACI_CellSize
00
/* the cell size, 00-52/53, 01-56/57, 10-60/61, 11-64/65 */
#define ACI_HEC
1
/* set to generate or verify the HEC bit */
#define ACI_TxIdle
0
/* set to generate idle cells when Tx Fifo is empty */
#define ACI_FixedPr
0
/* set to enable the priority of Phy device in Rx direction */
#define ACI_Slave
0
/* control the master/slave operation of the Utopia bus */
#define ACI_DirectPoll
0
/* enable direct or multiplexed polling scheme */
#define ACI_Reset
0
/* set the Tx and Rx state machines to idle state */
#define ACI_Ctrl \
(ACI_Phy | (ACI_LoopBack << 5) | \(ACI_Parity << 6) | (ACI_CellSize
8) | \
10-54
Initialization
BookL64364PG.fm5 Page 55 Friday, January 28, 2000 4:58 PM
51
(ACI_HEC << 10) | (ACI_TxIdle << 11) | \ (ACI_FixedPr << 12) |
(ACI_Slave
52
<< 13) | \ (ACI_DirectPoll << 14) | (ACI_Reset << 15))
53
/* ACI control fields */
54
55 #define ResvCB
0
56
/* in words, needed to calculate ACI_freelist */
57
58 #define ACI_FreeList \
59
( (SCD_VCDinCB << 5) + (ResvCB << 2) + \ TxRing_Size *
SizeOf_Ring_Entry + \
60
SizeOf_Ring_Credit)
61 #define ACI_FreeCount \
62
( CELL_COUNT )
63
64 #define CellBuffSize \
65
( (SizeOf_CBM - ACI_FreeList) / SizeOf_Cell )
66
67 /* ACI related initialization */
68
Hdr->ACI.TxSize
= ACI_TxSize;
69
Hdr->ACI.TxLimit = ACI_TxLimit;
70
Hdr->ACI.RxLimit = ACI_RxLimit;
71
Hdr->ACI.RxMask
= ACI_RxMask;
72
Hdr->ACI.Ctrl
= ACI_Ctrl;
73
Hdr->ACI.FreeList = ACI_FreeList;
74
Hdr->ACI.FreeCount = ACI_FreeCount;
10.8.4 Timer Registers
The ATMizer II+ Timer Unit implements a set of eight hardware timers
and a Timestamp Counter in registers to provide the APU with real-time
events. The 32-bit TM_TimeStamp counter is incremented at each input
clock event. It should be initialized to zero.
There are eight, 8-bit, general-purpose timers, TM_Timer1-7. They are
individually initialized to the values in the corresponding TM_TimerInit1-7
registers. A timer, if enabled by the associated bit in the TM_Enable
register, is decremented at each input clock event selected by the
TM_ClockSel register. A timer time-out event occurs when a timer
reaches zero. It then reloads the value in the corresponding
TM_TimerInit1-7 register.
The eight timers may be cascaded to achieve higher counts. Time-out
events for the Timestamp Counter, Timers 1 through 3 and 8 are
registered in the APU_Status register and may generate an interrupt.
Timers 4 through seven can be used only as part of a wider, cascaded
timer.
Hardware Registers Initialization
10-55
BookL64364PG.fm5 Page 56 Friday, January 28, 2000 4:58 PM
Figure 10.21 shows how to cascade timers to enlarge the value of a
watchdog timeout event.
Figure 10.21 Cascading Timers for a Long Watchdog Timeout
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/*
* To avoid APU hangs when it is stalled, enable the
* Watchdog timer with a large value. When it is timeout,
* something must be wrong.
*/
/* enable Timers */
/*
Hdr->TM.ClockSel = 0x65432100;
Hdr->TM.Timer[0].Value = 0xff;
Hdr->TM.Timer[0].Init = 0xff;
Hdr->TM.Timer[1].Value = 0xff;
Hdr->TM.Timer[1].Init = 0xff;
Hdr->TM.Timer[2].Value = 0xff;
Hdr->TM.Timer[2].Init = 0xff;
Hdr->TM.Timer[3].Value = 0xff;
Hdr->TM.Timer[3].Init = 0xff;
Hdr->TM.Timer[4].Value = 0xff;
Hdr->TM.Timer[4].Init = 0xff;
Hdr->TM.Timer[5].Value = 0xff;
Hdr->TM.Timer[5].Init = 0xff;
Hdr->TM.Timer[6].Value = 0xff;
Hdr->TM.Timer[6].Init = 0xff;
*/
/* enable watchdog timer on Timer2 */
Hdr->APU.Watchdog = 0x40ff;
10-56
Initialization
BookL64364PG.fm5 Page 57 Friday, January 28, 2000 4:58 PM
10.8.5 APU Registers
The APU_VIntEnable register is cleared at system reset. Setting a bit in
the register enables the corresponding interrupt and clearing the bit
disables (masks) the interrupt. The bit number in the register
corresponds to the interrupt number shown in Table 10.11. Interrupt
number 0 has the lowest priority.
Table 10.11 External Vectored Interrupts
Interrupt
Number
Description
IntEDMA_ComplFull
15
Completion Queue is full (Tx, Rx or Buff)
IntACI_RxFull
14
ACI Rx FIFO is full
IntRxMbx
13
Rx Mailbox FIFO not empty
IntEDMA_Move
12
EDMA Move is complete
IntEDMA_RxCell
11
RxCell Completion Queue not empty
IntACI_Rx
10
ACI Rx FIFO exceeds threshold
(ACI_RxLimit)
IntEDMA_TxCell
9
TxCell Completion Queue not empty
IntEDMA_Buff
8
Buff Completion Queue not empty
IntACI_Err
7
Timeout, parity, or short-cell error
IntACI_Tx
6
ACI Tx FIFO is below threshold
(ACI_TxLimit)
IntExt1-0
5-4
External interrupt inputs (user defined)
IntTim3-1
3-1
Timers 3-1 timeout
IntTim0
8
Timer 8 timeout
The contents of the APU_VIntBase register are used as bits [31:7] and
the interrupt number as bits [6:3] for the vectored interrupt handler
routine address. Bits [2:0] of the address are set to zero.
The APU_Reset bit in the APU_AddrMap register is set when the
hardware PCI_RSTn signal is asserted. All hardware modules remain in
an idle state as long as this bit is set. After the APU initializes all
hardware registers and memory resident data structures, the APU_Reset
Hardware Registers Initialization
10-57
BookL64364PG.fm5 Page 58 Friday, January 28, 2000 4:58 PM
bit should be cleared, as shown in Figure 10.22, to activate all the
hardware modules on the ATMizer II+ chip:
Figure 10.22 Clearing the APU_Reset Bit
Hdr->APU.AddrMap &= 0x7fff00ff;
10.9 Data Structures Initialization
This section describes initialization of the following data structures:
•
Virtual Connection Descriptors (VCDs) and APU Connection
Descriptors (ACDs)
•
Buffer Descriptors (BFDs)
•
The Calendar Table
•
The Tx and Rx Rings
•
The Free Cell List
10.9.1 VCD and ACD Initialization
During the initialization period, the SDP application code clears all fields
of all VCDs, as shown in Figure 10.23.
Figure 10.23 Clearing VCD Fields
1
2
3
4
/* clear all VCDs */
vcd = (ulong*)VCD;
for (i = 0; i < (MAX_CON_NUM * SizeOf_VCD / 4); i++)
*vcd ++= 0;
The APU initializes the corresponding ACD and VCD when it receives an
open connection command from the host. At the same time, the host
also passes the initial address of a block of signal parameters for the
connection to the APU. This block is the first 32 bytes of the Host
Connection Descriptor (HCD). Refer to Section 1.3.2.1, “Mailbox,” for how
to issue the open connection command and pass the required
parameters.
Based on these parameters, the APU calculates its own ACD for that
connection. Table 10.12 lists the signaled parameters. Refer to the ATM
Forum Traffic Management Specification 4.0, for the detailed meaning of
10-58
Initialization
BookL64364PG.fm5 Page 59 Friday, January 28, 2000 4:58 PM
each parameter. The calculation of the ACD is described in Table 10.13.
Defined in the Initialization column of the table means that the parameter
is predefined in the header file (ConPar.h) as the default value (you may
change it if it is signaled).
The undefined parameters are calculated at the connection open time.
Refer to Chapter 3, Scheduling, for more detail about the usage of ACDs.
Table 10.12 Required Open Connection Parameters
Name
Address Class Description
ConNum
0
All
Connection Number
Reserved 4
All
Cell header (to be implemented later)
Class
8
All
Class of traffic
PCR
12
All
Peak Cell Rate in Cells/Sec units, 24-bit integer
SCR
16
VBR
Sustained Cell Rate in Cells/Sec units, 24-bit
integer
MCR
16
ABR
Minimum Cell Rate in Cells/Sec units, 24-bit
integer
MBS
20
VBR
Maximum Burst Size
ICR
20
ABR
Initial Cell Rate in Cells/Sec units, 24-bit integer
TBE
24
ABR
Transient Buffer Exposure
FRTT
28
ABR
Fixed Round-Trip Time
Data Structures Initialization
10-59
BookL64364PG.fm5 Page 60 Friday, January 28, 2000 4:58 PM
Table 10.13 ACD Field Calculations
Name
Class
Initialization
ICG (Intercell Gap)
CBR, UBR
LCR (Line Cell Rate)/
PCR (Peak Cell Rate
ICG
ABR
LCR/ICR (Initial Cell Rate)
ICG_PCR
VBR
LCR/PCR
ICG
VBR
ICG_PCR
Bucket
VBR
0, Variable
Increment
VBR
LCR/SCR
Limit
VBR
(MBS - 1)(Increment - ICG_PCR)
ThTxTime
All
ICG
NRM (maximum number of cells a
source may send for each Forward
Resource Management cell)
ABR
32, defined
ICR
ABR
min(PCR, TBE/FRTT)
LastTimeFRM (last time a Forward
Resource Management cell was sent)
ABR
Now - NRM/ICR
logRIF (Rate Increase Factor)
ABR
4, RIF = 1/16, defined
logRDF (Rate Decrease Factor)
ABR
4, RDF = 1/16, defined
CRM (limit of FRM cells in the absence
of a Backward Resource Management
cell)
ABR
TBE/NRM
FRM_SinceBRM (The count of FRM
cells since the last Backward Resource
Management cell)
ABR
0
InRateCell (the count of In-Rate cells
since the last FRM cell)
ABR
NRM
ACR (Allowed Cell Rate)
ABR
ICR
MCR
ABR
0
PCR
ABR
PCR
(Sheet 1 of 2)
10-60
Initialization
BookL64364PG.fm5 Page 61 Friday, January 28, 2000 4:58 PM
Table 10.13 ACD Field Calculations (Cont.)
Name
Class
Initialization
LastWasFRM (last RM cell sent was an
FRM cell)
ABR
0
PresBRM (presenting BRM cell)
ABR
0
BRM_NI (BRM No Increase)
ABR
0
BRM_C (BRM Congestion Indicator)
ABR
0
BRM_BN (BRM Backward Explicit
Congestion Notification cell)
ABR
0
logCDF (Cutoff Decrease Factor)
ABR
4, CDF =1 / 16, defined
ADTF (ACR Decrease Time Factor)
ABR
(ulong) (0.5 * LCR), defined
BRM_ER (BRM Explicit Rate)
ABR
0
BRM_CCR (BRM Current Cell Rate)
ABR
0
BRM_MCR (BRM Minimum Cell Rate)
ABR
0
TRM (the upper bound on the time
between FRM cells)
ABR
100 ms, defined
TCR (Total Cell Rate)
ABR
10, defined
(Sheet 2 of 2)
10.9.2 BFD Initialization
All TxBFDs and RxBFDs are initialized before starting to send or receive
data. The NextBFD, pBuffData, and BuffSize fields of the BFDs are set
correctly and the other fields are cleared. Figure 10.24 shows an
example of C code to initialize all BFDs.
Data Structures Initialization
10-61
BookL64364PG.fm5 Page 62 Friday, January 28, 2000 4:58 PM
Figure 10.24 Initializing BFDs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
TmpAddr = (ulong *) TxBuffPool;
for (n = 1; n < Tx_BuffCount; n++, TmpAddr += TxBuffSize) {
TxBFD[n].NextBFD
= n+1;
/* points to next BFD */
RxBFD[n].NextBFD
= n+1;
TxBFD[n].pBuffData
= (ulong)TmpAddr;
TxBFD[n].BuffSize
= TxBuffSize;
if (n <= MAX_CON_NUM)
RxBFD[n].pBuffData = (ulong)RxSmallBuffPool[n-1];
else
RxBFD[n].pBuffData = \
(ulong)RxLargeBuffPool[n-1-MAX_CON_NUM];
}
TxBFD[2*MAX_CON_NUM].NextBFD = 0;/* last Tx BFD in the list */
RxBFD[MAX_CON_NUM-1].NextBFD = 0;/* last small Rx BFD
*/
RxBFD[2*MAX_CON_NUM].NextBFD = 0;/* last large Rx BFD
*/
NextFreeTxBFD = 1;
/* first available Tx BFD */
}
There are two ways to initialize BFDs. One way is to let the host and/or
the APU exchange buff commands with each other to attach BFDs to
VCDs. For the RxBFDs, the host:
1. sets the BuffFree bit in all BFDs,
2. sets the BuffLarge bit in large BFDs, and
3. clears the BuffLarge bit in small BFDs.
before sending the buff command to the APU. The APU passes the
command to the EDMA and the EDMA automatically puts the BFDs in
the corresponding buffer free list.
For TxBFDs, the APU clears the BuffFree bit and ignores the BuffLarge
bit if only one-size buffers are used. When the host receives the buff
command, it puts the BFD in its own free buffer list.
The other way to initialize BFDs is to simply let the host or the APU
create the free BFD lists. In the ADP, the host initializes all the free BFD
lists.
In the BFD copy mode, the BFDs are located in both primary and
secondary memory since they are copied back and forth. When
accessing the BFDs, the EDMA uses either the Far BFD base
(Fbase - BFD base address in primary memory) or the Local BFD base
(Lbase - BFD base address in secondary memory) as follows:
10-62
Initialization
BookL64364PG.fm5 Page 63 Friday, January 28, 2000 4:58 PM
•
attach
–
read BFD
if (TxBFD_Copy) use Fbase
else if (TxBFD_Far) use Fbase
else use Lbase
–
write partial BFD from VCD[tailBFD]
if (TxBFD_Copy) use Lbase
else if (TxBFD_Far) use Fbase
else use Lbase
•
free
–
if (RxBFD_Copy) use Lbase
–
else if (RxBFD_Far) use Fbase
–
else use Lbase
From the above it can be seen that, in the BFD copy mode, the RxBFDs
should be initialized in secondary memory and the TxBFDs should be
initialized in primary memory.
If not in the BFD copy mode, the BFDs should be located in the far
(primary) memory or the local (secondary) memory per the states of the
TxBFD_Far and RxBFD_Far bits in the EDMA_Ctrl register. Refer to the
L64364 ATMizer II+ ATM-SAR Chip Technical Manual for more detail.
10.9.3 Calendar Table Initialization
The Calendar Table is a cell-slot array managed by the Scheduler. Each
entry in the Calendar Table corresponds to one cell slot and contains
connection numbers of virtual connections to be serviced in that slot. All
slots in the table are cleared initially to indicate that there are no
connections scheduled. The example code in Figure 10.25 clears the
Calendar Table. Refer to Chapter 3, “Scheduling” and the L64364
ATMizer II+ ATM-SAR Chip Technical Manual for more detail about the
Calendar Table and Scheduler.
Data Structures Initialization
10-63
BookL64364PG.fm5 Page 64 Friday, January 28, 2000 4:58 PM
Figure 10.25 Clearing the Calendar Table
1
2
3
4
/* clear Calendar Table */
Calendar = (ulong*)SCD_Base;
for (i = 0; i < (MAX_CON_NUM * SizeOf_SCD / 4); i++)
*Calendar ++= 0;
10.9.4 Ring Initialization
To minimize the traffic on the PCI Bus, both the APU and the host keep
a separate set of pointers for the rings. The initialization for the host sets
the RxRing count to the RxRing size and clears the TxRing count. It also
points the TxRing credit and the RxRing starting pointer to primary
memory, and points the TxRing starting pointer and the RxRing credit to
CBM. The routine is shown in Figure 10.26.
Figure 10.26 Initializing Host Rings
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Ring_Desc_t
TxRing, RxRing;
TxRing.Ptr
TxRing.Size
TxRing.End
TxRing.Count
TxRing.Credit
=
=
=
=
=
&Host_APU[0];
/* points to CBM */
HOST_APU_RING_SIZE;
TxRing.Ptr +TxRing.Size;
0;
&Host_APU_Credit;
/* points to primary memory*/
RxRing.Ptr
RxRing.Size
RxRing.End
RxRing.Count
RxRing.Credit
*RxRing.Credit
=
=
=
=
=
=
&APU_Host[0];
/* points to primary memory */
APU_HOST_RING_SIZE;
RxRing.Ptr + RxRing.Size;
RxRing.Size;
&APU_Host_Credit;
/* points to CBM */
RxRing.Size;
Similarly, the initialization for the APU sets the TxRing count to the
TxRing size and clears the RxRing count. It also points the TxRing credit
and the RxRing starting pointer to primary memory, and points the
TxRing starting pointer and the RxRing credit to CBM. The routine is
shown in Figure 10.27.
10-64
Initialization
BookL64364PG.fm5 Page 65 Friday, January 28, 2000 4:58 PM
Figure 10.27 Initializing APU Rings
1
2
3
4
5
6
7
8
9
10
11
12
13
Ring_Desc_t
RxRing.Ptr
RxRing.Size
RxRing.End
RxRing.Count
RxRing.Credit
TxRing, RxRing;
= &APU_Host[0];
/* points to primary memory */
= APU_HOST_RING_SIZE;
= RxRing.Ptr +Rx Ring.Size;
= 0;
= &APU_Host_Credit;
/* points to CBM */
TxRing.Ptr
TxRing.Size
TxRing.End
TxRing.Count
TxRing.Credit
*TxRing.Credit
=
=
=
=
=
=
&Host_APU[0];
/* points to CBM */
HOST_APU_RING_SIZE;
TxRing.Ptr +Tx Ring.Size;
TxRing.Size;
&Host_APU_Credit;
/* points to primary memory */
TxRing.Size;
10.9.5 Free Cell List
Cell Buffer Memory is a 4 Kbyte sized, on-chip memory. It is mainly used
for the Free Cell List, the Transmit FIFO, and the Receive FIFO, and
occasionally may contain other data structures.
The Cell Buffer Manager is responsible for the management of the CBM.
The Cell Buffer Manager maintains the Free Cell List through the
ACI_FreeList register. This register is initialized as the first address of the
Free Cell List and is described in Section 10.8.3, “ACI Registers.” At
initialization, the APU builds a list of free cells. Refer to the L64364
ATMizer II+ ATM-SAR Chip Technical Manual for more detail of the Cell
Descriptor format and usage. The example code shown in Figure 10.28
builds a Free Cell List.
Figure 10.28 Free Cell List Initialization
1
2
3
4
5
/* CBM Freelist initialization */
CellBuff = (pCell_t)((ulong)CBM + ACI_FreeList);
for (i = 1; i < CellBuffSize - 1; i++)
CellBuff[i].CDS = (i * sizeof(Cell_t) + ACI_FreeList) << 18;
CellBuff[CellBuffSize - 1].CDS = 0;
10.9.6 Miscellaneous Data Structures
The remaining variables and structures should be correctly set or
cleared. For instance, all the statistic results should be cleared. The
global variable TimeNow should be initialized to 0.
Data Structures Initialization
10-65
BookL64364PG.fm5 Page 66 Friday, January 28, 2000 4:58 PM
10-66
Initialization
BookL64364PG.fm5 Page 1 Friday, January 28, 2000 4:58 PM
Chapter 11
Operating Software
This chapter describes operating software for an ATMizer II+ system. The
sections in this chapter are:
•
Section 11.1, “Top Level Structure”
•
Section 11.2, “APU Program”
•
Section 11.3, “Host Program”
11.1 Top Level Structure
Software running on the ATMizer II+ chip interacts with each hardware
module to realize the traffic flow control mechanism defined by the ATM
Forum. The Segmentation and Reassembly (SAR) process can be split
into four separate subprocesses or threads:
•
RxHAS
•
RxCRT
•
TxHAS
•
TxCRT
First, the receive and transmit directions are handled by independent
threads. Second, host-to-ATM Processing Unit (APU) signalling is
handled independent of cell receiving and transmitting. The Host-to-APU
Signalling thread (HAS) involves the EDMA buff command and the
EDMA move command (for packet mode only). The Cell Receive and
Transmit thread (CRT) involves the EDMA cell command. The HAS
thread is triggered by host commands and EDMA buffer completion
events while the CRT thread is triggered by all arrival or Scheduler/timer
events.
L64364 ATMizer II+ ATM-SAR Chip Programming Guide
11-1
BookL64364PG.fm5 Page 2 Friday, January 28, 2000 4:58 PM
11.2 APU Program
The MIPS processor core in the APU may be considered the main
control unit of the ATMizer II+ architecture. The APU is responsible for
traffic management, host messaging, OAM cell processing, and statistics
collection. All other hardware processing modules are slaves of the APU
and execute commands when appropriate hardware registers are written.
From the software perspective, the hardware accelerators appear as
predefined routines that execute faster than equivalent processor code
and, more importantly, in parallel with the main processor. The hardware
modules include:
•
a full AAL5 segmentation and reassembly engine with built-in
memory management
•
a calendar-based Scheduler unit
•
a floating point accelerator for ATM Forum 15-bit floating point format
11.2.1 Cell Operation Flow
Following is the typical transmit/receive cell flow. Refer to the L64364
ATMizer II+ ATM-SAR Chip Technical Manual for details.
TxCRT – Transmit Cell Thread
1. The APU reads the ACI_TxFree register. The Cell Buffer Manager
returns a free cell location.
2. The APU issues the SCD_Serv command. The Scheduler returns the
connection number to be serviced. If the connection number is zero,
no connection is scheduled to be serviced during this cell slot. Skip
steps 3 and 4.
3. The APU computes a new Intercell Gap (ICG).
4. The APU issues the SCD_sched command with the connection
number and the computed service time
(ServTime = ThTxTime + ICG).
5. The APU issues the SCD_tic command.
6. The APU issues the TxCell command.
11-2
Operating Software
BookL64364PG.fm5 Page 3 Friday, January 28, 2000 4:58 PM
7. The EDMA executes the TxCell command and puts the cell in the
TxFIFO.
8. The Cell Buffer Manager sends the cell out and links the cell to the
free cell list.
RxCRT – Receive Cell Thread
1. The ACI places a cell in the ACI RxFIFO.
2. The APU reads the ACI_RxRead register to get the cell address. The
ACI removes the cell from RxFIFO.
3. The APU reads the cell descriptor and cell header from the cell buffer
and computes the connection number.
4. The APU issues the RxCell command.
5. The EDMA executes the RxCell command and returns the cell to a
free list.
6. The Cell Buffer Manager links the cell to a free list.
11.2.2 Buffer Operation Flow
The buffer operation flow in the ATMizer II+ chip is described in the
following. Refer to the L64364 ATMizer II+ ATM-SAR Chip Technical
Manual for details.
TxHAS – Transmit Buffer Host-APU Signalling Thread
1. The host writes a BuffNum into the TxRing.
2. The APU reads this BuffNum from the TxRing and writes it into the
Buff Request Queue if the queue is not full.
3. The EDMA gets the BuffNum from the Buff Request Queue, copies
the BFD into secondary memory, links it to the VCD, and optionally
invokes the Move processor to copy the buffer contents from primary
memory to secondary memory (through the EDMA_TxBFD_Copy
and EDMA_TxBuffCopy control bits) if it is in the packet mode.
4. The EDMA completes the buffer and places the BuffNum in the Buff
Completion Queue.
5. The APU reads the BuffNum from the completion queue.
APU Program
11-3
BookL64364PG.fm5 Page 4 Friday, January 28, 2000 4:58 PM
6. The APU writes the BuffNum and two control bits (BFS_BuffLarge
and BFS_BuffFree) into the RxRing.
7. The host may free its own buffer since this packet has been sent out.
RxHAS – Receive Buffer Host-APU Signalling Thread
1. The RxCell processor completes a buffer, places the BuffNum in the
EDMA Completion Queue, copies the BFD to primary memory, and
optionally invokes the Move processor to copy the buffer contents to
primary memory (through the EDMA_RxBFD_Copy and
EDMA_RxBuffCopy control bits) if it is in packet mode.
2. The APU retrieves the BuffNum from the completion queue and
places the BuffNuM in the RxRing.
3. The host consumes data (some time later).
4. The host writes the BuffNum into the TxRing to free it.
5. The APU retrieves the BuffNum from the TxRing.
6. The APU issues a buff command to let the Buff processor link this
BFD to a free list.
11.2.3 Pseudocode
The ATMizer II+ Application Pseudocode contains a pseudo-code
example to illustrate the APU software necessary to implement SAR
functionality. The pseudo-code performs the following tasks:
•
11-4
host and APU messaging
–
command to segment a buffer (host->APU)
–
return of segmented (sent) buffer (APU->host)
–
notification of received buffer (APU->host)
–
return of a buffer to a free list (host->APU)
–
request to open a connection (host->APU)
–
request to close a connection (host->APU)
–
request to copy statistics vector (host->APU)
•
receive cell header lookup
•
scheduling connections for transmit
Operating Software
BookL64364PG.fm5 Page 5 Friday, January 28, 2000 4:58 PM
•
ABR rate computations
•
VBR leaky bucket computations
•
collecting statistics
–
number of received and transmitted cells
–
number of received and transmitted PDUs
–
number and type of errors (CRC32, lost or misinserted cells,
etc.)
11.3 Host Program
The host program allows you to send commands to the ATMizer II+ chip
and to display the results of these commands. This involves opening
connections, transmitting and receiving data, and displaying statistics
such as effective rate, errors received, etc.
11.3.1 Setting up a Configuration File
The host program will not accept your commands during execution time.
It will accept several initialization commands given after the program
starts. After the go command is issued, the host program does not scan
for user input. There is no data consistency checking on the received
channel.
The initialization commands allow you to:
•
set the size of all the buffers transmitted to the ATMizer II+ chip.
•
request the ATMizer II+ chip to open connections with specific
parameters.
–
the number of connections to open
–
class: CBR, UBR
–
PCR: rate to request for that connection
The go command starts the dialog between the host and the ATMizer II+
chip. According to the previous initialization commands, it will open the
requested connections and start transmitting and receiving data. See
Section 11.3.2.2, “Read Command Line Options,” for syntax details.
Host Program
11-5
BookL64364PG.fm5 Page 6 Friday, January 28, 2000 4:58 PM
Due to the sequencing of the commands (initialization commands first,
and then go), it is easy to write all of these commands in a configuration
(script) file. You will then be able to send this file through the
communications program you are using (e.g., tip in Unix and Crosstalk,
Procomm Plus, etc., in DOS) instead of typing all the commands one by
one.
This file will simply be a list of the commands to issue. A typical example
is shown in Figure 11.1.
Figure 11.1 A Typical Configuration File
# Comments start with a “#”
# buffsize buffer_size_in_bytes : set the buffer size
buffsize 1024
# open connections connection_class rate : open the
different
# connections with specified class and rate
open 1-3,8 CBR 25e6
open 3-5 CBR 12.5e6
open 6,7 CBR 0.1e6
# run !
go
Again, see Section 11.3.2.2, “Read Command Line Options,” for syntax
details.
11.3.2 Host Tasks
The different tasks the host will have to perform during a demonstration
are described in the following paragraphs.
11.3.2.1 Initialize the Data Structure in Primary Memory
The host will have to initialize the following structures:
TxRing – The TxRing holds the buffer numbers sent by the host to the
ATMizer II+ chip for transmission. It is located in CBM and has a
credit-type flow control. At initialization, this ring is filled with zeros.
11-6
Operating Software
BookL64364PG.fm5 Page 7 Friday, January 28, 2000 4:58 PM
Later the ring will contain buffer numbers that will allow the host to
identify the corresponding BFD in the TxBFDList. It is maintained by a
RingDesc_t structure.
TxBFDList – This is the array containing the BFDs used for
transmission. Each BFD has a field (pBuffData) pointing to an actual
buffer in the primary memory.
At initialization when the list is created, each NextBFD field is updated to
point to the following BFD in the array, and each pBuffData field points
to a buffer in the primary memory. The BuffSize field needs to be set
according to the size you defined for the transmit buffers (buffsize user
command).
Tx Buffers – Tx Buffers are the buffers actually transmitted on the line.
They are filled with random data at initialization.
The size of the Tx Buffers is defined by you for all of the connections that
are opened. It is a fixed value lower or equal to 1024 bytes. As explained
in Section 10.7, “Memory Allocation,” the transmit buffers are overlapped
every 16 bytes.
RxRing
The RxRing holds buffer numbers identifying BFDs from either the
RxSmallBFDList or RxLargeBFDList, depending on the value of the
BuffSize field.
This ring also has a credit-type flow control. It is located in the primary
memory. The ring is filled with zeros at initialization.
RxSmallBFDList and RxLargeBFDList
RxSmallBFDList and RxLargeBFDList are two arrays containing BFDs
pointing to small and large buffers in the primary memory. These buffers
are used by the ATMizer II+ chip to store the data received it receives
from the line and reassembles.
At initialization, the host builds these arrays the same way it creates the
TxBFDList, by assigning small (64 bytes) or large (256 bytes) buffers to
each BFD. Even though these lists are initialized by the host (they are in
the primary memory), only the ATMizer II+ chip maintains them.
Host Program
11-7
BookL64364PG.fm5 Page 8 Friday, January 28, 2000 4:58 PM
ConnectionList
This list is located in the host’s private memory and it describes the
different parameters associated to each connection. There is one entry
per connection requested, and each descriptor is created according to
what you specified in the open connection command. See Section
1.3.1.1, “Connection Numbers,” for more details on the contents of the
list.
11.3.2.2 Read Command Line Options
This procedure analyzes the command line arguments, interprets them,
and executes the corresponding functions. Any line beginning with a # is
a comment line and is ignored. The commands are described in the
following paragraphs.
Set transmit buffer size:
buffsize buffer_size
where buffer_size is the size in bytes to use for all the buffers
transmitted to the ATMizer II+ chip.
Open connection:
open connections class class_fields min_buffer_size
max_buffer_size]
where:
connections is the list of connections to open. The format
is a,c,e-h to open connections a, c, e, f, g, h (e to h).
class is ABR, CBR, VBR, or UBR.
class_fields are fields depending on the class type. See
Section 1.3.1.3, “Host Connection Descriptors,” for details.
min_buffer_size max_buffer_size - These parameters
will be available in a future version of the software. They set
the size of the transmit buffers to use for that connection. If
max = min, this size will be used for all the buffers for that
connection. If max > min, the software will use a random
value between min and max.
11-8
Operating Software
BookL64364PG.fm5 Page 9 Friday, January 28, 2000 4:58 PM
Close connection:
close connections
where:
connections is the list of connections to close. The
format is a,c,e-h to close connections a, c, e, f, g, h (e to h).
The connections have to be closed. A close_connection
message is sent to the ATMizer II+ chip and no more
buffers for the connections are transmitted.
Hold connections:
hold connections
where:
connections is the list of connections to hold. The format
is a,c,e-h to hold connections a, c, e, f, g, h (e to h).
This command stops the transfer of buffers to the
connections but keeps them open.
Refresh statistics display:
stats S
where:
S is the number of seconds.
This sets the time interval between two statistics screen updates.
11.3.2.3 Send Messages to the APU
The messages the host sends to the APU are:
•
open connection
•
close connection
•
get statistics
See section Section 1.3.2.1, “Mailbox,” for details on the messages.
The SendMsgToSAR procedure simply writes the messages (32-bit wide
words) to the memory-mapped mailbox register. No flow control is
Host Program
11-9
BookL64364PG.fm5 Page 10 Friday, January 28, 2000 4:58 PM
performed when accessing the mailbox since the host sends only one
message at a time and waits for an acknowledge before sending another
message.
11.3.2.4 Receive Messages from the APU
This procedure scans the APU-to-host mailbox for the presence of new
messages. As the host has to scan the mailbox regularly (for example to
wait for the acknowledge to the Get Statistics message), it is better to
have the mailbox located in the primary memory rather than to use the
PCI mapped APU-to-host mailbox. Indeed, if the mailbox is located in the
primary memory, writes to the mailbox by the APU will use the PCI Bus
and then consume some bandwidth but they will not occur very often. On
the other hand, reads from the mailbox will be done regularly by the host,
but they won’t need access to the PCI Bus.
Since the messages sent by the APU are only acknowledgments of the
host commands (only one message at a time – no chance to overwrite
a previous message), the APU-to-host mailbox can be a 32-bit location
at a fixed address in the primary memory. The procedure ReadMsg
returns the message content.
11.3.2.5 Open Connections
All connections are opened at startup. The ConnexionList in the host’s
private memory holds one Host +Connection Descriptor (HCD) per
connection. This descriptor contains data to be used by the APU to open
the connection and also data to be used by the host to maintain
connection statistics. Since the APU needs only the first 32 bytes of the
connection descriptor, the host copies the relevant bytes from its private
memory to a fixed location in the primary memory. It then sends a
message to the ATMizer II+ chip with the address of the connection
descriptor in the primary memory and sets the HCD Status field to
REQ_OPEN.
The host then waits for the acknowledge from the ATMizer II+ chip before
sending data to that particular connection. When the acknowledge is
received, the host updates the status field in the HCD from REQUESTED
to OPEN. It is then possible to send data for that connection.
The host has to execute the following (Figure 11.2):
11-10
Operating Software
BookL64364PG.fm5 Page 11 Friday, January 28, 2000 4:58 PM
Figure 11.2 Opening Connections
For (n=1 to Number of connections)
Send message to SAR (OPEN_CONNECTION)
Wait until (Read message from SAR == ACK_OPEN_CONNECTION)
Connection[n].Status = OPEN
11.3.2.6 Close Connections
This command sends a “CLOSE_CONNECTION n” message to the
APU. Once the message is sent, the HCD status field is changed from
OPEN to REQ_CLOSE. From then on, no more data for the connections
is sent to the ATMizer II+ chip by the host but the Rx Buffer is still taken
into account.
When the acknowledge with the correct connection number(s) is received
from the ATMizer II+ chip, the status field is changed to CLOSED and
any buffers corresponding to the closed connection are discarded by the
host.
11.3.2.7 Transmit Buffers to the ATMizer II+ Chip
After memory initialization at startup and when all the open connection
requests have been acknowledged by the APU, the host sends two
buffers per connection to the TxRing. Then from there on, each time the
host receives a TxDone notification from the APU in the RxRing, it sends
back the buffer that has just been “Done” to the TxRing.
Note that it is necessary to send more than one buffer per connection at
startup to be sure that there will always be data to send to the line
interface. Also, each time a buffer is sent, the HCD BytesSent field is
incremented with the size of the buffer.
11.3.2.8 Receive Buffers from the ATMizer II+ Chip
Each time the host receives a new buffer from the APU in the RxRing, it:
•
extracts the connection number from the BFD,
•
checks the status bits of the BFD_Ctrl field and updates the statistics
field for the corresponding connection number (“BadBufs”),
•
increments the number of bytes received for that connection
(“BytesRec”), and
Host Program
11-11
BookL64364PG.fm5 Page 12 Friday, January 28, 2000 4:58 PM
•
writes the buffer number and three control bits (valid, BFS_BuffLarge
and BFS_BuffFree) into the RxMbx to free it.
No error checking is done on the data received.
Note that, to maintain a constant flow, the received buffers are not looped
back to the ATMizer II+ chip by the host. Indeed, only the Tx buffers are
resent as soon as the TxDone notification is found in the TxRing (see
the previous paragraph).
11.3.2.9 Request Statistics from the ATMizer II+ Chip
To request statistics from the ATMizer II+ chip, the host sends a Get
Statistics command with a pointer to the memory location where the
data will be copied by the APU. See Section 1.3.2.1, “Mailbox.”
This command is issued regularly by the host so that the statistics
display in real time.
11.3.2.10 Display the Statistics
When the host receives the acknowledge to the Get Statistics
command from the APU, this procedure calculates and displays the
following values:
11-12
•
effective rate per connection
•
global rate, all connections considered
•
bad buffers per connection
•
bytes sent/received per connection
•
cells sent/received per connection
•
PDUs sent/received per connection
Operating Software
BookL64364PG.fm5 Page 25 Friday, January 28, 2000 4:58 PM
Customer Feedback
We would appreciate your feedback on this document. Please copy the
following page, add your comments, and fax it to us at the number
shown.
If appropriate, please also fax copies of any marked-up pages from this
document.
Important:
Please include your name, phone number, fax number, and
company address so that we may contact you directly for
clarification or additional information.
Thank you for your help in improving the quality of our documents.
BookL64364PG.fm5 Page 26 Friday, January 28, 2000 4:58 PM
Reader’s Comments
Fax your comments to:
LSI Logic Corporation
Technical Publications
M/S E-198
Fax: 408.433.4333
Please tell us how you rate this document: L64364 ATMizer® II+
ATM-SAR Chip Programming Guide. Place a check mark in the appropriate blank for each category.
Excellent Good Average
Completeness of information
Clarity of information
Ease of finding information
Technical content
Usefulness of examples and
illustrations
Overall manual
Fair
Poor
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
____
What could we do to improve this document?
If you found errors in this document, please specify the error and page
number. If appropriate, please fax a marked-up copy of the page(s).
Please complete the information below so that we may contact you
directly for clarification or additional information.
Name
Telephone
Title
Department
Company Name
Street
City, State, Zip
Customer Feedback
Date
Fax
Mail Stop
BookL64364PG.fm5 Page 27 Friday, January 28, 2000 4:58 PM
U.S. Distributors
by State
A. E.
Avnet Electronics
http://www.hh.avnet.com
B. M.
Bell Microproducts,
Inc. (for HAB’s)
http://www.bellmicro.com
I. E.
Insight Electronics
http://www.insight-electronics.com
W. E.
Wyle Electronics
http://www.wyle.com
Alabama
Daphne
I. E.
Tel: 334.626.6190
Huntsville
A. E.
Tel: 256.837.8700
I. E.
Tel: 256.830.1222
W. E. Tel: 800.964.9953
Alaska
A. E.
Tel: 800.332.8638
Arkansas
W. E. Tel: 972.235.9953
Arizona
Phoenix
A. E.
Tel: 480.736.7000
B. M.
Tel: 602.267.9551
W. E. Tel: 800.528.4040
Tempe
I. E.
Tel: 480.829.1800
Tucson
A. E.
Tel: 520.742.0515
California
Agoura Hills
B. M.
Tel: 818.865.0266
Irvine
A. E.
Tel: 949.789.4100
B. M.
Tel: 949.470.2900
I. E.
Tel: 949.727.3291
W. E. Tel: 800.626.9953
Los Angeles
A. E.
Tel: 818.594.0404
W. E. Tel: 800.288.9953
Sacramento
A. E.
Tel: 916.632.4500
W. E. Tel: 800.627.9953
San Diego
A. E.
Tel: 858.385.7500
B. M.
Tel: 858.597.3010
I. E.
Tel: 800.677.6011
W. E. Tel: 800.829.9953
San Jose
A. E.
Tel: 408.435.3500
B. M.
Tel: 408.436.0881
I. E.
Tel: 408.952.7000
Santa Clara
W. E. Tel: 800.866.9953
Woodland Hills
A. E.
Tel: 818.594.0404
Westlake Village
I. E.
Tel: 818.707.2101
Colorado
Denver
A. E.
Tel: 303.790.1662
B. M.
Tel: 303.846.3065
W. E. Tel: 800.933.9953
Englewood
I. E.
Tel: 303.649.1800
Connecticut
Cheshire
A. E.
Tel: 203.271.5700
I. E.
Tel: 203.272.5843
Wallingford
W. E. Tel: 800.605.9953
Delaware
North/South
A. E.
Tel: 800.526.4812
Tel: 800.638.5988
B. M.
Tel: 302.328.8968
W. E. Tel: 856.439.9110
Florida
Altamonte Springs
B. M.
Tel: 407.682.1199
I. E.
Tel: 407.834.6310
Boca Raton
I. E.
Tel: 561.997.2540
Clearwater
I. E.
Tel: 727.524.8850
Fort Lauderdale
A. E.
Tel: 954.484.5482
W. E. Tel: 800.568.9953
Miami
B. M.
Tel: 305.477.6406
Orlando
A. E.
Tel: 407.657.3300
W. E. Tel: 407.740.7450
Tampa
W. E. Tel: 800.395.9953
St. Petersburg
A. E.
Tel: 727.507.5000
Georgia
Atlanta
A. E.
Tel: 770.623.4400
B. M.
Tel: 770.980.4922
W. E. Tel: 800.876.9953
Duluth
I. E.
Tel: 678.584.0812
Hawaii
A. E.
Tel: 800.851.2282
Idaho
A. E.
W. E.
Tel: 801.365.3800
Tel: 801.974.9953
Illinois
North/South
A. E.
Tel: 847.797.7300
Tel: 314.291.5350
Chicago
B. M.
Tel: 847.413.8530
W. E. Tel: 800.853.9953
Schaumburg
I. E.
Tel: 847.885.9700
Indiana
Fort Wayne
I. E.
Tel: 219.436.4250
W. E. Tel: 888.358.9953
Indianapolis
A. E.
Tel: 317.575.3500
Iowa
W. E. Tel: 612.853.2280
Cedar Rapids
A. E.
Tel: 319.393.0033
Kansas
W. E. Tel: 303.457.9953
Kansas City
A. E.
Tel: 913.663.7900
Lenexa
I. E.
Tel: 913.492.0408
Kentucky
W. E. Tel: 937.436.9953
Central/Northern/ Western
A. E.
Tel: 800.984.9503
Tel: 800.767.0329
Tel: 800.829.0146
Louisiana
W. E. Tel: 713.854.9953
North/South
A. E.
Tel: 800.231.0253
Tel: 800.231.5575
Maine
A. E.
W. E.
Tel: 800.272.9255
Tel: 781.271.9953
Maryland
Baltimore
A. E.
Tel: 410.720.3400
W. E. Tel: 800.863.9953
Columbia
B. M.
Tel: 800.673.7461
I. E.
Tel: 410.381.3131
Massachusetts
Boston
A. E.
Tel: 978.532.9808
W. E. Tel: 800.444.9953
Burlingtonr
I. E.
Tel: 781.270.9400
Marlborough
B. M.
Tel: 508.480.9099
Woburn
B. M.
Tel: 781.933.9010
Michigan
Brighton
I. E.
Tel: 810.229.7710
Detroit
A. E.
Tel: 734.416.5800
W. E. Tel: 888.318.9953
Minnesota
Champlin
B. M.
Tel: 800.557.2566
Eden Prairie
B. M.
Tel: 800.255.1469
Minneapolis
A. E.
Tel: 612.346.3000
W. E. Tel: 800.860.9953
St. Louis Park
I. E.
Tel: 612.525.9999
Mississippi
A. E.
Tel: 800.633.2918
W. E. Tel: 256.830.1119
Missouri
W. E. Tel: 630.620.0969
St. Louis
A. E.
Tel: 314.291.5350
I. E.
Tel: 314.872.2182
Montana
A. E.
Tel: 800.526.1741
W. E. Tel: 801.974.9953
Nebraska
A. E.
Tel: 800.332.4375
W. E. Tel: 303.457.9953
Nevada
Las Vegas
A. E.
Tel: 800.528.8471
W. E. Tel: 702.765.7117
New Hampshire
A. E.
Tel: 800.272.9255
W. E. Tel: 781.271.9953
New Jersey
North/South
A. E.
Tel: 201.515.1641
Tel: 609.222.6400
Mt. Laurel
I. E.
Tel: 609.222.9566
Pine Brook
W. E. Tel: 800.862.9953
Parsippany
I. E.
Tel: 973.299.4425
Wayne
W. E. Tel: 973.237.9010
New Mexico
W. E. Tel: 480.804.7000
Albuquerque
A. E.
Tel: 505.293.5119
BookL64364PG.fm5 Page 28 Friday, January 28, 2000 4:58 PM
U.S. Distributors
by State
(Continued)
New York
Hauppauge
I. E.
Tel: 516.761.0960
Long Island
A. E.
Tel: 516.434.7400
W. E. Tel: 800.861.9953
Rochester
A. E.
Tel: 716.475.9130
I. E.
Tel: 716.242.7790
W. E. Tel: 800.319.9953
Smithtown
B. M.
Tel: 800.543.2008
Syracuse
A. E.
Tel: 315.449.4927
North Carolina
Raleigh
A. E.
Tel: 919.859.9159
I. E.
Tel: 919.873.9922
W. E. Tel: 800.560.9953
North Dakota
A. E.
Tel: 800.829.0116
W. E. Tel: 612.853.2280
Ohio
Cleveland
A. E.
Tel: 216.498.1100
W. E. Tel: 800.763.9953
Dayton
A. E.
Tel: 614.888.3313
I. E.
Tel: 937.253.7501
W. E. Tel: 800.575.9953
Strongsville
B. M.
Tel: 440.238.0404
Valley View
I. E.
Tel: 216.520.4333
Oklahoma
W. E. Tel: 972.235.9953
Tulsa
A. E.
Tel: 918.459.6000
I. E.
Tel: 918.665.4664
Oregon
Beavertonr
B. M.
Tel: 503.524.0787
I. E.
Tel: 503.644.3300
Portland
A. E.
Tel: 503.526.6200
W. E. Tel: 800.879.9953
Pennsylvania
Mercer
I. E.
Tel: 412.662.2707
Pittsburgh
A. E.
Tel: 412.281.4150
W. E. Tel: 440.248.9996
Philadelphia
A. E.
Tel: 800.526.4812
B. M.
Tel: 215.741.4080
W. E. Tel: 800.871.9953
Rhode Island
A. E.
800.272.9255
W. E. Tel: 781.271.9953
South Carolina
A. E.
Tel: 919.872.0712
W. E. Tel: 919.469.1502
South Dakota
A. E.
Tel: 800.829.0116
W. E. Tel: 612.853.2280
Tennessee
W. E. Tel: 256.830.1119
East/West
A. E.
Tel: 800.241.8182
Tel: 800.633.2918
Texas
Austin
A. E.
Tel: 512.219.3700
B. M.
Tel: 512.258.0725
I. E.
Tel: 512.719.3090
W. E. Tel: 800.365.9953
Dallas
A. E.
Tel: 214.553.4300
B. M.
Tel: 972.783.4191
W. E. Tel: 800.955.9953
El Paso
A. E.
Tel: 800.526.9238
Houston
A. E.
Tel: 713.781.6100
B. M.
Tel: 713.917.0663
W. E. Tel: 800.888.9953
Richardson
I. E.
Tel: 972.783.0800
Rio Grande Valley
A. E.
Tel: 210.412.2047
Stafford
I. E.
Tel: 281.277.8200
Utah
Centerville
B. M.
Tel: 801.295.3900
Murray
I. E.
Tel: 801.288.9001
Salt Lake City
A. E.
Tel: 801.365.3800
W. E. Tel: 800.477.9953
Vermont
A. E.
Tel: 800.272.9255
W. E. Tel: 716.334.5970
Virginia
A. E.
Tel: 800.638.5988
W. E. Tel: 301.604.8488
Washington
Kirkland
I. E.
Tel: 425.820.8100
Seattle
A. E.
Tel: 425.882.7000
W. E. Tel: 800.248.9953
West Virginia
A. E.
Tel: 800.638.5988
Wisconsin
Milwaukee
A. E.
Tel: 414.513.1500
W. E. Tel: 800.867.9953
Wauwatosa
I. E.
Tel: 414.258.5338
Wyoming
A. E.
Tel: 800.332.9326
W. E. Tel: 801.974.9953
BookL64364PG.fm5 Page 29 Friday, January 28, 2000 4:58 PM
Sales Offices and Design
Resource Centers
LSI Logic Corporation
Corporate Headquarters
Tel: 408.433.8000
Fax: 408.433.8989
NORTH AMERICA
California
Costa Mesa - Mint Technology
Tel: 949.752.6468
Fax: 949.752.6868
Irvine
♦ Tel: 949.809.4600
Fax: 949.809.4444
Pleasanton Design Center
Tel: 925.730.8800
Fax: 925.730.8700
San Diego
Tel: 858.467.6981
Fax: 858.496.0548
Silicon Valley
♦ Tel: 408.433.8000
Fax: 408.954.3353
Wireless Design Center
Tel: 858.350.5560
Fax: 858.350.0171
Colorado
Boulder
♦ Tel: 303.447.3800
Fax: 303.541.0641
Colorado Springs
Tel: 719.533.7000
Fax: 719.533.7020
Fort Collins
Tel: 970.223.5100
Fax: 970.206.5549
Florida
Boca Raton
Tel: 561.989.3236
Fax: 561.989.3237
Georgia
Alpharetta
Tel: 770.753.6146
Fax: 770.753.6147
Illinois
Oakbrook Terrace
Tel: 630.954.2234
Fax: 630.954.2235
Kentucky
Bowling Green
Tel: 270.793.0010
Fax: 270.793.0040
Maryland
Bethesda
Tel: 301.897.5800
Fax: 301.897.8389
Massachusetts
Waltham
♦ Tel: 781.890.0180
Fax: 781.890.6158
Burlington - Mint Technology
Tel: 781.685.3800
Fax: 781.685.3801
Minnesota
Minneapolis
♦ Tel: 612.921.8300
Fax: 612.921.8399
New Jersey
Red Bank
Tel: 732.933.2656
Fax: 732.933.2643
Cherry Hill - Mint Technology
Tel: 609.489.5530
Fax: 609.489.5531
New York
Fairport
Tel: 716.218.0020
Fax: 716.218.9010
North Carolina
Raleigh
Tel: 919.785.4520
Fax: 919.783.8909
Oregon
Beaverton
Tel: 503.645.0589
Fax: 503.645.6612
Texas
Austin
Tel: 512.388.7294
Fax: 512.388.4171
Plano
♦ Tel: 972.244.5000
Fax: 972.244.5001
Houston
Tel: 281.379.7800
Fax: 281.379.7818
Canada
Ontario
Ottawa
♦ Tel: 613.592.1263
Fax: 613.592.3253
INTERNATIONAL
France
Paris
LSI Logic S.A.
Immeuble Europa
♦ Tel: 33.1.34.63.13.13
Fax: 33.1.34.63.13.19
Germany
Munich
LSI Logic GmbH
♦ Tel: 49.89.4.58.33.0
Fax: 49.89.4.58.33.108
Stuttgart
Tel: 49.711.13.96.90
Fax: 49.711.86.61.428
Italy
Milano
LSI Logic S.P.A.
♦ Tel: 39.039.687371
Fax: 39.039.6057867
Japan
Tokyo
LSI Logic K.K.
♦ Tel: 81.3.5463.7821
Fax: 81.3.5463.7820
Osaka
♦ Tel: 81.6.947.5281
Fax: 81.6.947.5287
Korea
Seoul
LSI Logic Corporation of
Korea Ltd
Tel: 82.2.528.3400
Fax: 82.2.528.2250
The Netherlands
Eindhoven
LSI Logic Europe Ltd
Tel: 31.40.265.3580
Fax: 31.40.296.2109
Singapore
Singapore
LSI Logic Pte Ltd
Tel: 65.334.9061
Fax: 65.334.4749
Tel: 65.835.5040
Fax: 65.732.5047
Sweden
Stockholm
LSI Logic AB
♦ Tel: 46.8.444.15.00
Fax: 46.8.750.66.47
Taiwan
Taipei
LSI Logic Asia, Inc.
Taiwan Branch
Tel: 886.2.2718.7828
Fax: 886.2.2718.8869
United Kingdom
Bracknell
LSI Logic Europe Ltd
♦ Tel: 44.1344.426544
Fax: 44.1344.481039
♦ Sales Offices with
Design Resource Centers
BookL64364PG.fm5 Page 30 Friday, January 28, 2000 4:58 PM
International Distributors
Australia
New South Wales
Reptechnic Pty Ltd
♦ Tel: 612.9953.9844
Fax: 612.9953.9683
Belgium
Acal nv/sa
Tel: 32.2.7205983
Fax: 32.2.7251014
China
Beijing
LSI Logic International
Services Inc.
Tel: 86.10.6804.2534
Fax: 86.10.6804.2521
France
Rungis Cedex
Azzurri Technology France
Tel: 33.1.41806310
Fax: 33.1.41730340
Germany
Haar
EBV Elektronik
Tel: 49.89.4600980
Fax: 49.89.46009840
Munich
Avnet Emg GmbH
Tel: 49.89.45110102
Fax: 49.89.42.27.75
Wuennenberg-Haaren
Peacock AG
Tel: 49.2957.79.1692
Fax: 49.2957.79.9341
Hong Kong
Hong Kong
AVT Industrial Ltd
Tel: 852.2428.0008
Fax: 852.2401.2105
EastEle
Tel: 852.2798.8860
Fax: 852.2305.0640
India
Bangalore
Spike Technologies India
Private Ltd
♦ Tel: 91.80.664.5530
Fax: 91.80.664.9748
Israel
Tel Aviv
Eastronics Ltd
Tel: 972.3.6458777
Fax: 972.3.6458666
Japan
Tokyo
Global Electronics
Corporation
Tel: 81.3.3260.1411
Fax: 81.3.3260.7100
Technical Center
Tel: 81.471.43.8200
Yokohama-City
Macnica Corporation
Tel: 81.45.939.6140
Fax: 81.45.939.6141
The Netherlands
Eindhoven
Acal Nederland b.v.
Tel: 31.40.2.502602
Fax: 31.40.2.510255
Switzerland
Brugg
LSI Logic Sulzer AG
Tel: 41.32.3743232
Fax: 41.32.3743233
Taiwan
Taipei
Avnet-Mercuries
Corporation, Ltd
Tel: 886.2.2516.7303
Fax: 886.2.2505.7391
Lumax International
Corporation, Ltd
Tel: 886.2.2788.3656
Fax: 886.2.2788.3568
Prospect Technology
Corporation, Ltd
Tel: 886.2.2721.9533
Fax: 886.2.2773.3756
Serial Semiconductor
Corporation, Ltd
Tel: 886.2.2579.5858
Fax: 886.2.2570.3123
United Kingdom
Maidenhead
Azzurri Technology Ltd
Tel: 44.1628.826826
Fax: 44.1628.829730
Swindon
EBV Elektronik
Tel: 44.1793.849933
Fax: 44.1793.859555
♦ Sales Offices with
Design Resource Centers