ACT-7000ASC 64-Bit Superscaler Microprocessor

Standard Products
ACT 7000ASC
64-Bit Superscaler Microprocessor
October 9, 2009
www.aeroflex.com/Avionics
FEATURES
■
■
●
225, 300, 350 MHz operating frequency
Consult Factory for latest speeds
MIPS IV Superset Instruction Set Architecture
●
●
●
●
●
●
●
●
●
●
■
800 MB per second peak throughput
100 MHz max. freq., multiplexed address/data
Supports 1/2 clock multipliers (2, 2.5, 3, 3.5, 4,
4.5, 5, 6, 7, 8, 9)
IEEE 1149.1 JTAG (TAP) boundary scan
●
●
●
■
Data PREFETCH instruction allows the
processor to overlap cache miss latency and
instruction execution
Floating point combined multiply-add
instruction increases performance in signal
processing and graphics applications
Conditional moves reduce branch frequency
Index address modes (register + register)
Embedded supply de-coupling capacitors
and additional PLL filter components
Integrated memory management unit
(ACT52xx compatible)
●
●
●
●
Fully associative joint TLB (shared by I and D
translations)
48 dual entries map 96 pages
4 entry DTLB and 4 entry ITLB
Variable page size (4KB to 16MB in 4x
increments)
SCD7000A Rev C
●
●
■
●
■
Single cycle repeat rate for common
single-precision operations and some
double-precision operations
Single cycle repeat rate for single-precision
combined multiply-add operations
Two cycle repeat rate for double-precision
multiply and double-precision combined
multiply-add operations
Fully static CMOS design with dynamic
power down logic
●
■
Specialized DSP integer Multiply-Accumulate
instruction, (MAD/MADU) and
three-operand multiply instruction (MUL/U)
Per line cache locking in primaries and
secondary
Bypass secondary cache option
I&D Test/Break-point (Watch) registers for
emulation & debug
Performance counter for system and software
tuning & debug
Ten fully prioritized vectored interrupts 6 external, 2 internal, 2 software
Fast Hit-Writeback-Invalidate and
Hit-Invalidate cache operations for efficient
cache management
High-performance floating point unit 700M FLOPS maximum
●
16KB instruction
16KB data: non-blocking and write-back or
write-through
256KB on-chip secondary: unified,
non-blocking, block writeback
MIPS IV instruction set
●
■
●
Integrated primary and secondary caches all are 4-way set associative with 32 byte line
size
●
■
●
High performance interface (RM52xx
compatible)
●
■
Embedded application enhancements
Full militarized PMC-Sierra RM7000A
microprocessor
Dual Issue symmetric superscalar
microprocessor with instruction prefetch
optimized for system level price/performance
●
■
■
Standby reduced power mode with WAIT
instruction
3 watts typical @ 1.8V Int., 3.3V I/O, 300MHz
208-lead CQFP, cavity-up package (F17)
208-lead CQFP, inverted footprint (F24),
with the same pin rotation as the commercial
PMC-Sierra RM5261A
On - Chip 256K Byte Secondary Cache, 4 - Way Set Associative
Secondary Tags
Set A
Secondary Tags
Set B
Primary Data Cache
4 - Way Set Associative
Secondary Tags
Set C
ITag
DTag
DTLB
Secondary Tags
Set D
Primary Instruction Cache
4 - Way Set Associative
ITLB
A/D Bus
Store Buffer
Pad Buffer
Write Buffer
Pad Bus
Prefetch Buffer
Address Buffer
Instruction Dispatch Unit
F Pipe Register
Read Buffer
M Pipe Register
F-Pipe Bus
M-Pipe Bus
D Bus
Packer / Unpacker
Comparator
Floating-Point
MultAdd, Add, Sub,
Cvt, Div, Sqrt
Multiplier Array
DVA
System / Memory
Control
Load Aligner
Integer Register File
Co-processor 0
Floating - Point Control
Floating-Point
Register File
Joint TLB
IVA
PC Incrementer
M Pipe
F Pipe
Adder
StAin/Sh
Logicals
Adder
Shifter
Logicals
Integer Control
Floating-Point
Load / Align
FA Bus
Branch PC Adder
ITLB Virtuals
Program Counter
DTLB Virtuals
PLL/Clocks
Int Mult. Div. Madd
Block Diagram
SCD7000A Rev C 9/9/09
Aeroflex Plainview
2
DESCRIPTION
CPU Registers
The ACT 7000ASC is a highly integrated symmetric
superscalar microprocessor capable of issuing two
instructions each processor cycle. It has two high
performance 64-bit integer units as well as a high
throughput, fully pipelined 64-bit floating point unit. To
keep its multiple execution units running efficiently, the
ACT 7000ASC integrates not only 16KB 4-way set
associative instruction and data caches but backs them up
with an integrated 256KB 4-way set associative secondary
as well. For maximum efficiency, the data and secondary
caches are writeback and nonblocking. A RM52XX family
compatible, operating system friendly memory
management unit with a 64/48-entry fully associative TLB
and a high-performance 64-bit system interface supporting
hardware prioritized and vectored interrupts round out the
main features of the processor.
The ACT-7000ASC is ideally suited for high end
embedded control applications such as: Avionics upgrades,
Unmanned aerial/land/underwater vehicle guidance
systems, Flight Computers, Digital Mapping Systems and
Smart Munitions. The multiply-accumulate operation is the
core primitive of almost all signal processing algorithms
allowing the ACT-7000ASC to eliminate the need for a
separate DSP engine in many embedded applications.
Like all MIPS ISA processors, the ACT 7000ASC CPU
has a simple, clean user visible state consisting of 32
general purpose registers, or GPR’s, two special purpose
registers for integer multiplication and division, and a
program counter; there are no condition code bits. Figure 1
shows the user visible state.
Superscalar Dispatch
The
ACT 7000ASC has an efficient symmetric
superscalar dispatch unit which allows it to issue up to two
instructions per cycle. For purposes of instruction issue, the
ACT 7000ASC defines four classes of instructions:
integer, load/store, branches, and floating-point. There are
two logical pipelines, the function, or F, pipeline and the
memory, or M, pipeline. Note however that the M pipe can
execute integer as well as memory type instructions.
Table 1 – Instruction Issue Rules
F Pipe
M Pipe
one of:
one of:
integer, branch, floating-point,
integer mul, div
integer, load/store
HARDWARE OVERVIEW
The ACT 7000ASC offers a high-level of integration
targeted at high-performance embedded applications. The
key elements of the ACT 7000ASC are briefly described
below.
Figure 2 is a simplification of the pipeline section and
illustrates the basics of the instruction issue mechanism.
General Purpose Registers
63
Multiply/Divide Registers
0
0
0
63
r1
HI
r2
0
63
•
LO
•
Program Counter
•
•
0
63
r29
PC
r30
r31
Figure 1 – CP0 Registers
SCD7000A Rev C 9/9/09
Aeroflex Plainview
3
.
Instruction
Cache
integer
F Pipe IBus
M Pipe IBus
FP
M Pipe
load/store
floating-point
add, sub, or, lw, sw, ld, sd,
xor, shift, etc. ldc1, sdc1,
mov, movc,
fmov, etc.
Dispatch
Unit
FP
F Pipe
Table 2 – Dual Issue Instruction Classes
Integer
F Pipe
branch
fadd, fsub,
beq, bne,
fmult, fmadd, bCzT, bCzF,
fdiv, fcmp,
j, etc.
fsqrt, etc.
The symmetric superscalar capability of the
ACT 7000ASC, in combination with its low latency integer
execution units and high-throughput fully pipelined
floating-point execution unit, provides unparalleled
price/performance in computational intensive embedded
applications.
Integer
M Pipe
Pipeline
The logical length of both the F and M pipelines is five
stages with state committing in the register write, or W,
pipe stage. The physical length of the floating-point
execution pipeline is actually seven stages but this is
completely transparent to the user.
Figure 3 shows instruction execution within the
ACT 7000ASC
when
instructions
are
issuing
simultaneously down both pipelines. As illustrated in the
figure, up to ten instructions can be executing
simultaneously. This figure presents a somewhat simplistic
Figure 2 – Instruction Issue Paradigm
Figure 2 is a simplification of the pipeline section and
illustrates the basics of the instruction issue mechanism.
The figure illustrates that one F pipe instruction and one
M pipe instruction can be issued concurrently but that two
M pipe or two F pipe instructions cannot be issued. Table 2
specifies more completely the instructions within each
class.
I0
1l
2l
1R
2R
1A
2A
1D
2D
1W
2W
I1
1l
2l
1R
2R
1A
2A
1D
2D
1W
2W
I2
1l
2l
1R
2R
1A
2A
1D
2D
1W
2W
I3
1l
2l
1R
2R
1A
2A
1D
2D
1W
2W
1l
1l
2l
2l
1R
1R
2R
2R
1A
1A
2A
2A
1D
1D
2D
2D
1W
1W
2W
2W
1l
1l
2l
2l
1R
1R
2R
2R
1A
1A
2A
2A
1D
1D
2D
2D
1W
1W
2W
2W
1l
1l
2l
2l
1R
1R
2R
2R
1A
1A
2A
2A
1D
1D
2D
2D
I4
I5
I6
I7
I8
I9
1W
1W
one cycle
1I-1R:
2I:
2R:
1A:
1A:
1A-2A:
2A:
2A-2D:
1D:
2W:
Instruction cache access
Instruction virtual to physical address translation
Register file read, Bypass calculation, Instruction decode, Branch address calculation
Issue or slip decision, Branch decision
Data virtual address calculation
Integer add, logical, shift
Store Align
Data cache access and load align
Data virtual to physical address translation
Register file write
Figure 3 – Pipeline
SCD7000A Rev C 9/9/09
Aeroflex Plainview
4
2W
2W
Table 3 – ALU Operations
view of the processors operation however since the
out-of-order completion of loads, stores, and long latency
floating-point operations can result in there being even
more instructions in process than what is shown.
Note that instruction dependencies, resource conflicts,
and branches result in some of the instruction slots being
occupied by NOPs.
Unit
F Pipe
M Pipe
Adder
add, sub
add, sub, data
address add
Logic
logic, moves, zero
shifts (nop)
logic, moves, zero
shifts (nop)
Shifter
non zero shift
non zero shift, store
align
Integer Unit
Like the ACT 52xx family, the ACT 7000ASC
implements the MIPS IV Instruction Set Architecture, and
is therefore fully upward compatible with applications that
run on processors such as the R4650 and R4700 that
implement the earlier generation MIPS III Instruction Set
Architecture. Additionally, the ACT 7000ASC includes
two implementation specific instructions not found in the
baseline MIPS IV ISA, but that are useful in the embedded
market place. Described in detail in a later section of this
datasheet,
these
instructions
are
integer
multiply-accumulate and three-operand integer multiply.
The ACT 7000ASC integer unit includes thirty-two
general purpose 64-bit registers, the HI/LO result registers
for the two-Pipeline operand integer multiply/divide
operations, and the program counter, or PC. There are two
separate execution units, one of which can execute
function, or F, type instructions and one which can execute
memory, or M, type instructions. See above for a
description of the instruction types and the issue rules. As
a special case, integer multiply/divide instructions as well
as their corresponding MFHi and MFLo instructions can
only be executed in the F type execution unit. Within each
execution unit the operational characteristics are the same
as on previous MIPS designs with single cycle ALU
operations (add, sub, logical, shift), one cycle load delay,
and an autonomous multiply/divide unit.
Integer Multiply/Divide
The ACT 7000ASC has a single dedicated integer
multiply/divide unit optimized for high-speed multiply and
multiply-accumulate operations. The multiply/divide unit
resides in the F type execution unit. Table 4 shows the
performance of the multiply/divide unit on each operation.
Table 4 – Integer Multiply / Divide Operations
Operand
Size
Latency
Repeat
Rate
Stall
Cycles
MULT/U,
MAD/U
16 bit
4
3
0
32 bit
5
4
0
MUL
16 bit
4
3
2
32 bit
5
4
3
DMULT,
DMULTU
any
9
8
0
DIV, DIVD
any
36
36
0
DDIV,
DDIVU
any
68
68
0
Opcode
The baseline MIPS IV ISA specifies that the results of a
multiply or divide operation be placed in the Hi and Lo
registers. These values can then be transferred to the
general purpose register file using the Move-from-Hi and
Move-from-Lo (MFHI/MFLO) instructions.
In addition to the baseline MIPS IV integer multiply
instructions, the ACT 7000ASC also implements the
3-operand multiply instruction, MUL. This instruction
specifies that the multiply result go directly to the integer
register file rather than the Lo register. The portion of the
multiply that would have normally gone into the Hi register
is discarded. For applications where it is known that the
upper half of the multiply result is not required, using the
MUL instruction eliminates the necessity of executing an
explicit MFLO instruction.
Also included in the ACT 7000ASC are the
multiply-add instructions MAD/MADU. This instruction
multiplies two operands and adds the resulting product to
the current contents of the Hi and Lo registers. The
multiply-accumulate operation is the core primitive of
almost all signal processing algorithms allowing the
ACT 7000ASC to eliminate the need for a separate DSP
engine in many embedded applications.
Register File
The ACT 7000ASC has thirty-two general purpose
registers with register location (r0) hard wired to zero
value. These registers are used for scalar integer operations
and address calculation. In order to service the two integer
execution units, the register file has four read ports and two
write ports and is fully bypassed both within and between
the two execution units to minimize operation latency in
the pipeline.
ALU
The ACT 7000ASC has two complete integer ALU’s
each consisting of an integer adder/subtractor, a logic unit,
and a shifter. Table 3 shows the functions performed by the
ALU’s for each execution unit. Each of these units is
optimized to perform all operations in a single processor
cycle.
SCD7000A Rev C 9/9/09
Aeroflex Plainview
5
Table 5 – Floating Point Latencies and
Repeat Rates
By pipelining the multiply-accumulate function and
dynamically determining the size of the input operands, the
ACT 7000ASC is able to maximize throughput while still
using an area efficient implementation.
Operation
Latency
single/double
Repeat Rate
single/double
Floating-Point Coprocessor
fadd
4
1
The ACT 7000ASC incorporates a high-performance
fully pipe-lined floating-point coprocessor which includes
a floating-point register file and autonomous execution
units for multiply/ add/convert and divide/square root. The
floating-point coprocessor is a tightly coupled
co-execution unit, decoding and executing instructions in
parallel with, and in the case of floating-point loads and
stores, in cooperation with the M pipe of the integer unit.
As described earlier, the superscalar capabilities of the
ACT 7000ASC allow floating-point computation
instructions to issue concurrently with integer instructions.
fsub
4
1
fmult
4/5
1/2
fmadd
4/5
1/2
fmsub
4/5
1/2
fdiv
21/36
19/34
fsqrt
21/36
19/34
frecip
21/36
19/34
frsqrt
38/68
36/66
fcvt.s.d
4
1
fcvt.s.w
6
3
fcvt.s.l
6
3
fcvt.d.s
4
1
fcvt.d.w
4
1
fcvt.d.l
4
1
fcvt.w.s
4
1
fcvt.w.d
4
1
fcvt.l.s
4
1
fcvt.l.d
4
1
fcmp
1
1
fmov, fmovc
1
1
fabs, fneg
1
1
Floating-Point Unit
The ACT 7000ASC floating-point execution unit
supports single and double precision arithmetic, as
specified in the IEEE Standard 754. The execution unit is
broken into a separate divide/square root unit and a
pipelined multiply/add unit. Overlap of divide/square root
and multiply/add is supported.
The ACT 7000ASC maintains fully precise
floating-point exceptions while allowing both overlapped
and pipelined operations. Precise exceptions are extremely
important in object-oriented programming environments
and highly desirable for debugging in any environment.
The floating-point unit’s operation set includes
floating-point add, subtract, multiply, multiply-add, divide,
square root, reciprocal, reciprocal square root, conditional
moves, conversion between fixed-point and floating-point
format, conversion between floating-point formats, and
floating-point compare. Table 5 gives the latencies of the
floating-point instructions in internal processor cycles.
To support superscalar operations, the FGR has four read
ports and two write ports, and is fully bypassed to minimize
operation latency in the pipeline. Three of the read ports
and one write port are used to support the combined
multiply-add instruction while the fourth read and second
write port allows a concurrent floating-point load or store
and conditional moves.
Floating-Point General Register File
The floating-point general register file, FGR, is made up
of thirty-two 64-bit registers. With the floating-point load
and store double instructions, LDC1 and SDC1, the
floating-point unit can take advantage of the 64-bit wide
data cache and issue a floating-point coprocessor load or
store double-word instruction in every cycle.
The floating-point control register file contains two
registers; one for determining configuration and revision
information for the coprocessor and one for control and
status information. These registers are primarily used for
diagnostic software, exception handling, state saving and
restoring, and control of rounding modes.
System Control Coprocessor (CP0)
The system control coprocessor (CP0) in the MIPS
architecture is responsible for the virtual memory
sub-system, the exception control system, and the
diagnostics capability of the processor. In the MIPS
architecture, the system control coprocessor (and thus the
kernel software) is implementation dependent. For memory
management, the ACT 7000ASC CP0 is logically identical
to that of the RM5200 Family and R5000. For interrupt
exceptions and diagnostics, the ACT 7000ASC is a
superset of the RM5200 Family and R5000 implementing
additional features described later in the sections on
Interrupts, the Test/Breakpoint facility, and the
Performance Counter facility.
SCD7000A Rev C 9/9/09
Aeroflex Plainview
6
The memory management unit controls the virtual
memory system page mapping. It consists of an instruction
address translation buffer, or ITLB, a data address
translation buffer, or DTLB, a Joint TLB, or JTLB, and
coprocessor registers used by the virtual memory mapping
sub-system.
Level Lo (IPLLO), and Interrupt Priority Level Hi (IPLHI)
registers. These registers are described further in the
section on interrupt handling. The other two registers,
Imprecise Error 1 and Imprecise Error 2, have been added
to help diagnose bus errors which occur on non-blocking
memory references.
Figure 4 shows the CP0 registers.
System Control Coprocessor Registers
Virtual to Physical Address Mapping
The ACT 7000ASC incorporates all system control
coprocessor (CP0) registers internally. These registers
provide the path through which the virtual memory
system’s page mapping is examined and modified,
exceptions are handled, and operating modes are controlled
(kernel vs. user mode, interrupts enabled or disabled, cache
features). In addition, the ACT 7000ASC includes
registers to implement a real-time cycle counting facility,
to aid in cache and system diagnostics, and to assist in data
error detection.
To support the non-blocking caches and enhanced
interrupt handling capabilities of the ACT 7000ASC, both
the data and control register spaces of CP0 are supported by
the ACT 7000ASC. In the data register space, that is the
space accessed using the MFC0 and MTC0 instructions,
the ACT 7000ASC supports the same registers as found in
the RM5200, R4000 and R5000 families. In the control
space, that is the space accessed by the previously unused
CTC0 and CFC0 instructions, the ACT 7000ASC supports
five new registers. The first three of these new 32-bit
registers support the enhanced interrupt handling
capabilities and are the Interrupt Control, Interrupt Priority
PageMask
5*
EntryLo0
2*
EntryHi
10*
EntryLo1
3*
47
The ACT 7000ASC provides three modes of virtual
addressing:
• user mode
• supervisor mode
• kernel mode
This mechanism is available to system software to
provide a secure environment for user processes. Bits in the
CP0 Status register determine which virtual addressing
mode is used. In the user mode, the ACT 7000ASC
provides a single, uniform virtual address space of 256GB
(2GB in 32-bit mode).
When operating in the kernel mode, four distinct virtual
address spaces, totalling 1024GB (4GB in 32-bit mode),
are simultaneously available and are differentiated by the
high-order bits of the virtual address.
The ACT 7000ASC processor also supports a
supervisor mode in which the virtual address space is
256.5GB (2.5GB in 32-bit mode), divided into three
regions based on the high-order bits of the virtual address.
Figure 5 shows the address space layout for 32-bit
operation.
Context
4*
BadVAddr
8*
Perf Counter
25*
IPLLO
18*
Count
9*
Compare
11*
Perf Ctr Cntrl
22*
IPLHI
19*
Info
7*
Status
12*
Cause
13*
Index
0*
EPC
14*
Watch1
18*
Random
1*
Watch2
19*
Xcontext
20*
Wired
6*
ECC
26*
CacheErr
27*
IntControl
20*
Watch Mask
24*
Imp Error 1
26*
TLB
(entries protected
from TLBWR)
0
LLAddr
17*
TagLo
28*
TagHi
29*
Used for memory
management
Imp Error 2
27*
ErrorEPC
30*
PRid
15*
Config
16*
* Registered number
Used for exception
processing
Control Space Registers
Figure 4 – CP0 Registers
SCD7000A Rev C 9/9/09
Aeroflex Plainview
7
CP0 register, PageMask, is loaded with the desired page
size of a mapping, and that size is stored into the TLB along
with the virtual address when a new entry is written. Thus,
operating systems can create special purpose maps; for
example, a typical frame buffer can be memory mapped
using only one TLB entry.
The second mechanism controls the replacement
algorithm when a TLB miss occurs. The ACT 7000ASC
provides a random replacement algorithm to select a TLB
entry to be written with a new mapping; however, the
processor also provides a mechanism whereby a system
specific number of mappings can be locked into the TLB,
thereby avoiding random replacement. This mechanism
allows the operating system to guarantee that certain pages
are always mapped for performance reasons and for
deadlock avoidance. This mechanism also facilitates the
design of real-time systems by allowing deterministic
access to critical software.
The JTLB also contains information that controls the
cache coherency protocol for each page. Specifically, each
page has attribute bits to determine whether the coherency
algorithm is: uncached, write-back, write-through with
write-allocate, write-through without write-allocate,
write-back with secondary bypass. Note that both of the
write-through protocols bypass the secondary cache since it
does not support writes of less than a complete cache line.
These protocols are used for both code and data on the
ACT 7000ASC with data using write-back or
write-through depending on the application. The
write-through modes support the same efficient frame
buffer handling as the RM5200 Family, R4700 and R5000.
Figure 5 – Kernel Mode Virtual Addressing
(32-bit mode)
0xFFFFFFFF Kernel virtual address space
(kseg3)
Mapped, 0.5GB
0xE0000000
0xDFFFFFFF Supervisor virtual address space
(ksseg)
Mapped, 0.5GB
0xC0000000
0xBFFFFFFF Uncached kernel physical address space
(kseg1)
Unmapped, 0.5GB
0xA0000000
0x9FFFFFFF Cached kernel physical address space
(kseg0)
Unmapped, 0.5GB
0x80000000
0x7FFFFFFF User virtual address space
(kuseg)
Mapped, 2.0GB
Instruction TLB
When the ACT 7000ASC is configured for 64-bit
addressing, the virtual address space layout is an upward
compatible extension of the 32-bit virtual address space
layout.
The ACT 7000ASC uses a 4-entry instruction TLB
(ITLB) to minimize contention for the JTLB, to eliminate
the critical path of translating through a large associative
array, and to save power. Each ITLB entry maps a 4KB
page. The ITLB improves performance by allowing
instruction address translation to occur in parallel with data
address translation. When a miss occurs on an instruction
address translation by the ITLB, the least-recently used
ITLB entry is filled from the JTLB. The operation of the
ITLB is completely transparent to the user.
Joint TLB
Data TLB
For fast virtual-to-physical address translation, the
ACT 7000ASC uses a large, fully associative TLB that
maps virtual pages to their corresponding physical
addresses. As indicated by its name, the joint TLB (JTLB)
is used for both instruction and data translations. The JTLB
is organized as pairs of even/odd entries, and maps a virtual
address and address space identifier into the large, 64GB
physical address space. By default, the JTLB is configured
as 48 pairs of even/odd entries. The 64 even/odd entry
optional configuration is set at boot time.
Two mechanisms are provided to assist in controlling the
amount of mapped space, and the replacement
characteristics of various memory regions. First, the page
size can be configured, on a per-entry basis, to use page
sizes in the range of 4KB to 16MB (in 4X multiples). A
The ACT 7000ASC uses a 4-entry data TLB (DTLB) for
the same reasons cited above for the ITLB. Each DTLB
entry maps a 4KB page. The DTLB improves performance
by allowing data address translation to occur in parallel
with instruction address translation. When a miss occurs on
a data address translation by the DTLB, the DTLB is filled
from the JTLB. The DTLB refill is pseudo-LRU: the least
recently used entry of the least recently used pair of entries
is filled. The operation of the DTLB is completely
transparent to the user.
0x00000000
Cache Memory
In order to keep the ACT 7000ASC’s superscalar
pipeline full and operating efficiently, the ACT 7000ASC
has integrated primary instruction and data caches with
single cycle access as well as a large unified secondary
SCD7000A Rev C 9/9/09
Aeroflex Plainview
8
cache with a three cycle miss penalty from the primaries.
Each primary cache has a 64-bit read path, a 128-bit write
path, and both caches can be accessed simultaneously. The
primary caches provide the integer and floating-point units
with an aggregate band-width of 3.6 GB per second at an
internal clock frequency of 225 MHz. During an instruction
or data primary cache refill, the secondary cache can
provide a 64-bit datum every cycle following the initial
three cycle latency for a peak bandwidth of 2.4 GB per
second.
eliminates the potential for virtual aliases in the cache, it is
possible that some operating system code can be simplified
compared to the RM5200 Family, R5000 and R4000 class
processors. The data cache is non-blocking; that is, a miss
in the data cache will not necessarily stall the processor
pipeline. As long as no instruction is encountered which is
dependent on the data reference which caused the miss, the
pipeline will continue to advance. Once there are two cache
misses outstanding, the processor will stall if it encounters
another load or store instruction. A 32-byte (eight word)
line size is used to maximize the communication efficiency
between the data cache and the secondary cache or memory
system. The data array portion of the data cache is 64 bits
wide and protected by byte parity while the tag array holds
a 24-bit physical address, 3 housekeeping bits, a two bit
cache state field, and has two bits of parity protection. The
normal write policy is write-back, which means that a store
to a cache line does not immediately cause memory to be
updated. This increases system performance by reducing
bus traffic and eliminating the bottleneck of waiting for
each store operation to finish before issuing a subsequent
memory operation. Software can, however, select
write-through on a per-page basis when appropriate, such
as for frame buffers. Cache protocols supported for the data
cache are:
1. Uncached. Reads to addresses in a memory area
identified as uncached will not access the cache.
Writes to such addresses will be written directly to
main memory without updating the cache.
2. Write-back. Loads and instruction fetches will first
search the cache, reading the next memory hierarchy
level only if the desired data is not cache resident. On
data store operations, the cache is first searched to
determine if the target address is cache resident. If it
is resident, the cache contents will be updated, and
the cache line marked for later write-back. If the
cache lookup misses, the target line is first brought
into the cache and then the write is performed as
above.
3. Write-through with write allocate. Loads and
instruction fetches will first search the cache, reading
from memory only if the desired data is not cache
resident; write-through data is never cached in the
secondary cache. On data store operations, the cache
is first searched to determine if the target address is
cache resident. If it is resident, the primary cache
contents will be updated and main memory will also
be written leaving the write-back bit of the cache line
unchanged; no writes will occur into the secondary.
If the cache lookup misses, the target line is first
brought into the cache and then the write is
performed as above.
4. Write-through without write allocate. Loads and
instruction fetches will first search the cache, reading
from memory only if the desired data is not cache
resident; write-through data is never cached in the
secondary. On data store operations, the cache is first
searched to determine if the target address is cache
resident. If it is resident, the cache contents will be
updated and main memory will also be written
Instruction Cache
The ACT 7000ASC has an integrated 16KB, four-way
set associative instruction cache and, even though
instruction address translation is done in parallel with the
cache access, the combination of 4-way set associativity
and 16KB size results in a cache which is virtually indexed
and physically tagged. Since the effective physical index
eliminates the potential for virtual aliases in the cache, it is
possible that some operating system code can be simplified
as compared with the RM5200 Family, R5000 and R4000
class processors.
The data array portion of the instruction cache is 64 bits
wide and protected by word parity while the tag array holds
a 24-bit physical address, 14 housekeeping bits, a valid bit,
and a single bit of parity protection.
By accessing 64 bits per cycle, the instruction cache is
able to supply two instructions per cycle to the superscalar
dispatch unit. For signal processing, graphics, and other
numerical code sequences where a floating-point load or
store and a floating-point computation instruction are being
issued together in a loop, the entire bandwidth available
from the instruction cache will be consumed by instruction
issue. For typical integer code mixes, where instruction
dependencies and other resource constraints restrict the
achievable parallelism, the extra instruction cache
bandwidth is used to fetch both the taken and non-taken
branch paths to minimize the overall penalty for branches.
A 32-byte (eight instruction) line size is used to maximize
the communication efficiency between the instruction
cache and the secondary cache, or memory system.
The ACT 7000ASC is the first MIPS RISC
microprocessor to support cache locking on a per line basis.
The contents of each line of the cache can be locked by
setting a bit in the Tag. Locking the line prevents its
contents from being overwritten by a subsequent cache
miss. Refill will occur only into unlocked cache lines. This
mechanism allows the programmer to lock critical code
into the cache thereby guaranteeing deterministic behavior
for the locked code sequence.
Data Cache
The ACT 7000ASC has an integrated 16KB, four-way
set associative data cache, and even though data address
translation is done in parallel with the cache access, the
combination of 4-way set associativity and 16KB size
results in a cache which is physically indexed and
physically tagged. Since the effective physical index
SCD7000A Rev C 9/9/09
Aeroflex Plainview
9
leaving the write-back bit of the cache line
unchanged; no writes will occur into the secondary.
If the cache lookup misses, then only main memory
is written.
5. Write-back with secondary bypass. Loads and
instruction fetches first search the primary cache,
reading from memory only if the desired data is not
resident; the secondary is not searched. On data store
operations, the primary cache is first searched to
determine if the target address is resident. If it is
resident, the cache contents are updated, and the
cache line marked for later write-back. If the cache
lookup misses, the target line is first brought into the
cache and then the write is performed as above.
Associated with the Data Cache is the store buffer. When
the ACT 7000ASC executes a STORE instruction, this
single-entry buffer gets written with the store data while the
tag comparison is performed. If the tag matches, then the
data is written into the Data Cache in the next cycle that the
Data Cache is not accessed (the next non-load cycle). The
store buffer allows the ACT 7000ASC to execute a store
every processor cycle and to perform back-to-back stores
without penalty. In the event of a store immediately
followed by a load to the same address, a combined merge
and cache write will occur such that no penalty is incurred.
mentioned examples are the 4-way associativity and
write-back cache protocol.
A third management policy for which integration affords
flexibility is cache hierarchy management. With multiple
levels of cache, it is necessary to specify a policy for
dealing with cases where two cache lines at level n of the
hierarchy would, if possible, be sharing an entry in level
n+1 of the hierarchy. The policy followed by the
ACT 7000ASC is motivated by the desire to get maximum
cache utility and results in the ACT 7000ASC allowing
entries in the primaries which do not necessarily have a
corresponding entry in the secondary; the ACT 7000ASC
does not force the primaries to be a subset of the secondary.
For example, if primary cache line A is being filled and a
cache line already exists in the secondary for primary cache
line B at the location where primary A’s line would reside
then that secondary entry will be replaced by an entry
corresponding to primary cache line A and no action will
occur in the primary for cache line B. This operation will
create the aforementioned scenario where the primary
cache line which initially had a corresponding secondary
entry will no longer have such an entry. Such a primary line
is called an orphan. In general, cache lines at level n+1 of
the hierarchy are called parents of level n’s children.
Another
ACT 7000ASC
cache
management
optimization occurs for the case of a secondary cache line
replacement where the secondary line is dirty and has a
corresponding dirty line in the primary. In this case, since
it is permissible to leave the dirty line in the primary, it is
not necessary to write the secondary line back to main
memory. Taking this scenario one step further, a final
optimization occurs when the aforementioned dirty
primary line is replaced by another line and must be written
back, in this case, it will be written directly to memory
bypassing the secondary cache.
Secondary Cache
The ACT 7000ASC has an integrated 256KB, four-way
set associative, block write-back secondary cache. The
secondary has the same line size as the primaries, 32 bytes,
is logically 64-bits wide matching the system interface and
primary widths, and is protected with doubleword parity.
The secondary tag array holds a 20-bit physical address, 2
housekeeping bits, a three bit cache state field, and two
parity bits.
By integrating a secondary cache, the ACT 7000ASC is
able to dramatically decrease the latency of a primary cache
miss without dramatically increasing the number of pins
and the amount of power required by the processor. From a
technology point of view, integrating a secondary cache
maximally leverages CMOS semiconductor technology by
using silicon to build the structures that are most amenable
to silicon technology; silicon is being used to build very
dense, low power memory arrays rather than large power
hungry I/O buffers.
Further benefits of an integrated secondary are flexibility
in the cache organization and management policies that are
not practical with an external cache. Two previously
Secondary Caching Protocols
Unlike the primary data cache, the secondary cache
supports only uncached and block write-back. As noted
earlier, cache lines managed with either of the
write-through protocols will not be placed in the secondary
cache. A new caching attribute, write-back with secondary
bypass, allows the secondary to be bypassed entirely.
When this attribute is selected, the secondarywill not be
filled on load misses and will not be written on dirty
write-backs from the primary.
Table 6 – Cache Attributes
Attribute
Instruction
Data
Secondary
Size
16KB
16KB
256KB
Associativity
4-way
4-way
4-way
Replacement Algorithm.
cyclic
cyclic
cyclic
Line size
32 byte
32 byte
32 byte
SCD7000A Rev C 9/9/09
Aeroflex Plainview
10
Table 6 – Cache Attributes (cont)
Attribute
Instruction
Data
Secondary
Index
vAddr 11..0
vAddr 11..0
pAddr 15..0
Tag
pAddr 35..12
pAddr 35..12
pAddr 35..16
Write policy
n.a.
write-back, write-through
read policy
n.a.
non-blocking (2 outstanding) non-blocking (data only, 2
outstanding)
read order
critical word first
critical word first
critical word first
write order
NA
sequential
sequential
miss restart following:
complete line
first double (if waiting for
data)
n.a.
Parity
per word
per byte
per doubleword
Cache Locking
Table 8 – Penalty Cycle
The ACT 7000ASC allows critical code or data
fragments to be locked into the primary and secondary
caches. The user has complete control over what locking is
performed with cache line granularity. For instruction and
data fragments in the primaries, locking is accomplished by
setting either or both of the cache lock enable bits in the
CP0 ECC register, specifying the set via a field in the CP0
ECC register, and then executing either a load instruction
or a Fill_I cache operation for data or instructions
respectively. Only two sets are lockable within each cache:
set A and set B. Locking within the secondary works
identically to the primaries using a separate secondary lock
enable bit and the same set selection field. As with the
primaries, only two sets are lockable: sets A and B. Table 7
summarizes the cache locking capabilities.
Penalty
Operation
Lock
Enable
Primary I
ECC[27]
ECC[28]=0→A
ECC[28]=1→B
Fill_I
Primary D
ECC[26]
ECC[28]=0→A
ECC[28]=1→B
Load/Store
Secondary
ECC[25]
ECC[28]=0→A
ECC[28]=1→B
Fill_I or
Load/Store
Set Select
Condition
ACT 7000ASC R4000/R5000
Hit-WritebackInvalidate
Hit-Invalidate
Miss
0
7
Hit-Clean
3
12
Hit-Dirty
3+n
14+n
Miss
0
7
Hit
2
9
For the Hit-Dirty case of Hit-Writeback-Invalidate, if the
writeback buffer is full from some previous cache eviction
then n is the number of cycles required to empty the
write-back buffer. If the buffer is empty then n is zero.
The penalty value is the number of processor cycles
beyond the one cycle required to issue the instruction that
is required to implement the operation.
Table 7 – Cache Locking Control
Cache
block write-back, bypass
Activate
Primary Write Buffer
Writes to secondary cache or external memory, whether
cache miss write-backs or stores to uncached or
write-through addresses, use the integrated primary write
buffer. The write buffer holds up to four 64-bit address and
data pairs. The entire buffer is used for a data cache
write-back and allows the processor to proceed in parallel
with memory update. For uncached and write-through
stores, the write buffer significantly increases performance
by decoupling the SysAD bus transfers from the instruction
execution stream.
Cache Management
To improve the performance of critical data movement
operations in the embedded environment, the
ACT 7000ASC significantly improves the speed of
operation of certain critical cache management operations
as compared with the R5000 and R4000 families. In
particular, the speed of the Hit-Write-back-Invalidate and
Hit-Invalidate cache operations has been improved in some
cases by an order of magnitude over that of the earlier
families. Table 8 compares the ACT 7000ASC with the
R4000 and R5000 processors.
System Interface
The ACT 7000ASC provides a high-performance 64-bit
system interface which is compatible with the RM5200
Family and R5000. Unlike the R4000 and R5000 family
processors which provide only an integral
multiplication
Aeroflex Plainview
SCD7000A Rev C 9/9/09
11
factor between SysClock and the pipeline clock, the
ACT 7000ASC also allows half-integral multipliers,
thereby providing greater granularity in the designers
choice of pipeline and system interface frequencies.
The interface consists of a 64-bit Address/Data bus with
8 check bits and a 9-bit command bus. In addition, there are
six handshake signals and six interrupt inputs. The
interface has a simple timing specification and is capable of
transferring data between the processor and memory at a
peak rate of 600 MB/sec with a 75 MHz SysClock.
Figure 6 shows a typical embedded system using the
ACT 7000ASC. This example shows a system with a bank
of DRAMs, and an interface ASIC which provides DRAM
control as well as an I/O port.
support both processor requests and external requests to the
ACT 7000ASC. Processor requests are initiated by the
ACT 7000ASC and responded to by an external device.
External requests are issued by an external device and
require the ACT 7000ASC to respond.
The ACT 7000ASC supports one to eight byte and
32-byte block transfers on the SysAD bus. In the case of a
sub-double-word transfer, the 3 low-order address bits give
the byte address of the transfer, and the SysCmd bus
indicates the number of bytes being transferred.
Handshake Signals
There are six handshake signals on the system interface.
Two of these, RdRdy* and WrRdy*, are used by an
external device to indicate to the ACT 7000ASC whether it
can accept a new read or write transaction. The
ACT 7000ASC samples these signals before deasserting
the address on read and write requests.
ExtRqst* and Release* are used to transfer control of
the SysAD and SysCmd buses from the processor to an
external device. When an external device needs to control
the interface, it asserts ExtRqst*. The ACT 7000ASC
responds by asserting Release* to release the system
interface to slave state.
ValidOut* and ValidIn* are used by the
ACT 7000ASC and the external device respectively to
indicate that there is a valid command or data on the SysAD
and SysCmd buses. The ACT 7000ASC asserts ValidOut*
when it is driving these buses with a valid command or
data, and the external device drives ValidIn* when it has
control of the buses and is driving a valid command or data.
System Address/Data Bus
The 64-bit System Address Data (SysAD) bus is used to
transfer addresses and data between the ACT 7000ASC
and the rest of the system. It is protected with an 8-bit parity
check bus, SysADC.
The system interface is configurable to allow easy
interfacing to memory and I/O systems of varying
frequencies. The data rate and the bus frequency at which
the ACT 7000ASC transmits data to the system interface
are programmable via boot time mode control bits. Also,
the rate at which the processor receives data is fully
controlled by the external device. Therefore, either a low
cost interface requiring no read or write buffering or a
faster, high-performance interface can be designed to
communicate with the ACT 7000ASC. Again, the system
designer has the flexibility to make these
price/performance trade-offs.
System Interface Operation
System Command Bus
The ACT 7000ASC can issue read and write requests to
an external device, while an external device can issue null
and write requests to the ACT 7000ASC.
For processor reads, the ACT 7000ASC asserts
ValidOut* and simultaneously drives the address and read
command on the SysAD and SysCmd buses. If the system
interface has RdRdy* asserted, then the processor tristates
its drivers and releases the system interface to slave state by
asserting Release*. The external device can then begin
sending data to the ACT 7000ASC.
The ACT 7000ASC interface has a 9-bit System
Command (SysCmd) bus. The command bus indicates
whether the SysAD bus carries an address or data. If the
SysAD bus carries an address, then the SysCmd bus also
indicates what type of transaction is to take place (for
example, a read or write). If the SysAD bus carries data,
then the SysCmd bus also gives information about the data
(for example, this is the last data word transmitted, or the
data contains an error). The SysCmd bus is bidirectional to
Flash /
Boot
ROM
DRAM
Address
Control
X
X
72
8
Latch
72
ACT 7000ASC
SysAD Bus
72
Memory I/O
Controller
SysCmd
PCI Bus
25
Figure 6 – Typical Embedded System Block Diagram
SCD7000A Rev C 9/9/09
Aeroflex Plainview
12
Figure 7 shows a processor block read request and the
external agent read response for a system with a
transaction.
The read latency is 4 cycles (ValidOut* to ValidIn*),
and the response data pattern is DDxxDD. Figure 8 shows
a processor block write where the processor was
programmed with write-back data rate boot code 2, or
DDxxD-Dxx.
of the processor external interface. Typically a null request
will be executed after an external device, that has acquired
control of the processor interface via ExtRqst*, has
completed an independent transaction between itself and
system memory in a system where memory is connected
directly to the SysAD bus. Normally this transaction would
be a DMA read or write from the I/O system.
Test / Breakpoint Registers
Data Prefetch
To increase both observability and controllability of the
processor thereby easing hardware and software
debugging, a pair of Test/Break-point, or Watch, registers,
Watch1 and Watch2, have been added to the
ACT 7000ASC. Each Watch register can be separately
enabled to watch for a load address, a store address, or an
instruction address. All address comparisons are done on
physical addresses. An associated register, Watch Mask,
has also been added so that either or both of the Watch
registers can compare against an address range rather than
a specific address. The range granularity is limited to a
power of two.
When enabled, a match of either Watch register results
in an exception. If the Watch is enabled for a load or store
address then the exception is the Watch exception as
defined for the R4000 with Cause exception code
twenty-three. If the Watch is enabled for instruction
addresses then a newly defined Instruction Watch
exception is taken and the Cause code is sixteen. The
Watch register which caused the exception is indicated by
Cause bits 25..24.
Table 9 summarizes a Watch operation.
The ACT 7000ASC supports the MIPS IV integer data
prefetch (PREF) and floating-point data prefetch (PREFX)
instructions. These instructions are used by the compiler or
by an assembly language programmer when it is known or
suspected that an upcoming data reference is going to miss
in the cache. By appropriately placing a prefetch
instruction, the memory latency can be hidden under the
execution of other instructions. If the execution of a
prefetch instruction would cause a memory management or
address error exception the prefetch is treated as a NOP.
The “Hint” field of the data prefetch instruction is used
to specify the action taken by the instruction. The
instruction can operate normally (that is, fetching data as if
for a load operation) or it can allocate and fill a cache line
with zeroes on a primary data cache miss.
Enhanced Write Modes
The ACT 7000ASC implements two enhancements to
the original R4000 write mechanism: Write Reissue and
Pipeline Writes. In write reissue mode, a write rate of one
write every two bus cycles can be achieved. A write issues
if WrRdy* is asserted two cycles earlier and is still
asserted during the issue cycle. If it is not still asserted then
the last write will reissue. Pipe-lined writes have the same
two bus cycle write repeat rate, but can issue one additional
write following the deassertion of WrRdy*.
Table 9 – Watch Control Register
Register
Bit Field/Function
63
External Requests
62
61
Watch1, 2 Store Load Instr
The ACT 7000ASC can respond to certain requests
issued by an external device. These requests take one of
two forms: Write requests and Null requests. An external
device executes a write request when it wishes to update
one of the processors writable resources such as the internal
interrupt register. A null request is executed when the
external device wishes the processor to reassert ownership
60:36
35:2
1:0
0
Addr
0
1
0
31:2
Watch
Mask
Mask
Mask Mask
Watch Watch
2
1
SysClock
SysAD
Addr
Data0 Data1
Data2 Data3
SysCmd
Read
nData
nData NEOD
nData
ValidOut*
ValidIn*
RdRdy*
WrRdy*
Release*
Figure 7 – Processor Block Read
SCD7000A Rev C 9/9/09
Aeroflex Plainview
13
SysClock
SysAD
Addr
Data0
Data1
Data2 Data3
SysCmd
Write
NData NData
NData NEOD
ValidOut*
ValidIn*
RdRdy*
WrRdy*
Release*
Figure 8 – Processor Block Write
Performance Counters
Table 10 – Performance Counter Control (cont)
Like the Test/Break-point capability described above,
the Performance Counter feature has been added to
improve the observability and controllability of the
processor thereby easing system debug and, especially in
the case of the performance counters, easing system tuning.
The Performance Counter feature is implemented using
two new CP0 registers, PerfCount and PerfControl. The
PerfCount register is a 32-bit writable counter which
causes an interrupt when bit 31 is set. The PerfControl
register is a 32-bit register containing a five bit field which
selects one of twenty-two event types as well as a handful
of bits which control the overall counting function. Note
that only one event type can be counted at a time and that
counting can occur for user code, kernel code, or both. The
event types and control bits are listed in Table 10.
PerfControl
Field
4.0 con’t
Table 10 – Performance Counter Control
PerfControl
Field
4..0
Description
Event Type
00: Clock cycles
01: Total instructions issued
02: Floating-point instructions issued
03: Integer instructions issued
04: Load instructions issued
05: Store instructions issued
06: Dual issued pairs
07: Branch prefetches
08: External Cache Misses
09: Stall cycles
0A: Secondary cache misses
0B: Instruction cache misses
0C: Data cache misses
0D: Data TLB misses
0E: Instruction TLB misses
0F: Joint TLB instruction misses
10: Joint TLB data misses
11: Branches taken
7..5
8
9
10
31..11
Description
12: Branches issued
13: Secondary cache writebacks
14: Primary cache writebacks
15: Dcache miss stall cycles (cycles
where both cache miss tokens
taken and a third address is
requested)
16: Cache misses
17: FP possible exception cycles
18: Slip Cycles due to multiplier busy
19: Coprocessor 0 slip cycles
1A: Slip cycles due to pending
non-blockingloads
1B: Write buffer full stall cycles
1C: Cache instruction stall cycles
1D: Multiplier stall cycles
1E: Stall cycles due to pending
non-blocking loads - stall start of
exception
Reserved (must be zero)
Count in Kernel Mode
0: Disable
1: Enable
Count in User Mode
0: Disable
1: Enable
Count Enable
0: Disable
1: Enable
Reserved (must be zero)
The performance counter interrupt will only occur when
interrupts are enabled in the Status register, IE=1, and
Interrupt Mask bit 13 (IM[13]) of the coprocessor 0
interrupt control register is not set.
Since the performance counter can be set up to count
SCD7000A Rev C 9/9/09
Aeroflex Plainview
14
clock cycles, it can be used as either a) a second timer or b)
a watchdog interrupt. A watchdog interrupt can be used as
an aid in debugging system or software “hangs.” Typically
the software is setup to periodically update the count so that
no interrupt will occur. When a hang occurs the interrupt
ultimately triggers thereby breaking free from the hang-up.
and Table 14 below. The priority level registers are located
in the coprocessor 0 control register space. For further
details about the control space see the section describing
coprocessor 0.
In addition to programmable priority levels, the
ACT 7000ASC also permits the spacing between interrupt
vectors to be programmed. For example, the minimum
spacing between two adjacent vectors is 0x20 while the
maximum is 0x200. This programmability allows the user
to either set up the vectors as jumps to the actual interrupt
routines or, if interrupt latency is paramount, to include the
entire interrupt routine at the vector. Table 15 illustrates the
complete set of vector spacing selections along with the
coding as required in the Interrupt Control register bits 4:0.
In general, the active interrupt priority combined with
the spacing setting generates a vector offset which is then
added to the interrupt base address of 0x200 to generate the
interrupt exception offset. This offset is then added to the
exception base to produce the final interrupt vector address.
Interrupt Handling
In order to provide better real time interrupt handling, the
ACT 7000ASC provides an extended set of hardware
interrupts each of which can be separately prioritized and
separately vectored.
As described above, the performance counter is also a
hardware interrupt source, IP[13]. Also, whereas the
R4000 and R5000 family processors map the timer
interrupt onto IP[7], the ACT 7000ASC provides a
separate interrupt, IP[12], for this purpose.
All of these interrupts, IP[13..0], the Performance
Counter, and the Timer, have corresponding interrupt mask
bits, IM[13..0], and interrupt pending bits, IP[13..0], in the
Status, Interrupt Control, and Cause registers. The bit
assignments for the Interrupt Control and Cause registers
are shown in Table 11 and Table 12 below. The Status
register has not changed from the RM5200 Family and
R5000, and is not shown.
The IV bit in the Cause register is the global enable bit
for the enhanced interrupt features. If this bit is clear then
interrupt operation is compatible with the RM5200 Family
and R5000. Although not related to the interrupt
mechanism, note that the W1 and W2 bits indicate which
Watch register caused a particular Watch exception.
In the Interrupt Control register, the interrupt vector
spacing is controlled by the Spacing field as described
below. The Interrupt Mask field (IM[15..8]) contains the
interrupt mask for interrupts eight through thirteen.
IM[15..14] are reserved for future use.
The Timer Enable (TE) bit is used to gate the Timer
Interrupt to the Cause Register. If TE is set to 0, the Timer
Interrupt is not gated to IP12. If TE is set to 1, the Timer
Interrupt is gated to IP12.
The setting for Mode Bit 11 is used to determine if the
Timer Interrupt replaces the external interrupt (INT5*) as
an input to IP7 in the Cause Register. If Mode Bit 11 is set
to 0, the Timer Interrupt is gated to IP7. If Mode Bit 11 is
set to 1, external INT5* is gated to IP7.
In order to utilize both the external Interrupt (INT5*)
and the internal Timer Interrupt, Mode Bit 11 must be set
to 1, and TE must be set to 1. In this case, the Timer
Interrupt will utilize IP12, and INT5* will utilize IP7.
Please also reference the logic diagram for interrupt signals
in the RM7000 User Manual.
Priority of the interrupts is set via two new coprocessor
0 registers called Interrupt Priority Level Lo, IPLLO, and
Interrupt Priority Level Hi, IPLHI.
These two registers contain a four-bit field
corresponding to each interrupt thereby allowing each
interrupt to be programmed with a priority level from 0 to
13 inclusive. The priorities can be set in any manner
including having all the priorities set exactly the same.
Priority 0 is the highest level and priority 15 the lowest. The
format of the priority level registers is shown in Table 13
Table 15 – Interrupt Vector Spacing
ICR[4..0]
Spacing
0x0
0x000
0x1
0x020
0x2
0x040
0x4
0x080
0x8
0x100
0x10
0x200
others
reserved
Standby Mode
The ACT 7000ASC provides a means to reduce the
amount of power consumed by the internal core when the
CPU would not otherwise be performing any useful
operations. This state is known as Standby Mode.
Executing the WAIT instruction enables interrupts and
enters Standby Mode. When the WAIT instruction
completes the W pipe stage, if the SysAD bus is currently
idle, the internal processor clocks will stop thereby freezing
the pipeline. The phase lock loop, or PLL, internal timer/
counter, and the “wake up” input pins: IP[5:0]*, NMI*,
ExtReq*, Reset*, and ColdReset* continue to operate in
their normal fashion. If the SysAD bus is not idle when the
WAIT instruction completes the W pipe stage, then the
WAIT is treated as a NOP. Once the processor is in
Standby, any interrupt, including the internally generated
timer interrupt, will cause the processor to exit Standby and
resume operation where it left off. The WAIT instruction
is typically inserted in the idle loop of the operating system
or real time executive.
SCD7000A Rev C 9/9/09
Aeroflex Plainview
15
Table 11 – Cause Register
31
30
29,28
27
26
25
24
23..8
7
6..2
0,1
BD
0
CE
0
W2
W1
IV
IP[15..0]
0
EXC
0
Table 12 – Interupt Control Register
31..16
15..8
7
6..5
4..00
0
IM[15..8]
TE
0
Spacing
Table 13 – IPLLO Register
31..28
27..24
23..20
19..16
15..12
11..8
7..4
3..0
IPL7
IIPL6
IPL5
IPL4
IPL3
IPL2
IPL1
IPL0
Table 14 – IPLHI Register
31..28
27..24
23..20
19..16
15..12
11..8
7..4
3..0
0
0
IPL13
IPL12
IPL11
IPL10
IPL9
IPL8
JTAG Interface
The ACT 7000ASC interface supports JTAG boundary scan in conformance with IEEE 1149.1. The JTAG interface is
especially helpful for checking the integrity of the processor’s pin connections.
Boot-Time Options
Fundamental operational modes for the processor are initialized by the boot-time mode control interface. The boot-time
mode control interface is a serial interface operating at a very low frequency (SysClock divided by 256). The low
frequency operation allows the initialization information to be kept in a low cost EPROM; alternatively the twenty or so
bits could be generated by the system interface ASIC.
Immediately after the VccOK signal is asserted, the processor reads a serial bit stream of 256 bits to initialize all the
fundamental operational modes. ModeClock runs continuously from the assertion of VccOK.
Boot-Time Modes
The boot-time serial mode stream is defined in Table 16. Bit 0 is the bit presented to the processor when VccOK is
deasserted; bit 255 is the last.
SCD7000A Rev C 9/9/09
Aeroflex Plainview
16
Table 16 – Boot Time Mode Stream (cont)
Table 16 – Boot Time Mode Stream
Mode bit
0
4..1
Mode bit
Description
10..9 con’t
Reserved: Must be zero
11
1: External INT5* gated to IP[7]
2: DDxxDDxx
12
3: DxDxDxDx
14..13
4: DDxxxDDxxx
Reserved: Must be zero
Output driver strength - 100% = fastest
DDxxxxDDxxxx
00: 67% strength
6: DxxDxxDxxDxx
01: 50% strength
7: DDxxxxxxDDxxxxxx
10: 100% strength
8: DxxxDxxxDxxxDxxx
11: 83% strength
5
15
9-15:Reserved
Reserved must be zero
SysClock to Pclock Multiplier
Mode bit 20 = 0 / Mode bit 20 = 1
17..16
System configuration identifiers- software
visible in processor Config[21..20] register
0: Multiply by 2/x
19..18
Reserved: Must be zero
1: Multiply by 3/x
20
Pclock to SysClock multipliers.
2: Multiply by 4/x
0: Integer multipliers (2,3,4,5,6,7,8,9)
3: Multiply by 5/2.5
1:
23..21
4: Multiply by 6/x
24
5: Multiply by 7/3.5
Half integer multipliers (2.5,3.5,4.5)
Reserved: Must be zero
JTLB Size.
6: Multiply by 8/x
0: 48 dual-entry
7: Multiply by 9/4.5
1: 64 dual-entry
25
Specifies byte ordering. Logically ORed
with BigEndian input signal.
On-chip secondary cache control.
0: Disable
0: Little endian
1: Enable
1: Big endian
255..26
10..9
Timer Interrupt Enable/Disable
0: Internal Timer interrupt gated to IP[7]
1: DDxDDx
8
10: pipelined non-block writes
11: non-block write re-issue
Write-back data rate
0: DDDD
7..5
Description
Reserved: Must be zero
Non-Block Write Control
00: R4000 compatible non-block writes
01: Reserved
SCD7000A Rev C 9/9/09
Aeroflex Plainview
17
PLL Analog Power Filtering
The ACT 7000ASC includes extra PLL Analog Power Fiiltering circuitry designed to provide low noise,
temperature stable filtering for the VccP and VssP signals. The included circuitry consists of several passive
components located at the closest possible point to the RM7000A die and is configured as shown in Figure 9.
5Ω
VccP
64
.01
µF
1000
pF
RM7000A
Die
5Ω
VssP
65
Figure 9 – ACT 7000ASC Including PLL Filter Circuit
Additional board level PPL filtering is also required. The recommended configuration is shown in Figure 10.
5Ω
VccInt
10
µF
.1
µF
64
VccP
65
VssP
1000
pF
5Ω
VssInt
Figure 10 – Recommended Board Level PLL Filter circuit
for the ACT 7000ASC
SCD7000A Rev C 9/9/09
Aeroflex Plainview
18
Absolute Maximum Rating1
Symbol
VTERM
TC
TSTG
IIN
IOUT
Parameter
Limits
Units
-0.52 to +3.9
V
Case Operating Temperature
-55 to +125
°C
Storage Temperature
-65 to +150
°C
DC Input Current
203
mA
DC Output Current4
±20
mA
Terminal Voltage with respect to VSS
Note 1: Stresses greater than those listed under ABSOLUTE MAXIMUM RATINGS may cause permanent damage to the device. This is a stress rating
only and functional operation of the device at these or any other conditions above those indicated in the operational sections of this specification is
not implied. Exposure to absolute maximum rating conditions for extended periods may affect reliability.
Note 2: VIN minimum = -2.0V for pulse width less than 15ns. VIN should not exceed 3.9 Volts.
Note 3: When VIN < 0V or VIN > VCCIO
Note 4: Not more than one output should be shorted at a time. Duration of the short should not exceed 30 seconds.
Recommended Operating Conditions
CPU Speed
Temperature
Vss
VssInt
VccIO
VccP
225 - 350MHz
-55°C to +125°C (TC)
0V
1.8V ±50mV
3.3V ±150mV
1.8V ±150mV
Note:
VCC I/O should not exceed VccInt by greater than 2.0V during the power-up sequence.
Note:
Applying a logic high state to any I/O pin before VccInt becomes stable is not recommended.
Note:
As specified in IEEE 1149.1 (JTAG), the JTMS pin must be held low during reset to avoid entering JTAG test mode. Refer to the RM7000
Family Users Manual, Appendix E.
DC Electrical Characteristics
Parameter
Minimum
Maximum
VOL
-
0.2V
VOH
VCCIO - 0.2V
-
VOL
-
0.4V
VOH
2.4V
-
VIL
-0.3V
0.8
VIH
2.0V
VCCIO + 0.3V
IIN
-
±15µA
±15µA
CIN
-
10pF
COUT
-
10pF
SCD7000A Rev C 9/9/09
Conditions
|IOUT | = 100µA
|IOUT | = 2mA
VIN = 0
VIN = VCCIO
Aeroflex Plainview
19
Power Consumption
CPU Clock Speed
225
MHz
300
MHz
350
MHz
Max2
Max2
Max2
865
865
925
Maximum with no FFU operation 2
2350
2750
3550
Maximum worst case instruction mix
2500
3000
4000
Parameter
VccInt
Power
(mWatts)
Condition
Standby
Active
-
Notes
1. Worst case supply voltage (maximum VCCINT) with worst case temperature (maximum TCase).
2. Dhrystone 2.1 instruction mix.
3. I/O supply power is application dependent, but typically <20% of VCCINT.
AC Electrical Characteristics – Clock Parameters
CPU Clock Speed
Parameter
Symbol
Test
Condition
225MHz
300MHz
350MHz
Min
Max
Min
Max
Min
Max
Units
SysClock High
tSCHIGH
Transition < 5ns
3
-
3
-
3
-
ns
SysClock Low
tSCLOW
Transition < 5ns
3
-
3
-
3
-
ns
25
75
25
75
25
70
MHz
tSCP
-
40
-
40
-
40
ns
Clock Jitter for SysClock
tJITTERIN
-
±200
-
±150
-
±150
ps
SysClock Rise Time
tSCRISE
-
2
-
2
-
2
ns
SysClock Fall Time
tSCFALL
-
2
-
2
-
2
ns
ModeClock Period
tMODECKP
-
256
-
256
-
256
tSCP
JTAG Clock Period
tJTAGCKP
-
4
-
4
-
4
tSCP
SysClock Frequency
SysClock Period
Note:
Operation of the ACT 7000ASC is only guaranteed with the Phase Lock Loop enabled.
SCD7000A Rev C 9/9/09
Aeroflex Plainview
20
System Interface Parameters
225MHz
Parameter1
Sym
Data Output2,3
300MHz
350MHz
Test Conditions
Units
Min
Max
Min
Max
Min
Max
mode14...13 = 10 (fastest)
1.0
4.5
1.0
4.5
1.0
4.5
ns
mode14...13 = 01 (slowest)
1.0
5.5
1.0
5.5
1.0
5.5
ns
tDO
Data Setup4
tDS
trise = see above table
2.5
-
2.5
-
2.5
-
ns
Data Hold4
tDH
tfall = see above table
1.0
-
1.0
-
1.0
-
ns
Notes:
1. Timings are measured from 1.5V of the clock to 1.5V of the signal.
2. Capacitive load for all output timings is 50pF.
3. Data Output timing applies to all signal pins whether tristate I/O or output only.
4. Setup and Hold parameters apply to all signal pins whether tristate I/O or input only.
Boot-Time Interface Parameters
Parameter
Symbol
Test Conditions
Min
Max
Units
Mode Data Setup
tDS
-
4
-
SysClock cycles
Mode Data Hold
tDH
-
0
-
Clock Timing
SysClock
tSCP
tSCRise tSCFall
±tJitterIn
SysClock Timing
System Interface Timing (SysAD, SysCmd, ValidIn*, ValidOut*, etc.)
SysClock
tDS
Data
tDH
Data
Input Timing
SysClock
tDO
tDO
MIN
Data
Data
Data
Output Timing
SCD7000A Rev C 9/9/09
Aeroflex Plainview
21
Pin Descriptions
The following is a list of control, data, clock, interrupt, and miscellaneous pins of the ACT 7000ASC.
Pin Name
Type
Description
System interface:
ExtRqst*
Input
External request
Signals that the system interface is submitting an external request.
Release*
Output
Release interface
Signals that the processor is releasing the system interface to slave state
RdRdy*
Input
Read Ready
Signals that an external agent can now accept a processor read.
WrRdy*
Input
Write Ready
Signals that an external agent can now accept a processor write request.
ValidIn*
Input
Valid Input
Signals that an external agent is now driving a valid address or data on the
SysAD bus and a valid command or data identifier on the SysCmd bus.
ValidOut*
Output
Valid output
Signals that the processor is now driving a valid address or data on the SysAD
bus and a valid command or data identifier on the SysCmd bus.
SysAD(63:0)
Input/
Output
System address/data bus
A 64-bit address and data bus for communication between the processor and an
external agent.
SysADC(7:0)
Input/
Output
System address/data check bus
An 8-bit bus containing parity check bits for the SysAD bus during data cycles.
SysCmd(8:0)
Input/
Output
System command/data identifier bus
A 9-bit bus for command and data identifier transmission between the processor
and an external agent.
SysCmdP
Input/
Output
System Command/Data Identifier Bus Parity
For the RM7000A, unused on input and zero on output.
Clock/Control interface:
SysClock
Input
System clock
Master clock input used as the system interface reference clock. All output
timings are relative to this input clock. Pipeline operation frequency is derived by
multiplying this clock up by the factor selected during boot initialization
VccP
Input
Vcc for PLL
Quiet VccInt for the internal phase locked loop. Must be connected to VccInt.
See Figure 10 for additional PPL filtering information.
VssP
Input
Vss for PLL
Quiet Vss for the internal phase locked loop. Must be connected to Vss.
See Figure 10 for additional PPL filtering information.
Int*(5:0)
Input
Interrupt
Six general processor interrupts, bit-wise ORed with bits 5:0 of the interrupt
register.
NMI*
Input
Non-maskable interrupt
Non-maskable interrupt, ORed with bit 15 of the interrupt register (bit 6 in R5000
compatibility mode).
Interrupt Interface
SCD7000A Rev C 9/9/09
Aeroflex Plainview
22
Pin Descriptions (cont)
The following is a list of control, data, clock, interrupt, and miscellaneous pins of the ACT 7000ASC.
Pin Name
Type
Description
JTAG interface:
JTDI
Input
JTAG data in
JTAG serial data in.
JTCK
Input
JTAG clock input
JTAG serial clock input.
JTDO
Output
JTAG data out
JTAG serial data out.
JTMS
Input
JTAG command
JTAG command signal, signals that the incoming serial data is command data.
Initialization Interface:
BigEndian
Input
Big Endian / Little Endian Control
Allows the system to change the processor addressing mode without rewriting
the mode ROM.
VccOK
Input
Vcc is OK
When asserted, this signal indicates to the ACT-7000ASC that the VCCINT power
supply has been above the recommended value for more than 100 milliseconds
and will remain stable. The assertion of VccOK initiates the reading of the
boot-time mode control serial stream.
ColdReset*
Input
Cold Reset
This signal must be asserted for a power on reset or a cold reset. ColdReset
must be de-asserted synchronously with SysClock.
Reset*
Input
Reset
This signal must be asserted for any reset sequence. It may be asserted
synchronously or asynchronously for a cold reset, or synchronously to initiate a
warm reset. Reset must be de-asserted synchronously with SysClock.
ModeClock
Output
Boot Mode Clock
Serial boot-mode data clock output at the system clock frequency divided by two
hundred and fifty six.
ModeIn
Input
Boot Mode Data In
Serial boot-mode data input.
For additional Detail Information regarding the operation of the PMC-Sierra see the latest PMC-Sierra datasheet for
the RM7000A 64-Bit Superscalar Microprocessor with On-Chip Secondary Cache (doc. # PMC-2002227), Issue No. 5:
August, 2002
SCD7000A Rev C 9/9/09
Aeroflex Plainview
23
Package Information – "F17" – CQFP 208 Leads
1.131 (28.727) SQ
1.109 (28.169) SQ
53
104
52
.0236 (.51)
.0158 (.49)
105
Lid
.010R REF
.010R REF
.015 (.381)
.009 (.229)
.130 (3.302)
MAX
1.009 (25.63)
.9998 (25.37)
51 Spaces at .0197
(51 Spaces at .50)
0°±5°
.090 (2.286) REF
.010 (.253)
.007 (.178)
.050 (1.27)
.030 (.762)
Detail "A"
1
Pin 1 Chamfer
156
208
157
.960 (24.384) SQ
REF
Detail "A"
.055 (1.397)
REF
.005 (.127)
.008 (.202)
1.331 (33.807)
1.269 (32.233)
Units: Inches (Millimeters)
.115 (2.921)
MAX
.055 (1.397)
.045 (1.143)
Note: Pin rotation is opposite of PMC-Sierra PQUAD due to cavity-up construction.
Package Information – "F24" – Inverted CQFP 208 Leads
1.131 (28.727) SQ
1.109 (28.169) SQ
156
105
157
.0236 (.51)
.0158 (.49)
104
.055 (1.397)
.045 (1.143)
.012R REF
.012R REF
1.009 (25.63)
.9998 (25.37)
51 Spaces at .0197
(51 Spaces at .50)
.055 (1.397)
REF
0°±5°
.115 (2.921)
MAX
Lid
.010 (.253)
.007 (.178)
.060 (1.524)
.040 (1.016)
208
Pin 1 Chamfer
53
1
Detail "A"
52
.139 (3.531)
MAX
Detail "A"
.005 (.127)
.008 (.202)
Units: Inches (Millimeters)
.960 (24.384) REF
.024 (.610)
.010 (.253)
1.331 (33.807)
1.291 (32.791)
Note: Pin rotation is Identical to PMC-Sierra PQUAD due to cavity-down construction.
SCD7000A Rev C 9/9/09
Aeroflex Plainview
24
ACT 7000ASC Microprocessor CQFP Pinouts – "F17" & "F24"
Pin #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Function
VccIO
NC
NC
VccIO
Vss
SysAD4
SysAD36
SysAD5
SysAD37
VccInt
Vss
SysAD6
SysAD38
VccIO
Vss
SysAD7
SysAD39
SysAD8
SysAD40
VccInt
Vss
SysAD9
SysAD41
VccIO
Vss
SysAD10
SysAD42
SysAD11
SysAD43
VccInt
Vss
SysAD12
SysAD44
VccIO
Vss
SysAD13
SysAD45
SysAD14
SysAD46
VccInt
Vss
SysAD15
SysAD47
VccIO
Vss
ModeClock
JTDO
JTDI
JTCK
JTMS
VccIO
Vss
Pin #
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
Function
NC
NC
NC
VccIO
Vss
ModeIn
RdRdy*
WrRdy*
ValidIn*
ValidOut*
Release*
VccP
VssP
SysClock
VccInt
Vss
VccIO
Vss
VccInt
Vss
SysCmd0
SysCmd1
SysCmd2
SysCmd3
VccIO
Vss
SysCmd4
SysCmd5
VccIO
Vss
SysCmd6
SysCmd7
SysCmd8
SysCmdP
VccInt
Vss
VccInt
Vss
VccIO
Vss
Int0*
Int1*
Int2*
Int3*
Int4*
Int5*
VccIO
Vss
NC
NC
NC
NC
Pin #
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
SCD7000A Rev C 9/9/09
Function
VccIO
NMI*
ExtRqst*
Reset*
ColdReset*
VccOK
BigEndian
VccIO
Vss
SysAD16
SysAD48
VccInt
Vss
SysAD17
SysAD49
SysAD18
SysAD50
VccIO
Vss
SysAD19
SysAD51
VccInt
Vss
SysAD20
SysAD52
SysAD21
SysAD53
VccIO
Vss
SysAD22
SysAD54
VccInt
Vss
SysAD23
SysAD55
SysAD24
SysAD56
VccIO
Vss
SysAD25
SysAD57
VccInt
Vss
SysAD26
SysAD58
SysAD27
SysAD59
VccIO
Vss
NC
NC
Vss
Pin #
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
Function
NC
NC
NC
NC
VccIO
Vss
SysAD28
SysAD60
SysAD29
SysAD61
VccInt
Vss
SysAD30
SysAD62
VccIO
Vss
SysAD31
SysAD63
SysADC2
SysADC6
VccInt
Vss
SysADC3
SysADC7
VccIO
Vss
SysADC0
SysADC4
VccInt
Vss
SysADC1
SysADC5
SysAD0
SysAD32
VccIO
Vss
SysAD1
SysAD33
VccInt
Vss
SysAD2
SysAD34
SysAD3
SysAD35
VccIO
Vss
NC
NC
NC
NC
VccIO
Vss
Aeroflex Plainview
25
Sample Ordering Information
Part Number
Screening
Speed (MHz)
ACT-7000ASC-300F17I
Industrial Temperature
300
ACT-7000ASC-300F17C
Commercial Temperature
300
ACT-7000ASC-300F17T
Military Temperature
300
ACT-7000ASC-300F17M
Military Screening
300
Package
208 Lead CQFP
Part Number Breakdown
ACT– 7000A SC – 225 F17 M
Aeroflex-Plainview
Screening
Base Processor Type
C = Commercial Temp, 0°C to +70°C
I = Industrial Temp, -40°C to +85°C
T = Military Temp, -55°C to +125°C
M = Military Temp, -55°C to +125°C, Screened *
Q = MIL-PRF-38534 Compliant/SMD if applicable
Cache Style
SC = Secondary Cache
Maximum Pipeline Freq.
225 = 225MHz
300 = 300MHz
350 = 350MHz
400 = 400MHz (Future Option)
Package Type & Size
Surface Mount Package
F17 = 1.120" SQ 208 Lead CQFP
F24 = 1.120" SQ Inverted 208 Lead CQFP
* Screened to the individual test methods of MIL-STD-883
The QED logo and RISCMark are trademarks of PMC-Sierra, Inc.
MIPS is a registered trademark of MIPS Technologies, Inc. All other trademarks are the respective property of the trademark holders.
EXPORT CONTROL:
EXPORT WARNING:
This product is controlled for export under the International Traffic in
Arms Regulations (ITAR). A license from the U.S. Department of
State is required prior to the export of this product from the United
States.
Aeroflex’s military and space products are controlled for export under
the International Traffic in Arms Regulations (ITAR) and may not be
sold or proposed or offered for sale to certain countries. (See ITAR
126.1 for complete information.)
PLAINVIEW, NEW YORK
Toll Free: 800-THE-1553
Fax: 516-694-6715
INTERNATIONAL
Tel: 805-778-9229
Fax: 805-778-1980
NORTHEAST
Tel: 603-888-3975
Fax: 603-888-4585
SE AND MID-ATLANTIC
Tel: 321-951-4164
Fax: 321-951-4254
WEST COAST
Tel: 949-362-2260
Fax: 949-362-2266
CENTRAL
Tel: 719-594-8017
Fax: 719-594-8468
www.aeroflex.com
[email protected]
Aeroflex Microelectronic Solutions reserves the right to change at
any time without notice the specifications, design, function, or form
of its products described herein. All parameters must be validated for
each customer's application by engineering. No liability is assumed
as a result of use of this product. No patent licenses are implied. All
trademarks are acknowledged. Parent company Aeroflex, Inc. 2003.
SCD7000A Rev C 10/9/09
26
Our passion for performance is defined by three
attributes represented by these three icons:
solution-minded, performance-driven and customer-focused
Similar pages