INTEL INTEL387SX

Intel387 TM SX
MATH COPROCESSOR
Y
New Automatic Power Management
Ð Low Power Consumption
Ð Typically 100 mA in Dynamic Mode,
and 4 mA in Idle Mode
Y
Socket Compatible with Intel387 Family
of Math CoProcessors
Ð Hardware and Software Compatible
Ð Supported by Over 2100 Commercial
Software Packages
Ð 10% to 15% Performance Increase
on Whetstone and Livermore
Benchmarks
Y
Compatible with the Intel386 TM SX
Microprocessor
Ð Extends CPU Instruction Set to
Include Trigonometric, Logarithmic,
and Exponential
Y
High Performance 80-Bit Internal
Architecture
Y
Implements ANSI/IEEE Standard
754-1985 for Binary Floating-Point
Arithmetic
Y
Available in a 68-Pin PLCC Package
See Intel Packaging Specification, Order Ý231369
The Intel387 TM SX Math CoProcessor is an extension to the Intel386 TM SX microprocessor architecture. The
combination of the Intel387 TM SX with the Intel386 TM SX microprocessor dramatically increases the processing speed of computer application software that utilizes high performance floating-point operations. An internal
Power Management Unit enables the Intel387 TM SX to perform these floating-point operations while maintaining very low power consumption for portable and desktop applications. The internal Power Management Unit
effectively reduces power consumption by 95% when the device is idle.
The Intel387 TM SX Math CoProcessor is available in a 68-pin PLCC package, and is manufactured on Intel’s
advanced 1.0 micron CHMOS IV technology.
240225 – 22
Intel386 and Intel387 are trademarks of Intel Corporation.
*Other brands and names are the property of their respective owners.
Information in this document is provided in connection with Intel products. Intel assumes no liability whatsoever, including infringement of any patent or
copyright, for sale and use of Intel products except as provided in Intel’s Terms and Conditions of Sale for such products. Intel retains the right to make
changes to these specifications at any time, without notice. Microcomputer Products may have minor variations to this specification known as errata.
January 1994
COPYRIGHT © INTEL CORPORATION, 1995
Order Number: 240225-009
1
Intel387 TM SX Math CoProcessor
CONTENTS
PAGE
1.0 PIN ASSIGNMENT ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 5
1.1 Pin Description Table ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 6
2.0 FUNCTIONAL DESCRIPTION ÀÀÀÀÀÀÀÀÀ 7
2.1 Feature List ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 7
2.2 Math CoProcessor Architecture ÀÀÀÀÀÀ 7
2.3 Power Management ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 8
2.3.1 Dynamic Mode ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 8
2.3.2 Idle Mode ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 8
2.4 Compatibility ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 8
2.5 Performance ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 8
3.0 PROGRAMMING INTERFACE ÀÀÀÀÀÀÀÀÀ 9
3.1 Instruction Set ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 9
3.1.1 Data Transfer Instructions ÀÀÀÀÀÀ 9
3.1.2 Arithmetic Instructions ÀÀÀÀÀÀÀÀÀÀ 9
3.1.3 Comparison Instructions ÀÀÀÀÀÀÀ 10
3.1.4 Transcendental
Instructions ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 10
3.1.5 Load Constant Instructions ÀÀÀÀ 10
3.1.6 Processor Instructions ÀÀÀÀÀÀÀÀÀ 11
3.2 Register Set ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 11
3.2.1 Status Word (SW) Register ÀÀÀÀ 12
3.2.2 Control Word (CW)
Register ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 15
3.2.3 Data Register ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 16
3.2.4 Tag Word (TW) Register ÀÀÀÀÀÀÀ 16
3.2.5 Instruction and Data
Pointers ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 16
3.3 Data Types ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 18
3.4 Interrupt Description ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 18
3.5 Exception Handling ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 18
3.6 Initialization ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 21
3.7 Processing Modes ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 21
3.8 Programming Support ÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 21
CONTENTS
PAGE
4.0 HARDWARE SYSTEM
INTERFACE ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 21
4.1 Signal Description ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 22
4.1.1 Intel386 CPU Clock 2
(CPUCLK2) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 22
4.1.2 Intel387 Math CoProcessor
Clock 2 (NUMCLK2) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 22
4.1.3 Clocking Mode (CKM) ÀÀÀÀÀÀÀÀÀ 23
4.1.4 System Reset (RESETIN) ÀÀÀÀÀÀ 23
4.1.5 Processor Request
(PEREQ) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 23
4.1.6 Busy Status (BUSYÝ) ÀÀÀÀÀÀÀÀÀ 23
4.1.7 Error Status (ERRORÝ) ÀÀÀÀÀÀÀ 23
4.1.8 Data Pins (D15–D0) ÀÀÀÀÀÀÀÀÀÀÀ 23
4.1.9 Write/Read Bus Cycle
(W/RÝ) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 23
4.1.10 Address Stobe (ADSÝ) ÀÀÀÀÀÀÀ 23
4.1.11 Bus Ready Input
(READYÝ) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 24
4.1.12 Ready Output
(READYOÝ) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 24
4.1.13 Status Enable (STEN) ÀÀÀÀÀÀÀÀ 24
4.1.14 Math CoProcessor Select 1
(NPS1Ý) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 24
4.1.15 Math CoProcessor Select 2
(NPS2) ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 24
4.1.16 Command (CMD0Ý) ÀÀÀÀÀÀÀÀÀ 24
4.1.17 System Power (VCC) ÀÀÀÀÀÀÀÀÀ 24
4.1.18 System Ground (VSS) ÀÀÀÀÀÀÀÀ 24
4.2 System Configuration ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 25
4.3 Math CoProcessor Architecture ÀÀÀÀÀ 26
4.3.1 Bus Control Logic ÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 26
4.3.2 Data Interface and Control
Unit ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 26
4.3.3 Floating Point Unit ÀÀÀÀÀÀÀÀÀÀÀÀÀ 26
4.3.4 Power Management Unit ÀÀÀÀÀÀÀ 26
2
2
CONTENTS
PAGE
4.4 Bus Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 26
4.4.1 Intel387 SX Math
CoProcessor Addressing ÀÀÀÀÀÀÀÀÀÀ 27
4.4.2 CPU/Math CoProcessor
Synchronization ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 27
4.4.3 Synchronous/Asynchronous
Modes ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 27
4.4.4 Automatic Bus Cycle
Termination ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 27
5.0 BUS OPERATION ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 27
5.1 Non-pipelined Bus Cycles ÀÀÀÀÀÀÀÀÀÀ 28
5.1.1 Write Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 28
5.1.2 Read Cycle ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 29
5.2 Pipelined Bus Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 29
5.3 Mixed Bus Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 30
5.4 BUSYÝ and PEREQ Timing
Relationship ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 32
CONTENTS
PAGE
7.0 ELECTRICAL
CHARACTERISTICS ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 33
7.1 Absolute Maximum Ratings ÀÀÀÀÀÀÀÀÀ 33
7.2 D.C. Characteristics ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 34
7.3 A.C. Characteristics ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 35
8.0 Intel387 SX MATH COPROCESSOR
INSTRUCTION SET ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 41
APPENDIX AÐIntel387 SX MATH
COPROCESSOR COMPATIBILITY ÀÀÀÀ A-1
A.1 8087/80287 Compatibility ÀÀÀÀÀÀÀÀÀ A-1
A.1.1 General Differences ÀÀÀÀÀÀÀÀÀÀ A-1
A.1.2 Exceptions ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ A-2
APPENDIX BÐCOMPATIBILITY
BETWEEN THE 80287 AND 8087
MATH COPROCESSOR ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ B-1
6.0 PACKAGE SPECIFICATIONS ÀÀÀÀÀÀÀÀ 33
6.1 Mechanical Specifications ÀÀÀÀÀÀÀÀÀÀ 33
6.2 Thermal Specifications ÀÀÀÀÀÀÀÀÀÀÀÀÀ 33
3
3
CONTENTS
FIGURES
Figure 1-1
PAGE
Intel387 SX Math
CoProcessor Pinout ÀÀÀÀÀÀÀÀÀÀÀ 5
Figure 2-1 Intel387 SX Math
CoProcessor Block
Diagram ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 7
Figure 3-1 Intel 386 SX CPU and
Intel387 Math CoProcessor
Register Set ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 11
Figure 3-2 Status Word ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 12
Figure 3-3 Control Word ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 15
Figure 3-4 Tag Word Register ÀÀÀÀÀÀÀÀÀÀÀ 16
Figure 3-5 Instruction and Data Pointer
Image in Memory, 32-Bit
Protected Mode Format ÀÀÀÀÀÀ 17
Figure 3-6 Instruction and Data Pointer
Image in Memory, 16-Bit
Protected Mode Format ÀÀÀÀÀÀ 17
Figure 3-7 Instruction and Data Pointer
Image in Memory, 32-Bit
Real Mode Format ÀÀÀÀÀÀÀÀÀÀÀ 17
Figure 3-8 Instruction and Data Pointer
Image in Memory, 16-Bit
Real Mode Format ÀÀÀÀÀÀÀÀÀÀÀ 18
Figure 4-1 Intel386 SX CPU and
Intel387 SX Math
CoProcessor System
Configuration ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 25
Figure 5-1 Bus State Diagram ÀÀÀÀÀÀÀÀÀÀÀ 28
Figure 5-2 Non-Pipelined Read and
Write Cycles ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 29
Figure 5-3 Fastest Transition to and
from Pipelined Cycles ÀÀÀÀÀÀÀÀ 30
Figure 5-4 Pipelined Cycles with Wait
States ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 31
Figure 5-5 BUSYÝ and PEREQ Timing
Relationship ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 32
Figure 7-1a Typical Output Valid Delay
vs Load Capacitance at Max
Operating Temperature ÀÀÀÀÀÀ 37
Figure 7-1b Typical Output Slew Time vs
Load Capacitance at Max
Operating Temperature ÀÀÀÀÀÀ 37
Figure 7-1c Maximum ICC vs
Frequency ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 37
CONTENTS
Figure 7-2
Figure 7-3
Figure 7-4
Figure 7-5
Figure 7-6
Figure 7-7
PAGE
CPUCLK2/NUMCLK2
Waveform and
Measurement Points for
Input/Output ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 38
Output Signals ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 38
Input and I/O Signals ÀÀÀÀÀÀÀÀ 39
RESET Signal ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 39
Float from STEN ÀÀÀÀÀÀÀÀÀÀÀÀÀ 40
Other Parameters ÀÀÀÀÀÀÀÀÀÀÀÀ 40
TABLES
Table 1-1
Pin Cross ReferenceÐ
Functional Grouping ÀÀÀÀÀÀÀÀÀÀÀ 5
Table 3-1 Condition Code
Interpretation ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 13
Table 3-2 Condition Code Interpretation
after FPREM and FPREM1
Instructions ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 14
Table 3-3 Condition Code Resulting
from Comparison ÀÀÀÀÀÀÀÀÀÀÀÀÀ 14
Table 3-4 Condition Code Defining
Operand Class ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 14
Table 3-5 Mapping Condition Codes to
Intel386 CPU Flag Bits ÀÀÀÀÀÀÀÀ 14
Table 3-6 Intel387 SX Math
CoProcessor Data Type
Representation in Memory ÀÀÀÀ 19
Table 3-7 CPU Interrupt Vectors
Reserve for Math
CoProcessor ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 20
Table 3-8 Intel387 SX Math
CoProcessor Exceptions ÀÀÀÀÀÀ 20
Table 4-1 Pin Summary ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 22
Table 4-2 Output Pin Status during
Reset ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 23
Table 4-3 Bus Cycle Definition ÀÀÀÀÀÀÀÀÀÀ 26
Table 6-1 Thermal Resistances
(§ C/Watt) iJC and iJA ÀÀÀÀÀÀÀÀ 33
Table 6-2 Maximum TA at Various
Airflows ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 33
Table 7-1 D.C. Specifications ÀÀÀÀÀÀÀÀÀÀÀ 34
Table 7-2a Timing Requirements of the
Bus Interface Unit ÀÀÀÀÀÀÀÀÀÀÀÀ 35
Table 7-2b Timing Requirements of the
Execution Unit ÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ 36
Table 7-2c Other AC Parameters ÀÀÀÀÀÀÀÀÀ 36
Table 8-1 Instruction Formats ÀÀÀÀÀÀÀÀÀÀÀ 41
4
4
Intel387 TM SX MATH COPROCESSOR
1.0
PIN ASSIGNMENT
The Intel387 SX Math CoProcessor pinout as
viewed from the top side of the component is shown
in Figure 1-1. VCC and VSS (GND) connections must
be made to multiple pins. The circuit board should
include VCC and VSS planes for power distribution
and all VCC and VSS pins must be connected to the
appropriate plane.
NOTE:
Pins identified as N.C. should remain completely
unconnected.
240225 – 1
Figure 1-1.
Intel387 TM
SX Math CoProcessor Pinout
Table 1-1. Pin Cross ReferenceÐFunctional Grouping
BUSYÝ
PEREQ
ERRORÝ
36
56
35
ADSÝ
CMD0Ý
NPS1Ý
NPS2
STEN
W/RÝ
47
48
44
45
40
41
READYÝ
READYOÝ
49
57
CKM
CPUCLK2
NUMCLK2
59
54
53
RESETIN
51
D00
D01
D02
D03
D04
D05
D06
D07
D08
D09
D10
D11
D12
D13
D14
D15
19
20
23
8
7
6
3
2
24
28
29
30
16
15
12
11
VCC
4
9
13
22
26
31
33
37
39
43
46
50
58
62
64
VSS
5
14
21
25
27
32
34
38
42
55
60
61
63
66
N.C.
1
10
17
18
52
65
67
68
5
5
Intel387 TM SX MATH COPROCESSOR
1.1
Pin Description Table
The following table lists a brief description of each
pin on the Intel387 SX Math CoProcessor. For a
more complete description refer to Section 4.1 Signal Description. The following definitions are used in
these descriptions:
Ý
The signal is active LOW.
I
O
Input Signal
Output Signal
I/O
Input and Output Signal
Symbol
Type
Name and Function
ADSÝ
I
ADDRESS STROBE indicates that the address and bus cycle definition is valid.
BUSYÝ
O
BUSY indicates that the Math CoProcessor is currently executing an instruction.
CKM
I
CLOCKING MODE is used to select synchronous or asynchronous clock modes.
CMD0
I
COMMAND determines whether an opcode or operand are being sent to the Math
CoProcessor. During a read cycle it indicates which register group is being read.
CPUCLK2
I
CPU CLOCK input provides the timing for the bus interface unit and the execution
unit in synchronous mode.
D15–D0
I/O
DATA BUS is used to transfer instructions and data between the Math
CoProcessor and CPU.
ERRORÝ
O
ERROR signals that an unmasked exception has occurred.
NC
Ð
NO CONNECT should always remain unconnected. Connection of a N.C. pin may
cause the Math CoProcessor to malfunction or be incompatible with future
steppings.
NPS1Ý
I
NPX SELECT 1 is used to select the Math CoProcessor.
NPS2
I
NPX SELECT 2 is used to select the Math CoProcessor.
NUMCLK2
I
NUMERICS CLOCK is used in asynchronous mode to drive the Floating Point
Execution Unit.
PEREQ
O
PROCESSOR EXTENSION REQUEST signals the CPU that the Math
CoProcessor is ready for data transfer to/from its FIFO.
READYÝ
I
READY indicates that the bus cycle is being terminated.
READYOÝ
O
READY OUT signals the CPU that the Math CoProcessor is terminating the bus
cycle.
RESETIN
I
SYSTEM RESET terminates any operation in progress and forces the Math
CoProcessor to enter a dormant state.
STEN
I
STATUS ENABLE serves as a master chip select for the Math CoProcessor.
When inactive, this pin forces all outputs and bi-directional pins into a floating
state.
W/RÝ
I
WRITE/READ indicates whether the CPU bus cycle in progress is a read or a write
cycle.
VCC
I
SYSTEM POWER provides the a 5V nominal D.C. supply input.
VSS
I
SYSTEM GROUND provides the 0V connection from which all inputs and outputs
are measured.
6
6
Intel387 TM SX MATH COPROCESSOR
2.0
# Expands Intel386 SX CPU data types to include
FUNCTIONAL DESCRIPTION
The Intel387 SX Math CoProcessor is designed to
support the Intel386 SX Microprocessor and effectively extend the CPU architecture by providing fast
execution of arithmetic instructions and transcendental functions. This component contains internal
power management circuitry for reduced active power dissipation and an automatic idle mode.
32-bit, 64-bit, and 80-bit Floating Point; 32-bit and
64-bit Integers; and 18 Digit BCD Operands.
# Directly extends the Intel386 SX CPU Instruction
Set to trigonometric, logarithmic, exponential,
and arithmetic functions for all data types.
# Operates independently of Real, Protected, and
Virtual-86 Modes of the Intel386 SX Microprocessors.
# Fully compatible with the Intel387 SL Mobile and
2.1 Feature List
# New power saving design provides low power
dissipation in active and idle modes.
DX Math CoProcessors. Implements all Intel387
Math CoProcessor architectural enhancements
over 8087 and 80287.
# Implements ANSI/IEEE Standard 754-1985 for
# Higher Performance, 10%–25% higher benchmark performance than the original Intel387 SX
Math CoProcessor.
# High Performance 84-bit Internal Architecture
# Eight 80-bit Numeric Registers, usable as individually addressable general registers or as a register stack.
# Full-range transcendental operations for SINE,
COSINE, TANGENT, ARCTANGENT, and LOGARITHM.
# Programmable rounding modes and notification
of rounding effects.
# Exception reporting either by software polling or
hardware interrupts.
# Fully compatible with the SX Microprocessors.
binary floating point arithmetic.
# Upward Object Code compatible from 8087 and
80287.
2.2 Math CoProcessor Architecture
As shown in Figure 2-1, the Intel387 SX Math CoProcessor is internally divided into four sections; the
Bus Control Logic, the Data Interface and Control
Logic, the Floating Point Unit, and the Power Management Unit. The Bus Control Logic is responsible
for the CPU bus tracking and interface. The Data
Interface and Control Unit latches data and decodes
instructions. The Floating Point Unit executes the
mathematical instructions. The Power Management
Unit is new to the Intel387 family and is the nucleus
240225 – 2
Figure 2-1.
Intel387 TM
SX Math CoProcessor Block Diagram
7
7
Intel387 TM SX MATH COPROCESSOR
of the static architecture. It is responsible for shutting down idle sections of the device to save power.
Microprocessor/Math CoProcessor Interface
The Intel386 CPU interprets the pattern 11011B in
most significant five bits of an instruction as an opcode intended for a math coprocessor. Instructions
thus marked are called ESCAPE or ESC instructions. Upon decoding the instruction as an ESC instruction, the Intel386 CPU transfers the opcode to
the math coprocessor through an I/O write cycle at
a dedicated address (8000F8H) outside the normal
programmed I/O address range. The math coprocessor has dedicated output signals for controlling
the data transfer and notifying the CPU if the Math
CoProcessor is busy or that a floating point error has
occurred.
Math CoProcessor accepts the instruction and
ramps the internal core within one clock so there is
no impact to performance or throughput. In idle
mode, the Intel387 SX Math CoProcessor draws typically 4 mA of current and reduces case temperature
to near ambient.
NOTE:
In asynchronous clock mode (CKM e 0), the internal idle mode is disabled.
2.4 Compatibility
The Intel387 SX Math CoProcessor is compatible
with the Intel387 SL Mobile Math CoProcessor. Due
to the increased performance and internal pipelining
effects, diagnostic programs should never use instruction execution time for test purposes.
2.3 Power Management
The Intel387 SX Math CoProcessor offers two
modes of power management; dynamic and idle.
2.3.1 DYNAMIC MODE
Dynamic Mode is when the device is executing an
instruction. Using Intel’s CHMOS IV technology, the
Intel387 SX Math CoProcessor draws considerably
less power than its predecessor. The active power
supply current is reduced to approximately 100 mA
at 20 MHz and provides low case temperatures.
2.3.2 IDLE MODE
When an instruction is not being executed, the
Intel387 SX Math CoProcessor will automatically
change to Idle Mode . Three clocks after completion
of the previous instruction, the internal power manager shuts down the floating point execution unit
and all non-essential circuitry. Only portions of the
Bus Interface Unit remain active to monitor the CPU
bus activity and to accept the next instruction when
it is transferred. When the CPU transfers the next
instruction to the Math CoProcessor, the Intel387 SX
2.5 Performance
The increased performance of floating point calculations can be attributed to the 84-bit architecture and
floating point processor. For the CPU to execute
floating point calculations requires very long software emulation methods with reduced resolution
and accuracy. The performance of the Intel387 SX
Math CoProcessor has been further enhanced
through improvements in the internal microcode and
through internal architectural changes. These refinements will increase Whetstone benchmarks by approximately 10% to 25% over the original Intel387
SX Math CoProcessor.
Real performance, however, should be measured
with application software. Depending upon software
coding, system overhead, and percentage of floating
point instructions, performance can vary significantly.
8
8
Intel387 TM SX MATH COPROCESSOR
3.0
PROGRAMMING INTERFACE
The Intel387 SX Math CoProcessor effectively extends to an Intel386 Microprocessor system additional instructions, registers, data types, and interrupts specifically designed to facilitate high-speed
floating point processing. All communication between the CPU and the Math CoProcessor is transparent to applications software. The CPU automatically controls the Math CoProcessor whenever a
numerics instruction is executed. All physical memory and virtual memory of the CPU are available for
storage of the instructions and operands of programs that use the Math CoProcessor. All memory
addressing modes, including use of displacement,
base register, index register, and scaling are available for addressing numerical operands.
The Intel387 SX Math CoProcessor is software compatible with the Intel387 DX Math CoProcessors and
supports all applications written for the Intel386 CPU
and Intel387 Math CoProcessors.
3.1 Instruction Set
The Intel386 CPU interprets the pattern 11011B in
most significant five bits of an instruction as an opcode intended for a math coprocessor. Instructions
thus marked are called ESCAPE or ESC instruction.
The typical Math CoProcessor instruction accepts
one or two operands and produces one or sometimes two results. In two-operand instructions, one
operand is the contents of the Math CoProcessor
register, while the other may be a memory location.
The operands of some instructions are predefined;
for example, FSQRT always takes the square root of
the number in the top stack element.
The Intel387 SX Math CoProcessor instruction set
can be divided into six groups. The following sections gives a brief description of each instruction.
Section 8.0 defines the instruction format and byte
fields. Further details can be obtained from the
Intel387 User’s Manual, Programmer’s Reference,
Order Ý231917.
3.1.1 DATA TRANSFER INSTRUCTIONS
The class includes the operations that load, store,
and convert operands of any support data types.
Integer Transfers
FILD
Load (convert from) Integer (word, short,
long)
FIST Store (convert to) Integer (word, short)
FISTP Store (convert to) Integer and pop (word,
short, long)
Packed Decimal Transfers
FBLD Load (convert from) packed decimal
FBSTP Store packed decimal and pop
3.1.2 ARITHMETIC INSTRUCTIONS
This class of instructions provide variations on the
basic add, subtract, multiply, and divide operations
and a number of other basic arithmetic operations.
Operands may reside in registers or one operand
may reside in memory.
Addition
FADD
FADDP
Add Real
Add Real and pop
FIADD
Add Integer
Subtraction
FSUB
FSUBP
Subtract Real
Subtract Real and pop
FISUB
Subtract Integer
FSUBR
Subtract Real reversed
FSUBRP Subtract Real reversed and pop
FISUBR
Subtract Integer reversed
Multiplication
FMUL
Multiply Real
FMULP
Multiply Real and pop
FIMUL
Division
FDIV
FDIVP
FIDIV
FDIVR
FDIVRP
FIDIVR
Multiply Integer
Divide
Divide
Divide
Divide
Divide
Divide
Real
Real and pop
Integer
Real reversed
Real reversed and pop
Integer reversed
Real Transfers
FLD
Load Real (single, double, extended)
FST
Store Real (single, double)
FSTP Store Real and pop (single, double, extended)
FXCH Exchange registers
9
9
Intel387 TM SX MATH COPROCESSOR
Other Operations
FSQRT
FSCALE
Square Root
Scale
FPREM
Partial Remainder
FPREM1 IEEE standard partial remainder
FRNDINT Round to Integer
FXTRACT Extract Exponent and Significand
FABS
Absolute Value
FCHS
Change sign
3.1.3 COMPARISON INSTRUCTION
Instructions of this class allow comparison of numbers of all supported real and integer data types.
Each of these instructions analyzes the top stack
element often in relationship to another operand and
reports the result as a condition code in the status
word.
FCOM
Compare Real
FCOMP
FCOMPP
FUCOM
Compare Real and pop
Compare Real and pop twice
Unordered compare Real
FUCOMP Unordered compare Real and pop
FUCOMPP Unordered compare Real and pop
twice
FICOM
Compare Integer
FICOMP Compare Integer and pop
FTST
FXAM
Test
Examine
3.1.4 TRANSCENDENTAL INSTRUCTIONS
This group of the Intel387 operations includes trigonometric, inverse trigonometric, logarithmic and exponential functions. The transcendental operate on
the top one or two stack elements, and they return
their results to the stack. The trigonometric operations assume their arguments are expressed in radians. The logarithmic and exponential operations
work in base 2.
FSIN
Sine
FCOS
Cosine
FSINCOS Sine and cosine
FPTAN
Tangent
FPATAN
F2XM1
Arctangent of ST(1)/ST
2x –1
FYL2X
Y * log2X
FYL2XP1 Y * log2(X a 1)
3.1.5 LOAD CONSTANT INSTRUCTIONS
Each of these instructions loads (pushes) a commonly used constant onto the stack. The constants
have extended real values nearest to the infinitely
precise numbers. The only error that can be generated is an Invalid Exception if a stack overflow occurs.
FLDZ
Load a 0.0
FLD1
FLDPI
FLDL2T
FLDL2E
FLDLG2
FLDLN2
Load
Load
Load
Load
Load
Load
a 1.0
q
log2 10
log2e
log102
loge2
10
10
Intel387 TM SX MATH COPROCESSOR
FRSTOR
FINCSTP
3.1.6 PROCESSOR INSTRUCTIONS
(ADMINISTRATIVE)
Restore State
Increment Stack pointer
FINIT
Initialize Math CoProcessor
FDECSTP Decrement Stack pointer
FLDCW
Load Control Word
FSTCW
FLDCW
Store Control Word
Load Status Word
FFREE
FNOP
Free Register
No Operation
FWAIT
Report Math CoProcessor Error
FSTSW
Store Status Word
FSTSW AX Store Status Word to AX register
FCLEX
Clear Exceptions
FSTENV
Store Environment
FLDENV
FSAVE
Load Environment
Save State
3.2 Register Set
Figure 3-1 shows the Intel387 SX Math CoProcessor
register set. When a Math CoProcessor is present in
a system, programmers may use these registers in
addition to the registers normally available on the
CPU.
i386 TM Microprocessor Registers
GENERAL REGISTERS
31
16 15
0
AX
EAX
AH
CS
AL
BX
EBX
BH
SS
BL
DS
CX
ECX
CH
EDX
SEGMENT REGISTERS
15
0
ES
CL
FS
DX
DH
GS
DL
ESI
SI
EDI
DI
31
0
EIP
EFLAGS
EBP
ESP
BP
SP
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
i387 TM Math CoProcessor Data Registers
79
R0
Sign
78
64
Exponent
63
0
Tag
Field
1 0
Significand
R1
R2
R3
R4
R5
R6
R7
15
0
Control Register
Status Register
Tag Word
47
0
Instruction Pointer (in CPU)
Data Pointer (in CPU)
Figure 3-1. Intel386 TM CPU and Intel387 TM Math CoProcessor Register Set
11
11
Intel387 TM SX MATH COPROCESSOR
3.2.1 STATUS WORD (SW) REGISTER
The 16-bit status word (in the status register) shown
in Figure 3-2 reflects the overall state of the Math
CoProcessor. It can be read and inspected by programs using the FSTSW memory or FSTSW AX instructions.
Bit 15, the Busy bit (B) is included for 8087 compatibility only. It always has the same value as the Error
Summary bit (ES, bit 7 of status word); it does not
indicate the status of the BUSYÝ output of the Math
CoProcessor.
Bits 13–11 (TOP) serves as the pointer to the Math
CoProcessor data register that is the current Top-OfStack. The significance of the stack top is described
in Section 3.2.5 Data Registers.
The four numeric condition code bits (C3 –C0, Bit 14,
10–8) are similar to the flags in a CPU; instructions
that perform arithmetic operations update these bits
to reflect the outcome. The effects of the instructions on the condition code are summarized in Tables 3-1 through 3-4. These condition code bits are
used principally for conditional branching. The
FSTSW AX instructions stores the Math CoProcessor status word directly to the CPU AX register, allowing the condition codes to be inspected efficiently by Intel386 CPU code. The Intel386 CPU SAHF
instruction can copy C3 –C0 directly to the flag bits to
simplify conditional branching. Table 3-5 shows the
mapping of these bits to the Intel386 CPU flag bits.
Bit 7 is the error summary (ES) status bit. This bit is
set if any unmasked exception bit is set; it is clear
otherwise. If this bit is set, the ERRORÝ signal is
asserted.
Bit 6 is the stack flag (SF). This bit is used to distinguish invalid operations due to stack overflow or underflow from other kinds of invalid operations. When
SF is set, bit 9 (C1) distinguishes between stack
overflow (C1 e 1) or underflow (C1 e 0).
Bit 5 – 0 are the six exception flags of the status word
and are set to indicate that during an instruction execution the Math CoProcessor has detected one of
six possible exception conditions since these status
bits were last cleared or reset. Section 3.5 entitled
Exception Handling explains how they are set and
used.
The exception flags are ‘‘sticky’’ bits and can only
be cleared by the instructions FINIT, FCLEX,
FLDENV, FSAVE, and FRSTOR. Note that when a
new value is loaded into the status word by the
FLDENV or FRSTOR instruction, the value of ES (bit
7) and B (bit 15) are not derived from the values
loaded from memory but rather are dependent upon
the values of the exception flags (bits 5 – 0) in the
status word and their corresponding masks in the
control word. If ES is set in such a case, the
ERRORÝ output of the Math CoProcessor is activated immediately.
240225 – 3
ES is set if any unmasked exception bit is set; cleared otherwise. See Table 2-2 for interpretation of condition code.
TOP values:
000 e Register 0 is Top of Stack
001 e Register 1 is Top of Stack
.
.
.
e
Register 7 is Top of Stack
111
For definitions of exceptions, refer to the section entitled ‘‘Exception Handling’’
Figure 3-2. Status Word
12
12
Intel387 TM SX MATH COPROCESSOR
Table 3-1. Condition Code Interpretation
Instruction
C0 (S)
FPREM, FPREM1
(see Table 3-2)
Q2
FCOM, FCOMP,
FCOMPP, FTST,
FUCOM, FUCOMP,
FUCOMPP, FICOM,
FICOMP
FXAM
FCHS, FABS, FXCH,
FINCSTP, FDECSTP,
Constant loads,
FXTRACT, FLD,
FILD, FBLD,
FSTP (ext real)
FIST, FBSTP,
FRNDINT, FST,
FSTP, FADD, FMUL,
FDIV, FDIVR,
FSUB, FSUBR,
FSCALE, FSQRT,
FPATAN, F2XM1,
FYL2X, FYL2XP1
FPTAN, FSIN
FCOS, FSINCOS
FLDENV, FRSTOR
FLDCW, FSTENV,
FSTCW, FSTSW,
FCLEX, FINIT,
FSAVE
O/UÝ
Reduction
Roundup
UNDEFINED
C3 (Z)
Three least significant bits
of quotient
Q0
Result of comparison
(see Table 3-3)
C1 (A)
Q1
or O/UÝ
C2 (C)
Reduction
0 e complete
1 e incomplete
Zero
or O/UÝ
Operand is not
comparable
(Table 3-3)
Operand class
(see Table 3-4)
Sign
or O/UÝ
Operand class
(Table 3-4)
UNDEFINED
Zero
or O/UÝ
UNDEFINED
UNDEFINED
Roundup
or O/UÝ
UNDEFINED
UNDEFINED
Roundup
or O/UÝ,
undefined
if C2 e 1
Reduction
0 e complete
1 e incomplete
Each bit loaded from memory
UNDEFINED
When both IE and SF bits of status word are set, indicating a stack exception, this bit
distinguishes between stack overflow (C1 e 1) and underflow (C1 e 0).
If FPREM or FPREM1 produces a remainder that is less than the modulus, reduction is
complete. When reduction is incomplete the value at the top of the stack is a partial
remainder, which can be used as input to further reduction. For FPTAN, FSIN, FCOS, and
FSINCOS, the reduction bit is set if the operand at the top of the stack is too large. In this
case the original operand remains at the top of the stack.
When the PE bit of the status word is set, this bit indicates whether the last rounding in the
instruction was upward.
Do not rely on finding any specific value in these bits.
13
13
Intel387 TM SX MATH COPROCESSOR
Table 3-2. Condition Code Interpretation after FPREM and FPREM1 Instructions
Condition Code
Interpretation after FPREM and FPREM1
C2
C3
C1
C0
1
X
X
X
Q1
Q0
Q2
Q MOD8
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
0
0
0
1
1
1
1
0
1
2
3
4
5
6
7
0
Incomplete Reduction:
further interation required
for complete reduction
Complete Reduction:
C0, C3, C1 contain three least
significant bits of quotient
Table 3-3. Condition Code Resulting from Comparison
Order
C3
C2
C0
TOP l Operand
TOP k Operand
0
0
1
1
0
0
0
1
0
1
0
1
TOP e Operand
Unordered
Table 3-4. Condition Code Defining Operand Class
C3
C2
C1
C0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
0
Value at TOP
a Unsupported
a NaN
b Unsupported
b NaN
a Normal
a Infinity
b Normal
b Infinity
a0
a Empty
b0
b Empty
a Denormal
b Denormal
Table 3-5 Mapping Condition Codes to Intel386 TM CPU Flag Bits
240225 – 4
14
14
Intel387 TM SX MATH COPROCESSOR
3.2.2 CONTROL WORD (CW) REGISTER
The Math CoProcessor provides the programmer
with several processing options that are selected by
loading a control word from memory into the control
register. Figure 3-3 show the format and encoding of
fields in the control word.
The low-order byte of the control word register is
used to configure the exception masking. Bits 5 – 0
of the control word contain individual masks for each
of the six exceptions that the Math CoProcessor recognizes. See Section 3.5, Exception Handling, for
further explanation on the exception control and definition.
The high-order byte of the control word is used to
configure the Math CoProcessor operating mode, including precision, rounding and infinity control.
# The rounding control (RC) field (bits 11–10) provide for directed rounding and true chop, as well
as the unbiased round to nearest even mode
specified in the IEEE standard. Rounding control
affects only those instructions that perform
rounding at the end of the operation (and thus
can generate a precision exception); namely,
FST, FSTP, FIST, all arithmetic instructions (except FPREM, FPREM1, FXTRACT, FABS, and
FCHS) and all transcendental instructions.
# The precision control (PC) field (bits 9 – 8) can be
used to set the Math CoProcessor internal operating precision of the significand at less than the
default of 64 bits (extended precision). This can
be useful in providing compatibility with early generation arithmetic processors of smaller precision. PC affects only the instructions FADD,
FSUB(R), FMUL, FDIV(R), and FSQRT. For all
other instructions, either the precision is determined by the opcode or extended precision is
used.
# The ‘‘infinity control bit’’ (bit 12) is not meaningful
to the Intel387 SX Math CoProcessor and programs must ignore its value. To maintain compatibility with the 8087 and 80287 (non-387 core),
this bit can be programmed, however, regardless
of its value the Intel387 SX Math CoProcessor
always treats infinity in the affine sense ( b % k
a % ). This bit is initialized to zero both after a
hardware reset and after FINIT instruction.
All other bits are reserved and should not be programmed, to assure compatibility with future processors.
240225 – 5
Precision Control
00Ð24 bits (single precision)
01Ð(reserved)
10Ð53 bits (double precision)
11Ð64 bits (extended precision)
Rounding Control
00ÐRound to nearest or even
01ÐRound down (toward b % )
10ÐRound up (toward a % )
11ÐChop (truncate toward zero)
Figure 3-3. Control Word
15
15
Intel387 TM SX MATH COPROCESSOR
3.2.3 DATA REGISTER
Intel387 SX Math CoProcessor data register set
consists of eight registers (R0–R7) which are treated as both a stack and a general register file. Each
of these data registers in the Math CoProcessor is
80 bits wide and is divided into fields corresponding
to the Math CoProcessor’s extended-precision real
data type, which is used for internal calculations.
The Math CoProcessor register set can be accessed
either as a stack, with instructions operating on the
top one or two stack elements, or as individually addressable registers. The TOP field in the status word
identifies the current top-of-stack register. A ‘‘push’’
operation decrements TOP by one and loads a value
into the new top register. A ‘‘store and pop’’ operation stores the value from the current top register
into memory and then increments TOP by one. The
Math CoProcessor register stack grows ‘‘down’’
toward lower-addressed registers.
Most of the Intel387 SX Math CoProcessor operations use the register stack as the operand(s) and/or
as a place to store the result. Instructions may address the data register either implicitly or explicitly.
Many instructions operate on the register at the top
of the stack. These instructions implicitly address
the register at which TOP points. Other instructions
allow the programmer to explicitly specify which register to use. Explicit register addressing is also relative to TOP (where ST denotes the current stack top
and ST(i) refers to the i’th register from the ST in the
stack so the real register address in computed as
ST a i).
3.2.4 TAG WORD (TW) REGISTER
The tag word marks the content of each numeric
data register, as Figure 3-4 shows. Each two-bit tag
represents one of the eight data register. The princi-
pal function of the tag word is to optimize the Math
CoProcessor’s performance and stack handling by
making it possible to distinguish between empty and
non-empty register locations. It also enables exception handlers to identify special values (e.g. NaNs or
denormals) in the contents of a stack location without the need to perform complex decoding of the
actual data.
3.2.5 INSTRUCTION AND DATA POINTERS
Because the Math CoProcessor operates in parallel
with the CPU, any exceptions detected by the Math
CoProcessor may be reported after the CPU has executed the ESC instruction which caused it. To allow
identification of the numeric instruction which
caused the exception, the Intel386 Microprocessor
contains registers that aid in diagnosis. These registers supply the address of the failing instruction and
the address of its numeric memory operand (if appropriate).
The instruction and data pointers are provided for
user-written exception handlers. These registers are
located in the CPU, but appear to be located in the
Math CoProcessor because they are accessed by
the ESC instructions FLDENV, FSTENV, FSAVE,
and FRSTOR; which transfer the values between
the registers and memory. Whenever the CPU executes a new ESC instruction (except administrative
instructions), it saves the address of the instruction
(including any prefixes that may be present), the address of the operand (if present) and the opcode.
The instruction and data pointers appear in one of
four formats depending on the operating mode of
the CPU (protected mode or real-address mode)
and depending on the operand size attribute in effect (32-bit operand or 16-bit operand). (See Figures
3-5, 3-6, 3-7, and 3-8.) Note that the value of the
data pointer is undefined if the prior ESC instruction
did not have a memory operand.
15
TAG (7)
0
TAG (6)
TAG (5)
TAG (4)
TAG (3)
TAG (2)
TAG (1)
TAG (0)
NOTE:
The index i of tag(i) is not top-relative. A program typically uses the ‘‘top’’ field of Status Word to determine which tag(i)
field refers to logical top of stack.
TAG VALUES:
00 e Valid
01 e Zero
10 e QNaN, SNaN, Infinity, Denormal and Unsupported Formats
11 e Empty
Figure 3-4. Tag Word Register
16
16
Intel387 TM SX MATH COPROCESSOR
32-BIT PROTECTED MODE FORMAT
31
23
15
7
0
RESERVED
CONTROL WORD
0
RESERVED
STATUS WORD
4
RESERVED
TAG WORD
8
IP OFFSET
00000
C
OPCODE 10..0
CS SELECTOR
10
DATA OPERAND OFFSET
14
RESERVED
OPERAND SELECTOR
18
Figure 3-5. Instruction and Data Pointer Image in Memory, 32-Bit Protected-Mode Format
15
16-BIT PROTECTED MODE FORMAT
7
0
CONTROL WORD
0
STATUS WORD
2
TAG WORD
4
IP OFFSET
6
CS SELECTOR
8
OPERAND OFFSET
A
OPERAND SELECTOR
C
Figure 3-6. Instruction and Data Pointer Image in Memory, 16-Bit Protected-Mode Format
31
23
0000
32-BIT REAL-ADDRESS MODE FORMAT
15
0
CONTROL WORD
0
RESERVED
STATUS WORD
4
RESERVED
TAG WORD
8
RESERVED
INSTRUCTION POINTER 15..0
C
INSTRUCTION POINTER 31..16
RESERVED
0000
7
RESERVED
OPERAND POINTER 31..16
0
OPCODE 10..0
OPERAND POINTER 15..0
0000
00000000
10
14
18
Figure 3-7. Instruction and Data Pointer Image in Memory, 32-Bit Real-Mode Format
17
17
Intel387 TM SX MATH COPROCESSOR
16-BIT REAL-ADDRESS MODE AND VIRTUAL 8086 MODE FORMAT
15
7
IP19.16
0
CONTROL WORD
0
STATUS WORD
2
TAG WORD
4
INSTRUCTION POINTER 15..0
6
0
OPCODE 10..0
8
OPERAND POINTER 15..0
DP 19.16
0 0 0 0 0 0 0 0 0 0
A
0
0
C
Figure 3-8. Instruction and Data Pointer Image in Memory, 16-Bit Real-Mode Format
3.3 Data Types
3.4 Interrupt Description
Table 3-6 lists the seven data types that the Math
CoProcessor supports and presents the format for
each type. Operands are stored in memory with the
least significant digit at the lowest memory address.
Programs retrieve these values by generating the
lowest address. For maximum system performance,
all operands should start at physical-memory addresses that correspond to the word size of the
CPU; operands may begin at any other addresses,
but will require extra memory cycles to access the
entire operand.
CPU interrupts are used to report errors or exceptional conditions while executing numeric programs
in either real or protected mode. Table 3-7 shows
these interrupts and their functions.
The data type formats can be divided into three
classes: binary integer, decimal integer, and binary
real. These formats, however, exist in memory only.
Internally, the Math CoProcessor holds all numbers
in the extended-precision real format. Instructions
that load operands from memory automatically convert operands represented in memory as 16, 32, or
64-bit integers, 32 or 64-bit floating point numbers,
or 18 digit packed BCD numbers into extended-precision real format. Instructions that store operands in
memory perform the inverse type conversion.
In addition to the typical real and integer data values,
the Intel387 SX Math CoProcessor data formats encompass encodings for a variety of special values.
These special values have significance and can express relevant information about the computations
or operations that produced them. The various types
of special values are denormal real numbers, zeros,
positive and negative infinity, NaNs (Not-a-Number),
Indefinite, and unsupported formats. For further information on data types and formats, see the Intel387 Programmer’s Reference Manual.
3.5 Exception Handling
The Math CoProcessor detects six different exception conditions that occur during instruction execution. Table 3-8 lists the exception conditions in order
of precedence, showing for each the cause and the
default action taken by the Math CoProcessor if the
exception is masked by its corresponding mask bit in
the control word.
Any exception that is not masked by the control
word sets the corresponding exception flag of the
status word, sets the ES bit of the status word, and
asserts the ERRORÝ signal. When the CPU attempts to execute another ESC instruction or WAIT,
exception 16 occurs. The exception condition must
be resolved via an interrupt service routine. The return address pushed onto the CPU stack upon entry
to the service routine does not necessarily point to
the failing instruction nor to the following instruction.
The CPU saves the address of the floating-point instruction that caused the exception and the address
of any memory operand required by that instruction.
18
18
Intel387 TM SX MATH COPROCESSOR
Table 3-6. Intel387 TM SX Math CoProcessor Data Type Representation in Memory
240225 – 23
NOTES:
1. S e Sign bit (0 e positive, 1 e negative)
2. dn e Decimal digit (two per byte)
3. X e Bits have no significance; Math CoProcessor ignores when loading, zeros when storing
4. U e Position of implicit binary point
5. I e Integer bit of significand; stored in temporary real, implicit in single and double precision
6. Exponent Bias (normalized values):
Single: 127 (7FH)
Double: 1023 (3FFH)
Extended REal: 16383 (3FFFH)
7. Packed BCD: (b1)S (D17..D0)
8. Real: ( b1)S (2E-BIAS) (F0 F1...)
19
19
Intel387 TM SX MATH COPROCESSOR
Table 3-7. CPU Interrupt Vectors Reserved for Math CoProcessor
Interrupt
Number
Cause of Interrupt
7
An ESC instruction was encountered when EM or TS of CPU control register zero (CR0) was
set. EM e 1 indicates that software emulation of the instruction is required. When TS is set,
either an ESC or WAIT instruction causes interrupt 7. This indicates that the current Math
CoProcessor context may not belong to the current task.
9
In a protected-mode system, an operand of a coprocessor instruction wrapped around an
addressing limit (0FFFFH for expand-up segments, zero for expand-down segments) and
spanned inaccessible addresses(1). The failing numerics instruction is not restartable. The
address of the failing numerics instruction and data operand may be lost; an FSTENV does not
return reliable addresses. The segment overrun exception should be handled by executing an
FNINIT instruction (i.e., an FINIT without a preceding WAIT). The exception can be avoided by
never allowing numerics operands to cross the end of a segment.
13
In a protected-mode system, the first word of a numeric operand is not entirely within the limit of
its segment. The return address pushed onto the stack of the exception handler points at the
ESC instruction that caused the exception, including any prefixes. The Math CoProcessor has
not executed this instruction; the instruction pointer and data pointer register refer to a previous,
correctly executed instruction.
16
The previous numerics instruction caused an unmasked exception. The address of the faulty
instruction and the address of its operand are stored in the instruction pointer and data pointer
registers. Only ESC and WAIT instructions can cause this interrupt. The CPU return address
pushed onto the stack of the exception handler points to a WAIT or ESC instruction (including
prefixes). This instruction can be restarted after clearing the exception condition in the Math
CoProcessor. FNINIT, FNCLEX, FNSTSW, FNSTENV, and FNSAVE cannot cause this interrupt.
NOTE:
1. An operand may wrap around an addressing limit when the segment limit is near an addressing limit and the operand is
near the largest valid address in the segment. Because of the wrap-around, the beginning and ending addresses of such an
operand will be at opposite ends of the segment. There are two ways that such an operand may also span inaccessible
addresses: 1) if the segment limit is not equal to the addressing limit (e.g. addressing limit is FFFFH and segment limit is
FFFDH) the operand will span addresses that are not within the segment (e.g. an 8-byte operand that starts at valid offset
FFFCH will span addresses FFFC–FFFFH and 0000-0003H; however addresses FFFEH and FFFFH are not valid, because
they exceed the limit); 2) if the operand begins and ends in present and accessible segments but intermediate bytes of the
operand fall in a not-present page or in a segment or page to which the procedure does not have access rights.
Table 3-8. Intel387 TM SX Math CoProcessor Exceptions
Exception
Cause
Default Action
(if exception is masked)
Invalid
Operation
Operation on a signalling NaN, unsupported format,
indeterminate for (0- % , 0/0, ( a % ) a ( b % ), etc.), or stack
overflow/underflow (SF is also set).
Result is a quiet NaN,
integer indefinite, or
BCD indefinte
Denormalized
Operand
At least one of the operands is denormalized, i.e., it has the
smallest exponent but a nonzero significand.
Normal processing
continues
Zero Divisor
The divisor is zero while the dividend is a noninfinite, nonzero
number.
Result is %
Overflow
The result is too large in magnitude to fit in the specified format.
Result is largest finite
value or %
Underflow
The true result is nonzero but too small to be represented in the
specified format, and, if underflow exception is masked,
denormalization causes the loss of accuracy.
Result is denormalized
or zero
Inexact Result
(Precision)
The true result is not exactly representable in the specified
format (e.g. 1/3); the result is rounded according to the rounding
mode.
Normal processing
continues
20
20
Intel387 TM SX MATH COPROCESSOR
3.6 Initialization
After FNINIT or RESET, the control word contains
the value 037FH (all exceptions masked, precision
control 64 bits, rounding to nearest) the same values
as in an Intel287 after RESET. For compatibility with
the 8087 and Intel287, the bit that used to indicate
infinity control (bit 12) is set to zero; however, regardless of its setting, infinity is treated in the affine
sense. After FNINIT or RESET, the status word is
initialized as follows:
# All exceptions are set to zero.
# Stack TOP is zero, so that after the first push the
stack top will be register seven (111B).
# The condition code C3 –C0 is undefined.
# The B-bit is zero.
The tag word contains FFFFH (all stack locations
are empty).
The Intel386 Microprocessor and Intel387 Math CoProcessor initialization software must execute a
FNINIT instruction (i.e., FINIT without a preceding
WAIT) after RESET. The FNINIT is not strictly required for the Intel386 software, but Intel recommends its use to help ensure upware compatibility
with other processors. After a hardware RESET, the
ERRORÝ output is asserted to indicate that an Intel387 Math CoProcessor is present. To accomplish
this, the IE (Invalid Exception) and ES (Error Summary) bits of the status word are set, and the IM bit
(Invalid Exception Mask) in the control word is
cleared. After FNINIT, the status word and the control word have the same values as in an Intel287
Math CoProcessor after RESET.
3.7 Processing Modes
The Intel387 SX Math CoProcessor works the same
whether the CPU is executing in real-addressing
mode, protected mode, or virtual-8086 mode. All references to memory for numerics data or status information are performed by the CPU, and therefore
obey the memory-management and protection rules
of the CPU mode currently in effect. The Intel387 SX
Math CoProcessor merely operates on instruc-
tions and values passed to it by the CPU and therefore is not sensitive to the processing mode of the
CPU.
The real-address mode and virtual-8086 mode, the
Intel387 SX Math CoProcessor is completely upward
compatible with software for the 8086/8087 and
80286/80287 real-address mode systems.
In protected mode, the Intel387 SX Math CoProcessor is completely upward compatible with software
for the 80286/80287 protected mode system.
The only differences of operation that may appear
when 8086/8087 programs are ported to the protected mode (not using virtual-8086 mode) is in the
format of operands for the administrative instructions FLDENV, FSTENV, FRSTOR, and FSAVE.
3.8 Programming Support
Using the Intel387 SX Math CoProcessor requires
no special programming tools, because all new instructions and data types are directly supported by
the assembler and compilers for high-level languages. All Intel386 Microprocessor development
tools that support Intel387 Math CoProcessor programs can also be used to develop software for the
Intel386 SX Microprocessors and Intel387 SX Math
CoProcessors. All 8086/8088 development tools
that support the 8087 can also be used to develop
software for the CPU and Math CoProcessor in realaddress mode or virtual-8086 mode. All 80286 development tools that support the Intel287 Math CoProcessor can also be used to develop software for
the Intel386 CPU and Intel387 Math CoProcessor.
4.0
HARDWARE SYSTEM
INTERFACE
In the following description of hardware interface,
the Ý symbol at the end of a signal name indicates
that the active or asserted state occurs when the
signal is at a low voltage. When no Ý is present after
the signal name, the signal is asserted when at the
high voltage level.
21
21
Intel387 TM SX MATH COPROCESSOR
4.1 Signal Description
In the following signal descriptions, the Intel387 SX
Math CoProcessor pins are grouped by function as
shown by Table 4-1. Table 4-1 lists every pin by its
identifier, gives a brief description and lists some of
its characteristics (Refer to Figure 1-1 and Table 1-1
for pin configuration).
All output signals can be tri-stated by driving STEN
inactive. The output buffers of the bi-directional data
pins D15–D0 are also tri-state; they only leave the
floating state during read cycles when the Math CoProcessor is selected.
4.1.1 Intel386 CPU CLOCK 2 (CPUCLK2)
This input uses the CLK2 signal of the CPU to time
the bus control logic. Several other Math CoProcessor signals are referenced to the rising edge of this
signal. When CKM e 1 (synchronous mode) this pin
also clocks the data interface and control unit and
the floating point unit of the Math CoProcessor. This
pin requires CMOS-level input. The signal on this pin
is divided by two to produce the internal clock signal
CLK.
4.1.2 Intel387 MATH COPROCESSOR CLOCK 2
(NUMCLK2)
When CKM e 0 (asynchronous mode), this pin provides the clock for the data interface and control unit
and the floating point unit of the Math CoProcessor.
In this case, the ratio of the frequency of NUMCLK2
to the frequency of CPUCLK2 must lie within the
range 10:16 to 14:10 and the maximum frequency
must not exceed the device specifications. When
CKM e 1 (synchronous mode), signals on this pin
are ignored: CPUCLK2 is used instead for the data
interface and control unit and the floating point unit.
This pin requires CMOS level input and should be
tied low if not used.
Table 4-1. Pin Summary
Pin Name
Function
Active State
Input/
Output
Referenced
To . . .
I
I
I
I
CPUCLK2
O
O
O
CPUCLK2
CPUCLK2
NUMCLK2
I/O
I
I
I
O
CPUCLK2
CPUCLK2
CPUCLK2
CPUCLK2
CPUCLK2
I
I
I
I
CPUCLK2
CPUCLK2
CPUCLK2
CPUCLK2
Execution Control
CPUCLK2
NUMCLK2
CKM
RESETIN
Microprocessor Clock2
Math CoProcessor Clock2
Math CoProcessor Clock Mode
System Reset
PEREQ
BUSYÝ
ERRORÝ
Processor Request
Busy Status
Error Status
D15–D0
W/RÝ
ADSÝ
READYÝ
READYOÝ
Data Pins
Write/Read Bus Cycle
Address Strobe
Bus Ready Input
Ready Output
STEN
NPS1Ý
NPS2
CMD0Ý
Status Enable
Numerics Select Ý1
Numerics Select Ý2
Command
VCC
VSS
System Power
System Ground
High
Math CoProcessor Handshake
High
Low
Low
Bus Interface
High/Low
Low
Low
Low
Chip/Port Select
High
Low
High
Low
Power and Ground
22
22
Intel387 TM SX MATH COPROCESSOR
4.1.3 CLOCKING MODE (CKM)
4.1.5 PROCESSOR REQUEST (PEREQ)
This pin is strapping option. When it is strapped to
VCC (HIGH), the Math CoProcessor operates in synchronous mode; when strapped to VSS (LOW), the
Math CoProcessor operates in asynchronous mode.
These modes relate to clocking of the internal data
interface and control unit and the floating point unit
only; the bus control logic always operates synchronously with respect to the CPU.
When active, this pin signals to the CPU that the
Math CoProcessor is ready for data transfer to/from
its data FIFO. When all data is written to or read
from the data FIFO, PEREQ is deactivated. This signal always goes inactive before BUSYÝ goes inactive. This signal is reference to CPUCLK2. It should
be connected to the CPU PEREQ input pin.
Synchronous mode requires the use of only one
clock, the CPU’s CLK2. Use of synchronous mode
eliminates one clock generator from the board
design and is recommended for all designs. Synchronous mode also allows the internal Power Management Unit to enable the idle and standby power
saving modes.
4.1.6 BUSY STATUS (BUSYÝ)
When active, this pin signals to the CPU that the
Math CoProcessor is currently executing an instruction. This signal is referenced to CPUCLK2. It should
be connected to the CPU BUSYÝ input pin.
4.1.7 ERROR STATUS (ERRORÝ)
Asynchronous mode can provide higher performance of the floating point unit by running a faster
clock on NUMCLK2. (The CPU’s CLK2 must still be
connected to CPUCLK2 input.) This allows the floating point unit to run up to 40% faster than in synchronous mode. Internal power management is disabled in asynchronous mode.
4.1.4 SYSTEM RESET (RESETIN)
A LOW to HIGH transition on this pin causes the
Math CoProcessor to terminate its present activity
and to enter a dormant state. RESETIN must remain
active (HIGH) for at least 40 CPUCLK2 (NUMCLK2 if
CKM e 0) periods.
The HIGH to LOW transitions of RESETIN must be
synchronous with CPUCLK2, so that the phase of
the internal clock of the bus control logic (which is
the CPUCLK2 divided by two) is the same as
the phase of the internal clock of the CPU. After
RESETIN goes LOW, at least 50 CPUCLK2
(NUMCLK2 if CKM e 0) periods must pass before
the first Math CoProcessor instruction is written into
the Math CoProcessor. This pin should be connected to the CPU RESET pin. Table 4-2 shows the
status of the output pins during the reset sequence.
After a reset, all output pins return to their inactive
state except for ERRORÝ which remains active (for
CPU recognition) until cleared.
Table 4-2. Output Pin Status during Reset
Pin Value
Pin Name
HIGH
LOW
Tri-State OFF
READYOÝ, BUSYÝ
PEREQ, ERRORÝ
D15–D0
This pin reflects the ES bit of the status register.
When active, it indicates that an unmasked exception has occurred. This signal can be changed to the
inactive state only by the following instructions (without a preceding WAIT); FNINIT, FNCLEX,
FNSTENV, FNSAVE, FLDCW, FLDENV, and
FRSTOR. ERRORÝ is driven active during RESET
to indicate to the CPU that the Math CoProcessor is
present. This pin is referenced to NUMCLK2 (or
CPUCLK2 if CKM e 1). It should be connected to
the ERRORÝ pin of the CPU.
4.1.8 DATA PINS (D15 – D0)
These bi-directional pins are used to transfer data
and opcodes between the CPU and Math CoProcessor. They are normally connected directly to the corresponding CPU data pins. HIGH state indicates a
value of one. D0 is the least significant data bit. Timings are referenced to rising edge of CPUCLK2.
4.1.9 WRITE/READ BUS CYCLE (W/RÝ)
This signal indicates to the Math CoProcessor
whether the CPU bus cycle in progress is a read or a
write cycle. This pin should be connected directly to
the CPU’s W/RÝ pin. HIGH indicates a write cycle
to the Math CoProcessor; LOW a read cycle from
the Math CoProcessor. This input is ignored if any of
the signals STEN, NPS1Ý, or NPS2 are inactive.
Setup and hold times are referenced to CPUCLK2.
4.1.10 ADDRESS STROBE (ADSÝ)
This input, in conjunction with the READYÝ input,
indicates when the Math CoProcessor bus control
logic may sample W/RÝ and the chip select signals.
Setup and hold times are referenced to CPUCLK2.
This pin should be connected to the ADSÝ pin of
the CPU.
23
23
Intel387 TM SX MATH COPROCESSOR
4.1.11 BUS READY INPUT (READYÝ)
This input indicates to the Math CoProcessor when
a CPU bus cycle is to be terminated. It is used by the
bus control logic to trace bus activities. Bus cycles
can be extended indefinitely until terminated by
READYÝ. This input should be connected to the
same signal that drives the CPU’s READYÝ input.
Setup and hold times are referenced to CPUCLK2.
4.1.12 READY OUTPUT (READYOÝ)
This pin is activated at such a time that write cycles
are terminated after two clocks (except FLDENV
and FRSTOR) and read cycles after three clocks. In
configurations where no extra wait states are required, this pin must directly or indirectly drive the
READYÝ input of the CPU. Refer to the section entitled ‘‘BUS OPERATION’’ for details. This pin is activated only during bus cycles that select the Math
CoProcessor. This signal is referenced to CPUCLK2.
(FLDENV and FRSTOR require data transfers larger
than the FIFO. Therefore, PEREQ is activated for
the duration of transferring 2 words of 32 bits and
then deactivated until the FIFO is ready to accept
two additional words. The length of the write cycles
of the last operand word in each transfer as well as
the first operand word transfer of the entire instruction is 3 clocks instead of 2 clocks. This is done to
give the Intel386 CPU enough time to sample
PEREQ and to notice that the Intel387 is not ready
for additional transfers.)
4.1.13 STATUS ENABLE (STEN)
This pin serves as a chip select for the Math CoProcessor. When inactive, this pin forces BUSYÝ,
PEREQ, ERRORÝ and READYOÝ outputs into a
floating state. D15–D0 are normally floating and will
leave the floating state only if STEN is active and
additional conditions are met (read cycle). STEN
also causes the chip to recognize its other chip select inputs. STEN makes it easier to do on-board
testing (using the overdrive method) of other chips in
systems containing the Math CoProcessor. STEN
should be pulled up with a resistor so that it can be
pulled down when testing. In boards that do not use
on-board testing STEN should be connected to VCC.
Setup and hold times are relative to CPUCLK2. Note
that STEN must maintain the same setup and hold
times as NPS1Ý, NPS2, and CMD0Ý (i.e., if STEN
changes state during a Math CoProcessor bus cycle,
it must change state during the same CLK period as
the NPS1Ý, NPS2, and CMD0Ý signals).
4.1.14 MATH COPROCESSOR SELECT 1
(NPS1Ý)
When active (along with STEN and NPS2) in the first
period of a CPU bus cycle, this signal indicates that
the purpose of the bus cycle is to communicate with
the Math CoProcessor. This pin should be connected directly to the M/IOÝ pin of the CPU, so that the
Math CoProcessor is selected only when the CPU
performs I/O cycles. Setup and hold times are referenced to the rising edge of CPUCLK2.
4.1.15 MATH COPROCESSOR SELECT 2
(NPS2)
When active (along with STEN and NPS1Ý) in the
first period of a CPU bus cycle, this signal indicates
that the purpose of the bus cycle is to communicate
with the Math CoProcessor. This pin should be connected directly to the A23 pin of the CPU, so that the
Math CoProcessor is selected only when the CPU
issues one of the I/O addresses reserved for the
Math CoProcessor (8000F8h, 8000FCh, or 8000FEh
which is treated as 8000FCh by the Math CoProcessor). Setup and hold times are referenced to the rising edge of CPUCLK2.
4.1.16 COMMAND (CMD0Ý)
During a write cycle, this signal indicates whether an
opcode (CMD0Ý active low) or data (CMD0Ý inactive high) is being sent to the Math CoProcessor.
During a read cycle, it indicates whether the control
or status register (CMD0Ý active) or a data register
(CMD0Ý) is being read. CMD0Ý should be connected directly to the A2 output of the CPU. Setup and
hold times are referenced to the rising edge of
CPUCLK2 at the end of PH2.
4.1.17 SYSTEM POWER (VCC)
System power provides the a 5V DC supply input.
All VCC pins should be tied together on the circuit
board and local decoupling capacitors should be
used between VCC and VSS.
4.1.18 SYSTEM GROUND (VSS)
System ground provides the 0V connection from
which all inputs and outputs are measured. All VSS
pins should be tied together on the circuit board and
local decoupling capacitors should be used between
VCC and VSS.
24
24
Intel387 TM SX MATH COPROCESSOR
4.2 System Configuration
The Intel387 SX Math CoProcessor is designed to
interface with the Intel386 SX Microprocessor as
shown by Figure 4-1. A dedicated communication
protocol makes possible high-speed transfer of opcodes and operands between the CPU and Math
CoProcessor. The Intel387 SX Math CoProcessor is
designed so that no additional components are required for interface with the CPU. Most control pins
of the Math CoProcessor are connected directly to
pins of the CPU.
# The CPU and Math CoProcessor share the same
reset signals. They may also share the same
clock input; however, for greatest performance,
an external oscillator may be needed.
# The corresponding BusyÝ, ERRORÝ, and
PEREQ pins are connected together.
# The Math CoProcessor NPS1Ý and NPS2 inputs
are connected to the latched CPU M/IOÝ and
A23 outputs respectively. For Math CoProcessor
cycles, M/IOÝ is always LOW and A23 always
HIGH.
# The Math CoProcessor input CMD0 is connected
The interface between the Math CoProcessor and
the CPU has these characteristics:
# The Math CoProcessor shares the local bus of
the Intel386 SX Microprocessor.
to the latched A2 output. The Intel386 SX Microprocessor generates address 8000F8H when
writing a command and address 8000FCH or
8000FEH (treated as 8000FCH by the Intel387
SX Math CoProcessor) when writing or reading
data. It does not generate any other addresses
during Math CoProcessor bus cycles.
240225 – 6
Figure 4-1. Intel386 TM SX CPU and Intel387 TM SX Math CoProcessor System Configuration
25
25
Intel387 TM SX MATH COPROCESSOR
4.3 Math CoProcessor Architecture
As shown in Figure 2-1 Block Diagram, the Intel387
SX Math CoProcessor is internally divided into four
sections; the Bus Control Logic (BCL), the Data Interface and Control Logic, the Floating Point Unit
(FPU), and the Power Management Unit (PMU). The
Bus Control Logic is responsible for the CPU bus
tracking and interface. The BCL is the only unit in
the Math CoProcessor that must run synchronously
with the CPU; the rest of the Math CoProcessor can
run asynchronously with respect to the CPU. The
Data Interface and Control Unit is responsible for the
data flow to and from the FPU and the control registers, for receiving the instructions, decoding them,
sequencing the microinstructions, and for handling
some of the administrative instructions. The Floating
Point Unit (with the support of the control unit which
contains the sequencer and other support units) executes the mathematical instructions. The Power
Manager is new to the Intel387 family. It is responsible for shutting down idle sections of the device to
save power.
FIFO or the instruction decoder. The instruction decoder decodes the ESC instructions sent to it by the
CPU and generates controls that direct the data flow
in the FIFO. It also triggers the microinstruction sequencer that controls execution of each instruction.
If the ESC instruction is FINIT, FCLEX, FSTSW,
FSTSW AX, FSTCW, FSETPM, or FRSTPM, the
control unit executes it independently of the FPU
and the sequencer. The data interface and control
unit is the unit that generates the BUSYÝ, PEREQ,
and ERRORÝ signals that synchronize the Math
CoProcessor activities with the CPU.
4.3.3 FLOATING POINT UNIT
The FPU executes all instructions that involve the
register stack, including arithmetic, logical, transcendental, constant, and data transfer instructions. The
data path in the FPU is 84 bits wide (68 significant
bits, 15 exponent bits, and a sign bit) which allows
internal operand transfers to be performed at very
high speeds.
4.3.4 POWER MANAGEMENT UNIT
4.3.1 BUS CONTROL LOGIC
The BCL communicates solely with the CPU using
I/O bus cycles. The BCL appears to the CPU as a
special peripheral device. It is special in two respects: the CPU initiates I/O automatically when it
encounters ESC instructions, and the CPU uses reserved I/O addresses to communicate with the BCL.
The BCL does not communicate directly with memory. The CPU performs all memory access, transferring input operands from the memory to the Math
CoProcessor and transferring outputs from the Math
CoProcessor to memory.
4.3.2 DATA INTERFACE AND CONTROL UNIT
The data interface and control unit latches the data
and, subject to BCL control, directs the data to the
The Power Management Unit (PMU) controls all internal power savings circuits. When the Math CoProcessor is not executing an instruction, the PMU
disables the internal clock to the FPU, Control Unit,
and Data Interface within three clocks. The Bus
Control Logic remains enabled to accept the next
instruction. Upon decode of a valid Math CoProcessor bus cycle, the PMU enables the internal clock to
all circuits. No loss in performance occurs.
4.4 Bus Cycles
All bus cycles are initiated by the CPU. The pins
STEN, NPS1Ý, NPS2, CMD0, and W/RÝ identify
bus cycles for the Math CoProcessor. Table 4-3 defines the types of Math CoProcessor bus cycles.
Table 4-3. Bus Cycle Definition
STEN NPS1Ý NPS2 CMD0Ý W/RÝ
0
1
1
1
1
1
1
X
1
X
0
0
0
0
X
X
0
1
1
1
1
X
X
X
0
0
1
1
X
X
X
0
1
0
1
Bus Cycle Type
Math CoProcessor not selected and all outputs in floating state
Math CoProcessor not selected
Math CoProcessor not selected
CW or SW read from Math CoProcessor
Opcode write to Math CoProcessor
Data read from Math CoProcessor
Data write to Math CoProcessor
26
26
Intel387 TM SX MATH COPROCESSOR
4.4.1 INTEL387 SX MATH COPROCESSOR
ADDRESSING
value has already been written or read by the Math
CoProcessor before the CPU reads or changes the
value.
The NPS1Ý, NPS2, and CMD0 signals allow the
Math CoProcessor to identify which bus cycles are
intended for the Math CoProcessor. The Math CoProcessor responds to I/O cycles when the I/O address is 8000F8h, 8000FCh, and 8000FEh (treated
as 8000FCh). The Math CoProcessor responds to
I/O cycles when bit 23 of the I/O address is set. In
other words, the Math CoProcessor acts as an I/O
device in a reserved I/O address space.
Once it has started to execute a numerics instruction
and has transferred and operands from the CPU, the
Math CoProcessor can process the instruction in
parallel with and independent of the host CPU.
When the Math CoProcessor detects an exception,
it asserts the ERRORÝ signal, which causes a CPU
interrupt.
Because A23 is used to select the Intel387 SX Math
CoProcessor for data transfers, it is not possible for
a program running on the CPU to address the Math
CoProcessor with an I/O instruction. Only ESC instructions cause the CPU to communicate with the
Math CoProcessor.
4.4.2 CPU/MATH COPROCESSOR
SYNCHRONIZATION
The pins BUSYÝ, PEREQ, and ERRORÝ are used
for various aspects of synchronization between the
CPU and the Math CoProcessor.
BUSYÝ is used to synchronize instruction transfer
from the CPU to the Math CoProcessor. When the
Math CoProcessor recognizes an ESC instruction it
asserts BUSYÝ. For most ESC instructions, the
CPU waits for the Math CoProcessor to deassert
BUSYÝ before sending the new opcode.
The Math CoProcessor uses the PEREQ pin of the
CPU to signal that the Math CoProcessor is ready
for data transfer to or from its data FIFO. The Math
CoProcessor does not directly access memory; rather, the CPU provides memory access services for
the Math CoProcessor. (For this reason, memory access on behalf of the Math CoProcessor always
obeys the protection rules applicable to the current
CPU mode.) Once the CPU initiates an Math CoProcessor instruction that has operands, the CPU
waits for PEREQ signals that indicate when the Math
CoProcessor is ready for operand transfer. Once all
operands have been transferred (or if the instruction
has no operands) the CPU continues program execution while the Math CoProcessor executes the
ESC instruction.
In 8087/8087 systems, WAIT instructions may be
required to achieve synchronization of both commands and operands. In the Intel386 Microprocessor and Intel387 Math CoProcessor systems,
however, WAIT instructions are required only for operand synchronization; namely, after Math CoProcessor stores to memory (except FSTSW and
FSTCW) or load from memory. (In 80286/80287
systems, WAIT is required before FLDENV and
FRSTOR.) Used this way, WAIT ensures that the
4.4.3 SYNCHRONOUS/ASYNCHRONOUS
MODES
The internal logic of the Math CoProcessor can operate either directly from the CPU clock (synchronous mode) or from a separate clock (asynchronous
mode). The two configurations are distinguished by
the CKM pin. In either case, the bus control logic
(BCL) of the Math CoProcessor is synchronized with
the CPU clock. Use of asynchronous mode allows
the BCL and the FPU section of the Math CoProcessor to run at different speeds. In this case, the ratio
of the frequency of NUMCLK2 to the frequency of
CPUCLK2 must lie within the range 10:16 to 14:10.
Use of synchronous mode eliminates one clock generator from the board design. The internal Power
Management Unit of the Intel387 SX Math CoProcessor is disabled in asynchronous mode.
4.4.4 AUTOMATIC BUS CYCLE TERMINATION
In configurations where no extra wait states are required, READYOÝ can drive the CPU’s READYÝ
input and the Math CoProcessors READYÝ input. If
wait states are required, this pin should be connected to the logic that ORs all READY outputs from
peripheral devices on the CPU bus. READYOÝ is
asserted by the Math CoProcessor only during I/O
cycles that select the Math CoProcessor. Refer to
Section 5.0 Bus Operation for details.
5.0
BUS OPERATION
With respect to bus interface, the Intel387 SX Math
CoProcessor is fully synchronous with the CPU.
Both operate at the same rate because each generates its internal CLK signal by dividing CPUCLK2 by
two. Furthermore, both internal CLK signals are in
phase, because they are synchronized by the same
RESETIN signal.
A bus cycle for the Math CoProcessor starts when
the CPU activates ADSÝ and drives new values on
the address and cycle definition lines (W/RÝ,
M/IOÝ, etc.). The Math CoProcessor examines the
address and cycle definition lines in the same CLK
period during which ADSÝ is activated. This CLK
period is considered the first CLK of the bus cycle.
27
27
Intel387 TM SX MATH COPROCESSOR
During this first CLK period, the Math CoProcessor
also examines the W/RÝ input signal to determine
whether the cycle is a read or a write cycle and examines the CMD0Ý input to determine whether an
opcode, operand, or control/status register transfer
is to occur.
The Intel387 SX Math CoProcessor supports both
pipelined (i.e., overlapped) and non-pipelined bus
cycles. A non-pipelined cycle is one for which the
CPU asserts ADSÝ when no other bus cycle is in
progress. A pipelined bus cycle is one for which the
CPU asserts ADSÝ and provides valid next address
and control signals before the prior Math CoProcessor cycle terminates. The CPU may do this as early
as the second CLK period after asserting ADSÝ for
the prior cycle. Pipelining increases the availability of
the bus by at least one CLK period. The Intel387 SX
Math CoProcessor supports pipelined bus cycles in
order to optimize address pipelining by the CPU for
memory cycles.
Bus operation is described in terms of an abstract
state machine. Figure 5-1 illustrates the states and
state transitions for Math CoProcessor bus cycles:
# TI is the idle state. This is the state of the bus
logic after RESET, the state to which bus logic
returns after every non-pipelined bus cycle, and
the state to which bus logic returns after a series
of pipelined cycles.
# TRS is the READYÝ sensitive state. Different
types of bus cycles may require a minimum of
one or two successive TRS states. The bus logic
remains in TRS state until READYÝ is sensed, at
which point the bus cycle terminates. Any number
of wait states may be implemented by delaying
READYÝ, thereby causing additional successive
TRS states.
# TP is the first state for every pipelined bus cycle.
This state is not used by non-pipelined cycles.
Note that the bus logic tracks bus state regardless
of the values on the chip/port select pins. The
READYOÝ output of the Math CoProcessor indicates when a Math CoProcessor bus cycle may be
terminated if no extra wait states are required. For all
write cycles (except those for the instructions
FLDENV and FRSTOR), READYOÝ is always asserted during the first TRS state, regardless of the
number of wait states. For all read cycles (and write
cycles for FLDENV and FRSTOR), READYÝ is always asserted in the second TRS state, regardless
of the number of wait states. These rules apply to
both pipelined and non-pipelined cycles. Systems
designers may use READYOÝ in one of the following ways:
1. Connect it (directly or through logic that ORs
READYÝ signals from other devices) to the
READYÝ inputs of the CPU and Math CoProcessor.
2. Use it as one input to a wait-state generator.
The following sections illustrate different types of
Intel387 SX Math CoProcessor bus cycles. Because
different instructions have different amounts of overhead before, between, and after operand transfer
cycles, it is not possible to represent in a few diagrams all of the combinations of successive operand
transfer cycles. The following bus cycle diagrams
show memory cycles between Math CoProcessor
operand transfer cycles. Note however that, during
FRSTOR, some consecutive accesses to the Math
CoProcessor do not have intervening memory accesses. For the timing relationship between operand
transfer cycles and opcode write or other overhead
activities, see Figure 7-7 ‘‘Other Parameters’’.
5.1 Non-Pipelined Bus Cycles
Figure 5-2 illustrates bus activity for consecutive
non-pipelined bus cycles.
At the second clock of the bus cycle, the Math CoProcessor enters the TRS state. During this state, it
samples the READYÝ input and stays in this state
as long as READYÝ is inactive.
5.1.1 WRITE CYCLE
In write cycles, the Math CoProcessor drives the
READYOÝ signal for one CLK period during the
second CLK period of the cycle (i.e., the first TRS
state); therefore, the fastest write cycle takes two
CLK periods (see cycle 2 of Figure 5-2). For the instructions FLDENV and FRSTOR, however, the
Math CoProcessor forces wait state by delaying the
activation of READYOÝ to the second TRS state
(not shown in Figure 5-2).
240225 – 7
The Math CoProcessor samples the D15 – D0 inputs
into data latches at the falling edge of CLK as long
as it stays in TRS state.
Figure 5-1. Bus State Diagram
28
28
Intel387 TM SX MATH COPROCESSOR
240225 – 8
Cycles 1 & 2 represent part of the operand transfer cycle for instructions involving either 4-byte or 8-byte operand loads.
Cycles 3 & 4 represent part of the operand transfer cycle for a store operation.
*Cycles 1 & 2 could repeat here or TI states for various non-operand transfer cycles and overhead.
Figure 5-2. Non-Pipelined Read and Write Cycles
When READYÝ is asserted, the Math CoProcessor
returns to the idle state. Simultaneously with the
Math CoProcessor entering the idle state, the CPU
may assert ADSÝ again, signaling the beginning of
yet another cycle.
5.1.2 READ CYCLE
At the rising edge of CLK in the second CLK period
of the cycle (i.e., the first TRS state), the Math CoProcessor starts to drive the D15–D0 outputs and
continues to drive them as long as it stays in TRS
state.
At least one wait state must be inserted to ensure
that the CPU latches the correct data. Because the
Math CoProcessor starts driving the data bus only at
the rising edge of CLK in the second clock period of
the bus cycle, not enough time is left for the data
signals to propagate and be latched by the CPU before the next falling edge of CLK. Therefore, the
Math CoProcessor does not drive the READYOÝ
signal until the third CLK period of the cycle. Thus, if
the READYOÝ output drives the CPU’s READYÝ
input, one wait state is automatically inserted.
Because one wait state is required for Math CoProcessor reads, the minimum length of a Math CoProcessor read cycle is three CLK periods, as cycle 3 of
Figure 5-2 shows.
When READYÝ is asserted, the Math CoProcessor
returns to the idle state. Simultaneously with the
Math CoProcessor’s entering the idle state, the CPU
may assert ADSÝ again, signaling the beginning of
yet another cycle. The transition from TRS state to
idle state causes the Math CoProcessor to put the
D15 – D0 outputs into the floating state, allowing another device to drive the data bus.
5.2 Pipelined Bus Cycles
Because all the activities of the Math CoProcessor
bus interface occur either during the TRS state or
29
29
Intel387 TM SX MATH COPROCESSOR
during the transitions to or from that state, the only
difference between a pipelined and a non-pipelined
cycle is the manner of changing from one state to
another. The exact activities during each state are
detailed in the previous section ‘‘Non-pipelined Bus
Cycles’’.
When the CPU asserts ADSÝ before the end of a
bus cycle, both ADSÝ and READYÝ are active during a TRS state. This condition causes the Math CoProcessor to change to a different state named TP.
One clock period after a TP state, the Math CoProcessor always returns to the TRS state. In consecutive pipelined cycles, the Math CoProcessor bus logic uses only the TRS and TP states.
Figure 5-3 shows the fastest transitions into and out
of the pipelined bus cycles. Cycle 1 in the figure represents a non-pipelined cycle. (Non-pipelined write
are always followed by another non-pipelined cycle,
because READYÝ is asserted before the earliest
possible assertion of ADSÝ for the next cycle.)
Figure 5-4 shows pipelined write and read cycles
with one additional TRS state beyond the minimum
required. To delay the assertion of READYÝ requires external logic.
5.3 Mixed Bus Cycles
When the Math CoProcessor bus logic is in the TRS
state, it distinguishes between non-pipelined and
pipelined cycles according to the behavior of ADSÝ
and READYÝ. In a non-pipelined cycle, only
READYÝ is activated, and the transition is from the
TRS state to the idle state. In a pipelined cycle, both
READYÝ and ADSÝ are active, and the transition is
first from TRS state to TP state, then, after one clock
period, back to TRS state.
240225 – 9
Cycle 1 – Cycle 4 represent the operand transfer cycle for an instruction involving a transfer of two 32-bit loads in total.
The opcode write cycles and other overhead are not shown.
Note that the next cycle will be a pipelined cycle if both READYÝ and ADSÝ are sampled active at the end of a TRS
state of the current cycle.
Figure 5-3. Fastest Transitions to and from Pipelined Cycles
30
30
Intel387 TM SX MATH COPROCESSOR
240225 – 10
NOTE:
1. Cycles between operand write to the Math CoProcessor and storing result.
Figure 5-4. Pipelined Cycles with Wait States
31
31
Intel387 TM SX MATH COPROCESSOR
5.4 BUSYÝ and PEREQ Timing
Relationship
Figure 5-5 shows the activation of BUSYÝ at the
beginning of instruction execution and its deactiva-
tion upon completion of the instruction. PEREQ is
activated within this interval. If ERRORÝ is ever asserted, it would be asserted at least six CPUCLK2
periods after the deactivation of PEREQ and would
be deasserted at least six CPUCLK2 periods before
the deactivation of BUSYÝ.
240225 – 11
NOTES:
1. Instruction dependent.
2. PEREQ is an asynchronous input to the Intel386 TM Microprocessor; it may not be asserted (instruction dependent).
3. More operand transfers.
4. Memory read (operand) cycle is not shown.
Figure 5-5. STEN, BUSYÝ, and PEREQ Timing Relationships
32
32
Intel387 TM SX MATH COPROCESSOR
6.0
The ambient temperature (TA) is guaranteed as long
as TC is not violated. The ambient temperature can
be calculated from the iJC (thermal resistance constant from the transistor junction to the case) and
iJA (thermal resistance from junction to ambient)
from the following calculations:
Junction Temperature TJ e TC a P*iJC
PACKAGE SPECIFICATIONS
6.1 Mechanical Specifications
The Intel387 SX Math CoProcessor is packaged in a
68-pin PLCC package. Detailed mechanical specifications can be found in the Intel Packaging Specification, Order Number 231369.
Ambient Temperature TA e TJ b P*iJA
Case Temperature
6.2 Thermal Specifications
TC e TA a P* (iJA b iJC)
Values for iJA and iJC are given in Table 6-1 for the
68 pin PLCC package. iJC is given at various airflows. Table 6-2 shows the maximum TA allowable
without exceeding TC at various airflows. Note that
TA can be improved further by attaching a heat sink
to the package. P is calculated by using the maximum hot ICC and maximum VCC.
The Intel387 SX Math CoProcessor is specified for
operation when the case temperature is within the
range of 0§ C to 100§ C. The case temperature (TC)
may be measured in any environment to determine
whether the Intel387 SX Math CoProcessor is within
the specified operating range. The case temperature
should be measured at the center of the top surface.
Table 6-1. Thermal Resistances (§ C/Watt) iJC and iJA
iJA versus Airflow - ft/min (m/sec)
Package
iJC
0
(0)
200
(1.01)
400
(2.03)
600
(3.04)
800
(4.06)
1000
(5.07)
68-Pin PLCC
8
30
25
20
15.5
13
12
Table 6-2. Maximum TA at Various Airflows
TA (§ C) versus Airflow - ft/min (m/sec)
Package
0
(0)
200
(1.01)
400
(2.03)
600
(3.04)
800
(4.06)
1000
(5.07)
68-Pin PLCC
84.9
88.3
91.8
94.8
96.6
97.2
Maximum TA is calculated at maximum VCC and maximum ICC.
7.0
ELECTRICAL CHARACTERISTICS
The following specifications represent the targets of the design effort. They are subject to change without
notice. Contact your Intel representative to get the most up-to-date values.
7.1 Absolute Maximum Ratings*
Case Temperature TC Under Bias ÀÀÀ0§ C to a 100§ C
Storage Temperature ÀÀÀÀÀÀÀÀÀÀ b 65§ C to a 150§ C
Voltage on Any Pin
with Respect to Ground ÀÀÀÀÀÀÀ b 0.5 to VCC a 0.5
Power DissipationÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀÀ0.8W
NOTICE: This is a production data sheet. The specifications are subject to change without notice.
*WARNING: Stressing the device beyond the ‘‘Absolute
Maximum Ratings’’ may cause permanent damage.
These are stress ratings only. Operation beyond the
‘‘Operating Conditions’’ is not recommended and extended exposure beyond the ‘‘Operating Conditions’’
may affect device reliability.
33
33
Intel387 TM SX MATH COPROCESSOR
7.2 D.C. Characteristics
Table 7-1. D.C. Specifications TC e 0§ C to a 100§ C, VCC e 5V g 10%
Symbol
Parameter
Min
Max
Units
Input LO Voltage
Input HI Voltage
CPUCLK2 and NUMCLK2
Input LO Voltage
CPUCLK2 and NUMCLK2
Input HI Voltage
b 0.3
a 0.8
VCC a 0.3
V
V
b 0.3
a 0.8
V
VCC b 0.8
VCC a 0.8
V
VOL
VOH
VOH
Output LO Voltage
Output HI Voltage
Output HI Voltage
0.45
2.4
VCC b 0.8
V
V
V
(Note 2)
(Note 3)
(Note 4)
ICC
Power Supply Current
Dynamic Mode
Freq. e 33 MHz(5)
Freq. e 25 MHz(5)
Freq. e 20 MHz(5)
Freq. e 16 MHz(5)
Freq. e 1 MHz(5)
Idle Mode(6)
150
150
125
100
20
7
mA
mA
mA
mA
mA
mA
ICC typ.
ICC typ.
ICC typ.
ICC typ.
ICC typ.
ICC typ.
ILI
ILO
Input Leakage Current
I/O Leakage Current
g 15
g 15
mA
mA
0V s VIN s VCC
0.45V s VO s VCC
CIN
CO
CCLK
Input Capacitance
I/O Capacitance
Clock Capacitance
10
12
20
pF
pF
pF
fc e 1 MHz
fc e 1 MHz
fc e 1 MHz
VIL
VIH
VCL
VCH
2.0
7
7
7
Test Conditions
(Note 1)
(Note 1)
e
e
e
e
e
e
135 mA
130 mA
110 mA
90 mA
5 mA
4 mA
NOTES:
1. This parameter is for all inputs, excluding the clock inputs.
2. This parameter is measured at IOL as follows:
Data e 4.0 mA
READYOÝ, ERRORÝ, BUSYÝ, PEREQ e 25 mA
3. This parameter is measured at IOH as follows:
Data e 1.0 mA
READYOÝ, ERRORÝ, BUSYÝ, PEREQ e 0.6 mA
4. This parameter is measured at IOH as follows:
Data e 0.2 mA
READYOÝ, ERRORÝ, BUSYÝ PEREQ e 0.12 mA
5. Synchronous Clock Mode (CKM e 1). ICC is measured at steady state, maximum capacitive loading on the outputs, and
worst-case D.C. level at the inputs.
6. Intel387 SX Math CoProcessor Internal Idle Mode. Synchronous clock mode, clock and control inputs are active but the
Math CoProcessor is not executing an instruction. Outputs driving CMOS inputs.
34
34
Intel387 TM SX MATH COPROCESSOR
7.3 A.C. Characteristics
Table 7-2a. Timing Requirements of the Bus Interface Unit
TC e 0§ C to a 100§ C, VCC e 5V g 10% (All measurements made at 1.5V unless otherwise specified)
Pin
Symbol
Parameter
CPUCLK2
CPUCLK2
CPUCLK2
CPUCLK2
CPUCLK2
CPUCLK2
CPUCLK2
t1
t2a
t2b
t3a
t3b
t4
t5
READYOÝ
PEREQ
BUSYÝ
ERRORÝ
16 MHz –
25 MHz
33 MHz
Test
Conditions
Refer to
Figure
7.2
4
4
2.0V
2.0V
VCC b 0.8V
2.0V
0.8V
From VCC b 0.8V to 0.8V
From 0.8V to VCC b 0.8V
4
4
4
4
17
21
21
23
CL
CL
CL
CL
50 pF
50 pF
50 pF
50 pF
7.3
37
CL e 50 pF
7.4
24
0
8
8
6
40
40
40
40
1
1
1
1
30
30
30
30
Min
(ns)
Max
(ns)
Min
(ns)
Max
(ns)
Period
High Time
High Time
Low Time
Low Time
Fall Time
Rise Time
20
6
3
6
4
DC
15
6.25
4.5
6.25
4.5
DC
t7a
t7b
t7c
t7d
Out Delay
Out Delay
Out Delay
Out Delay
4
4
4
4
25
23
23
23
D15–D0
D15–D0
D15–D0
D15–D0
t8
t10
t11
t12*
Out Delay
Setup Time
Hold Time
Float Time
1
11
11
6
45
READYOÝ
PEREQ
BUSYÝ
ERRORÝ
t13a*
t13b*
t13c*
t13d*
Float Time
Float Time
Float Time
Float Time
1
1
1
1
ADSÝ
ADSÝ
W/RÝ
W/RÝ
t14a
t15a
t14b
t15b
Setup Time
Hold Time
Setup Time
Hold Time
15
4
15
4
13
4
13
4
7.4
READYÝ
READYÝ
CMD0Ý
CMD0Ý
NPS1Ý, NPS2
NPS1Ý, NPS2
STEN
STEN
t16a
t17a
t16b
t17b
t16c
t17c
t16d
t17d
Setup Time
Hold Time
Setup Time
Hold Time
Setup Time
Hold Time
Setup Time
Hold Time
9
4
16
2
16
2
15
2
7
4
13
2
13
2
13
2
7.4
RESETIN
RESETIN
t18
t19
Setup Time
Hold Time
8
3
5
2
7.5
7
7
e
e
e
e
19
7.6
NOTE:
*Float condition occurs when maximum output current becomes less than ILO in magnitude. Float delay is not tested.
35
35
Intel387 TM SX MATH COPROCESSOR
Table 7-2b. Timing Requirements of the Execution Unit (Asynchronous Mode CKM e 0)
Pin
Symbol
Parameter
NUMCLK2
NUMCLK2
NUMCLK2
NUMCLK2
NUMCLK2
NUMCLK2
NUMCLK2
t1
t2a
t2b
t3a
t3b
t4
t5
Period
High Time
High Time
Low Time
Low Time
Fall Time
Rise Time
NUMCLK2/
CPUCLK2
16 MHz –
25 MHz
33 MHz
Min
(ns)
Max
(ns)
Min
(ns)
Max
(ns)
20
6
3
6
4
500
15
6.25
4.5
6.25
4.5
500
7
7
Ratio
10/16
14/10
6
6
10/16
Test
Conditions
Refer to
Figure
2.0V
2.0V
VCC b 0.8V
2.0V
0.8V
From VCC b 0.8V to 0.8V
From 0.8V to VCC b 0.8V
7.2
14/10
NOTE:
If not used (CKM e 1) tie NUMCLK2 low.
Table 7-2c. Other A.C. Parameters
Pin
Symbol
Parameter
Min
Max
Units
RESETIN
t30
Duration
40
NUMCLK2
RESETIN
t31
RESETIN Inactive to 1st
Opcode Write
50
NUMCLK2
BUSYÝ
t32
Duration
6
CPUCLK2
BUSYÝ, ERRORÝ
t33
ERRORÝ (In)Active to
BUSYÝ Inactive
6
CPUCLK2
PEREQ, ERRORÝ
t34
PEREQ Inactive to
ERRORÝ Active
6
CPUCLK2
READYÝ, BUSYÝ
t35
READYÝ Active to BUSYÝ
Active
0
READYÝ
t36
Minimum Time from
Opcode Write to
Opcode/Operand Write
4
CPUCLK2
READYÝ
t37
Minimum Time from
Operand Write to Operand
Write
4
CPUCLK2
4
CPUCLK2
36
36
Intel387 TM SX MATH COPROCESSOR
240225 – 12
NOTE:
*Typical part under worst-case conditions.
Figure 7-1a. Typical Output Valid Delay vs Load Capacitance at Max Operating Temperature
240225 – 13
240225 – 14
NOTE:
*Typical part under worst-case conditions.
Figure 7-1b. Typical Output Slew Time vs Load Capacitance at Max Operating Temperature
240225 – 15
Figure 7-1c. Maximum ICC vs Frequency
37
37
Intel387 TM SX MATH COPROCESSOR
240225 – 16
Figure 7-2. CPUCLK2/NUMCLK2 Waveform and Measurement Points for Input/Output
240225 – 17
Figure 7-3. Output Signals
38
38
Intel387 TM SX MATH COPROCESSOR
240225 – 18
Figure 7-4. Input and I/O Signals
240225 – 19
NOTE:
The second internal processor phase following RESET high to low transition is PH2.
Figure 7-5. RESET Signal
39
39
Intel387 TM SX MATH COPROCESSOR
240225 – 20
Figure 7-6. Float from STEN
240225 – 21
*In NUMCLK2’s
**or last operand
NOTE:
1. Memory read (operand) cycle is not shown.
Figure 7-7. Other Parameters
40
40
Intel387 TM SX MATH COPROCESSOR
8.0
INTEL387 SX MATH
COPROCESSOR INSTRUCTION
SET
Instructions for the Intel387 SX Math CoProcessor
assume one of the five forms shown in Table 8-1. In
all cases, instructions are at least two bytes long and
begin with the bit pattern 11011B, which identifies
the ESCAPE class of instruction. Instructions that
refer to memory operands specify addresses using
the CPU’s addressing modes.
MOD (Mode field) and R/M (Register/Memory specifier) have the same interpretation as the corresponding fields of CPU instructions (refer to Pro-
grammer’s Reference Manual for the CPU). SIB
(Scale Index Base) byte and DISP (displacement)
are optionally present in instructions that have MOD
and R/M fields. Their presence depends on the values of MOD and R/M, as for instructions of the CPU.
The instruction summaries that follow in Table 8-2
assume that the instruction has been prefetched,
decoded, and is ready for execution; that bus cycles
do not require wait states; that there are no local bus
HOLD requests delaying processor access to the
bus; and that no exceptions are detected during instruction execution. If the instruction has MOD and
R/M fields that call for both base and index registers, add one clock.
Table 8-1. Instruction Formats
Instruction
First Byte
OPA
Optional
Fields
Second Byte
1
11011
1
2
11011
3
11011
d
P
OPA
1
1
4
11011
0
0
1
1
1
1
OP
5
11011
0
1
1
1
1
1
OP
15 –11
10
9
8
7
6
5
4 3 2 1 0
MF
MOD
OPA
1
MOD
OPB
R/M
SIB
DISP
OPB*
R/M
SIB
DISP
OPB*
ST(i)
OP e Instruction opcode, possibly split into two fields OPA and OPB
MF e Memory Format
00 - 32-bit real
01 - 32-bit integer
10 - 64-bit real
11 - 16-bit integer
d e Destination
0 - Destination is ST(0)
1 - Destination is ST(i)
R XOR d e 0 - Destination (op) Source
R XOR d e 1 - Source (op) Destination
*In FSUB and FDIV, the low-order bit of OPB is the R (reversed) bit
P e POP
0 - Do not pop stack
1 - Pop stack after operation
ESC e 11011
ST(i) e Register stack element i
000 e Stack top
001 e Second stack element
#
#
#
111 e Eighth stack element
41
41
Intel387 TM SX MATH COPROCESSOR
Encoding
Instruction
Clock Count Range
32-Bit
64-Bit
Integer
Real
Byte 0
Byte 1
Optional
Bytes 2 – 6
32-Bit
Real
DATA TRANSFER
FLD e Loada
Integer/real memory to ST(0)
ESC MF 1
MOD 000 R/M
SIB/DISP
11 – 20
Long integer memory to ST(0)
ESC 111
MOD 101 R/M
SIB/DISP
Extended real memory to ST(0)
ESC 011
MOD 101 R/M
SIB/DISP
16 – 47
BCD memory to ST(0)
ESC 111
MOD 100 R/M
SIB/DISP
49 – 101
ST(i) to ST(0)
ESC 001
11000 ST(i)
ST(0) to integer/real memory
ESC MF 1
MOD 010 R/M
ST(0) to ST(i)
ESC 101
11010 ST(i)
28 – 44
20 – 27
16-Bit
Integer
42 – 53
30 – 58
7 – 12
FST e Store
SIB/DISP
27 – 45
59 – 78
59
58 – 76
59
58 – 76
7 – 11
FSTP e Store and Pop
ST(0) to integer/real memory
ESC MF 1
MOD 011 R/M
SIB/DISP
ST(0) to long integer memory
ESC 111
MOD 111 R/M
SIB/DISP
27 – 45
59 – 78
ST(0) to extended real memory
ESC 011
MOD 111 R/M
SIB/DISP
50 – 56
ST(0) to BCD memory
ESC 111
MOD 110 R/M
SIB/DISP
116 – 194
ST(0) to ST(i)
ESC 101
11011 ST (i)
7 – 11
ESC 001
11001 ST(i)
10 – 17
Integer/real memory to ST(0)
ESC MF 0
MOD 010 R/M
ST(i) to ST(0)
ESC 000
11010 ST(i)
Integer/real memory to ST(0)
ESC MF 0
MOD 011 R/M
ST(i) to ST(0)
ESC 000
11011 ST(i)
64 – 86
FXCH e Exchange
ST(i) and ST(0)
COMPARISON
FCOM e Compare
SIB/DISP
15 – 27
36 – 54
18 – 31
39 – 62
13 – 21
FCOMP e Compare and pop
SIB/DISP
15 – 27
36 – 54
18 – 31
39 – 62
13 – 21
FCOMPP e Compare and pop twice
ESC 110
1101 1001
13 – 21
FTST e Test ST(0)
ST(1) to ST(0)
ESC 001
1110 0100
17 – 25
FUCOM e Unordered compare
ESC 101
11100 ST(i)
13 – 21
FUCOMP e Unordered compare
ESC 101
11101 ST(i)
13 – 21
FUCOMPP e Unordered compare
and pop twice
and pop
ESC 010
1110 1001
13 – 21
FXAM e Examine ST(0)
ESC 001
1110 0101
24-37
Shaded areas indicate instructions not available in 8087/80287.
NOTE:
a. When loading single or double precision zero from memory, add 5 clocks.
42
42
Intel387 TM SX MATH COPROCESSOR
Encoding
Instruction
Clock Count Range
Optional
Bytes 2 – 6
32-Bit
Real
32-Bit
Integer
64-Bit
Real
16-Bit
Integer
14 – 31
36 – 58
19 – 38
38 – 64
Byte 0
Byte 1
Integer/real memory to ST(0)
ESC MF 0
MOD 000 R/M
SIB/DISP
ST(i) and ST(0)
ESC d P 0
11000 ST(i)
SIB/DISP
ESC MF 0
MOD 10 R R/M
SIB/DISP
ARITHMETIC
FADD e Add
12 – 26b
FSUB e Subtract
Integer/real memory with ST(0)
ST(i) to ST(0)
14 – 31
36 – 58
19 – 38
38 – 64c
12 – 26d
ESC d P 0
1110 R R/M
Integer/real memory with ST(0)
ESC MF 0
MOD 001 R/M
ST(i) and ST(0)
ESC d P 0
1100 1 R/M
Integer/real memory with ST(0)
ESC MF 0
MOD 11 R R/M
ST(i) and ST(0)
ESC d P 0
1111 R R/M
77 – 80h
FSQRT i e Square root
ESC 001
1111 1010
97 – 111
FSCALE e Scale ST(0) by ST(1)
ESC 001
1111 1101
44 – 82
FPREM e Partial remainder
ESC 001
1111 1000
56 – 140
FPREM1 e Partial remainder (IEEE)
ESC 001
1111 0101
81 – 168
FRNDINT e Round ST(0) to integer
ESC 001
1111 1100
41 – 62
FMUL e Multiply
SIB/DISP
21 – 33
45 – 73
27 – 57
46 – 74
17 – 50e
FDIV e Divide
SIB/DISP
79 – 87
103 – 116f
85 – 95
FXTRACT e Extract components
of ST(0)
ESC 001
1111 0100
42 – 63
FABS e Absolute value of ST(0)
ESC 001
1110 0001
14 – 21
FCHS e Change sign of ST(0)
ESC 001
1110 0000
17 – 24
105 – 124g
TRANSCENDENTAL
FCOSk e Cosine of ST(0)
ESC 001
1111 1111
122 – 680
FPTANk e Partial tangent of ST(0)
ESC 001
1111 0010
162 – 430j
FPATAN e Partial arctangent of ST(0)
ESC 001
1111 0011
250 – 420
FSINk e Sine of ST(0)
ESC 001
1111 1110
121 – 680
FSINCOSk e Sine and cosine of ST(0)
ESC 001
1111 1011
150 – 650
F2XM1l e 2ST(0) b 1
ESC 001
1111 0000
167 – 410
FYL2Xm e ST(1) * log2ST(0)
ESC 001
1111 0001
99 – 436
FYL2XP1n e ST(1) * log2 [ST(0) a 1.0]
ESC 001
1111 1001
210 – 447
Shaded areas indicate instructions not available in 8087/80287.
NOTES:
b. Add 3 clocks to the range when d e 1.
c. Add 1 clock to each range when R e 1.
d. Add 3 clocks to the range when d e 0.
e. typical e 52 (When d e 0, 46–54, typical e 49).
f. Add 1 clock to the range when R e 1.
g. 135 – 141 when R e 1.
h. Add 3 clocks to the range when d e 1.
i. b0 s ST(0) s a % .
j. These timings hold for operands in the range lxl k q. For operands not in this range, up to 76 additional clocks may be
needed to reduce the operand.
k. 0 s ST(0) k 263.
l. b1.0 s ST(0) s 1.0.
m. 0 s ST(0) k % , b % k ST(1) k a % .
n. 0 s lST(0)l k [2-SQRT(2)]/2, b % kST(1) k a % .
43
43
Intel387 TM SX MATH COPROCESSOR
Encoding
Instruction
Byte 0
Byte 1
Clock Count Range
Optional
Bytes 2 – 6
32-Bit
Real
32-Bit
Integer
64-Bit
Real
16-Bit
Integer
CONSTANTS
FLDZ e Load a 0.0 to ST(0)
ESC 001
1110 1110
10 – 17
FLD1 e Load a 1.0 to ST(0)
ESC 001
1110 1000
15 – 22
FLDPI e Load q to ST(0)
ESC 001
1110 1011
26 – 36
FLDL2T e Load log2(10) to ST(0)
ESC 001
1110 1001
26 – 36
FLDL2E e Load log2(e) to ST(0)
ESC 001
1110 1010
26 – 36
FLDLG2 e Load log10(2) to ST(0)
ESC 001
1110 1100
25 – 35
FLDLN2 e Load loge(2) to ST(0)
ESC 001
1110 1101
26 – 38
PROCESSOR CONTROL
FINIT e Initialize Math CoProcessor
ESC 011
1110 0011
FLDCW e Load control word from memory
ESC 001
MOD 101 R/M
SIB/DISP
19
33
FSTCW e Store control word to memory
ESC 001
MOD 111 R/M
SIB/DISP
15
FSTSW e Store status word to memory
ESC 101
MOD 111 R/M
SIB/DISP
15
FSTSW AX e Store status word to AX
ESC 111
1110 0000
FCLEX e Clear exceptions
ESC 011
1110 0010
FSTENV e Store environment
ESC 001
MOD 110 R/M
SIB/DISP
117 – 118
FLDENV e Load environment
ESC 001
MOD 100 R/M
SIB/DISP
85
13
11
FSAVE e Save state
ESC 101
MOD 110 R/M
SIB/DISP
402 – 403
FRSTOR e Restore state
ESC 101
MOD 100 R/M
SIB/DISP
415
FINCSTP e Increment stack pointer
ESC 001
1111 0111
FDECSTP e Decrement stack pointer
ESC 001
1111 0110
22
FFREE e Free ST(i)
ESC 101
1100 0 ST(i)
18
FNOP e No operations
ESC 001
1101 0000
12
21
44
44
Intel387 TM SX MATH COPROCESSOR
APPENDIX A
INTEL387 SX MATH COPROCESSOR
COMPATIBILITY
A.1 8087/80287 Compatibility
This section summarizes the differences between the Intel387 SX Math CoProcessor and the 80287 Math
CoProcessor. Any migration from the 8087 directly to the Intel387 SX Math CoProcessor must also take into
account the differences between the 8087 and the 80287 Math CoProcessor as listed in Appendix B.
Many changes have been designed into the Intel387 SX Math CoProcessor to directly support the IEEE
standard in hardware. These changes result in increased performance by eliminating the need for software
that supports the standard.
A.1.1 GENERAL DIFFERENCES
The Intel387 SX Math CoProcessor supports only affine closure for infinity arithmetic, not projective closure.
Operands for FSCALE and FPATAN are no longer restricted in range (except for g % ); F2XM1 and FPTAN
accept a wider range of operands.
Rounding control is in effect for FLD constant.
Software cannot change entries of the tag word to values (other than empty) that differ from actual register
contents.
After reset, FINIT, and incomplete FPREM, the Intel387 SX Math CoProcessor resets to zero the condition
code bits C3 –C0 of the status word.
In conformance with the IEEE standard, the Intel387 SX Math CoProcessor does not support the special data
formats pseudo-zero, pseudo-NaN, pseudo-infinity, and unnormal.
The denormal exception has a different purpose on the Intel387 SX Math CoProcessor. A system that uses the
denormal exception handler solely to normalize the denormal operands, would better mask the denormal
exception on the Intel387 SX Math CoProcessor. The Intel387 SX Math CoProcessor automatically normalizes
denormal operands when the denormal exception is masked.
A-1
45
Intel387 TM SX MATH COPROCESSOR
A.1.2 EXCEPTIONS
A number of differences exist due to changes in the IEEE standard and to functional improvements to the
architecture of the Intel387 SX Math CoProcessor:
1. When the overflow or underflow exception is masked, the Intel387 SX Math CoProcessor differs from the
80287 in rounding when overflow or underflow occurs. The Intel387 SX Math CoProcessor produces
results that are consistent with the rounding mode.
2. When the underflow exception is masked, the Intel387 SX Math CoProcessor sets its underflow flag only if
there is also a loss of accuracy during denormalization.
3. Fewer invalid-operations exceptions due to denormal operand, because the instructions FSQRT, FDIV,
FPREM, and conversions to BCD or to integer normalize denormal operands before proceeding.
4. The FSQRT, FBSTP, and FPREM instructions may cause underflow, because they support denormal
operands.
5. The denormal exception can occur during the transcendental instruction and the FXTRACT instruction.
6. The denormal exception no longer takes precedence over all other exceptions.
7. When the denormal exception is masked, the Intel387 SX Math CoProcessor automatically normalizes
denormal operands. The 8087/80287 performs unnormal arithmetic, which might produce an unnormal
result.
8. When the operand is zero, the FXTRACT instruction reports a zero-divide exception and leaves b % in
ST(1).
9. The status word has a new bit (SF) that signals when invalid-operation exceptions are due to stack
underflow or overflow.
10. FLD extended precision no longer reports denormal exceptions, because the instruction is not numeric.
11. FLD single/double precision when the operand is denormal converts the number to extended precision
and signals the denormal operand exception. When loading a signaling NaN, FLD single/double precision
signals an invalid-operation exception.
12. The Intel387 SX Math CoProcessor only generates quiet NaNs (as on the 80287); however, the Intel387
SX Math CoProcessor distinguishes between quiet NaNs and signaling NaNs. Signaling NaNs trigger
exceptions when they are used as operands; quiet NaNs do not (except for FCOM, FIST, and FBSTP
which also raise IE for quiet NaNs).
13. When stack overflow occurs during FPTAN and overflow is masked, both ST(0) and ST(1) contain quiet
NaNs. The 80287/8087 leaves the original operand in ST(1) intact.
14. When the scaling factor is g % , the FSCALE instruction behaves as follows:
# FSCALE (0, % ) generates the invalid operation exception.
# FSCALE (finite, b % ) generates zero with the same sign as the scaled operand.
# FSCALE (finite, a % ) generates % with the same sign as the scaled operand.
The 8087/80287 returns zero in the first case and raises the invalid-operation exception in the other
cases.
15. The Intel387 SX Math CoProcessor returns signed infinity/zero as the unmasked response to massive
overflow/underflow. The 8087 and 80287 support a limited range for the scaling factor; within this range
either massive overflow/underflow do not occur or undefined results are produced.
A-2
46
Intel387 TM SX MATH COPROCESSOR
APPENDIX B
COMPATIBILITY BETWEEN THE 80287
AND 8087 MATH COPROCESSOR
The 80286/80287 operating in Real Address mode will execute 8086/8087 programs without major modification. However, because of differences in the handling of numeric exceptions by the 80287 Math CoProcessor
and the 8087 Math CoProcessor, exception handling routines may need to be changed. This appendix summarizes the differences between the 80287 Math CoProcessor and the 8087 Math CoProcessor, and provides
details showing how 8087/8087 programs can be ported to the 80286/80287.
1. The Math CoProcessor signals exceptions through a dedicated ERRORÝ line to the 80286. The Math
CoProcessor error signal does not pass through an interrupt controller (the 8087 INT signal does). Therefore, any interrupt controller oriented instructions in numeric exception handlers for the 8086/8087 should
be deleted.
2. The 8087 instructions FENI and FDISI perform no useful function in the 80287. If the 80287 encounters one
of these opcodes in its instruction stream, the instruction will effectively be ignored; none of the 80287
internal states will be updated. While 8086/8087 programs containing the instruction may be executed on
the 80286/80287, it is unlikely that the exception handling routines containing these instructions will be
completely portable to the 80287.
3. Interrupt vector 16 must point to the numeric exception handling routine.
4. The ESC instruction address saved in the 80287 includes any leading prefixes before the ESC opcode. The
corresponding address saved in the 8087 does not include leading prefixes.
5. In Protected Address mode, the format of the 80287’s saved instruction and address pointers is different
than for the 8087. The instruction opcode is not saved in Protected mode; exception handlers will have to
retrieve the opcode from memory if needed.
6. Interrupt 7 will occur in the 80286 when executing ESC instructions with either TS (task switched) or EM
(emulation) of the 80286 MSW set (TS e 1 or EM e 1). It TS is set, then a WAIT instruction will also cause
interrupt 7. An exception handler should be included in 80286/80287 code to handle these situations.
7. Interrupt 9 will occur if the second or subsequent words of a floating point operand fall outside a segment’s
size. Interrupt 13 will occur if the starting address of a numeric operand falls outside a segment’s size. An
exception handler should be included in 80286/80287 code to report these programming errors.
8. Except for the processor control instructions, all of the 80287 numeric instructions are automatically synchronized by the 80286 CPU; the 80286 CPU automatically tests the BUSYÝ line from the 80287 to ensure
that the 80287 has completed its previous instruction before executing the next ESC instruction. No explicit
WAIT instructions are required to assure this synchronization. For the 8087 used witth 8086 and 8088
processors, explicit WAITs are required before each numeric instruction to ensure synchronization. Although 8086/8087 programs having explicit WAIT instructions will execute perfectly on the 80286/80287
without reassembly, these WAIT instructions are unnecessary.
9. Since the 80287 does not require WAIT instructions before each numeric instruction, the ASM286 assembler does not automatically generate these WAIT instuctions. The ASM86 assembler, however, automatically precedes every ESC instruction with a WAIT instruction. Although numeric routines generated using
the ASM86 assembler will generally execute correctly on the 80286/80287, reassembly using ASM286
may result in a more compact code image.
The processor control instructions for the 80287 may be coded using either a WAIT or No-WAIT form of
mnemonic. The WAIT forms of these instructions cause ASM286 to precede the ESC instructions with a CPU
WAIT instruction, in the identical manner as does ASM86.
B-1
47