ETC AM486

Am486® Microprocessor
Software User’s
Manual
Rev. 1, 1994
A
D
V
A
N
C
E
D
M
I
C
R
O
D
E
V
I
C
E
S
© 1994 Advanced Micro Devices, Inc.
Advanced Micro Devices reserves the right to make changes in its products
without notice in order to improve design or performance characteristics.
This publication neither states nor implies any warranty of any kind, including but not limited to implied warrants of merchantability or fitness for
a particular application. AMD® assumes no responsibility for the use of any circuitry other than the circuitry in an AMD product.
The information in this publication is believed to be accurate in all respects at the time of publication, but is subject to change without notice. AMD
assumes no responsibility for any errors or omissions, and disclaims responsibility for any consequences resulting from the use of the
information included herein. Additionally, AMD assumes no responsibility for the functioning of undescribed features or parameters.
Trademarks
AMD, Am486, and Am386 are registered trademarks of Advanced Micro Devices, Inc.
Microsoft is a registered trademark of Microsoft Corporation. Windows is a trademark of Microsoft Corporation.
Product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
INTRODUCTION
The Am486® Microprocessor Software User’s Manual is designed to support system software engineers developing BIOS and application software for use with products from the
Am486 microprocessor family. Because, typically, such engineers are already familiar with
basic personal computer system programming requirements, this book focuses on providing information about the basic processor instruction set and the programmable registers.
Each chapter begins with an overview diagram of registers or instructions organized by
operational category with cross-references to the detailed description page for each item.
The detailed descriptions are listed alphabetically on the subsequent pages in the chapter.
Supplementary information is provided in Appendices A through J. A glossary of terms is
included after the appendices. For convenience, a basic ASCII cross-reference is on the
inside back cover of this manual.
AMD
Introduction
CHAPTER
TABLE OF CONTENTS
Introduction
Chapter 1 Am486 Microprocessor Register Set
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
1.2 Detailed Register Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
1.3 AH Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
1.4 AL Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
1.5 AX Processor General Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
1.6 BH Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
1.7 BL Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
1.8 BP Processor General Register/Base Pointer 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
1.9 BX Processor General Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
1.10 CH Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
1.11 CL Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
1.12 CR0 Control Register 0 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
1.13 CR1 Control Register 1 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14
1.14 CR2 Control Register 2 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15
1.15 CR3 Control Register 3 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-16
1.16 CS Code Segment Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17
1.17 CX Processor General Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18
1.18 DH Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19
1.19 DI Processor General Register — Data Index 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20
1.20 DL Processor General Register 8 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21
1.21 DR0 Linear Breakpoint Address 0 Debug Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . 1-22
1.22 DR1 Linear Breakpoint Address 1 Debug Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . 1-23
1.23 DR2 Linear Breakpoint Address 2 Debug Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . 1-24
1.24 DR3 Linear Breakpoint Address 3 Debug Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . 1-25
1.25 DR4 Debug Register 4 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-26
1.26 DR5 Debug Register 5 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-27
1.27 DR6 Breakpoint Status Debug Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-28
1.28 DR7 Breakpoint Control Debug Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-29
1.29 DS Data Segment Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-31
1.30 DX Processor General Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-32
1.31 EAX Processor General Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-33
1.32 EBP Processor General Register — Base Pointer 32 bits. . . . . . . . . . . . . . . . . . . . . . . . . 1-34
1.33 EBX Processor General Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-35
1.34 ECX Processor General Register 32 bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-36
1.35 EDI Processor General Register — Data Index 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-37
1.36 EDX Processor General Register 32 bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-38
1.37 EFLAGS Extended Flags Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-39
1.38 EIP Extended Instruction Pointer Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-40
1.39 ES Data Segment Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-41
1.40 ESI Processor General Register — Stack Index 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . 1-42
1.41 ESP Processor General Register — Stack Pointer 32 bits . . . . . . . . . . . . . . . . . . . . . . . . 1-43
1.42 FLAGS Flags Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-44
1.43 FPUCR FPU Control Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-45
1.44 FPUDP FPU Data Pointer 32 or 64 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-46
1.45 FPUIP FPU Instruction Pointer 32 or 64 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-47
1.46 FPUSR FPU Status Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-48
Table Of Contents
iii
AMD
1.47 FPUTWR FPU Tag Word Register 16 bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-50
1.48 FS Data Segment Register 16 bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-51
1.49 GDTR Global Descriptor Table Register 48 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-52
1.50 GS Data Segment Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-53
1.51 IDTR Interrupt Descriptor Table Register 48 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-54
1.52 IP Instruction Pointer 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-55
1.53 LDTR Local Descriptor Table Register 48 bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-56
1.54 R0–R7 FPU Data Registers 0–7 80 bits each . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-57
1.55 SI Processor General Register — Stack Index 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-58
1.56 SP Processor General Register — Stack Pointer 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . 1-59
1.57 SS Stack Segment Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-60
1.58 TR Task Register 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-61
1.59 TR3 Cache Test Data Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-62
1.60 TR4 Cache Test Status Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-63
1.61 TR5 Cache Test Control Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-64
1.62 TR6 TLB Test Control Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-65
1.63 TR7 TLB Test Status Register 32 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-66
Chapter 2 Am486 Microprocessor Instruction Set
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.2 Detailed Instruction Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.3 AAA ASCII Adjusts AL after Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.4 AAD ASCII Adjusts AX before Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2.5 AAM ASCII Adjusts AX after Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.6 AAS ASCII Adjusts AL after Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
2.7 ADC Adds Integers with Carry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
2.8 ADD Adds Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
2.9 AND Logical AND Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
2.10 ARPL Adjusts RPL Field of Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
2.11 BOUND Checks Array Index Against Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
2.12 BSF Bit Scan Forward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
2.13 BSR Bit Scan Reverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
2.14 BSWAP Byte Swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
2.15 BT Bit Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16
2.16 BTC Bit Test and Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
2.17 BTR Bit Test And Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18
2.18 BTS Bit Test And Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-19
2.19 CALL Calls Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
2.20 CBW Converts Byte to Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-25
2.21 CDQ Converts Doubleword to Quadword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-26
2.22 CLC Clears Carry Flag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27
2.23 CLD Clears Direction Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28
2.24 CLI Clears Interrupt-Enable Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29
2.25 CLTS Clears Task-Switched Flag in CR0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2.26 CMC Complements Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
2.27 CMP Compares Two Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32
2.28 CMPS/CMPSB/CMPSD/CMPSW Compares Two String Operands . . . . . . . . . . . . . . . . . 2-33
2.29 CMPXCHG Compares And Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-35
2.30 CWD Converts Word to Doubleword Using DX:AX Register Pair . . . . . . . . . . . . . . . . . . . 2-36
2.31 CWDE Converts Word to Doubleword Using EAX Register . . . . . . . . . . . . . . . . . . . . . . . 2-37
2.32 DAA Decimal Adjusts AL after Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-38
2.33 DAS Decimal Adjusts AL after Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-39
2.34 DEC Decrements by 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-40
2.35 DIV Unsigned Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-41
2.36 ENTER Makes Stack Frame for Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-42
2.37 F2XM1 Computes 2X–1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-43
2.38 FABS Absolute Value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-44
iv
Table Of Contents
AMD
2.39 FADD Adds Floating Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-45
2.40 FADDP Adds Floating Point and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-46
2.41 FBLD Loads Binary Coded Decimal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-47
2.42 FBSTP Stores Binary Coded Decimal and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . 2-48
2.43 FCHS Changes Sign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-49
2.44 FCLEX Clears Exceptions after Checking for FPU Error . . . . . . . . . . . . . . . . . . . . . . . . . . 2-50
2.45 FCOM Compares Real. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-51
2.46 FCOMP Compares Real and Pops FPU Stack Top. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-52
2.47 FCOMPP Compares Real and Pops FPU Stack Top Twice . . . . . . . . . . . . . . . . . . . . . . . 2-53
2.48 FCOS Cosine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-54
2.49 FDECSTP Decrements Top-of-Stack Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-55
2.50 FDIV Divides Real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-56
2.51 FDIVP Divides Real and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-57
2.52 FDIVR Reverse Divides Real. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-58
2.53 FDIVRP Reverse Divides Real and Pops FPU Stack Top. . . . . . . . . . . . . . . . . . . . . . . . . 2-59
2.54 FFREE Free Floating-Point Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-60
2.55 FIADD Adds Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-61
2.56 FICOM Compares Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-62
2.57 FICOMP Compares Integer and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-63
2.58 FIDIV Divides Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-64
2.59 FIDIVR Reverse Divides Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-65
2.60 FILD Loads Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-66
2.61 FIMUL Multiplies Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-67
2.62 FINCSTP Increments Top-of-Stack Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-68
2.63 FINIT Initializes FPU after Checking for Unmasked FPU Error . . . . . . . . . . . . . . . . . . . . 2-69
2.64 FIST Stores Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-70
2.65 FISTP Stores Integer and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-71
2.66 FISUB Subtracts Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-72
2.67 FISUBR Reverse Subtracts Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-73
2.68 FLD Loads Real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-74
2.69 FLD1 Loads Constant +1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-75
2.70 FLDCW Loads Control Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-76
2.71 FLDENV Loads FPU Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-77
2.72 FLDL2E Loads Constant log2e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-78
2.73 FLDL2T Loads Constant log210 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-79
2.74 FLDLG2 Loads Constant log102 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-80
2.75 FLDLN2 Loads Constant loge2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-81
2.76 FLDPI Loads Constant π . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-82
2.77 FLDZ Loads Constant +0.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-83
2.78 FMUL Multiplies Real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-84
2.79 FMULP Multiplies Real and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-85
2.80 FNCLEX Clears Exceptions without Checking for FPU Error . . . . . . . . . . . . . . . . . . . . . . 2-86
2.81 FNINIT Initializes FPU without Checking for Unmasked FPU Error. . . . . . . . . . . . . . . . . . 2-87
2.82 FNOP No Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-88
2.83 FNSAVE Stores FPU State w/o Checking for Unmasked FPU Error. . . . . . . . . . . . . . . . . 2-89
2.84 FNSTCW Stores Control Word without Checking for FPU Error . . . . . . . . . . . . . . . . . . . . 2-90
2.85 FNSTENV Stores FPU Environment w/o Checking for FPU Error. . . . . . . . . . . . . . . . . . . 2-91
2.86 FNSTSW Stores Status Word w/o Checking for Unmasked FPU Error. . . . . . . . . . . . . . . 2-92
2.87 FPATAN Partial Arctangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-93
2.88 FPREM Partial Remainder (Non-IEEE 754 compliant) . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-94
2.89 FPREM1 Partial Remainder (IEEE 754 compliant) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-95
2.90 FPTAN Partial Tangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-96
2.91 FRNDINT Rounds to Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-97
2.92 FRSTOR Restores FPU State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-98
2.93 FSAVE Stores FPU State after Checking for Unmasked FPU Error . . . . . . . . . . . . . . . . . 2-99
2.94 FSCALE Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-100
2.95 FSIN Sine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-101
2.96 FSINCOS Sine and Cosine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-102
Table of Contents
v
AMD
2.97 FSQRT Square Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-103
2.98 FST Stores Real. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-104
2.99 FSTCW Stores Control Word after Checking for FPU Error . . . . . . . . . . . . . . . . . . . . . . 2-105
2.100 FSTENV Stores FPU Environment after Checking for FPU Error . . . . . . . . . . . . . . . . . 2-106
2.101 FSTP Stores Real and Pops the FPU Stack Top. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-107
2.102 FSTSW Stores Status Word after Checking for Unmasked FPU Error . . . . . . . . . . . . . 2-108
2.103 FSUB Subtracts Real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-109
2.104 FSUBP Subtracts Real and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-110
2.105 FSUBR Reverse Subtracts Real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-111
2.106 FSUBRP Reverse Subtracts and Pops FPU Stack Top . . . . . . . . . . . . . . . . . . . . . . . . 2-112
2.107 FTST Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-113
2.108 FUCOM Unordered Compare Real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-114
2.109 FUCOMP Unordered Compare Real and Pop FPU Stack Top . . . . . . . . . . . . . . . . . . . 2-115
2.110 FUCOMPP Unordered Compare Real and Pop FPU Stack Top Twice. . . . . . . . . . . . . 2-116
2.111 FWAIT Wait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-117
2.112 FXAM Examine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-118
2.113 FXCH Exchanges Stack Register Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-119
2.114 FXTRACT Extracts Exponent and Significand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-120
2.115 FYL2X Computes y ⋅ log2x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-121
2.116 FYL2XP1 Computes y ⋅ log2(x+1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-122
2.117 HLT Halt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-123
2.118 IDIV Signed Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-124
2.119 IMUL Signed Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-125
2.120 IN Inputs Data from Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-126
2.121 INC Increments by One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-127
2.122 INS/INSB/INSD/INSW Inputs Data from Port to String . . . . . . . . . . . . . . . . . . . . . . . . . 2-128
2.123 INT/INTO Call to Interrupt Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-130
2.124 INVD Invalidates Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-134
2.125 INVLPG Invalidates TLB Entry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-135
2.126 IRET/IRETD Interrupt Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-136
2.127 JA Jumps If Above (see also JNBE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-140
2.128 JAE Jumps If Above or Equal (see also JNB and JNC). . . . . . . . . . . . . . . . . . . . . . . . . 2-141
2.129 JB Jumps If Below (see also JC and JNAE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-142
2.130 JBE Jumps If Below or Equal (see also JNA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-143
2.131 JC Jumps If Carry (see also JB and JNAE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-144
2.132 JCXZ Jumps Short If CX Register is 0 (see also JECXZ) . . . . . . . . . . . . . . . . . . . . . . . 2-145
2.133 JE Jumps Short If Equal (see also JZ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-146
2.134 JECXZ Jumps Short If ECX Register is 0 (see also JCXZ) . . . . . . . . . . . . . . . . . . . . . . 2-147
2.135 JG Jumps If Greater (see also JNLE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-148
2.136 JGE Jumps If Greater or Equal (see also JNL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-149
2.137 JL Jumps If Less (see also JNGE). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-150
2.138 JLE Jumps If Less or Equal (see also JNG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-151
2.139 JMP Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-152
2.140 JNA Jumps If Not Above (see also JBE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-156
2.141 JNAE Jumps If Not Above or Equal (see also JB and JC). . . . . . . . . . . . . . . . . . . . . . . 2-157
2.142 JNB Jumps If Not Below (see also JAE and JNC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-158
2.143 JNBE Jumps If Not Below or Equal (see also JA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-159
2.144 JNC Jumps If Not Carry (see also JAE and JNB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-160
2.145 JNE Jumps If Not Equal (see also JNZ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-161
2.146 JNG Jumps If Not Greater (see also JLE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-162
2.147 JNGE Jumps If Not Greater or Equal (see also JL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-163
2.148 JNL Jumps If Not Less (see also JGE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-164
2.149 JNLE Jumps If Not Less or Equal (see also JG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-165
2.150 JNO Jumps If Not Overflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-166
2.151 JNP Jumps If Not Parity (see also JPO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-167
2.152 JNS Jumps If Not Sign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-168
2.153 JNZ Jumps If Not Zero (see also JNE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-169
2.154 JO Jumps If Overflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-170
vi
Table Of Contents
AMD
2.155 JP Jumps If Parity (see also JPE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-171
2.156 JPE Jumps If Parity Even (see also JP). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-172
2.157 JPO Jumps if Parity Odd (see also JNP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-173
2.158 JS Jumps If Sign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-174
2.159 JZ Jumps If 0 (see also JE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-175
2.160 LAHF Loads Flags into AH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-176
2.161 LAR Loads Access Rights Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-177
2.162 LDS Loads Pointer Using DS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-178
2.163 LEA Loads Effective Address. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-179
2.164 LEAVE High Level Procedure Exit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-180
2.165 LES Loads Pointer Using ES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-181
2.166 LFS Loads Pointer Using FS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-182
2.167 LGDT Loads GDTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-183
2.168 LGS Loads Pointer Using GS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-184
2.169 LIDT Loads IDTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-185
2.170 LLDT Loads LDTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-186
2.171 LMSW Loads Machine Status Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-187
2.172 LOCK Asserts LOCK Signal Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-188
2.173 LODS/LODSB/LODSD/LODSW Loads String Operand . . . . . . . . . . . . . . . . . . . . . . . 2-189
2.174 LOOP/LOOPE/LOOPNE/LOOPNZ/LOOPZ Loop Control CX Counter . . . . . . . . . . . . . 2-191
2.175 LSL Loads Segment Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-192
2.176 LSS Loads Pointer Using SS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-193
2.177 LTR Loads Task Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-194
2.178 MOV Moves Data/Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-195
2.179 MOVS/MOVSB/MOVSD/MOVSW Moves Data from String to String . . . . . . . . . . . . . . 2-198
2.180 MOVSX Moves with Sign Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-200
2.181 MOVZX Moves with Zero Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-201
2.182 MUL Unsigned Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-202
2.183 NEG Two’s Complement Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-203
2.184 NOP No Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-204
2.185 NOT One’s Complement Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-205
2.186 OR Logical Inclusive OR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-206
2.187 OUT Outputs to Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-207
2.188 OUTS/OUTSB/OUTSD/OUTSW Output String to Port . . . . . . . . . . . . . . . . . . . . . . . . . 2-208
2.189 POP Pops Word from Stack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-210
2.190 POPA Pops All 16-Bit General Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-212
2.191 POPAD Pops All 32-Bit General Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-213
2.192 POPF/POPFD Pops Stack into FLAGS or EFLAGS Register . . . . . . . . . . . . . . . . . . . . 2-214
2.193 PUSH Pushes Operand onto Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-215
2.194 PUSHA Pushes All 16-Bit General Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-217
2.195 PUSHAD Pushes All 32-Bit General Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-218
2.196 PUSHF/PUSHFD Pushes FLAGS Register onto the Stack . . . . . . . . . . . . . . . . . . . . . 2-219
2.197 RCL Rotates through Carry Left. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-220
2.198 RCR Rotates through Carry Right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-221
2.199 REP/REPE/REPNE/REPNZ/REPZ Repeats Specified String Operation . . . . . . . . . . . 2-222
2.200 RET Returns from Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-224
2.201 ROL Rotates Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-228
2.202 ROR Rotates Right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-229
2.203 SAHF Stores AH into Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-230
2.204 SAL Shifts Arithmetic Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-231
2.205 SAR Shifts Arithmetic Right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-232
2.206 SBB Integer Subtract with Borrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-233
2.207 SCAS/SCASB/SCASD/SCASW Compares String Data . . . . . . . . . . . . . . . . . . . . . . . . 2-234
2.208 SETcc Sets Byte on Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-236
2.209 SGDT Store Global Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-237
2.210 SHL Shift Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-238
2.211 SHLD Double Precision Shift Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-239
2.212 SHR Shift Right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-241
Table of Contents
vii
AMD
2.213 SHRD Double Precision Shift Right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-242
2.214 SIDT Stores Interrupt Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-244
2.215 SLDT Stores Local Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-245
2.216 SMSW Stores Machine Status Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-246
2.217 STC Sets Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-247
2.218 STD Sets Direction Flag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-248
2.219 STI Sets Interrupt-Enable Flag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-249
2.220 STOS/STOSB/STOSD/STOSW Stores String Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-250
2.221 STR Stores Task Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-252
2.222 SUB Integer Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-253
2.223 TEST Logical Compare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-254
2.224 VERR/VERW Verifies Segment for Read/Write. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-255
2.225 WAIT Wait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-256
2.226 WBINVD Writes Back and Invalidates Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-257
2.227 XADD Exchanges and Adds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-258
2.228 XCHG Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-259
2.229 XLAT/XLATB Table Look-Up Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-260
2.230 XOR Logical Exclusive OR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-261
Appendices
A
General Guidelines for Programming
A.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-1
A.1.1 BIOS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-1
A.1.2 OS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-2
A.1.3 Application Software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-2
A.1.4 Software Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-2
A.2 Basic Programming Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-2
A.2.1 Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-3
A.2.2 Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-3
A.2.2.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-4
A.2.2.1.1 Simple Memory Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-4
A.2.2.1.2 Partial Segmentation Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-4
A.2.2.1.3 Full Segmentation Implementation . . . . . . . . . . . . . . . . . . . . . . . . . .A-4
A.2.2.2 Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-4
A.2.2.3 Selecting a Segmentation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-5
A.2.2.3.1 Flat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-5
A.2.2.3.2 Protected Flat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-6
A.2.2.3.3 Multisegment Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-7
A.2.2.4 Segment Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-8
A.2.2.4.1 Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-10
A.2.2.4.2 Segment Selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-11
A.2.2.4.3 Segment Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-12
A.2.2.4.4 Segment Descriptor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-15
A.2.2.4.5 Descriptor Table Base Registers. . . . . . . . . . . . . . . . . . . . . . . . . . .A-16
A.2.2.5 Page Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-17
A.2.2.5.1 PG Bit Enables Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-18
A.2.2.5.2 Linear Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-18
A.2.2.5.3 Page Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-19
A.2.2.5.4 Page Table Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-20
A.2.2.5.5 Page Frame Address. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-20
A.2.2.5.6 Present Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-20
A.2.2.5.7 Accessed and Dirty Bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-21
A.2.2.5.8 Read/Write and User/Supervisor Bits . . . . . . . . . . . . . . . . . . . . . . .A-21
A.2.2.5.9 Page-Level Cache Control Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-21
A.2.2.5.10 Translation Lookaside Buffer (TLB). . . . . . . . . . . . . . . . . . . . . . . .A-21
A.2.2.6 Combining Segment and Page Translation . . . . . . . . . . . . . . . . . . . . . . . . . .A-22
A.2.2.6.1 Flat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-22
viii
Table Of Contents
AMD
A.2.2.6.2 Segments Spanning Several Pages . . . . . . . . . . . . . . . . . . . . . . . .A-22
A.2.2.6.3 Pages Spanning Several Segments . . . . . . . . . . . . . . . . . . . . . . . .A-23
A.2.2.6.4 Non-Aligned Page and Segment Boundaries . . . . . . . . . . . . . . . . .A-23
A.2.2.6.5 Aligned Page and Segment Boundaries . . . . . . . . . . . . . . . . . . . . .A-23
A.2.2.6.6 Page-Table Per Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-23
A.2.3 Internal System Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-24
A.2.3.1 Segment-Level Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-24
A.2.3.2 Segment Descriptors and Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-24
A.2.3.2.1 Type Checking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-26
A.2.3.2.2 Limit Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-27
A.2.3.2.3 Privilege Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-28
A.2.3.3 Restricting Access to Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-29
A.2.3.4 Restricting Control Transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-30
A.2.3.5 Gate Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-32
A.2.3.5.1 Stack Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-34
A.2.3.5.2 Returning from a Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-37
A.2.3.6 Instructions Reserved for the Operating System . . . . . . . . . . . . . . . . . . . . . .A-38
A.2.3.6.1 Privileged lnstructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-38
A.2.3.6.2 Sensitive Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-38
A.2.3.7 Instructions for Pointer Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-38
A.2.3.7.1 Descriptor Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-40
A.2.3.7.2 Pointer Integrity and RPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-40
A.2.3.8 Page-Level Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-41
A.2.3.8.1 Page-Table Entries Hold Protection Parameters. . . . . . . . . . . . . . .A-41
A.2.3.8.2 Combining Protection of Both Levels of Page Tables . . . . . . . . . . .A-42
A.2.3.8.3 Overrides to Page Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-43
A.2.3.9 Combining Page and Segment Protection . . . . . . . . . . . . . . . . . . . . . . . . . . .A-43
A.2.4 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-43
A.2.4.1 Data Types in Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-43
A.2.4.2 Operand Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-46
A.2.5 Application Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-47
A.2.5.1 General Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-49
A.2.5.2 Segment Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-49
A.2.5.3 Status and Control Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-52
A.2.5.3.1 Flags Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-53
A.2.5.3.2 Instruction Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-54
A.2.5.4 FPU Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-55
A.2.5.4.1 FPU Register Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-55
A.2.5.4.2 FPU Status and Control Registers . . . . . . . . . . . . . . . . . . . . . . . . .A-56
A.2.5.4.3 Control Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-58
A.2.5.4.4 FPU Tag Word. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-59
A.2.5.4.5 Numeric Instruction and Data Pointers . . . . . . . . . . . . . . . . . . . . . .A-59
A.2.5.4.6 Opcode Field of Last Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . .A-62
A.2.6 Instruction Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-62
A.2.6.1 Instruction Prefixes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-66
A.2.6.2 Opcode Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-67
A.2.6.3 Address Specifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-67
A.2.6.4 Immediate Operand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-68
A.2.7 Operand Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-68
A.2.7.1 Immediate Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-69
A.2.7.2 Register Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-69
A.2.7.3 Memory Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-70
A.2.7.3.1 Segment Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-70
A.2.7.3.2 Effective-Address Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-71
A.2.8 Interrupts and Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-72
A.2.9 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-74
A.2.9.1 I/O Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-75
A.2.9.1.1 I/O Address Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-75
Table of Contents
ix
AMD
A.2.9.1.2 Memory-Mapped I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-76
A.2.9.2 I/O Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-77
A.2.9.3 Register I/O Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-77
A.2.9.4 Block I/O Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-77
A.2.9.5 Protection and I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-78
A.2.9.5.1 I/O Privilege Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-78
A.2.9.5.2 I/O Permission Bit Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-79
A.3 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-80
A.3.1 Debugging Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-81
A.3.2 Debug Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-81
A.3.2.1 Debug Address Registers (DR3–DR0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-81
A.3.2.2 Debug Control Register (DR7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-81
A.3.2.3 Debug Status Register (DR6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-83
A.3.2.4 Breakpoint Field Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-83
A.3.3 Debug Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-84
A.3.3.1 Interrupt 1—Debug Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-84
A.3.3.1.1 Instruction-Breakpoint Fault. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-85
A.3.3.1.2 Data-Breakpoint Trap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-85
A.3.3.1.3 General-Detect Fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-85
A.3.3.1.4 Single-Step Trap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-86
A.3.3.1.5 Task-Switch Trap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-86
A.3.3.2 Interrupt 3—Breakpoint Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-86
A.4 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-86
A.4.1 Introduction to Caching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-87
A.4.2 Operation of the Internal Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-88
A.4.2.1 Cache Disabling Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-88
A.4.2.2 Cache Management Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-88
A.4.2.3 Self-Modifying Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-88
A.4.3 Page-Level Cache Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-89
A.4.3.1 PCD Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-89
A.4.3.2 PWT Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-89
B
Opcode Map
B.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .B-1
B.2 Key to Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .B-1
B.3 Codes for Addressing Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .B-1
B.4 Codes for Operand Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .B-1
B.5 Register Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .B-1
C
Flag Cross-Reference
C.1 Key to Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .C-1
D
Condition Codes
D.1 Condition Codes for Conditional Jump and Set Instructions . . . . . . . . . . . . . . . . . . . . . . . . .D-1
E
Instruction Format and Timing
E.1 Instruction Encoding and Clock Count Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-1
E.2 Factors that Affect Instruction Clock Counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-1
E.3 General Instruction Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-36
E.4 Encoding of Floating-Point Instruction Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-40
F
Numeric Exception Summary
G
Code Optimization
G.1 Addressing Modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.2 Prefetch Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.3 Cache and Code Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.4 NOP Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x
Table Of Contents
G-1
G-2
G-2
G-3
AMD
G.5 Integer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.6 Condition Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.7 String Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.8 Floating-Point Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.9 Prefix Opcodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.10 Overlapped Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G.11 Miscellaneous Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
H
BIOS Data Area Map
I
Typical CMOS RAM Map
J
Standard I/O Port Addressing
G-3
G-4
G-5
G-5
G-6
G-6
G-6
Glossary
LIST OF FIGURES
Figure A-1 Flat Memory Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-5
Figure A-2 Protected Flat Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-6
Figure A-3 Multisegment Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-7
Figure A-4 TI Bit Selects Descriptor Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-9
Figure A-5 Segment Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-10
Figure A-6 Segment Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-10
Figure A-7 Segment Selector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-11
Figure A-8 Segment Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-12
Figure A-9 Segment Descriptor (Segment Not Present) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-15
Figure A-10 Descriptor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-16
Figure A-11 Pseudo-Descriptor Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-16
Figure A-12 Linear Address Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-18
Figure A-13 Page Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-19
Figure A-14 Page Table Entry Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-20
Figure A-15 Page Table Entry Format for a Not-Present Page . . . . . . . . . . . . . . . . . . . . . . . . .A-20
Figure A-16 Combining Segment and Page Address Translation . . . . . . . . . . . . . . . . . . . . . . .A-22
Figure A-17 Separate Page Tables for Each Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-23
Figure A-18 Description Fields Used for Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-25
Figure A-19 Protection Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-28
Figure A-20 Privilege Check for Data Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-29
Figure A-21 Privilege Check for Control Transfer Without Gate . . . . . . . . . . . . . . . . . . . . . . . .A-31
Figure A-22 Call Gate Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-32
Figure A-23 Call Gate Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-33
Figure A-24 Privilege Check for Control Transfer with Call Gate. . . . . . . . . . . . . . . . . . . . . . . .A-33
Figure A-25 Initial Stack Pointers in a TSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-35
Figure A-26 Stack Frame During Interlevel CALL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-36
Figure A-27 Protection Holds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-41
Figure A-28 Data Types in Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-43
Figure A-29 Bytes, Words, and Doublewords in Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-44
Figure A-30 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-45
Figure A-31 Application Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-48
Figure A-32 Unsegmented Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-50
Figure A-33 Segmented Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-50
Figure A-34 Stacks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-52
Table of Contents
xi
AMD
Figure A-35 EFLAGS Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-53
Figure A-36 Am486 Microprocessor FPU Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-55
Figure A-37 FPU Status Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-56
Figure A-38 FPU Control Word Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-58
Figure A-39 Tag Word Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-59
Figure A-40 Protected Mode Numeric Instruction and Data Pointer Image in Memory,
32-Bit Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-60
Figure A-41 Real Mode Numeric Instruction and Data Pointer Image in Memory,
32-Bit Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-60
Figure A-42 Protected Mode Numeric Instruction and Data Pointer Image in Memory,
16-Bit Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-61
Figure A-43 Real Mode Numeric Instruction and Data Pointer Image in Memory,
16-Bit Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-61
Figure A-44 Opcode Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-62
Figure A-45 General Instruction Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-63
Figure A-46 Floating-Point Instruction Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-64
Figure A-47 mod R/M and s-i-b Byte Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-67
Figure A-48 Effective Address Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-71
Figure A-49 Memory Mapped I/O. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-76
Figure A-50 I/O Permission Bit Map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-79
Figure A-51 Debug Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-82
Figure E-1 General Instruction Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-36
Figure E-2 Floating-Point Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-40
LIST OF TABLES
Table A-1 Application Segment Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-13
Table A-2 System Segment and Gate Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-26
Table A-3 Interlevel Return Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-37
Table A-4 Valid Descriptor Types for LSL Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-39
Table A-5 Combined Page Directory and Page Table Protection . . . . . . . . . . . . . . . . . . . . . . .A-42
Table A-6 Real Number Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-47
Table A-7 Register Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-49
Table A-8 Status Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-53
Table A-9 Condition Code Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-57
Table A-10 Correspondence between FPU Flags and Processor Flag Bits . . . . . . . . . . . . . . .A-57
Table A-11 Address Mode Field (mod/rm) Definitions (no s-i-b present). . . . . . . . . . . . . . . . . .A-64
Table A-12 Scale Field (ss) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-65
Table A-13 Index Field (index) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-65
Table A-14 Base Field (base) Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-66
Table A-15 Default Segment Selection Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-70
Table A-16 Exceptions and Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-73
Table A-17 Breakpoint Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-84
Table A-18 Debug Exception Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-84
Table A-19 Cache Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-88
Table E-1 Instruction Clock Count Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-2
Table E-2 Instruction Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-36
Table E-3 Operand Length Field (w) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
Table E-4 Direction Field (d) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
Table E-5 Sign-Extend Field (s) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
xii
Table Of Contents
AMD
Table E-6 General Register Field (reg) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
Table E-7 Address Mode Field (mod/rm) Definitions (no s-i-b present). . . . . . . . . . . . . . . . . . .E-38
Table E-8 Scale Field (ss) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-39
Table E-9 Index Field (index) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-39
Table E-10 Base Field (base) Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-39
Table F-1 Exception Summary for Floating-Point Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . F-1
Table H-1 BIOS Map Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .H-1
Table I-1 Example CMOS RAM Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I-1
Table J-1 Standard I/O Port Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J-1
Table of Contents
xiii
CHAPTER
INTRODUCTION
The Am486® Microprocessor Software User’s Manual is designed to support system software engineers developing BIOS and application software for use with products from the
Am486 microprocessor family. Because, typically, such engineers are already familiar with
basic personal computer system programming requirements, this book focuses on providing information about the basic processor instruction set and the programmable registers.
Each chapter begins with an overview diagram of registers or instructions organized by
operational category with cross-references to the detailed description page for each item.
The detailed descriptions are listed alphabetically on the subsequent pages in the chapter.
Supplementary information is provided in Appendices A through J. A glossary of terms is
included after the appendices. For convenience, a basic ASCII cross-reference is on the
inside back cover of this manual.
xiv
Introduction
CHAPTER
LIST OF FIGURES
Figure A-1 Flat Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-5
Figure A-2 Protected Flat Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-6
Figure A-3 Multisegment Memory Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-7
Figure A-4 TI Bit Selects Descriptor Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-9
Figure A-5 Segment Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-10
Figure A-6 Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-10
Figure A-7 Segment Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-11
Figure A-8 Segment Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-12
Figure A-9 Segment Descriptor (Segment Not Present) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-15
Figure A-10 Descriptor Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-16
Figure A-11 Pseudo-Descriptor Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-16
Figure A-12 Linear Address Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-18
Figure A-13 Page Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-19
Figure A-14 Page Table Entry Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-20
Figure A-15 Page Table Entry Format for a Not-Present Page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-20
Figure A-16 Combining Segment and Page Address Translation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-22
Figure A-17 Separate Page Tables for Each Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-23
Figure A-18 Description Fields Used for Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-25
Figure A-19 Protection Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-28
Figure A-20 Privilege Check for Data Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-29
Figure A-21 Privilege Check for Control Transfer Without Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-31
Figure A-22 Call Gate Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-32
Figure A-23 Call Gate Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-33
Figure A-24 Privilege Check for Control Transfer with Call Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-33
Figure A-25 Initial Stack Pointers in a TSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-35
Figure A-26 Stack Frame During Interlevel CALL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-36
Figure A-27 Protection Holds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-41
Figure A-28 Data Types in Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-43
Figure A-29 Bytes, Words, and Doublewords in Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-44
Figure A-30 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-45
Figure A-31 Application Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-48
Figure A-32 Unsegmented Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-50
Figure A-33 Segmented Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-50
Figure A-34 Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-52
Figure A-35 EFLAGS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-53
Figure A-36 Am486 Microprocessor FPU Register Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-55
Figure A-37 FPU Status Word. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-56
Figure A-38 FPU Control Word Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-58
Figure A-39 Tag Word Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-59
Figure A-40 Protected Mode Numeric Instruction and Data Pointer Image in Memory, 32-Bit Format. . . . .A-60
Figure A-41 Real Mode Numeric Instruction and Data Pointer Image in Memory, 32-Bit Format. . . . . . . . .A-60
Figure A-42 Protected Mode Numeric Instruction and Data Pointer Image in Memory, 16-Bit Format. . . . .A-61
Figure A-43 Real Mode Numeric Instruction and Data Pointer Image in Memory, 16-Bit Format. . . . . . . . .A-61
Table Of Contents
xiii
AMD
Figure A-44 Opcode Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-62
Figure A-45 General Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-63
Figure A-46 Floating-Point Instruction Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-64
Figure A-47 mod R/M and s-i-b Byte Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-67
Figure A-48 Effective Address Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-71
Figure A-49 Memory Mapped I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-76
Figure A-50 I/O Permission Bit Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-79
Figure A-51 Debug Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-82
Figure E-1 General Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-36
Figure E-2 Floating-Point Instruction Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-40
xiv
Table Of Contents
CHAPTER
LIST OF TABLES
Table A-1 Application Segment Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-13
Table A-2 System Segment and Gate Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-26
Table A-3 Interlevel Return Checks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-37
Table A-4 Valid Descriptor Types for LSL Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-39
Table A-5 Combined Page Directory and Page Table Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-42
Table A-6 Real Number Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-47
Table A-7 Register Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-49
Table A-8 Status Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-53
Table A-9 Condition Code Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-57
Table A-10 Correspondence between FPU Flags and Processor Flag Bits . . . . . . . . . . . . . . . . . . . . . . . . .A-57
Table A-11 Address Mode Field (mod/rm) Definitions (no s-i-b present) . . . . . . . . . . . . . . . . . . . . . . . . . . .A-64
Table A-12 Scale Field (ss) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-65
Table A-13 Index Field (index) Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-65
Table A-14 Base Field (base) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-66
Table A-15 Default Segment Selection Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-70
Table A-16 Exceptions and Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-73
Table A-17 Breakpoint Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-84
Table A-18 Debug Exception Conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-84
Table A-19 Cache Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-88
Table E-1 Instruction Clock Count Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-2
Table E-2 Instruction Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-36
Table E-3 Operand Length Field (w) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
Table E-4 Direction Field (d) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
Table E-5 Sign-Extend Field (s) Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
Table E-6 General Register Field (reg) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-37
Table E-7 Address Mode Field (mod/rm) Definitions (no s-i-b present) . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-38
Table E-8 Scale Field (ss) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-39
Table E-9 Index Field (index) Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-39
Table E-10 Base Field (base) Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-39
Table F-1 Exception Summary for Floating-Point Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-1
Table H-1 BIOS Map Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .H-1
Table I-1 Example CMOS RAM Map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I-1
Table J-1 Standard I/O Port Addresses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J-1
Table Of Contents
xv
AMD
xvi
Table Of Contents
CHAPTER
1
1.1
Am486 MICROPROCESSOR REGISTER SET
OVERVIEW
The Am486 Microprocessor Register Set includes the same basic system architecture as
other 486-based microprocessors. Page 1-2 provides a roadmap to these registers using
functional categories. For each register, the roadmap lists the page on which the detailed
register description appears. In the detailed description section that follows the Am486
microprocessor register roadmap, the registers appear in alphabetical order using the name
listed in the roadmap.
1.2
DETAILED REGISTER DESCRIPTIONS
Register descriptions begin on page 1-3, using the following format:
Register Name/s
General Description
Bit(s)
Bit Set Name
Description
nn xx
XXX
Function
Bit Size
Addressing
Description of register addressing method.
Default Value
Factory/default register setting.
Functional Description
Verbal description of register function by bit or bit set.
Note: Standard compiler programs convert the register names into opcode. This chapter
references the registers by name. Appendix E includes the opcodes used to address the
registers as part of the ‘Instruction Format and Timing’ descriptions.
Am486 Microprocessor Register Set
1-1
AMD
Am486 Microprocessor Register Roadmap
General
AH
AL
AX
BH
BL
BP
BX
CH
CL
CX
DH
DI
DL
DX
EAX
EBP
EBX
ECX
EDI
EDX
ESI
ESP
SI
SP
1-2
Segment
1-3
1-4
1-5
1-6
1-7
1-8
1-9
1-10
1-11
1-18
1-19
1-20
1-21
1-32
1-33
1-34
1-35
1-36
1-37
1-38
1-42
1-43
1-58
1-59
CS
DS
ES
FS
GS
SS
Memory
Management
1-17
1-31
1-41
1-51
1-53
1-60
GDTR
IDTR
LDTR
TR
Status and Control
Debug
EFLAGS
EIP
FLAGS
IP
DR0
DR1
DR2
DR3
DR4
DR5
DR6
DR7
1-39
1-40
1-44
1-55
Test
1-52
1-54
1-56
1-61
TR3
TR4
TR5
TR6
TR7
1-62
1-63
1-64
1-65
1-66
FPU
1-22
1-23
1-24
1-25
1-26
1-27
1-28
1-29
Am486 Microprocessor Register Set
CR0
CR1
CR2
CR3
FPUCR
FPUDP
FPUIP
FPUSR
FPUTWR
R0
R1
R2
R3
R4
R5
R6
R7
1-12
1-14
1-15
1-16
1-45
1-46
1-47
1-48
1-50
1-57
1-57
1-57
1-57
1-57
1-57
1-57
1-57
AMD
1.3
AH
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
AH Register
Processor general register, High byte of AX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-3
AMD
1.4
AL
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
AL Register
Processor general register, Low byte of AX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
1-4
Am486 Microprocessor Register Set
AMD
1.5
AX
Processor General Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
AX Register
Processor general register, Low word of EAX; see also AL, AH.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-5
AMD
1.6
BH
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
BH Register
Processor general register, High byte of BX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
1-6
Am486 Microprocessor Register Set
AMD
1.7
BL
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
BL Register
Processor general register, Low byte of BX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-7
AMD
1.8
BP
Processor General Register/Base Pointer
Bit(s)
Bit Set Name
Description
15–0
BP Register
Processor general register, base pointer register.
16 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations. When using 16-bit addressing, you can copy the stack pointer
(SP — see page 1-59) into BP before pushing anything onto the stack, and access data
structures using fixed offsets from the BP value.
1-8
Am486 Microprocessor Register Set
AMD
1.9
BX
Processor General Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
BX Register
Processor general register, Low word of EBX; see also BH and BL.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-9
AMD
1.10
CH
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
CH Register
Processor general register, High byte of CX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
1-10
Am486 Microprocessor Register Set
AMD
1.11
CL
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
CL Register
Processor general register, Low byte of CX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-11
AMD
1.12
CR0
Control Register 0
Bit(s)
Bit Set Name
Description
31
PG
0 = Paging disabled.
1 = Paging enabled.
30
CD
0 = Internal cache enabled.
1 = Internal cache disabled.
29
NW
0 = Enables write-throughs and invalidation cycles.
1 = Disables write-throughs and invalidation cycles.
28–19
N/A
Reserved
18
AM
0 = Alignment checking disabled.
1 = Alignment checking allowed.
17
N/A
Reserved
16
WP
0 = Supervising process can write read-only user-level pages.
1 = User-level pages protected against supervisor mode access.
15–6
N/A
Reserved
5
NE
0 = No error since last clear.
1 = Numeric error occurred.
4
ET
0 = No 387 coprocessor support.
1 = 387 coprocessor support.
3
TS
0 = No task switch since last clear.
1 = Task switched.
2
EM
0 = No emulation.
1 = Numeric emulation.
1
MP
0 = No coprocessor.
1 = Coprocessor present.
0
PE
0 = No protection.
1 = Segment level protection.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
CR0 configures several system level controls, as follows:
1-12
n
PG (bit 31) enables paging when set and disables paging when clear.
n
CD (bit 30) enables the internal cache when clear and disables the cache when set.
Cache misses do not cause cache line fills when the bit is set. Cache hits are not disabled;
you must flush the cache to disable it completely.
n
NW (bit 29) enables write-throughs and cache invalidation cycles when clear and disables invalidation cycles and write-throughs when hit in the cache when set. Disabling
write-throughs can allow stale data to appear in the cache.
n
AM (bit 18) allows alignment checking when set and disables alignment checking when
clear. Alignment checking occurs only when this bit is set, the AC flag is set, and CPL
is 3 (user mode).
Am486 Microprocessor Register Set
AMD
n
WP (bit 16) protects user-level pages against supervisor-mode access when set. When
clear, a supervisor process can write read-only user-level pages. This feature is useful
for implementing the copy-on-write method of creating a new process (forking) used by
some operating systems, such as UNIX.
n
NE (bit 5) enables the standard mechanism for reporting floating-point errors when set.
When NE is clear and the IGNNE input is active, numeric errors are ignored. When NE
is set and IGNNE is inactive, a numeric error causes the processor to stop and wait for
an interrupt from the FERR pin.
n
ET (bit 4) is set to support 387 coprocessor functions.
n
TS (bit 3) is set whenever a task switch occurs. The processor checks this bit when
interpreting floating-point arithmetic instructions to allow delaying save/restore of numeric content until the numeric data is actually used. The CLTS instruction clears this bit.
n
EM (bit 2) is used when set (along with TS) to generate a coprocessor-not-available
exception when a WAIT or numeric instruction is executed. EM can be set to cause
exception 7 on any WAIT or numeric instruction. When clear, the bit does not cause the
exception.
n
MP (bit 1) indicates, when set, that a coprocessor is present. When clear, the floatingpoint capability is not present.
n
PE (bit 0) enables segment-level protection when set. Clearing this bit removes the
protection.
The remaining bits are undefined and reserved.
Am486 Microprocessor Register Set
1-13
AMD
1.13
CR1
Control Register 1
Bit(s)
Bit Set Name
Description
31–0
CR1
Reserved
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
Register reserved
1-14
Am486 Microprocessor Register Set
32 bits
AMD
1.14
CR2
Control Register 2
Bit(s)
Bit Set Name
Description
31–0
CR2
Page fault linear address
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
When an exception occurs during paging, CR2 stores the 32-bit linear address that caused
the exception.
Am486 Microprocessor Register Set
1-15
AMD
1.15
CR3
Control Register 3
32 bits
Bit(s)
Bit Set Name
Description
31–12
PDBR
Page directory base register contains the 20 most significant bits
of the page directory (first-level page table) address.
11–5
N/A
Reserved
4
PCD
Page-level cache disable bit. 1=Paging disabled; 0=Paging enabled.
3
PWT
Page-level writes transparent. 1=Write-through to external cache
enabled; 2=Write-through disabled.
2–0
N/A
Reserved
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
CR3 configures some of the page-level controls, as follows:
n
PDBR (bits 31–12) is the page directory table address system control register. It contains
the 20 most significant bits of the page directory (first-level page table) address. Because
the page directory must be aligned to a page boundary, the lower 12 address bits are
ignored.
n
PCD (bit 4) is driven on the PCD pin during bus cycles that are not paged, such as
interrupt acknowledge cycles, when paging is enabled. It is driven on all bus cycles when
paging is not enabled. The PCD pin is one of the write-through cache controls for external
cache and is used on a cycle-by-cycle basis.
n
PWT (bit 3) is driven on the PWT pin during bus cycles that are not paged, such as
interrupt acknowledge cycles, when paging is enabled. It is driven on all bus cycles when
paging is not enabled. The PWT pin is one of the write-through cache controls for external
cache and is used on a cycle-by-cycle basis.
Bits 11–5 and 2–0 are undefined and are reserved.
1-16
Am486 Microprocessor Register Set
AMD
1.16
CS
Code Segment Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
CS
Code segment register holds the base address for the code segment of
memory, that area containing the instructions being executed.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The processor organizes memory into segments as one of the possible ways to access
memory. There are six segments (tables within memory) accessed through the segment
registers. Each register stores the base address for its segment. The segment containing
the instructions being executed is called the code segment. Its segment selector (base
address) is stored in the CS register. The processor fetches instructions from the code
segment, using the contents of the EIP or IP register as an offset into the segment. The
CS register value changes as a result of interrupts, exceptions, and instructions that transfer
control between segments (see CALL, IRET, and JMP instructions).
Am486 Microprocessor Register Set
1-17
AMD
1.17
CX
Processor General Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
CX Register
Processor general register, Low word of ECX; see also CL, CH.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations.
1-18
Am486 Microprocessor Register Set
AMD
1.18
DH
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
DH Register
Processor general register, High byte of DX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-19
AMD
1.19
DI
Processor General Register — Data Index
Bit(s)
Bit Set Name
Description
15–0
DI
Processor general register;
used as a destination index for string operations.
16 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Descriptions
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations. For string operations the DI register points to destination operands and increments or decrements between operations, depending on the DF setting in
the EFLAGS register (see page 1-39). The DI register can only point to operands in the
memory space specified by the ES segment register.
1-20
Am486 Microprocessor Register Set
AMD
1.20
DL
Processor General Register
Bit(s)
Bit Set Name
Description
7–0
AL Register
Processor general register, Low byte of DX.
8 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 8-bit general processor registers used to hold 8-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-21
AMD
1.21
DR0
Linear Breakpoint Address 0 Debug Register
Bit(s)
Bit Set Name
Description
31–0
DR0
Stores the address of a debug breakpoint.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
Access the built-in debugging features of the microprocessor through the eight Debug
Registers. The linear breakpoint address registers (DR0 to DR3) store addresses for as
many as four breakpoints. These breakpoints invoke debugging software. Whenever an
operation accesses one of these addresses, it generates an exception that initiates the
referenced debugging subroutine. You must specify the form of memory access that triggers
the breakpoint; for example, select an instruction fetch or a doubleword write operation.
The debug registers support instruction breakpoints and data breakpoints.
1-22
Am486 Microprocessor Register Set
AMD
1.22
DR1
Linear Breakpoint Address 1 Debug Register
Bit(s)
Bit Set Name
Description
31–0
DR1
Stores the address of a debug breakpoint.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
Access the built-in debugging features of the microprocessor through the eight Debug
Registers. The linear breakpoint address registers (DR0 to DR3) store addresses for as
many as four breakpoints. These breakpoints invoke debugging software. Whenever an
operation accesses one of these addresses, it generates an exception that initiates the
referenced debugging subroutine. You must specify the form of memory access that triggers
the breakpoint; for example, select an instruction fetch or a doubleword write operation.
The debug registers support instruction breakpoints and data breakpoints.
Am486 Microprocessor Register Set
1-23
AMD
1.23
DR2
Linear Breakpoint Address 2 Debug Register
Bit(s)
Bit Set Name
Description
31–0
DR2
Stores the address of a debug breakpoint.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
Access the built-in debugging features of the microprocessor through the eight Debug
Registers. The linear breakpoint address registers (DR0 to DR3) store addresses for as
many as four breakpoints. These breakpoints invoke debugging software. Whenever an
operation accesses one of these addresses, it generates an exception that initiates the
referenced debugging subroutine. You must specify the form of memory access that triggers
the breakpoint; for example, select an instruction fetch or a doubleword write operation.
The debug registers support instruction breakpoints and data breakpoints.
1-24
Am486 Microprocessor Register Set
AMD
1.24
DR3
Linear Breakpoint Address 3 Debug Register
Bit(s)
Bit Set Name
Description
31–0
DR3
Stores the address of a debug breakpoint.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
Access the built-in debugging features of the microprocessor through the eight Debug
Registers. The linear breakpoint address registers (DR0 to DR3) store addresses for as
many as four breakpoints. These breakpoints invoke debugging software. Whenever an
operation accesses one of these addresses, it generates an exception that initiates the
referenced debugging subroutine. You must specify the form of memory access that triggers
the breakpoint; for example, select an instruction fetch or a doubleword write operation.
The debug registers support instruction breakpoints and data breakpoints.
Am486 Microprocessor Register Set
1-25
AMD
1.25
DR4
Debug Register 4
Bit(s)
Bit Set Name
Description
31–0
DR4
Reserved
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
Not currently used.
1-26
Am486 Microprocessor Register Set
32 bits
AMD
1.26
DR5
Debug Register 5
Bit(s)
Bit Set Name
Description
31–0
DR5
Reserved
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
Not currently used.
Am486 Microprocessor Register Set
1-27
AMD
1.27
DR6
Breakpoint Status Debug Register
32 bits
Bit(s)
Bit Set Name
Description
31–16
N/A
Reserved, always 0000 0000 0000 0000
15
BT
0 = Default, no setting condition detected.
1 = Switch to task with TSS that has debug trap bit (T) set.
14
BS
0 = Default, no setting condition detected.
1 = Trap flag (TF) set.
13
BD
0 = Default, no setting condition detected.
1 = Next instruction reads or writes a debug register that is
in use by in-circuit emulation.
12–4
N/A
Reserved, always 0 0000 0000
3
B3
0 = No debug exception generated for breakpoint 3.
1 = Debug exception generated for breakpoint 3.
2
B2
0 = No debug exception generated for breakpoint 2.
1 = Debug exception generated for breakpoint 2.
1
B1
0 = No debug exception generated for breakpoint 1.
1 = Debug exception generated for breakpoint 1.
0
B0
0 = No debug exception generated for breakpoint 0.
1 = Debug exception generated for breakpoint 0.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The Breakpoint Status Debug Register stores the current breakpoint exception status. If
an exception occurs, read this register to determine which breakpoint caused the exception,
or whether one of three other possible triggering events occurred.
n
The BT bit (15) indicates when the exception was generated by switching to a task for
which the TSS had the T bit (debug trap) set.
n
The BS bit (14) indicates whether the trap flag (TF) in the EFLAGS register is set.
n
The BD bit (13) indicates that the debug registers are in use by in-circuit emulation and
that the next instruction writes to or reads from one of the registers.
n
B3 (bit 3), B2 (bit 2), B1 (bit 1), and B0 (bit 0) specify, when set, that the specified
breakpoint exception occurred.
Note: The processor never clears the contents of TR6. When writing a debug handler
routine, always make sure that the program clears TR6 before returning,
1-28
Am486 Microprocessor Register Set
AMD
1.28
DR7
Breakpoint Control Debug Register
32 bits
Bit(s)
Bit Set Name
Description
31–30
LEN3
00 = Breakpoint 3 is one byte.
01 = Breakpoint 3 is word (two bytes).
10 = Reserved, undefined.
11 = Breakpoint 3 is doubleword (four bytes).
29–28
R/W3
00 = Breakpoint 3 breaks on instruction execution only.
01 = Breakpoint 3 breaks on data writes only.
10 = Reserved, undefined.
11 = Breakpoint 3 breaks on data reads or writes, but not instructions.
27–26
LEN2
00 = Breakpoint 2 is one byte.
01 = Breakpoint 2 is word (two bytes).
10 = Reserved, undefined.
11 = Breakpoint 2 is doubleword (four bytes).
25–24
R/W2
00 = Breakpoint 2 breaks on instruction execution only.
01 = Breakpoint 2 breaks on data writes only.
10 = Reserved, undefined.
11 = Breakpoint 2 breaks on data reads or writes, but not instructions.
23–22
LEN1
00 = Breakpoint 1 is one byte.
01 = Breakpoint 1 is word (two bytes).
10 = Reserved, undefined.
11 = Breakpoint 1 is doubleword (four bytes).
21–20
R/W1
00 = Breakpoint 1 breaks on instruction execution only.
01 = Breakpoint 1 breaks on data writes only.
10 = Reserved, undefined.
11 = Breakpoint 1 breaks on data reads or writes, but not instructions.
19–18
LEN0
00 = Breakpoint 0 is one byte.
01 = Breakpoint 0 is word (two bytes).
10 = Reserved, undefined.
11 = Breakpoint 0 is doubleword (four bytes).
17–16
R/W0
00 = Breakpoint 0 breaks on instruction execution only.
01 = Breakpoint 0 breaks on data writes only.
10 = Reserved, undefined.
11 = Breakpoint 0 breaks on data reads or writes, but not instructions.
15–10
N/A
Reserved, always 0000 00
9
GE
Global enable, not used.
8
LE
Local enable, not used.
7
G3
0 = Global disable of breakpoint 3.
1 = Global enable of breakpoint 3.
6
L3
0 = Local disable of breakpoint 3.
1 = Local enable of breakpoint 3.
5
G2
0 = Global disable of breakpoint 2.
1 = Global enable of breakpoint 2.
4
L2
0 = Local disable of breakpoint 2.
1 = Local enable of breakpoint 2.
3
G1
0 = Global disable of breakpoint 1.
1 = Global enable of breakpoint 1.
2
L1
0 = Local disable of breakpoint 1.
1 = Local enable of breakpoint 1.
1
G0
0 = Global disable of breakpoint 0.
1 = Global enable of breakpoint 0.
0
L0
0 = Local disable of breakpoint 0.
1 = Local enable of breakpoint 0.
Am486 Microprocessor Register Set
1-29
AMD
Addressing
Specify by name as instruction operand.
Default Value
00000000h
Functional Description
The Debug Control Register (DR7) configures the breakpoints. The High word of the register
defines for each breakpoint, the type of breakpoint it is (R/W3, R/W2, R/W1, and R/W0)
and the length of each field (LEN3, LEN2, LEN1, and LEN0).
Note: For each LENn and R/Wn pair, if the breakpoint is defined as an instruction breakpoint
(R/Wn = 00), set LENn = 00. The instruction break is only defined for byte lengths; the
operation of an instruction break with any other length is undefined.
The lowest byte of DR7 allows enabling or disabling of the breakpoints at one or two levels:
global (G3, G2, G1, G0) or local (L3, L2, L1, L0). If a breakpoint is enabled at the global
level (Gn=1), it is enabled for all operations. If a breakpoint is disabled at the global level,
it can still be enabled for a single task with the local enable bit Ln. This acts as a temporary
enable that exists while the specified task runs. When a task switch occurs, it resets the
Ln enable bit for the associated breakpoint.
1-30
Am486 Microprocessor Register Set
AMD
1.29
DS
Data Segment Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
DS
A segment register that holds the base address for one of the four data
segments of memory, available to the program currently executing.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The processor organizes memory into segments as one of the possible ways to access
memory. There are six segments (tables within memory) accessed through the segment
registers. Each register stores the base address for its segment. There are four data segments that can contain data used by a program being executed. The segment selectors
(base addresses) for these segments are stored in the DS, ES, FS, and GS registers. The
processor fetches data from a data segment, using an offset into the segment. The data
segment register value changes as a result of interrupts, exceptions, and instructions that
transfer control between segments (see CALL, IRET, and JMP instructions).
Am486 Microprocessor Register Set
1-31
AMD
1.30
DX
Processor General Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
DX Register
Processor general register, Low word of EDX; see also DL, DH.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations.
1-32
Am486 Microprocessor Register Set
AMD
1.31
EAX
Processor General Register
Bit(s)
Bit Set Name
Description
31–0
EAX
Processor general register; see also AX.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-33
AMD
1.32
EBP
Processor General Register — Base Pointer
32 bits
Bit(s)
Bit Set Name
Description
31–0
EBP
Processor general register; base pointer register; see also BP.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations. When using 32-bit addressing, copy the stack pointer (ESP —
see page 1-43) into EBP before pushing anything onto the stack, and access data structures
using fixed offsets from the EBP value.
1-34
Am486 Microprocessor Register Set
AMD
1.33
EBX
Processor General Register
Bit(s)
Bit Set Name
Description
31–0
EBX
Processor general register; see also BX.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations.
Am486 Microprocessor Register Set
1-35
AMD
1.34
ECX
Processor General Register
Bit(s)
Bit Set Name
Description
31–0
ECX
Processor general register; see also CX.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations.
1-36
Am486 Microprocessor Register Set
AMD
1.35
EDI
Processor General Register — Data Index
32 bits
Bit(s)
Bit Set Name
Description
31–0
EDI
Processor general register; data index register, used as a 32-bit destination index for string operations; see also DI.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations. For string operations, the EDI register points to destination
operands and increments or decrements between operations, depending on the DF setting
in the EFLAGS register (see page 1-39). The EDI register can only point to operands in
the memory space specified by the ES segment register.
Am486 Microprocessor Register Set
1-37
AMD
1.36
EDX
Processor General Register
Bit(s)
Bit Set Name
Description
31–0
EDX
Processor general register; see also DX.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations.
1-38
Am486 Microprocessor Register Set
AMD
1.37
EFLAGS
Extended Flags Register
32 bits
Bit(s)
Bit Set Name
Description
31–19
N/A
Reserved, always 0000 0000 0000 0
18
AC
0 = Alignment Check mode not enabled.
1 = Alignment Check mode enabled.
17
VM
0 = Normal processing mode.
1 = Virtual-8086 mode.
16
RF
0 = Normal operation.
1 = Debug exceptions disabled to allow debugger program to run
without causing another exception, Resume Flag set.
15
N/A
Reserved, always 0
14
NT
0 = Current task is not nested below another task.
1 = Current task is nested below another task.
13–12
IOPL
00 = Highest I/O access privilege level; typically operating system.
01 = Second highest I/O access privilege level; system services.
10 = Third highest I/O access privilege level; system services.
11 = Lowest I/O access privilege level; application software.
11
OF
0 = Arithmetic result within limits.
1 = Arithmetic result not in positive/negative range, Overflow Flag set.
10
DF
0 = Forward direction, addressing increments.
1 = Backward direction, addressing decrements, Direction Flag set.
9
IF
0 = Maskable interrupts disabled.
1 = Maskable interrupts enabled, Interrupt Flag set.
8
TF
0 = Normal operation.
1 = Trap Flag set, processor enters single-step mode for debugging;
each instruction generates a debug exception.
7
SF
0 = Arithmetic result is not negative (≥0); sign is +.
1 = Arithmetic result is negative (<0); Sign Flag set, sign is –.
6
ZF
0 = Arithmetic result is not zero.
1 = Arithmetic result is zero, Zero Flag set.
5
N/A
Reserved, always 0
4
AF
0 = No BCD carry.
1 = BCD carry from bit position 3, Auxiliary Flag set.
3
N/A
Reserved, always 0
2
PF
0 = Result Low byte has odd parity.
1 = Result Low byte has even parity, Parity Flag set.
1
N/A
Reserved, always 1
0
CF
0 = No carry from MSB of result.
1 = Carry from MSB of result, Carry Flag set.
Addressing
Specify by bit/set names or by using the special flag instructions (BT, BTR, BTS, CLC, CLD,
LAHF, POPF, POPFD, PUSHF, PUSHFD, SAHF, STC, STD, STI) described in Chapter 2.
Default Value
00000002h
Functional Description
The 32-bit EFLAGS register has system flags, status flags, and a control flag.
Am486 Microprocessor Register Set
1-39
AMD
1.38
EIP
Extended Instruction Pointer Register
32 bits
Bit(s)
Bit Set Name
Description
31–0
EIP
Extended Instruction Pointer, the offset that points to the next
instruction within the current code segment.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The EIP register contains the 32-bit offset that points to the next instruction within the current
code segment. The control-transfer instructions, such as JUMP or RET, and interrupts and
exceptions control the contents of this register implicitly. The contents of this register advance from one instruction boundary to the next. Because of instruction prefetching, its
value is only an approximate indication of the bus activity loading instructions into the
processor. The IP register (see page 1-55) is the lower word of the EIP register.
1-40
Am486 Microprocessor Register Set
AMD
1.39
ES
Data Segment Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
ES
A segment register that holds the base address for one of the four data
segments of memory, available to the program currently executing.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The processor organizes memory into segments as one of the possible ways to access
memory. There are six segments (tables within memory) accessed through the segment
registers. Each register stores the base address for its segment. There are four data segments that can contain data used by a program being executed. The segment selectors
(base addresses) for these segments are stored in the DS, ES, FS, and GS registers. The
processor fetches data from a data segment, using an offset into the segment. The data
segment register value changes as a result of interrupts, exceptions, and instructions that
transfer control between segments (see CALL, IRET, and JMP instructions).
Am486 Microprocessor Register Set
1-41
AMD
1.40
ESI
Processor General Register — Stack Index
32 bits
Bit(s)
Bit Set Name
Description
31–0
ESI
Processor 32-bit general register, also used as a 32-bit stack index
register.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations. String operations can use ESI as the source index register. The
value in ESI represents the offset into a memory space defined by one of the segment
registers. The default segment register is DS, but a segment override prefix allows a string
instruction to use CS, SS, ES, FS, or GS. When used by string instructions, ESI automatically increments or decrements (based on the value of DF in the EFLAGS register — see
page 1-39). This feature allows sequential string operations to operate on a set of string
values without having to specify a new ESI value for each instruction.
1-42
Am486 Microprocessor Register Set
AMD
1.41
ESP
Processor General Register — Stack Pointer
32 bits
Bit(s)
Bit Set Name
Description
31–0
ESP
Processor general 32-bit register; also used as the Stack Pointer
register.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 32-bit general processor registers used to hold 32-bit operands for logical
and arithmetic operations. When used as the Stack Pointer, the register holds the offset
value that points to the current top-of-stack (TOS) location within the memory segment
specified by the Stack Segment (SS) register (see page 1-60). When a program PUSHes
a value onto the stack, the processor decrements the value in the ESP register, and then
writes the value to the new TOS specified by ESP. To POP a value, the processor copies
it from the current address specified by ESP and then increments the ESP value.
Am486 Microprocessor Register Set
1-43
AMD
1.42
FLAGS
Flags Register
16 bits
Bit(s)
Bit Set Name
Description
15
N/A
Reserved, always 0
14
NT
0 = Current task is not nested below another task.
1 = Current task is nested below another task.
13–12
IOPL
00 = Highest I/O access privilege level; typically operating system.
01 = Second highest I/O access privilege level; system services.
10 = Third highest I/O access privilege level; system services.
11 = Lowest I/O access privilege level; application software.
11
OF
0 = Arithmetic result within limits.
1 = Arithmetic result not in positive/negative range, Overflow Flag set.
10
DF
0 = Forward direction, addressing increments.
1 = Backward direction, addressing decrements, Direction Flag set.
9
IF
0 = Maskable interrupts disabled.
1 = Maskable interrupts enabled, Interrupt Flag set.
8
TF
0 = Normal operation.
1 = Trap Flag set, processor enters single-step mode for debugging;
each instruction generates a debug exception.
7
SF
0 = Arithmetic result is not negative (≥0); sign is +.
1 = Arithmetic result is negative (<0); Sign Flag set, sign is –.
6
ZF
0 = Arithmetic result is not zero.
1 = Arithmetic result is zero, Zero Flag set.
5
N/A
Reserved, always 0
4
AF
0 = No BCD carry.
1 = BCD carry from bit position 3, Auxiliary Flag set.
3
N/A
Reserved, always 0
2
PF
0 = Result Low byte has odd parity.
1 = Result Low byte has even parity, Parity Flag set.
1
N/A
Reserved, always 1
0
CF
0 = No carry from MSB of result.
1 = Carry from MSB of result, Carry Flag set.
Addressing
Specify by bit/set names or by using the special flag instructions (BT, BTR, BTS, CLC, CLD,
LAHF, POPF, POPFD, PUSHF, PUSHFD, SAHF, STC, STD, STI) described in Chapter 2.
Default Value
00000002h
Functional Description
The 16-bit FLAGS register has system flags, status flags, and a control flag, described
above. FLAGS is the lower word of EFLAGS (see page 1-39).
1-44
Am486 Microprocessor Register Set
AMD
1.43
FPUCR
FPU Control Register
Bit(s)
Bit Set Name
Description
15–13
N/A
Reserved, undefined.
12
Infinity Control
Not used.
11–10
Rounding Control (RC) 00 = Round to nearest or even value.
01 = Round down (toward – ∞).
10 = Round up (toward + ∞).
11 = Chop (truncate toward 0).
9–8
Precision Control (PC) 00 = 24 bits (single precision).
01 = Not used/reserved.
10 = 53 bits (double precision).
11 = 64 bits (extended precision).
7–6
N/A
Reserved, undefined.
5
Precision Exception
Mask
0 = Exception not masked.
1 = Exception masked.
4
Underflow Exception
Mask
0 = Exception not masked.
1 = Exception masked.
3
Overflow Exception
Mask
0 = Exception not masked.
1 = Exception masked.
2
Zero Divide Exception
Mask
0 = Exception not masked.
1 = Exception masked.
1
Denormalized
Operand
Exception Mask
0 = Exception not masked.
1 = Exception masked.
0
Invalid Operation
Exception Mask
0 = Exception not masked.
1 = Exception masked.
16 bits
Addressing
Use the appropriate Instruction (FLDCW, FNSTCW, or FSTCW) to address the contents
of this register. See Chapter 2 for a description of these instructions.
Default Value
Undefined
Functional Description
The FPUCR stores the current FPU Control Word value. The Control Word allows configuration of Rounding and Precision Control values and masking of the six exception types
described above. No direct writing to or reading from this register is possible. Load a value
from memory using the FLDCW instruction to write to the register. Load a copy to memory
using the FNSTCW or FSTCW instruction to read the register contents.
Am486 Microprocessor Register Set
1-45
AMD
1.44
FPUDP
Bit(s)
FPU Data Pointer
Bit Set Name
32 or 64 bits
Description
32-bit Format in Protected Mode (64-bit field):
64–49
N/A
Reserved
48–32
Operand Selector
Stores the value loaded into the segment register to select the
data segment.
31–0
Data Operand Offset
Stores the data offset value within the specified segment.
32-bit Format in Real or Virtual-8086 Mode (64-bit field):
64–61
N/A
Reserved, always 0000
60–45
Operand Pointer
(bits 31–16)
Upper word of the operand address.
44–32
N/A
Reserved, always 0000 0000 0000
31–16
N/A
Reserved, undefined.
15–0
Operand Pointer
(bits 15–0)
Lower word of the operand address.
16-bit Format in Protected Mode (32-bit field):
31–16
Operand Selector
Stores the value loaded into the segment register to select the
data segment.
15–0
Data Operand Offset
Stores the data offset value within the specified segment.
16-bit Format in Real or Virtual-8086 Mode (32-bit field):
31–28
Operand Pointer
(bits 19–16)
Upper four bits of the operand address.
27–16
N/A
Reserved, always 0000 0000 0000
15–0
Operand Pointer
(bits 15–0)
Lower 16 bits of the operand address.
Addressing
Direct addressing of the register contents is not possible. Use the instructions FLDENV,
FNSAVE, FNSTENV, FRSTOR, FSAVE, and FSTENV to write to or read from the register.
The save (FNSAVE, FSAVE) and store (FNSTENV and FSTENV) instructions write the
contents of all the FPU registers to memory. The FPU Data Pointer starts at offset 14h from
the base address in 32-bit format, or offset Ah from the base address in 16-bit format, within
the stored ENVironment data.
Default Value
Undefined
Functional Description
The data pointer stores the address of the last data operand that caused a floating-point
exception. The format of the pointer varies depending on the addressing format (32-bit or
16-bit) and mode (Protected or Real/Virtual), as described above.
1-46
Am486 Microprocessor Register Set
AMD
1.45
FPUIP
Bit(s)
FPU Instruction Pointer
Bit Set Name
32 or 64 bits
Description
32-bit Format in Protected Mode (64-bit field):
64–49
N/A
Reserved
48–32
CS Selector
Stores the value loaded into the code segment register .
31–0
IP Offset
Stores the instruction pointer offset value.
32-bit Format in Real or Virtual-8086 Mode (64-bit field):
64–61
N/A
Reserved, always 0000
60–45
Instruction Pointer
(bits 31–16)
Upper word of the instruction pointer address.
44–32
N/A
Reserved, always 0
43–32
Opcode
Stores the 11-bit opcode value.
31–16
N/A
Reserved, undefined.
15–0
Instruction Pointer
(bits 15–0)
Lower word of the instruction pointer address.
16-bit Format in Protected Mode (32-bit field):
31–16
CS Selector
Stores the value loaded into the code segment register.
15–0
IP Offset
Stores the instruction pointer offset value.
16-bit Format in Real or Virtual-8086 Mode (32-bit field):
31–28
Instruction Pointer
(bits 19–16)
Upper four bits of the instruction pointer address.
27
N/A
Reserved, always 0
26–16
Opcode
Stores the 11-bit opcode value.
15–0
Instruction Pointer
(bits 15–0)
Lower 16 bits of the instruction pointer address.
Addressing
Direct addressing of the register contents is not possible. Use the instructions FLDENV,
FNSAVE, FNSTENV, FRSTOR, FSAVE, and FSTENV to write to or read from the register.
The save (FNSAVE, FSAVE) and store (FNSTENV and FSTENV) instructions write the
contents of all the FPU registers to memory. The FPU Instruction Pointer starts at offset
Ch from the base address in 32-bit format, or offset 6h from the base address in 16-bit
format, within the stored ENVironment data.
Default Value
Undefined
Functional Description
The data pointer stores the address of the last instruction that caused a floating-point
exception. In Real or Virtual-8086 mode, the opcode field stores the opcode value for the
last non-control FPU instruction. The format of the pointer varies depending on the addressing format (32-bit or 16-bit) and mode (Protected or Real/Virtual), as described above.
Am486 Microprocessor Register Set
1-47
AMD
1.46
1-48
FPUSR
FPU Status Register
16 bits
Bit(s)
Bit Set Name
Description
15
B
0 = FPU not busy
1 = FPU busy
14
C3
Condition flag C3, value varies depending on floating-point instruction.
For compare and test instructions, 0 = result not zero
and 1 = result zero.
FXAM uses C3, C2, and C0 to generate a result code (see FXAM).
For FPREM and FPREM1, C3 is the least significant bit of the result.
13–11
TOP
000 = R0 is top of stack.
001 = R1 is top of stack.
010 = R2 is top of stack.
011 = R3 is top of stack.
100 = R4 is top of stack.
101 = R5 is top of stack.
110 = R6 is top of stack.
111 = R7 is top of stack.
10
C2
Condition flag C2, value varies depending on floating-point instruction.
For compare and test instructions:
0 = operand is comparable and 1 = operand is not comparable.
FXAM uses C3, C2, and C0 to generate a result code (see FXAM).
For FPREM and FPREM1:
0 = reduction complete and 1 = reduction incomplete.
9
C1
Condition flag C1, value varies depending on floating point instruction.
If the instruction generates an exception:
0 = underflow error and 1 = overflow error.
If there is no exception:
For FXAM, 0 = value is ≥0, sign is +; 1 = value is < 0, sign is –.
For FPREM and FPREM1, C1 is the second least significant
result bit.
For arithmetic instructions:
0 = last rounding down and 1 = last rounding up.
8
C0
Condition flag C0, value varies depending on floating point instruction.
For compare and test instructions:
0 = result did not generate carry and 1 = result generated carry.
FXAM uses C3, C2, and C0 to generate a result code (see FXAM).
For FPREM and FPREM1, C0 is the third least significant result bit.
7
ES
0 = No exception generated.
1 = Exception generated.
6
SF
0 = No exception generated.
1 = Stack fault exception generated.
5
PE
0 = No exception generated.
1 = Precision exception generated.
4
UE
0 = No exception generated.
1 = Underflow exception generated.
3
OE
0 = No exception generated.
1 = Overflow exception generated.
2
ZE
0 = No exception generated.
1 = Divide by zero exception generated.
1
DE
0 = No exception generated.
1 = Denormalized operand exception generated.
0
IE
0 = No exception generated.
1 = Invalid operation exception generated.
Am486 Microprocessor Register Set
AMD
Addressing
Use the appropriate instruction (FLDSW, FNSTSW, or FSTSW) to address the contents of
this register. See Chapter 2 for a description of these instructions.
Default Value
Undefined
Functional Description
The FPUSR stores the current FPU Status Word value. The Status Word allows monitoring
of the current status of the FPU. Direct addressing of the register contents is not possible.
Load a value from memory using the FLDSW instruction to write to the register. Read the
current value by loading a copy to memory using the FNSTSW or FSTSW instruction. The
interaction between the FPU instructions and the Status Word is discussed in detail in
Chapter 2 as part of the individual instruction descriptions.
Am486 Microprocessor Register Set
1-49
AMD
1.47
FPUTWR
FPU Tag Word Register
16 bits
Bit(s)
Bit Set Name
Description
15–14
TAG(7)
00 = R7 contents valid.
01 = R7 contents are zero.
10 = R7 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R7 empty.
13–12
TAG(6)
00 = R6 contents valid.
01 = R6 contents are zero
10 = R6 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R6 empty.
11–10
TAG(5)
00 = R5 contents valid.
01 = R5 contents are zero.
10 = R5 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R5 empty.
9–8
TAG(4)
00 = R4 contents valid.
01 = R4 contents are zero.
10 = R4 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R4 empty.
7–6
TAG(3)
00 = R3 contents valid.
01 = R3 contents are zero.
10 = R3 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R3 empty.
5–4
TAG(2)
00 = R2 contents valid.
01 = R2 contents are zero.
10 = R2 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R2 empty.
3–2
TAG(1)
00 = R1 contents valid.
01 = R1 contents are zero.
10 = R1 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R1 empty.
1–0
TAG(0)
00 = R0 contents valid.
01 = R0 contents are zero.
10 = R0 contents special: invalid (NaN or unsupported), infinity, or denormal.
11 = R0 empty.
Addressing
Direct addressing of the register contents is not possible. Use the instructions FLDENV,
FNSAVE, FNSTENV, FRSTOR, FSAVE, and FSTENV to write to or read from the register.
The save (FNSAVE, FSAVE) and store (FNSTENV and FSTENV) instructions write the
contents of all the FPU registers to memory. The FPU Tag Word starts at offset 8h from the
base address in 32-bit format, or offset 4h from the base address in 16-bit format, within
the stored ENVironment data.
Default Value
Undefined
Functional Description
The FPUTWR stores the Tag Word for the eight FPU data registers (R0–R7). The Tag Word
describes the current status for each of these register, as described above. Because the
FPU instructions refer to the registers indirectly through the stack register notation as ST(0)
through ST(7) and the actual associated registers change as the stack pointer changes,
use the value for TOP (bits 13–11 in the FPU Status Word — see page 1-48) to associate
the tag values with the relative stack registers.
1-50
Am486 Microprocessor Register Set
AMD
1.48
FS
Data Segment Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
FS
A segment register that holds the base address for one of the four data
segments of memory, available to the program currently executing.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The processor organizes memory into segments as one of the possible ways to access
memory. There are six segments (tables within memory) accessed through the segment
registers. Each register stores the base address for its segment. There are four data segments that can contain data used by a program being executed. The segment selectors
(base addresses) for these segments are stored in the DS, ES, FS, and GS registers. The
processor fetches data from a data segment, using an offset into the segment. The data
segment register value changes as a result of interrupts, exceptions, and instructions that
transfer control between segments (see CALL, IRET, and JMP instructions).
Am486 Microprocessor Register Set
1-51
AMD
1.49
GDTR
Global Descriptor Table Register
48 bits
Bit(s)
Bit Set Name
Description
47–16
GDT Base Address
Stores the base address for the Global Descriptor Table location.
15–0
GDT Segment Limit
Stores the limit for the Global Descriptor Table segment.
Addressing
Direct addressing of the register contents is not possible. Write to the register using the
LGDT instruction. Read the contents of the register into memory using the SGDT instruction.
Both instructions require the highest privilege level generally accorded only to operating
system software.
Default Value
Undefined; BIOS and operating system software define the contents of this register.
Functional Description
The register holds the 32-bit base address and 16-bit segment limit for the Global Descriptor
Table (GDT). The referenced GDT contains the segment descriptors for the memory available to any general operation. The table can vary in size from a minimum of 8 bytes to a
maximum of 64K bytes. Each memory segment descriptor requires 8 bytes, so the GDT
can store as many as 8192 segment descriptors. The first 8 bytes of the GDT are, however,
reserved as the null descriptor to define a null pointer value. Load the null value into unused
segment registers to initialize them.
The GDT contains selectors for all of the defined Local Descriptor Tables (LDTs) but should
exclude segments defined for use by the system services (interrupts and traps). The system
services segments are included as part of the Interrupt Descriptor Table (IDT). A detailed
description of the descriptor tables is included as part of Appendix A.
1-52
Am486 Microprocessor Register Set
AMD
1.50
GS
Data Segment Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
CS
A segment register that holds the base address for one of the four data
segments of memory, available to the program currently executing.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The processor organizes memory into segments as one of the possible ways to access
memory. There are six segments (tables within memory) accessed through the segment
registers. Each register stores the base address for its segment. There are four data segments that can contain data used by a program being executed. The segment selectors
(base addresses) for these segments are stored in the DS, ES, FS, and GS registers. The
processor fetches data from a data segment, using an offset into the segment. The data
segment register value changes as a result of interrupts, exceptions, and instructions that
transfer control between segments (see CALL, IRET, and JMP instructions).
Am486 Microprocessor Register Set
1-53
AMD
1.51
IDTR
Interrupt Descriptor Table Register
48 bits
Bit(s)
Bit Set Name
Description
47–16
IDT Base Address
Stores the base address for the Interrupt Descriptor Table location.
15–0
IDT Segment Limit
Stores the limit for the Interrupt Descriptor Table segment.
Addressing
Direct addressing of the register contents is not possible. Write to the register using the
LIDT instruction. Read the contents of the register into memory using the SIDT instruction.
Both instructions require the highest privilege level generally accorded only to operating
system software.
Default Value
Undefined; BIOS and operating system software define the contents of this register.
Functional Description
The register holds the 32-bit base address and 16-bit segment limit for the Interrupt Descriptor Table (IDT). The referenced IDT contains the segment descriptors for the memory
available to system service (interrupt and trap) operations. The table can vary in size from
a minimum of 8 bytes to a maximum of 64K bytes. Each memory segment descriptor
requires 8 bytes, so the IDT can store as many as 8192 segment descriptors. To protect
them from use by other tasks, exclude the system services segments from the General
Descriptor Table (GDT). A detailed description of the descriptor table is included as part of
Appendix A.
1-54
Am486 Microprocessor Register Set
AMD
1.52
IP
Instruction Pointer
16 bits
Bit(s)
Bit Set Name
Description
15–0
IP
Instruction Pointer, contains the 16-bit offset into the current code
segment for the next instruction.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The IP register contains the 16-bit offset that points to the next instruction within the current
code segment, when operating using 16-bit addressing. The control-transfer instructions,
such as JUMP or RET, and interrupts and exceptions control the contents of this register
implicitly. The contents of this register advance from one instruction boundary to the next.
Because of instruction prefetching, its value is only an approximate indication of the bus
activity loading instructions into the processor. The IP register is the lower word of the EIP
register (see page 1-40).
Am486 Microprocessor Register Set
1-55
AMD
1.53
LDTR
Local Descriptor Table Register
48 bits
Bit(s)
Bit Set Name
Description
47–16
LDT Base Address
Stores the base address for the Local Descriptor Table location.
15–0
LDT Segment Limit
Stores the limit for the Local Descriptor Table segment.
Addressing
Direct addressing of the register contents is not possible. Write to the register using the
LLDT instruction. Read the contents of the register into memory using the SLDT instruction.
Both instructions require the highest privilege level generally accorded only to operating
system software.
Default Value
Undefined; BIOS and operating system software define the contents of this register.
Functional Description
The register holds the 32-bit base address and 16-bit segment limit for the current Local
Descriptor Table (LDT) used by a referenced segment register (CS, DS, ES, FS, GS, or
SS). By using the segment registers, a task can access as many as six different memory
segments simultaneously. The referenced LDT contains the segment descriptors for the
memory available to a specific task. The table can vary in size from a minimum of 8 bytes
to a maximum of 64 Kbytes. Each memory segment descriptor requires 8 bytes, so each
LDT can store as many as 8192 segment descriptors.
The LDT should exclude segments defined for use by the system services (interrupts and
traps). The system services segments are included as part of the Interrupt Descriptor Table
(IDT). A detailed description of the descriptor table is included as part of Appendix A.
1-56
Am486 Microprocessor Register Set
AMD
1.54
R0–R7
FPU Data Registers 0–7
Bit(s)
Bit Set Name
Description
79
Sign
0=+
1=–
78–64
Exponent
Exponent value
63–0
Significand
Significand value
80 bits each
Addressing
Address the registers through the stack address ST(n). ST(0) is the top of the FPU stack.
Bits 13–11 in the FPU Status Word indicate which data register is at the top of the stack
(see page 1-48).
Default Value
00000000000000000000h
Functional Description
The FPU data registers store data for processing by the FPU. Numeric instructions address
the data registers relative to the register at the top of the FPU stack. At any point in time,
the register at the top of the stack (R0–R7) is indicated by the TOP field in the FPU status
word. Load or push operations decrement TOP by one and load a value into the new TOP
register. A store-and-pop operation stores the value from the current TOP register and then
increments TOP by 1. The FPU register stack, similar to stack operations in memory, grows
down toward lower-numbered registers. Some numeric operations allow operating on registers as an offset of the stack top.
Am486 Microprocessor Register Set
1-57
AMD
1.55
SI
Processor General Register — Stack Index
16 bits
Bit(s)
Bit Set Name
Description
15–0
SI
Processor general 16-bit register; also used as the Stack Index
register.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations. String operations can use SI as the source index register for 16bit addressing. The value in SI represents the offset into a memory space defined by one
of the segment registers. The default segment register is DS, but a segment override prefix
allows a string instruction to use CS, SS, ES, FS, or GS. When used by string instructions,
SI automatically increments or decrements (based on the value of DF in the EFLAGS
register — see page 1-39). This feature allows sequential string operations to operate on
a set of string values without having to specify a new SI value for each instruction.
1-58
Am486 Microprocessor Register Set
AMD
1.56
SP
Processor General Register — Stack Pointer
16 bits
Bit(s)
Bit Set Name
Description
15–0
SP
Processor general 16-bit register, also used as Stack Pointer register
for 16-bit addressing modes.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
One of the eight, 16-bit general processor registers used to hold 16-bit operands for logical
and arithmetic operations. When used as the Stack Pointer in 16-bit addressing mode, the
register holds the offset value that points to the current top-of-stack (TOS) location within
the memory segment specified by the Stack Segment (SS) register (see page 1-60). When
a program PUSHes a value onto the stack, the processor decrements the value in the SP
register, and then writes the value to the new TOS specified by SP. To POP a value, the
processor copies it from the current address specified by SP and then increments the SP
value.
Am486 Microprocessor Register Set
1-59
AMD
1.57
SS
Stack Segment Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
SS
Stack segment register holds the base address for the stack segment
in memory.
Addressing
Specify by name as instruction operand.
Default Value
Undefined
Functional Description
The processor organizes memory into segments as one of the possible ways to access
memory. There are six segments (tables within memory) accessed through the segment
registers. Each register stores the base address for its segment. The segment containing
the temporary user space for the program being executed is called the stack segment. Its
segment selector (base address) is stored in the SS register. The processor writes to and
fetches dynamically stored information from the stack segment, using the contents of the
SP register as a top-of-stack pointer and the SI register as offset into the segment. The SS
register value changes as a result of interrupts, exceptions, and instructions that transfer
control between tasks (see CALL, IRET, and JMP instructions).
1-60
Am486 Microprocessor Register Set
AMD
1.58
TR
Task Register
16 bits
Bit(s)
Bit Set Name
Description
15–0
Selector
The selector value used to access/index a TSS descriptor in the GDT.
Addressing
Access by using the load instruction LTR or the store instruction STR.
Default Value
Design dependent; loaded during system initialization and then modified by task switching.
Functional Description
The Task Register points to the current TSS. This register consists of a visible 16-bit selector
that points to the TSS descriptor in the GDT for the current task, and the invisible base
address and segment limit maintained by the TSS. The processor maintains the invisible
part of the TR to make execution more efficient by addressing the Task State Segment
directly through the register. The LTR instruction requires the highest privilege level
(CPL = 0) because changing the register must be restricted to initialization and operating
software task switches to prevent unpredictable results. The STR instruction has no privilege restriction.
Am486 Microprocessor Register Set
1-61
AMD
1.59
TR3
Cache Test Data Register
Bit(s)
Bit Set Name
Description
31–0
Data
Data storage for internal cache testing.
32 bits
Addressing
Specify by name as instruction operand.
Default Value
00000000h
Functional Description
TR3 is the cache test data register. This register contains a doubleword used to write to
the cache, or, a doubleword read from the cache read buffer. The fill and read buffers each
store four doublewords that pass through TR3 one at a time. Select a specific doubleword
in either buffer by using the 2-bit Entry Select field (bits 2 and 3) of TR5 (see page 1-64).
1-62
Am486 Microprocessor Register Set
AMD
1.60
TR4
Cache Test Status Register
32 bits
Bit(s)
Bit Set Name
Description
31–11
TAG
The address that becomes the tag on a cache write.
10
VALID
0 = Not valid.
1 = Valid bit on a cache lookup, this is a copy of one of the bits 6–3;
on a write it is a new bit.
9–7
LRU
On a cache lookup, this is the three LRU bits of the accessed set; the
LRU bits in the cache are updated by the pseudo-LRU cache replacement algorithm.
On a cache write, these bits are ignored.
6–3
VALID
On a cache lookup, these are the four Valid bits of the accessed set.
2–0
N/A
Reserved; always 000.
Addressing (I/O)
Specify by name as instruction operand.
Default Value
00000000h
Functional Description
TR4 contains the Cache Test Status Register. This includes the Valid bits, LRU bits, and a
tag.
Am486 Microprocessor Register Set
1-63
AMD
1.61
TR5
Cache Test Control Register
32 bits
Bit(s)
Bit Set Name
Description
31–11
N/A
Not used.
10–4
SET SELECT
Selects one of the 128 available sets.
3–2
ENTRY SELECT
During a cache read or write, selects one of four entries in the set
addressed by the Set Select; during cache-fill-buffer writes or readbuffer reads, selects one of the four doublewords in a line.
1–0
CONTROL
00 = Write to cache fill buffer, or read from cache read buffer.
01 = Perform cache write.
10 = Perform cache read.
11 = Flush the cache (mark all entries as invalid).
Addressing
Specify by name as instruction operand.
Default Value
00000000h
Functional Description
TR5 is the Cache Test Control Register. The register defines the section (set and entry) of
the cache to test and the operation to perform.
1-64
Am486 Microprocessor Register Set
AMD
1.62
TR6
TLB Test Control Register
32 bits
Bit(s)
Bit Set Name
Description
31–12
Linear Address
On a write, the TLB entry is allocated to this linear address.
On a TLB lookup, the TLB is interrogated with this value.
11
V
0 = TLB not valid.
1 = TLB data valid.
10–9
D, D
00 = Undefined
01 = Match on lookup; clear D on write.
10 = Match on lookup; set D on write.
11 = Undefined
8–7
U, U
00 = Undefined
01 = Match on lookup; clear U on write.
10 = Match on lookup; set U on write.
11 = Undefined
6–5
W, W
00 = Undefined
01 = Match on lookup; clear W on write.
10 = Match on lookup; set W on write.
11 = Undefined
4–1
N/A
Reserved, always 0000
0
C
0 = TLB write enabled.
1 = TLB lookup enabled.
Addressing
Specify by name as instruction operand.
Default Value
00000000h
Functional Description
The Am486 processor uses a translation lookaside buffer (TLB) to translate linear address
to physical address in the cache. The TLB contains the 20 high-order bits of a physical
address used as a base address for a memory page. The 12 low-order bits (the offset into
the page) are the same in both a linear and physical address. Corresponding to the block
of data entries is a block of valid, attribute, and tag entries. The entry consists of the 17
high-order bits of the linear address (31–15). The processor uses the middle-order bits
(14–12) to address eight sets and then checks the four tags of a selected set for a match
with the high-order bits. If a match is found among the tags of the selected set, the corresponding valid bit is set to 1 and the linear address is translated by replacing its high-order
20 bits with the 20 bits of the corresponding data entry. Three LRU bits are included in each
set to track the use of data in each set. The LRU bits are checked when a new entry is
needed and none of the entries in the set is invalid; a pseudo-LRU replacement algorithm
modifies the LRU when required.
Testing of the TLB uses two registers, TR6 and TR7. TR6 is the Test Control Register. TR7
contains test data read from or written to the TLB.
Am486 Microprocessor Register Set
1-65
AMD
1.63
TR7
TLB Test Status Register
32 bits
Bit(s)
Bit Set Name
Description
31–12
Physical Address
This is the data field of the TLB. On a write to the TLB, the Linear
Address in TR6 is set to this value. On a TLB Lookup, the physical
address is loaded from the TLB to this field.
11
PCD
The page-level cache-disable (PCD) bit of a page table entry.
10
PWT
The page-level write-through (PWT) bit of a page table entry.
9–7
LRU
The LRU values before a TLB lookup. TLB lookups that result in hits
and TLB writes change the value of these bits.
6–5
N/A
Reserved, always 00
4
PL
0 = On a write, the internal pointer of the paging unit selects the TLB
block to load.
On a TLB lookup, this value indicates a miss.
1 = On a write, the REP field selects which associative block of the
TLB to load.
On a TLB lookup, this value indicates a hit.
3–2
REP
If TLB = 0, REP is undefined.
If TLB = 1, then,
For a TLB write, REP indicates which block to write.
For a TLB lookup, REP reports in which of the associative blocks, the
tag was found.
1–0
N/A
Reserved, always 00
Addressing
Specify by name as instruction operand.
Default Value
00000000h
Functional Description
The Am486 processor uses a translation lookaside buffer (TLB) to translate linear address
to physical address in the cache. The TLB contains the 20 high-order bits of a physical
address used as a base address for a memory page. The 12 low-order bits (the offset into
the page) are the same in both a linear and physical address. Corresponding to the block
of data entries is a block of valid, attribute, and tag entries. The entry consists of the 17
high-order bits of the linear address (31–15). The processor uses the middle-order bits
(14–12) to address eight sets and then checks the four tags of a selected set for a match
with the high-order bits. If a match is found among the tags of the selected set, the corresponding valid bit is set to 1 and the linear address is translated by replacing its high-order
20 bits with the 20 bits of the corresponding data entry. Three LRU bits are included in each
set to track the use of data in each set. The LRU bits are checked when a new entry is
needed and none of the entries in the set is invalid; a pseudo-LRU replacement algorithm
modifies the LRU when required.
Testing of the TLB uses two registers, TR6 and TR7. TR6 is the Test Control Register. TR7
contains test data read from or written to the TLB.
1-66
Am486 Microprocessor Register Set
CHAPTER
2
2.1
Am486 MICROPROCESSOR INSTRUCTION SET
OVERVIEW
The Am486 microprocessor instruction set uses the same basic instructions as other 486based microprocessors. Pages 2-2 and 2-3 provide a roadmap to these instructions using
functional categories. For each instruction, the roadmap lists the page on which the detailed
instruction description appears. In the detailed description section that follows the instruction roadmap, the instructions appear in alphabetical order using the roadmap name.
2.2
DETAILED INSTRUCTION DESCRIPTIONS
Note: If you are unfamiliar with the instruction notation used in this chapter, refer to Appendix
A for a detailed explanation of instructions and their use in application programming.
Instruction descriptions begin on page 2-4, using the following format:
INSTRUCTION NAME/S
General Description
Opcode
Instruction
Clocks
Concurrent
Execution*
Description
nn xx
XXX
nn
nn
Some FPU
instructions
Function
Operation
Algorithmic description using a notation similar to Algol or Pascal language.
Description
Verbal description of code operation.
[FPU] Flags Affected
Description of changes made to system flags (or FPU flags C0, C1, C2, and C3).
Numeric Exceptions (floating-point operations only)
List of possible FPU exceptions.
Protected Mode Exceptions
Description of exceptions generated in Protected Mode.
Real Address Mode Exceptions
Description of exceptions generated in Real Address Mode.
Virtual 8086 Mode Exceptions
Description of exceptions generated in Virtual 8086 Mode.
*shaded column not included for all instructions.
Am486 Microprocessor Instruction Set
2-1
AMD
Instruction Roadmap
Binary Arithmetic
Control Transfer
Flag Control
AAA
AAD
AAM
AAS
ADC
ADD
CMP
DAA
DAS
DEC
DIV
IDIV
IMUL
INC
MUL
NEG
SBB
SUB
CALL
IRET
IRETD
JA
JAE
JB
JBE
JC
JCXZ
JE
JECXZ
JG
JGE
JL
JLE
JMP
JNA
JNAE
JNB
JNBE
JNC
JNE
JNG
JNGE
JNL
JNLE
JNO
JNP
JNS
JNZ
JO
JP
JPE
JPO
JS
JZ
LOOP
LOOPE
LOOPNE
LOOPNZ
LOOPZ
RET
CLC
CLD
CLI
CLTS
CMC
LAHF
POPF
POPFD
PUSHF
PUSHFD
SAHF
STC
STD
STI
2-4
2-5
2-6
2-7
2-8
2-9
2-32
2-38
2-39
2-40
2-41
2-124
2-125
2-127
2-202
2-203
2-233
2-253
Block Structured
Language
ENTER
LEAVE
2-42
2-180
Data Movement
CBW
CDQ
CWD
CWDE
MOV
POP
POPA
POPAD
PUSH
PUSHA
PUSHAD
XCHG
2-25
2-26
2-36
2-37
2-195
2-210
2-212
2-213
2-215
2-217
2-218
2-259
Data Pointer
LDS
LES
LFS
LGS
LSS
2-2
2-178
2-181
2-182
2-184
2-193
2-20
2-136
2-136
2-140
2-141
2-142
2-143
2-144
2-145
2-146
2-147
2-148
2-149
2-150
2-151
2-152
2-156
2-157
2-158
2-159
2-160
2-161
2-162
2-163
2-164
2-165
2-166
2-167
2-168
2-169
2-170
2-171
2-172
2-173
2-174
2-175
2-191
2-191
2-191
2-191
2-191
2-224
Protection Control
2-27
2-28
2-29
2-30
2-31
2-176
2-214
2-214
2-219
2-219
2-230
2-247
2-248
2-249
Logical Operation
AND
BSF
BSR
BT
BTC
BTR
BTS
NOT
OR
XOR
2-10
2-13
2-14
2-16
2-17
2-18
2-19
2-205
2-206
2-261
Input/Output (I/O)
IN
OUT
2-126
2-207
Interrupt Control
BOUND
INT
INTO
2-12
2-130
2-130
Am486 Microprocessor Instruction Set
ARPL
LAR
LGDT
LIDT
LLDT
LMSW
LOCK
LSL
LTR
SGDT
SIDT
SLDT
SMSW
STR
VERR
VERW
2-11
2-177
2-183
2-185
2-186
2-187
2-188
2-192
2-194
2-237
2-244
2-245
2-246
2-252
2-255
2-255
Process Control
HLT
INVD
INVLPG
WAIT
WBINVD
2-123
2-134
2-135
2-256
2-257
Shift and Rotate
RCL
RCR
ROL
ROR
SAL
SAR
SHL
SHLD
SHR
SHRD
2-220
2-221
2-228
2-229
2-231
2-232
2-238
2-239
2-241
2-242
Miscellaneous
BSWAP
CMPXCHG
LEA
NOP
TEST
XADD
XLAT
XLATB
2-15
2-35
2-179
2-204
2-254
2-258
2-260
2-260
AMD
Set Register
SETA
SETAE
SETB
SETBE
SETC
SETE
SETG
SETGE
SETL
SETLE
SETNA
SETNAE
SETNB
SETNBE
SETNC
SETNE
SETNG
SETNGE
SETNL
SETNLE
SETNO
SETNP
SETNS
SETNZ
SETO
SETP
SETPE
SETPO
SETS
SETZ
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
2-236
String Operations
Floating-Point Operations
CMPS
CMPSB
CMPSD
CMPSW
INS
INSB
INSD
INSW
LODS
LODSB
LODSD
LODSW
MOVS
MOVSB
MOVSD
MOVSW
MOVSX
MOVZX
OUTS
OUTSB
OUTSD
OUTSW
REP
REPE
REPNE
REPNZ
REPZ
SCAS
SCASB
SCASD
SCASW
STOS
STOSB
STOSD
STOSW
F2XM1
FABS
FADD
FADDP
FBLD
FBSTP
FCHS
FCLEX
FCOM
FCOMP
FCOMPP
FCOS
FDECSTP
FDIV
FDIVP
FDIVR
FDIVRP
FFREE
FIADD
FICOM
FICOMP
FIDIV
FIDIVR
FILD
FIMUL
FINCSTP
FINIT
FIST
FISTP
FISUB
FISUBR
FLD1
FLD
FLDCW
FLDENV
FLDL2E
FLDL2T
FLDLG2
FLDLN2
2-33
2-33
2-33
2-33
2-128
2-128
2-128
2-128
2-189
2-189
2-189
2-189
2-198
2-198
2-198
2-198
2-200
2-201
2-208
2-208
2-208
2-208
2-222
2-222
2-222
2-222
2-222
2-234
2-234
2-234
2-234
2-250
2-250
2-250
2-250
Am486 Microprocessor Instruction Set
2-43
2-44
2-45
2-46
2-47
2-48
2-49
2-50
2-51
2-52
2-53
2-54
2-55
2-56
2-57
2-58
2-59
2-60
2-61
2-62
2-63
2-64
2-65
2-66
2-67
2-68
2-69
2-70
2-71
2-72
2-73
2-75
2-74
2-76
2-77
2-78
2-79
2-80
2-81
FLDPI
FLDZ
FMUL
FMULP
FNCLEX
FNINIT
FNOP
FNSAVE
FNSTCW
FNSTENV
FNSTSW
FPATAN
FPREM
FPREM1
FPTAN
FRNDINT
FRSTOR
FSAVE
FSCALE
FSIN
FSINCOS
FSQRT
FST
FSTCW
FSTENV
FSTP
FSTSW
FSUB
FSUBP
FSUBR
FSUBRP
FTST
FUCOM
FUCOMP
FUCOMPP
FWAIT
FXAM
FXCH
FXTRACT
FYL2X
FYL2XP1
2-82
2-83
2-84
2-85
2-86
2-87
2-88
2-89
2-90
2-91
2-92
2-93
2-94
2-95
2-96
2-97
2-98
2-99
2-100
2-101
2-102
2-103
2-104
2-105
2-106
2-107
2-108
2-109
2-110
2-111
2-112
2-113
2-114
2-115
2-116
2-117
2-118
2-119
2-120
2-121
2-122
2-3
AMD
2.3
AAA
ASCII Adjusts AL after Addition
Opcode
Instruction
Clocks
Description
37
AAA
3
ASCII adjusts after addition.
Operation
IF ((AL and 0Fh) > 9) OR (AF = 1)
THEN
AL ← (AL + 6) and 0Fh;
AH ← AH + 1;
AF ← 1;
CF ← 1;
ELSE
CF ← 0;
AF ← 0;
FI
Description
Use the AAA instruction after an ADD instruction that leaves a byte result in the AL register.
The lower nibbles of the operands of the ADD instruction should be in the range 0–9 (BCD
digits). The AAA instruction adjusts the AL register to contain the correct decimal digit result.
If the addition produced a decimal carry, AAA increments the AH register and sets the Carry
and Auxiliary-carry Flags (CF and AF). If there is no decimal carry, AAA clears CF and AF
and leaves the AH register unchanged. AAA sets the top nibble of the AL register to 0. To
convert the AL register to an ASCII result, use an OR AL, 30h instruction after the AAA
instruction.
Flags Affected
For a decimal carry, AAA sets AF and CF. AAA clears AF and CF when there is no carry.
OF, SF, ZF, and PF are not affected by this instruction.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-4
Am486 Microprocessor Instruction Set
AMD
2.4
AAD
ASCII Adjusts AX before Division
Opcode
Instruction
Clocks
Description
D5 0A
AAD
14
ASCII adjusts AX before division.
Operation
AH ← AH ⋅ 10 + AL ; 10 is decimal
AH ← 0
Description
AAD prepares two unpacked BCD digits (the least-significant digit in the AL register and
the most-significant digit in the AH register) for a division operation that yields an unpacked
result. The instruction sets the AL register to AL + (10 ⋅ AH) and then clears the AH register.
The AX register then equals the binary equivalent of the original unpacked two digit number.
Flags Affected
The result determines the SF, ZF, and PF settings. This instruction does not affect OF, AF,
and CF.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-5
AMD
2.5
AAM
ASCII Adjusts AX after Multiply
Opcode
Instruction
Clocks
Description
D4 0A
AAM
15
ASCII adjusts AX after multiply.
Operation
AH ← AL / 10
AL ← AL MOD 10
Description
Use AAM only after executing the MUL instruction between two unpacked BCD operands
with the result in the AX register. Because the result is less than 100, it resides entirely in
the AL register. AAM unpacks the AL result by dividing AL by 10, leaving the quotient (mostsignificant digit) in AH and the remainder (least-significant digit) in AL.
Flags Affected
The result determines the SF, ZF, and PF settings. This instruction does not affect OF, AF,
and CF.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-6
Am486 Microprocessor Instruction Set
AMD
2.6
AAS
ASCII Adjusts AL after Subtraction
Opcode
Instruction
Clocks
Description
3F
AAS
3
ASCII adjusts AL after subtract.
Operation
IF ((AL and 0Fh) > 9) OR (AF = 1)
THEN
AL ← AL – 6;
AL ← AL and 0Fh;
AH ← AH – 1;
AF ← 1;
CF ← 1;
ELSE
CF ← 0;
AF ← 0;
FI
Description
Use AAS only after a SUB instruction that leaves the byte result in AL. The lower nibbles
of the SUB instruction must be in the range 0–9 (BCD). AAS adjusts AL so that it contains
the correct decimal result. If the subtraction produced a decimal carry, AAS decrements
AH and sets CF and AF. If there is no decimal carry, AAS clears CF and AF and leaves AH
unchanged. AAS sets the top nibble set in AL to 0. Use OR AL, 30h after AAS to convert
AL to an ASCII result.
Flags Affected
For a decimal carry, AAS sets AF and CF. AAS clears AF and CF when there is no carry.
OF, SF, ZF, and PF are not affected by this instruction.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-7
AMD
2.7
ADC
Adds Integers with Carry
Opcode
Instruction
Clocks
Description
14 ib
15 iw
15 id
80 /2 ib
81 /2 iw
81 /2 id
83 /2 ib
83 /2 ib
ADC AL, imm8
ADC AX, imm16
ADC EAX, imm32
ADC r/m8, imm8
ADC r/m16, imm16
ADC r/m32, imm32
ADC r/m16, imm8
ADC r/m32, imm8
1
1
1
1/3
1/3
1/3
1/3
1/3
10 /r
11 /r
11 /r
12 /r
13 /r
13 /r
ADC r/m8, r8
ADC r/m16, r16
ADC r/m32, r32
ADC r8, r/m8
ADC r16, r/m16
ADC r32, r/m32
1/3
1/3
1/3
1/2
1/2
1/2
Adds immediate byte to AL with carry.
Adds immediate word to AX with carry.
Adds immediate doubleword to EAX with carry.
Adds immediate byte to r/m byte with carry.
Adds immediate word to r/m word with carry.
Adds immediate doubleword to r/m doubleword with carry.
Adds sign-extended immediate byte to r/m word with carry.
Adds sign-extended immediate byte into r/m doubleword
with carry.
Adds byte register to r/m byte with carry.
Adds word register to r/m word with carry.
Adds doubleword register to r/m doubleword with carry.
Adds r/m byte to byte register with carry.
Adds r/m word to word register with carry.
Adds r/m doubleword to doubleword register with carry.
Operation
DEST ← DEST + SRC + CF
Description
ADC performs an integer addition of the two operands DEST and SRC and sets the Carry
Flag (CF) as required. ADC assigns the result to DEST and sets the flags accordingly. ADC
is typically part of a multibyte or multiword addition operation. ADC sign-extends immediate
byte values to the appropriate size before adding to a word or doubleword operand.
Flags Affected
The result determines the OF, SF, ZF, CF, and PF settings.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-8
Am486 Microprocessor Instruction Set
AMD
2.8
ADD
Adds Integers
Opcode
Instruction
Clocks
Description
04 ib
05 iw
05 id
80 /0 ib
81 /0 iw
81 /0 id
83 /0 ib
83 /0 ib
00 /r
01 /r
01 /r
02 /r
03 /r
03 /r
ADD AL, imm8
ADD AX, imm16
ADD EAX, imm32
ADD r/m8, imm8
ADD r/m16, imm16
ADD r/m32, imm32
ADD r/m16, imm8
ADD r/m32, imm8
ADD r/m8, r8
ADD r/m16, r16
ADD r/m32, r32
ADD r8, r/m8
ADD r16, r/m16
ADD r32, r/m32
1
1
1
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/2
1/2
1/2
Adds immediate byte to AL.
Adds immediate word to AX.
Adds immediate doubleword to EAX.
Adds immediate byte to r/m byte.
Adds immediate word to r/m word.
Adds immediate doubleword to r/m doubleword.
Adds sign-extended immediate byte to r/m word.
Adds sign-extended immediate byte into r/m doubleword.
Adds byte register to r/m byte.
Adds word register to r/m word.
Adds doubleword register to r/m doubleword.
Adds r/m byte to byte register.
Adds r/m word to word register.
Adds r/m doubleword to doubleword register.
Operation
DEST ← DEST + SRC
Description
ADD performs an integer addition of the two operands DEST and SRC. ADD assigns the
result to DEST and sets the flags accordingly. ADC sign-extends immediate byte values to
the appropriate size before adding to a word or doubleword operand.
Flags Affected
The result determines the OF, SF, ZF, CF, and PF settings.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-9
AMD
2.9
AND
Logical AND Function
Opcode
Instruction
Clocks
Description
24 ib
25 iw
25 id
80 /4 ib
81 /4 iw
81 /4 id
83 /4 ib
83 /4 ib
AND AL, imm8
AND AX, imm16
AND EAX, imm32
AND r/m8, imm8
AND r/m16, imm16
AND r/m32, imm32
AND r/m16, imm8
AND r/m32, imm8
1
1
1
1/3
1/3
1/3
1/3
1/3
20 /r
21 /r
21 /r
22 /r
23 /r
23 /r
AND r/m8, r8
AND r/m16, r16
AND r/m32, r32
AND r8, r/m8
AND r16, r/m16
AND r32, r/m32
1/3
1/3
1/3
1/2
1/2
1/2
ANDs immediate byte to AL.
ANDs immediate word to AX.
ANDs immediate doubleword to EAX.
ANDs immediate byte to r/m byte.
ANDs immediate word to r/m word.
ANDs immediate doubleword to r/m doubleword.
ANDs sign-extended immediate bye to r/m word.
ANDs sign-extended immediate byte into r/m
doubleword.
ANDs byte register to r/m byte.
ANDs word register to r/m word.
ANDs doubleword register to r/m doubleword.
ANDs r/m byte to byte register.
ANDs r/m word to word register.
ANDs r/m doubleword to doubleword register.
Operation
DEST ← DEST AND SRC
CF ← 0
OF ← 0
Description
AND computes the logical AND of the two operands. If corresponding bits of the operands
are 1, the resulting bit is 1. If the bits are not the same or are both 0, the result is 0. The
answer replaces the first operand.
Flags Affected
AND clears CF and OF. The result determines the ZF, CF, and PF settings.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-10
Am486 Microprocessor Instruction Set
AMD
2.10
ARPL
Adjusts RPL Field of Selector
Opcode
Instruction
Clocks
Description
63 /r
ARPL r/m16, r16
9/9
Adjusts RPL of r/m16 to no less than the RPL of r16.
Operation
IF RPL bits(0,1) of DEST < RPL bits(0,1) of SRC
THEN
ZF ← 1;
RPL bits(0,1) of DEST ← RPL bits(0,1) of SRC;
ELSE
ZF ← 0;
FI
Description
ARPL has two operands. The first (r/m16) is a 16-bit memory variable or word register that
contains the selector value. The second (r16) is a word register. If the RPL field (“requested
privilege level” — bits 0 and 1) of the first operand is less than the RPL field of the second
operand, ARPL sets ZF and increases the RPL field of the first operand to equal the RPL
field of the second operand. If the first operand RPL field is equal to or greater than the
second operand RPL field, ARPL clears ZF and does not change the first operand.
Typically, ARPL appears in operating system software and not application programs. Its
use guarantees that a selector parameter to a subroutine does not request a higher privilege
level than allowed to the caller. The second operand used by ARPL is normally a register
that contains the CS selector value of the caller.
Flags Affected
ARPL sets ZF to 1 if the first operand RPL field is less than the second operand RPL field.
ARPL resets the ZF to 0 if the first operand RPL field is greater than or equal to the second
operand RPL field.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Invalid Opcode (6); Real Address Mode does not recognize ARPL.
Virtual 8086 Mode Exceptions
Invalid Opcode (6); Virtual 8086 Mode does not recognize ARPL. Page Fault (14) indicates
a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned memory
reference.
Am486 Microprocessor Instruction Set
2-11
AMD
2.11
BOUND
Checks Array Index Against Bounds
Opcode
Instruction
Clocks
Description
62 /r
62 /r
BOUND r16,m16&16
BOUND r32,m32&32
7
7
Checks to see if r16 is within bounds (passes test).
Checks to see if r32 is within bounds (passes test).
Operation
IF (LeftSRC < [RightSRC] OR LeftSRC > [RightSRC + OperandSize/8])
THEN BOUND Range Exceeded Exception;
FI
Description
BOUND ensures that a signed array index is within the limits specified by a block of memory
between an upper and lower bound. The register size determines whether the operation
uses words or doublewords. The first operand (from the specified register) must be greater
than or equal to the lower bound value, but not greater than the upper bound. The lower
bound value is stored at the address specified by the second operand. The upper bound
value is stored at a consecutive higher memory address (+2 for word operations; +4 for
doubleword operations). If the first operand is out of the specified bounds, BOUND returns
an Interrupt 15. The return EIP points to the BOUND instruction.
Flags Affected
None
Protected Mode Exceptions
If the test fails, BOUND generates a BOUND Range Exceeded (5) exception. General
Protection Fault (13) indicates an illegal memory-operand effective address in the code or
data segments. Stack Fault (12) indicates an illegal SS segment address. Page Fault (14)
indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned
memory reference. Invalid Opcode (6) occurs if BOUND uses a register as the second
operand.
Real Address Mode Exceptions
BOUND Range Exceeded (5) indicates the test failed. Invalid Opcode (6) indicates the
second operand is a register. General Protection Fault (13) indicates that part of the operand
lies outside the effective address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
BOUND Range Exceeded (5) indicates the test failed. Invalid Opcode (6) indicates the
second operand is a register. General Protection Fault (13) indicates that part of the operand
lies outside the effective address space: 0 to 0FFFFh. Page Fault (14) indicates a page
fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
2-12
Am486 Microprocessor Instruction Set
AMD
2.12
BSF
Bit Scan Forward
Opcode
Instruction
Clocks
Description
0F BC
0F BC
BSF r16, r/m16
BSF r32, r/m32
6–42/7–43
6–42/7–43
Performs a forward bit scan on r/m word.
Performs a forward bit scan on r/m doubleword.
Operation
IF r/m = 0
THEN
ZF ← 1;
register ← UNDEFINED;
ELSE
temp ← 0;
ZF ← 0;
WHILE BIT[r/m, temp = 0];
DO;
temp ← temp + 1;
register ← temp;
OD;
FI
Description
BSF scans the bits in the second word or doubleword operand starting with bit 0. If all the
bits are 0, BSF sets ZF. If any bit is not 0, BSF clears ZF and loads the destination register
with the bit index of the first set bit.
Flags Affected
ZF is set if all bits are 0. If any bit is 1, BSF clears ZF.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-13
AMD
2.13
BSR
Bit Scan Reverse
Opcode
Instruction
Clocks
Description
0F BD
0F BD
BSR r16, r/m16
BSR r32, r/m32
6–103/7–104
6–103/7–104
Performs a reverse bit scan on r/m word.
Performs a reverse bit scan on r/m doubleword.
Operation
IF r/m = 0
THEN
ZF ← 1;
register ← UNDEFINED;
ELSE
temp ← OperandSize –1;
ZF ← 0;
WHILE BIT[r/m, temp = 0];
DO;
temp ← temp + 1;
register ← temp;
OD;
FI
Description
BSR scans the bits in the second word or doubleword operand from the most-significant
bit to the least-significant bit. If all the bits are 0, BSR sets ZF. If any bit is not 0, BSR clears
ZF and loads the destination register with the bit index of the first set bit found when scanning
in the reverse direction.
Flags Affected
ZF is set if all bits are 0. If any bit is 1, BSR clears ZF.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-14
Am486 Microprocessor Instruction Set
AMD
2.14
BSWAP
Byte Swap
Opcode
Instruction
Clocks
Description
0F C8/r
BSWAP r32
1
Swaps bytes to convert little/big endian data in a
32-bit register to big/little endian form.
Operation
TEMP ← r32
r32(7..0) ← TEMP(31..24)
r32(15..8) ← TEMP(23..16)
r32(23..16) ← TEMP(15..8)
r32(31..24) ← TEMP(7..0)
Description
BSWAP reverses the byte order of a 32-bit register, converting a value in little/big endian
form to big/little endian form. Applying BSWAP to a 16-bit operand leaves an undefined
result in the destination register.
Flags Affected
None
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The BSWAP instruction is not available on 386DX or SX microprocessors. If you are
writing code that must be compatible with these systems, you must use 386 functionallyequivalent code to perform this operation.
Am486 Microprocessor Instruction Set
2-15
AMD
2.15
BT
Bit Test
Opcode
Instruction
Clocks
Description
0F A3
0F A3
0F BA /4 /ib
0F BA /4 /ib
BT r/m16, r16
BT r/m32, r32
BT r/m16, imm8
BT r/m32, imm8
3/8
3/8
3/8
3/8
Saves bit in Carry Flag.
Saves bit in Carry Flag.
Saves bit in Carry Flag.
Saves bit in Carry Flag.
Operation
CF ← BIT[LeftSRC, RightSRC]
Description
BT saves the value of the bit indicated by the base (first operand) and the bit offset (second
operand) into CF.
Flags Affected
CF contains the value of the selected bit.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: You can indicate the bit index by using a general register value or an immediate
8-bit constant. The operand is taken modulo 32, so the range of immediate bit offsets is
0–31. This allows you to select any bit in a word or doubleword register. For memory bit
strings, you can support longer fields by using the immediate bit offset field in combination
with the memory-operand displacement field. The Low order 3 to 5 bits of the immediate
bit offset are stored in the immediate bit offset field, and the High order 27 to 29 bits are
shifted and combined with the byte displacement in the addressing mode. When accessing
a bit in memory, you can make the processor access two (16-bit operand) or four (32-bit
operand) bytes from the starting address using:
Effective Address + ([2 or 4] ⋅ (BitOffset DIV [16 or 32]))
You may use this form even if the processor only needs to access one byte to reach the
given bit. When using this form, avoid referencing areas close to address space holes, and
in particular, avoid references to memory-mapped I/O registers. Use MOV instructions to
load from or store to these addresses, and use the register form of these instructions to
manipulate the data.
2-16
Am486 Microprocessor Instruction Set
AMD
2.16
BTC
Bit Test and Complement
Opcode
Instruction
Clocks
Description
0F BB
0F BB
0F BA /7 ib
0F BA /7 ib
BTC r/m16, r16
BTC r/m32, r32
BTC r/m16, imm8
BTC r/m32, imm8
6/13
6/13
6/8
6/8
Saves bit in Carry Flag and complement.
Saves bit in Carry Flag and complement.
Saves bit in Carry Flag and complement.
Saves bit in Carry Flag and complement.
Operation
CF ← BIT[LeftSRC, RightSRC]
BIT[LeftSRC, RightSRC] ← NOT BIT[LeftSRC, RightSRC]
Description
BTC saves the value of the bit indicated by the base (first operand) and the bit offset (second
operand) into CF and complements the bit.
Flags Affected
CF contains the value of the selected bit.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: You can indicate the bit index by using a general register value or an immediate
8-bit constant. The operand is taken modulo 32, so the range of immediate bit offsets is
0–31. This allows you to select any bit in a word or doubleword register. For memory bit
strings, you can support longer fields by using the immediate bit offset field in combination
with the memory-operand displacement field. The Low order 3 to 5 bits of the immediate
bit offset are stored in the immediate bit offset field, and the High order 27 to 29 bits are
shifted and combined with the byte displacement in the addressing mode. When accessing
a bit in memory, you can make the processor access two (16-bit operand) or four (32-bit
operand) bytes from the starting address using:
Effective Address + ([2 or 4] ⋅ (BitOffset DIV [16 or 32]))
You may use this form even if the processor only needs to access one byte to reach the
given bit. When using this form, avoid referencing areas close to address space holes, and
in particular, avoid references to memory-mapped I/O registers. Use MOV instructions to
load from or store to these addresses, and use the register form of these instructions to
manipulate the data.
Am486 Microprocessor Instruction Set
2-17
AMD
2.17
BTR
Bit Test And Reset
Opcode
Instruction
Clocks
Description
0F B3
0F B3
0F BA /6 ib
0F BA /6 ib
BTR r/m16, r16
BTR r/m32, r32
BTR r/m16, imm8
BTR r/m32, imm8
6/13
6/13
6/8
6/8
Saves bit in Carry Flag and reset.
Saves bit in Carry Flag and reset.
Saves bit in Carry Flag and reset.
Saves bit in Carry Flag and reset.
Operation
CF← BIT[LeftSRC, RightSRC]
BIT[LeftSRC, RightSRC]← 0
Description
BTR saves the value of the bit indicated by the base (first operand) and the bit offset (second
operand) into CF and resets the bit to 0.
Flags Affected
CF contains the value of the selected bit.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: You can indicate the bit index by using a general register value or an immediate
8-bit constant. The operand is taken modulo 32, so the range of immediate bit offsets is
0–31. This allows you to select any bit in a word or doubleword register. For memory bit
strings, you can support longer fields by using the immediate bit offset field in combination
with the memory-operand displacement field. The Low order 3 to 5 bits of the immediate
bit offset are stored in the immediate bit offset field, and the High order 27 to 29 bits are
shifted and combined with the byte displacement in the addressing mode. When accessing
a bit in memory, you can make the processor access two (16-bit operand) or four (32-bit
operand) bytes from the starting address using:
Effective Address + ([2 or 4] ⋅ (BitOffset DIV [16 or 32]))
You may use this form even if the processor only needs to access one byte to reach the
given bit. When using this form, avoid referencing areas close to address space holes, and
in particular, avoid references to memory-mapped I/O registers. Use MOV instructions to
load from or store to these addresses, and use the register form of these instructions to
manipulate the data.
2-18
Am486 Microprocessor Instruction Set
AMD
2.18
BTS
Bit Test And Set
Opcode
Instruction
Clocks
Description
0F AB
0F AB
0F BA /5 ib
0F BA /5 ib
BTS r/m16, r16
BTS r/m32, r32
BTS r/m16, imm8
BTS r/m32, imm8
6/13
6/13
6/8
6/8
Saves bit in Carry Flag and sets it to a 1.
Saves bit in Carry Flag and sets it to a 1.
Saves bit in Carry Flag and sets it to a 1.
Saves bit in Carry Flag and sets it to a 1.
Operation
CF ← BIT[LeftSRC, RightSRC]
BIT[LeftSRC, RightSRC]← 1
Description
BTS saves the value of the bit indicated by the base (first operand) and the bit offset (second
operand) into CF and sets the bit to 1.
Flags Affected
CF contains the value of the selected bit.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: You can indicate the bit index by using a general register value or an immediate 8bit constant. The operand is taken modulo 32, so the range of immediate bit offsets is
0–31. This allows you to select any bit in a word or doubleword register. For memory bit
strings, you can support longer fields by using the immediate bit offset field in combination
with the memory-operand displacement field. The Low order 3 to 5 bits of the immediate
bit offset are stored in the immediate bit offset field, and the High order 27 to 29 bits are
shifted and combined with the byte displacement in the addressing mode. When accessing
a bit in memory, you can make the processor access two (16-bit operand) or four (32-bit
operand) bytes from the starting address using:
Effective Address + ([2 or 4] ⋅ (BitOffset DIV [16 or 32]))
You may use this form even if the processor only needs to access one byte to reach the
given bit. When using this form, avoid referencing areas close to address space holes, and
in particular, avoid references to memory-mapped I/O registers. Use MOV instructions to
load from or store to these addresses, and use the register form of these instructions to
manipulate the data.
Am486 Microprocessor Instruction Set
2-19
AMD
2.19
CALL
Calls Procedure
Opcode
Instruction
Clocks
Description
E8 cw
FF /2
9A cd
9A cd
9A cd
9A cd
9A cd
FF /3
FF /3
FF /3
FF /3
FF /3
E8 cd
FF /2
9A cp
9A cp
9A cp
9A cp
9A cp
FF /3
FF /3
FF /3
FF /3
FF /3
CALL rel16
CALL r/m16
CALL ptr16:16
CALL ptr16:16
CALL ptr16:16
CALL ptr16:16
CALL ptr16:16
CALL m16:16
CALL m16:16
CALL m16:16
CALL m16:16
CALL m16:16
CALL rel32
CALL r/m32
CALL ptr16:32
CALL ptr16:32
CALL ptr16:32
CALL ptr16:32
CALL ptr16:32
CALL m16:32
CALL m16:32
CALL m16:32
CALL m16:32
CALL m16:32
3
5/5
18, pm = 20
pm = 35
pm = 69
pm = 77+4x
pm = 37+ts*
17,pm = 20
pm = 35
pm = 69
pm = 77+4x
pm = 37+ts*
3
5/5
18, pm = 20
pm = 35
pm = 69
pm = 77+4x
pm = 37+ts*
17,pm = 20
pm = 35
pm = 69
pm = 77+4x
pm = 37+ts*
Calls near, displacement relative to next instruction.
Calls near, register indirect/memory indirect.
Calls far to full pointer given.
Calls gate, same privilege.
Calls gate, more privilege, no parameters.
Calls gate, more privilege, x parameters.
Calls to task.
Calls far to address at r/m word.
Calls gate, same privilege.
Calls gate, more privilege, no parameters.
Calls gate, more privilege, x parameters.
Calls to task.
Calls near, displacement relative to next instruction.
Calls near, register indirect/memory indirect.
Calls far to full pointer given.
Calls gate, same privilege.
Calls gate, more privilege, no parameters.
Calls gate, more privilege, x parameters.
Calls to task.
Calls far to address at r/m doubleword.
Calls gate, same privilege.
Calls gate, more privilege, no parameters.
Calls gate, more privilege, x parameters.
Calls to task.
*ts = 199 for 486TSS, 180 for 286TSS, or 177 for VM TSS.
Operation
IF rel16 or rel32 type of call
THEN (* near relative call *)
IF OperandSize = 16
THEN
Push(IP);
EIP ← (EIP + rel16) AND 0000FFFFh;
ELSE (* OperandSize = 32 *)
Push(EIP);
EIP ← EIP + rel32; FI; FI;
IF r/m16 or r/m32 type of call
THEN (* near absolute call *)
OperandSize = 16
THEN
Push(IP);
EIP ← [r/m16] AND 0000FFFFh;
ELSE (* OperandSize = 32 *)
Push(EIP)
EIP ← [r/m32];FI; FI;
IF (PE = 0 OR (PE = 1 and VM = 1) [* real or Virtual 8086 Mode *]
AND operand type = [m16:16, m16:32, ptr16:16, or ptr16:32]
THEN
IF OperandSize = 16
THEN
Push(CS);
Push(IP)
ELSE
Push(CS);
2-20
Am486 Microprocessor Instruction Set
AMD
Push(EIP)
FI;
IF operand type is m16:16 or m16:32
THEN (* indirect far call *)
IF OperandSize = 16
THEN
CS:IP ← [m16:16];
EIP ← EIP AND 0000FFFFh; (* clear upper 16 bits *)
ELSE (* OperandSize = 32 *);
CS:IP ← [m16:32]; FI;
IF operand type is ptr16:16 or ptr16:32
THEN (* direct far call *)
IF OperandSize = 16
THEN
CS:IP ← ptr16:16;
EIP ← EIP AND 0000FFFFh; (* clear upper 16 bits *)
ELSE (* OperandSize = 32 *);
CS:IP ← ptr16:32; FI; FI
IF (PE = 1 AND VM = 0)(* Protected Mode, not V86 Mode *)
AND instruction = far CALL
THEN
If indirect, then check access of EA doubleword;
General Protection Fault if limit violation;
New CS selector must not be null else General Protection Fault;
Check that new CS selector index is within its descriptor limits;
else General Protection Fault(new CS selector);
Examine AR byte of selected descriptor for various legal values;
depending on value:
go to CONFORMING-CODE-SEGMENT;
go to NONCONFORMING-CODE-SEGMENT;
go to CALL-GATE;
go to TASK-GATE;
go to TASK-GATE-SEGMENT;
ELSE General Protection Fault(code segment selector); FI
CONFORMING-CODE-SEGMENT
DPL must be ≤ CPL ELSE General Protection Fault(code segment selector);
Segment must be present
ELSE Segment Not Present Exception(code segment selector);
Stack must be big enough for return address ELSE Stack Fault (12);
Instruction pointer must be in code segment limit
ELSE General Protection Fault;
Load code segment descriptor into CS register;
Load CS with new code segment selector;
Load EIP with zero-extend(new offset);
IF OperandSize = 16 THEN EIP ← EIP AND 0000FFFFh; FI;
NONCONFORMING-CODE-SEGMENT
RPL must be ≤ CPL ELSE General Protection Fault(code segment selector)
DPL must be = CPL ELSE General Protection Fault(code segment selector)
Segment must be present
ELSE Segment Not Present Exception (code segment selector)
Stack must be big enough for return address ELSE Stack Fault(0)
Instruction pointer must be in code segment limit
ELSE General Protection Fault
Load code segment descriptor into CS register
Load CS with new code segment selector
Set RPL of CS to CPL
Am486 Microprocessor Instruction Set
2-21
AMD
Load EIP with zero-extend (new offset);
IF OperandSize = 16 THEN EIP ← EIP AND 0000FFFFh; FI;
CALL-GATE
Call gate DPL must be ≥ CPL
ELSE General Protection Fault(call gate selector)
Call gate DPL must be ≥ RPL
ELSE General Protection Fault(call gate selector)
Call gate must be present
ELSE Segment Not Present (11)(call gate selector)
Examine code segment selector in call gate descriptor:
Selector must not be null ELSE General Protection Fault
Selector must be within its descriptor table limits
ELSE General Protection Fault(code segment selector)
AR byte of selected descriptor must indicate code
segment ELSE General Protection Fault(code segment selector)
DPL of selected descriptor must be ≤ CPL
ELSE General Protection Fault(code segment selector)
IF non-conforming code segment AND DPL < CPL
THEN go to MORE-PRIVILEGE
ELSE go to SAME-PRIVILEGE; FI;
MORE-PRIVILEGE:
Get new SS selector for new privilege level from TSS
Check selector and descriptor for new SS:
Selector must not be null ELSE Invalid TSS Exception(0)
Selector index must be within its descriptor
table limits ELSE Invalid TSS Exception(SS selector)
Selector’s RPL must equal DPL of code segment
ELSE Invalid TSS Exception(SS selector)
Stack segment DPL must equal DPL of code
segment ELSE Invalid TSS Exception(SS selector)
Descriptor must indicate writable data segment
ELSE Invalid TSS Exception(SS selector)
Segment present ELSE Stack Fault(SS selector)
IF OperandSize = 32
THEN
New stack must have room for parameters plus 16 bytes
ELSE Invalid TSS Exception(SS selector)
EIP must be in code segment limit ELSE General Protection Fault
Load new SS:eSP value from TSS
Load new CS: EIP value from gate
ELSE
New stack must have room for parameters plus 8 bytes
ELSE Stack Fault (12)(SS selector)
IP must be in code segment limit ELSE General Protection Fault
Load new SS:eSP value from TSS
Load new CS:IP value from gate;FI;
Load CS descriptor
Load SS descriptor
Push long pointer of old stack onto new stack
Get word count from call gate, mask to 5 bits
Copy parameters from old stack onto new stack
Push return address onto new stack
Set CPL to stack segment DPL
Set RPL of CS to CPL
SAME-PRIVILEGE:
IF OperandSize = 32
2-22
Am486 Microprocessor Instruction Set
AMD
THEN
Stack must have room for 6-byte return address (padded to 8 bytes)
ELSE Stack Fault
EIP must be within code segment limit ELSE General Protection Fault
Load CS:EIP from gate
ELSE
Stack must have room for 4-byte return address ELSE Stack Fault
IP must be within code segment limit ELSE General Protection Fault
Load CS:IP from gate
FI;
Push return address onto stack
Load code segment descriptor into CS register
Set RPL of CS to CPL
TASK-GATE
Task gate DPL must be ≥ CPL ELSE Invalid TSS Exception(gate selector)
Task gate DPL must be ≥ RPL ELSE Invalid TSS Exception(gate selector)
Task Gate must be present
ELSE Segment Not Present Exception(gate selector)
Examine selector to TSS, given in Task Gate descriptor:
Must specify global in the local/global bit
ELSE Invalid TSS Exception (TSS selector)
Index must be within GDT limits
ELSE Invalid TSS Exception (TSS selector)
TSS descriptor AR byte must specify nonbusy TSS
ELSE Invalid TSS Exception(TSS selector)
Task State Segment must be present
ELSE Segment Not Present (11)(TSS selector)
SWITCH-TASKS (with nesting) to TSS
IP must be in code segment limit ELSE Invalid TSS Exception(0)
TO TASK-STATE-SEGMENT
TSS DPL must be ≥ CPL ELSE Invalid TSS Exception(TSS selector)
TSS DPL must be ≥ RPL ELSE Invalid TSS Exception(TSS selector)
TSS descriptor AR byte must specify available TSS
ELSE Invalid TSS Exception(TSS selector)
Task State Segment must be present
ELSE Segment Not Present (11)(TSS selector)
SWITCH-TASKS (with nesting) to TSS
IP must be in code segment limit ELSE Invalid TSS Exception(0)
Description
CALL exits the current instruction sequence and executes the procedure named in the
operand. A return at the end of the CALLed procedure exits the procedure and starts
execution at the instruction following the CALL instruction.
A CALL with a destination of r/m16, r/m32, rel16, or rel32 is a near CALL. It uses the current
segment register value. The CALL rel16 and CALL rel32 forms add a signed offset to the
address of the next instruction to determine the destination. Use the rel16 form if the next
instruction uses a 16-bit (word) operand. Use the rel32 form if the next instruction uses a
32-bit (doubleword) operand. CALL stores the result in the 32-bit EIP register. With rel16,
CALL clears the upper word of the EIP register, resulting in an offset whose value does not
exceed 16 bits. CALL r/m16 and CALL r/m32 specify a register or memory location from
which the absolute segment offset is fetched. CALL r/m16 fetches a 16-bit offset for a word
operand; CALL r/m32 fetches a 32-bit offset for a doubleword operand. CALL pushes the
offset of the next instruction in sequence onto the stack. The near RET instruction in the
procedure pops the instruction offset when it returns control.
Am486 Microprocessor Instruction Set
2-23
AMD
The far calls, CALL ptr16:16 and CALL ptr16:32, use a 4-byte or 6-byte operand as a long
pointer to the called procedure. The CALL m16:16 and m16:32 forms fetch the long pointer
from the memory location specified (indirection). In Real Address Mode or Virtual 8086
Mode, the long pointer provides 16 bits for the CS register and 16 or 32 bits for the EIP
register (depending on the operand-size attribute). These forms of the instruction push both
the CS and IP or EIP registers as a return address.
In Protected Mode, both long pointer forms consult the AR byte in the descriptor indexed
by the selector part of the long pointer. Depending on the value of the AR byte, the call will
perform one of the following types of control transfers:
n
A far call to the same protection level
n
An inter-protection level far call
n
A task switch
A CALL-indirect-through-memory, using the stack pointer (ESP) as a base register, references memory before the CALL. The base is the value of the ESP before the instruction
executes.
Flags Affected
All flags are affected if a task switch occurs; no flags are affected if a task switch does not
occur.
Protected Mode Exceptions
For far calls: General Protection Fault (13), Segment Not Present (11), Stack Fault (12),
and Invalid TSS (10), as indicated in Appendix A.
For near direct calls: General Protection Fault (13) if procedure location is beyond the code
segment limits; Stack Fault (12) if pushing the return address exceeds the bounds of the
stack segment; Page Fault Exception (14) for a page fault; Alignment Check (17) for unaligned memory reference if the current privilege level is 3.
For a near indirect call: General Protection Fault (13) for an illegal memory-operand effective
address in the code or data segments; Stack Fault (12) for an illegal SS segment address;
General Protection Fault (13) if the indirect offset obtained is beyond the code segment
limits; Page Fault Exception (14) for a page fault; Alignment Check (17) for unaligned
memory reference if the current privilege level is 3.
Real Address Mode Exceptions
General Protection Fault (13) if any part of the operand would lie outside of the effective
address space from 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) if any part of the operand would lie outside of the effective
address space from 0 to 0FFFFh. Page Fault Exception (14) for a page fault; Alignment
Check (17) for aligned memory reference if the current privilege level is 3.
Note: Any far call from a 32-bit code segment to a 16-bit code segment should be made
from the first 64 Kbytes of the 32-bit code segment, because the operand-size attribute of
the instruction is set to 16, allowing only a 16-bit return address offset to be saved.
2-24
Am486 Microprocessor Instruction Set
AMD
2.20
CBW
Converts Byte to Word
Opcode
Instruction
Clocks
Description
98
CBW
3
AX ← sign-extend of AL
Operation
IF OperandSize = 16
THEN AX ← SignExtend (AL)
Description
The CBW instruction converts the signed byte in the AL register to a signed word in the AX
register by extending the most-significant bit of the AL register (the sign bit) into all of the
bits of the AH register.
Flags Affected
None
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-25
AMD
2.21
CDQ
Converts Doubleword to Quadword
Opcode
Instruction
Clocks
Description
99
CDQ
3
EDX:EAX ← sign-extend of EAX
Operation
IF OperandSize = 32
THEN
IF EAX < 0
THEN EDX ← 0FFFFFFFFh;
ELSE EDX ← 0;
FI
Description
The CDQ instruction converts the signed doubleword in the EAX register to a signed 64bit integer in the register pair EDX:EAX by extending the most-significant bit of the EAX
register (the sign bit) into all the bits of the EDX register.
Flags Affected
None
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-26
Am486 Microprocessor Instruction Set
AMD
2.22
CLC
Clears Carry Flag
Opcode
Instruction
Clocks
Description
F8
CLC
2
Clears Carry Flag.
Operation
CF ← 0
Description
CLC clears CF. It does not affect other flags or registers.
Flags Affected
CF is cleared.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-27
AMD
2.23
CLD
Clears Direction Flag
Opcode
Instruction Clocks Description
FC
CLD
2
Clears Direction Flag to make the Stack Index (SI or ESI) and/or the
Data Index (DI or EDI) Registers increment.
Operation
DF ← 0
Description
The CLD instruction clears the Direction Flag, causing all subsequent string operations to
increment the index registers on which they operate: SI (8-bit or 16-bit operation) or ESI
(32-bit operation), and/or DI (8-bit or 16-bit operation) or EDI (32-bit operation).
Flags Affected
DF is cleared. No other flags or registers are affected.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-28
Am486 Microprocessor Instruction Set
AMD
2.24
CLI
Clears Interrupt-Enable Flag
Opcode
Instruction Clocks
Description
FA
CLI
Clears Interrupt-enable Flag: maskable interrupts disabled.
5
Operation
IF ← 0
Description
The CLI instruction clears IF if the current privilege level is at least as privileged as IOPL.
No other flags are affected. External interrupts are not recognized at the end of the CLI
instruction or from that point on until the IF flag is set.
Flags Affected
IF is cleared.
Protected Mode Exceptions
General Protection Fault (13) if the current privilege level is greater (has less privilege) than
the I/O privilege level in the FLAGS register. The I/O privilege level specifies the least
privileged level at which I/O can be performed.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
General Protection Fault (13) if the current privilege level is greater (has less privilege) than
the I/O privilege level in the FLAGS register. The I/O privilege level specifies the least
privileged level at which I/O can be performed.
Am486 Microprocessor Instruction Set
2-29
AMD
2.25
CLTS
Clears Task-Switched Flag in CR0
Opcode
Instruction
Clocks
Description
0F 06
CLTS
7
Clears Task-Switched flag.
Operation
TS Flag in CR0 ← 0
Description
The CLTS instruction clears the Task-Switched (TS) flag in the CR0 register. This flag is
set by the microprocessor every time a task switch occurs. The TS flag is used to manage
microprocessor extensions as follows:
n
Every execution of an ESC instruction is trapped if the TS flag is set.
n
Execution of a WAIT instruction is trapped if the MP flag and the TS flag are both set.
If a task switch occurs after an ESC instruction begins execution, you may need to save
the floating-point unit’s context before issuing a new ESC instruction. The fault handler
saves the context and clears the IS flag.
The CLTS instruction appears in operating system software, not in application programs.
It is a privileged instruction that only executes at privilege level 0.
Flags Affected
The TS flag is cleared (the TS flag is in the CR0 register, not the FLAGS or EFLAGS register).
Protected Mode Exceptions
General Protection Fault (13) if the CLTS instruction is executed with a current privilege
level other than 0.
Real Address Mode Exceptions
None (valid in Real Address Mode to allow initialization for Protected Mode).
Virtual 8086 Mode Exceptions
General Protection Fault (13) if the CLTS instruction is executed with a current privilege
level other than 0.
2-30
Am486 Microprocessor Instruction Set
AMD
2.26
CMC
Complements Carry Flag
Opcode
Instruction
Clocks
Description
F5
CMC
2
Complements the Carry Flag.
Operation
CF ← NOT CF
Description
The CMC instruction reverses the setting of CF. No other flags are affected.
Flags Affected
CF contains the complement of its original value.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-31
AMD
2.27
CMP
Compares Two Operands
Opcode Instruction
Clocks Description
3C ib
3D iw
3D id
80 /7 ib
81 /7 iw
81 /7 id
83 /7 ib
83 /7 ib
38 /r
39 /r
39 /r
3A /r
3B /r
3B /r
1
1
1
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
CMP AL,imm8
CMP AX,imm16
CMP EAX,imm32
CMP r/m8,imm8
CMP r/m16,imm16
CMP r/m32,imm32
CMP r/m16,imm8
CMP r/m32,imm8
CMP r/m8,48
CMP r/m16,r16
CMP r/m32,r32
CMP 48,4/m8
CMP r16,r/m16
CMP r32,r/m32
Compares immediate byte to AL.
Compares immediate word to AX.
Compares immediate doubleword to EAX.
Compares immediate byte to r/m byte.
Compares immediate word to r/m word.
Compares immediate doubleword to r/m doubleword.
Compares sign extended immediate byte to r/m word.
Compares sign extended immediate word to r/m doubleword.
Compares byte register to r/m byte.
Compares word register to r/m word.
Compares doubleword register to r/m doubleword.
Compares r/m byte to byte register.
Compares r/m word to word register.
Compares r/m doubleword to doubleword register.
Operation
LeftSRC – SignExtend(RightSRC);
(* CMP does not store a result; its purpose is to set the flags *)
Description
CMP subtracts the second operand from the first, but does not store the result; CMP only
changes the flag settings. The CMP instruction is typically used in conjunction with conditional jumps and the conditional SET instructions. (Refer to Appendix D for the list of signed
and unsigned flag tests provided.) If an operand greater than one byte is compared to an
immediate byte, the byte value is first sign-extended.
Flags Affected
The result determines the OF, SF, ZF, AF, PF, and CF settings.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-32
Am486 Microprocessor Instruction Set
AMD
2.28
CMPS/CMPSB/CMPSD/CMPSW
Opcode Instruction
Clocks Description
A6
A7
A7
CMPS m8,m8
CMPS m16,m16
CMPS m32,m32
8
8
8
A6
A7
A7
CMPSB
CMPSD
CMPSW
8
8
8
Compares Two String Operands
Compares bytes ES:DI (second operand) with SI (first operand).
Compares words ES:DI (second operand) with SI (first operand).
Compares doublewords ES:EDI (second operand) with
ESI (first operand).
Compares bytes ES:DI with DS:SI.
Compares doublewords ES:EDI with DS:SI.
Compares words ES:DI with DS:SI.
Operation
IF OperandSize = 8 (* byte *)
THEN
SI – DI
IF DF = 0 THEN IncDec ← 1 ELSE IncDec ← –1; FI;
IF OperandSize = 16 (* word *)
THEN
SI – DI
IF DF = 0 THEN IncDec ← 2 ELSE IncDec ← –2; FI;
IF OperandSize = 32 (* doubleword *)
THEN
ESI – EDI
IF DF = 0 THEN IncDec ← 4 ELSE IncDec ← –4; FI;
FI;
source-index = source-index + IncDec;
destination-index = destination-index + IncDec
Note: If AddressSize = 16, SI = Source Index and DI = Destination Index.
If AddressSize = 32, ESI = Source Index and EDI = Destination Index.
Description
CMPS compares the byte, word, or doubleword pointed to by the SI (8- or 16-bit operation)
or ESI (32-bit operation) register with the byte, word, or doubleword pointed to by the DI
(8- or 16-bit operation) or EDI (32-bit operation) register. You must preload the registers
before executing CMPS.
CMPS subtracts the (E)DI indexed operand from the (E)SI indexed operand. This is the
reverse of the usual AMD convention in which the left operand is the destination and the
right operand is the source. No result is stored; only the flags reflect the change. The
operand size determines whether bytes, words, or doublewords are compared. The first
operand (SI or ESI) uses the DS register unless a segment override byte is present. The
second operand (DI or EDI) must be addressable from the ES register; no segment override
is possible. After the comparison, both the source-index register and the destination-index
register are automatically advanced. If DF is 0, the registers increment according to the
operand size (byte = 1; word = 2; doubleword = 4); if DF is 1, the registers decrement.
CMPSB, CMPSD, and CMPSW instructions are synonymous with the byte, doubleword,
and word CMPS instructions, respectively.
The CMPS instruction can be preceded by the REPE or REPNE prefix for block comparison
of CX or ECX bytes, words, or doublewords. Refer to the description of the REP instruction
for more information on this operation.
Flags Affected
OF, SF, ZF, AF, PF, and CF are set according to the result.
Am486 Microprocessor Instruction Set
2-33
AMD
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-34
Am486 Microprocessor Instruction Set
AMD
2.29
CMPXCHG
Compares And Exchanges
Opcode
Instruction
Clocks
Description
0F B0 /r
CMPXCHG
r/m8,r8
0F B1 /r
CMPXCHG
r/m16,r16
0F B1/r
CMPXCHG
r/m32,r32
6/7 if equal; Compares AL with r/m byte. If equal, sets ZF and loads byte
6/10 if not
register into r/m byte; otherwise, clears ZF and loads r/m byte
into AL.
6/7if equal; Compares AX with r/m word. If equal, sets ZF and loads word
6/10 if not
register into r/m word; otherwise, clears ZF and loads r/m word
into AX.
6/7 if equal; Compares EAX with r/m doubleword. If equal, sets ZF and
6/10 if not
loads doubleword register into r/m doubleword; otherwise,
clears ZF and loads r/m doubleword into EAX.
Operation
IF accumulator = DEST
ZF ← 1
DEST ← SRC
ELSE
ZF ← 0
accumulator ← DEST
Description
CMPXCHG compares the accumulator (AL, AX, or EAX register) with DEST. If they are
equal, SRC is loaded into DEST. Otherwise, DEST is loaded into the accumulator.
Flags Affected
CF, PF, AF, SF, and OF are affected as if a CMP instruction had been executed with DEST
and the accumulator as operands. ZF is set if the destination operand and the accumulator
are equal; otherwise it is cleared.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: This instruction can be used with a LOCK prefix. In order to simplify the interface to
the microprocessor’s bus, the destination operand receives a write cycle without regard to
the result of the comparison. DEST is written back if the comparison fails, and SRC is
written into the destination otherwise. (The microprocessor never produces a locked read
without also producing a locked write.) This instruction is not supported by 386 processors.
Am486 Microprocessor Instruction Set
2-35
AMD
2.30
CWD
Converts Word to Doubleword Using DX:AX Register Pair
Opcode
Instruction
Clocks
Description
99
CWD
3
DX:AX ← sign-extend of AX
Operation
IF OperandSize = 16
THEN
IF AX < 0
THEN DX ← 0FFFFh;
ELSE DX ← 0;
FI
Description
The CWD instruction converts the signed word in the AX register to a signed doubleword
in the DX:AX register pair by extending the most-significant bit of the AX register into all
the bits of the DX register.
Flags Affected
None
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-36
Am486 Microprocessor Instruction Set
AMD
2.31
CWDE
Converts Word to Doubleword Using EAX Register
Opcode
Instruction
Clocks
Description
98
CWDE
3
EAX ← sign-extend of AX
Operation
IF OperandSize = 32
THEN EAX ← SignExtend (AX)
Description
The CWDE instruction converts the signed word in the AX register to a doubleword in the
EAX register by extending the most-significant bit of the AX register into the two mostsignificant bytes of the EAX register.
Note: The CWDE instruction is different from the CWD instruction. The CWD instruction
uses the DX:AX register pair rather than the EAX register as a destination.
Flags Affected
None
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-37
AMD
2.32
DAA
Decimal Adjusts AL after Addition
Opcode
Instruction
Clocks
Description
27
DAA
2
Decimal adjusts AL after addition.
Operation
IF ((AL AND 0Fh) > 9) OR (AF = 1)
THEN
AL ← AL + 6;
AF ← 1;
ELSE
AF ← 0;
FI;
IF (AL > 9Fh) On (CF = 1)
THEN
AL ← AL + 60h;
CF ← 1;
ELSE CF ← 0;
FI
Description
Execute the DAA instruction only after executing an ADD instruction that leaves a twoBCD-digit byte result in the AL register. The ADD operands should consist of two packed
BCD digits. The DAA instruction adjusts the AL register to contain the correct two-digit
packed decimal result.
Flags Affected
AF and CF are set if there is a decimal carry, cleared if there is no decimal carry; SF, ZF
and PF are set according to the result. OF is undefined.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-38
Am486 Microprocessor Instruction Set
AMD
2.33
DAS
Decimal Adjusts AL after Subtraction
Opcode
Instruction
Clocks
Description
2F
DAS
2
Decimal adjusts after subtraction.
Operation
IF (AL AND 0Fh) > 9 OR AF = 1
THEN
AL ← AL – 6;
AF ← 1;
ELSE
AF ← 0;
FI;
IF (AL > 9Fh) OR (CF = 1)
THEN
AL ← AL – 60h;
CF ← 1;
ELSE CF ← 0;
FI
Description
Execute the DAS instruction only after a subtraction instruction that leaves a two-BCD digit
byte result in the AL register. The operands should consist of two packed BCD digits. The
DAS instruction adjusts the AL register to contain the correct packed two-digit decimal
result.
Flags Affected
AF and CF are set if there is a decimal carry, cleared if there is no decimal carry; SF, ZF
and PF are set according to the result. OF is undefined.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-39
AMD
2.34
DEC
Decrements by 1
Opcode
Instruction
Clocks
Description
FE /1
FF /1
FF /1
48 + rw
48 + rw
DEC r/m8
DEC r/m16
DEC r/m32
DEC r16
DEC r32
1/3
1/3
1/3
1
1
Decrements r/m byte by 1.
Decrements r/m word by 1.
Decrements r/m doubleword by 1.
Decrements word register by 1.
Decrements doubleword register by 1.
Operation
DEST ← DEST – 1
Description
The DEC instruction subtracts 1 from the operand. The DEC instruction does not change
CF. To affect CF, use the SUB instruction with an immediate operand of 1.
Flags Affected
OF, SF, ZF, AF, and PF are set according to the result.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-40
Am486 Microprocessor Instruction Set
AMD
2.35
DIV
Unsigned Divide
Opcode Instruction
F6 /6
F7 /6
F7 /6
Clocks Description
DIV AL, r/m8
16/16
DIV AX,r/m16 24/24
DIV EAX,r/m32 40/40
Unsigned division of AX by r/m byte (AL = Quo, AH = Rem).
Unsigned division of DX:AX by r/m word (AX = Quo, DX = Rem).
Unsigned division of EDS:EAX by r/m doubleword (EAX = Quo,
EDX = Rem).
Operation
temp ← dividend / divisor;
IF temp does not fit in quotient
THEN Divide By Zero Exception 0;
ELSE
quotient ← temp;
remainder ← dividend MOD (r/m);
FI
Note: Divisions are unsigned. The divisor is given by the r/m operand. The dividend,
quotient, and remainder use implicit registers. Refer to the table under ‘Description.’
Description
The DIV instruction performs an unsigned division. The dividend is implicit; only the divisor
is given as an operand. The remainder is always less than the divisor. The type of the
divisor determines which registers to use as follows:
Size
Divisor
Quotient
Remainder
Dividend
byte
word
doubleword
AX
DX:AX
EDX:EAX
r/m8
r/m16
r/m32
AL
AX
EAX
AH
DX
EDX
Flags Affected
OF, SF, ZF, AF, PF, and CF are undefined.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Divide By Zero Exception 0 if the quotient is too big to fit in the designated register (AL,
AX, or EAX), or if the divisor is 0. General Protection Fault (13) indicates that part of the
operand lies outside the effective address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
Divide By Zero Exception 0 if the quotient is too big to fit in the designated register (AL,
AX, or EAX), or if the divisor is 0. General Protection Fault (13) indicates that part of the
operand lies outside the effective address space: 0 to 0FFFFh. Page Fault (14) indicates
a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned memory
reference.
Am486 Microprocessor Instruction Set
2-41
AMD
2.36
ENTER
Makes Stack Frame for Procedure
Opcode
Instruction
Clocks
Description
C8 /w 00
C8 /w 01
C8 /w ib
ENTER imm16,0
ENTER imm16,1
ENTER imm16,imm8
14
17
17 + 3n
Makes procedure stack frame.
Makes stack frame for procedure parameters.
Makes stack frame for procedure parameters.
Operation
level ← level MOD 32
IF OperandSize = 16 THEN Push(BP) ELSE Push (EBP) FI;
(* Save stack pointer *)
frame-ptr ← eSP
IF level > 0
THEN (* level is rightmost parameter *)
FOR i ← 1 TO level – 1
DO
IF OperandSize = 16
THEN
BP ← BP – 2;
Push [BP]
ELSE (* OperandSize = 32 *)
EBP ← EBP – 4;
Push[EBP]; FI;
OD;
Push(frame-ptr)
FI;
IF OperandSize = 16 THEN BP ← frame-ptr ELSE EBP ← frame-ptr; FI;
IF StackAddrSize = 16
THEN SP ← SP – First operand;
ELSE ESP ← ESP – ZeroExtend (First operand); FI
Description
ENTER creates the stack frame required by most block-structured high-level languages.
The first operand specifies the number of allocated dynamic storage bytes. The second
operand gives the lexical nesting level (0–31) of the routine within the high-level language
source code and determines the number of stack frame pointers copied into the new stack
frame from the preceding frame. The processor uses the BP (word) or EBP (doubleword)
register as the frame pointer and the SP (word) or ESP (doubleword) register as the stack
pointer. If the second operand is 0, ENTER pushes the frame pointer onto the stack, subtracts the first operand from the stack pointer, and sets the frame pointer to the current
stack-pointer value.
Flags Affected
None
Protected Mode Exceptions
Stack Fault (12) if SP or ESP exceeds the stack limit. Page Fault (14) indicates a page fault.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-42
Am486 Microprocessor Instruction Set
AMD
2.37
F2XM1
Computes 2X–1
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 F0
F2XM1
242 (140–279)
2
Replaces ST with (2ST – 1).
Operation
ST ← (2ST – 1)
Description
F2XM1 replaces the contents of ST with (2ST–1). ST must lie in the range –1 < ST < 1.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Denormalized Operand, Invalid Operation, Stack
Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If the operand is outside the acceptable range, the result of F2XM1 is undefined.
The F2XM1 instruction is designed to produce a very accurate result even when the operand
is close to zero. Larger errors are incurred for operands with magnitudes very close to 1.
Values other than 2 can be exponentiated using the formula:
xy = 2(y · log x)
2
The instructions FLDL2T and FLDL2E load the constants log210 and log2e, respectively.
FYL2X can be used to calculate y ⋅ log2x for arbitrary positive x.
Am486 Microprocessor Instruction Set
2-43
AMD
2.38
FABS
Absolute Value
Opcode
Instruction
Clocks
Description
D9 E1
FABS
3
Replaces ST with its absolute value.
Operation
sign bit of ST ← 0
Description
The absolute value instruction clears the sign bit of ST. This operation leaves a positive
value unchanged, or replaces a negative value with a positive value of equal magnitude.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: The invalid-operation exception is raised only on stack underflow, even if the operand
is signaling NaN or is in an unsupported format.
2-44
Am486 Microprocessor Instruction Set
AMD
2.39
FADD
Adds Floating Point
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D8 /0
DC /0
D8 C0+i
DC C0+i
FADD m32 real
FADD m64 real
FADD ST,ST(i)
FADD ST(i),ST
10 (8–20)
10 (8–20)
10 (8–20)
10 (8–20)
7 (5–17)
7 (5–17)
7 (5–17)
7 (5–17)
Adds m32real to ST.
Adds m64real to ST.
Adds ST(i) to ST.
Adds ST to ST(i).
Operation
DEST ← DEST +SRC
Description
The addition instructions add the source and destination operands and return the sum to
the destination. The operand at the stack top can be doubled by coding:
FADD ST, ST(0)
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-45
AMD
2.40
FADDP
Adds Floating Point and Pops FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE C0+i
DE C1
FADDP ST(i),ST
FADDP
10 (8–20)
10 (8–20)
7 (5–17)
7 (5–17)
Adds ST to ST(i) and pops ST.
Adds ST to ST(1) and pops ST.
Operation
DEST ← DEST +SRC;
pop ST;
FI
Description
The addition instructions add the source and destination operands, return the sum to the
destination, and pop the stack.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
2-46
Am486 Microprocessor Instruction Set
AMD
2.41
FBLD
Loads Binary Coded Decimal
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D8 /4
FBLD m80 dec
75 (70–103)
7.7 (2–8)
Pushes m80dec onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← SRC
Description
FBLD converts the BCD source operand into extended-real format and pushes it onto the
FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: The source is loaded without rounding error. The sign of the source is preserved,
including the case where the value is negative zero. The packed decimal digits are assumed
to be in the range 0–9. The instruction does not check for invalid digits (A–Fh) and the
result of attempting to load an invalid encoding is undefined. ST(7) must be empty to avoid
causing an invalid-operation exception.
Am486 Microprocessor Instruction Set
2-47
AMD
2.42
FBSTP
Stores Binary Coded Decimal and Pops FPU Stack Top
Opcode
Instruction
Clocks
Description
DF /6
FBSTP m80dec
175 (172–176)
Stores ST in m80dec and pops ST.
Operation
DEST ← ST(0);
pop ST FI
Description
FBSTP converts the value in ST into a packed decimal integer, stores the result at the
destination in memory, and pops ST. Non-integral values are first rounded according to the
RC field of the control word.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
2-48
Am486 Microprocessor Instruction Set
AMD
2.43
FCHS
Changes Sign
Opcode
Instruction
Clocks
Description
D9 E0
FCHS
6
Replaces ST with a value of opposite sign.
Operation
sign bit of ← ST NOT (sign bit of ST)
Description
The FCHS instruction inverts the sign bit of ST. This operation replaces a positive value
with a negative value of equal magnitude, or vice versa.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: The invalid-operation exception is raised only on stack underflow, even if the operand
is a signaling NaN or is in an unsupported format.
Am486 Microprocessor Instruction Set
2-49
AMD
2.44
FCLEX
Clears Exceptions after Checking for FPU Error
Opcode
Instruction
Clocks
Description
9B DB E2
FCLEX
7 + 3+
for FWAIT
Clears floating-point exception flags after
checking for floating-point error conditions.
Operation
SW[0–7] ← 0;
SW[15] ← 0
Description
FCLEX clears the exception flags, the exception status flag, and the busy flag of the FPU
status word after checking for floating-point error conditions.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
2-50
Am486 Microprocessor Instruction Set
AMD
2.45
FCOM
Compares Real
Opcode
Instruction
Clocks
Description
C8 /2
DC /2
D8 D0+i
D8 D1
FCOM m32real
FCOM m64real
FCOM st(i)
FCOM
4
4
4
4
Compares ST with m32real.
Compares ST with m64real.
Compares ST with ST(i).
Compares ST with ST(1).
Operation
CASE (relation of operands) OF
Not comparable:C3, C2, C0 ←
ST > SRC:
C3, C2, C0 ←
ST < SRC:
C3, C2, C0 ←
ST = SRC:
C3, C2, C0 ←
CF ← C0;
PF ← C2;
ZF ← C3;
FI
111;
000;
001;
100;
Description
FCOM compares the stack top to the source, which can be a register or a 32-bit or 64-bit
real memory operand. If no operand is encoded, ST is compared to ST(1). Following the
instruction, the condition codes reflect the relation between ST and the source operand.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are set as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If either operand is a NaN or is in an undefined format, or if a stack fault occurs, the
invalid-operation exception is raised and the condition bits are set to “unordered.” The sign
of zero is ignored, so that – 0.0 = + 0.0.
Am486 Microprocessor Instruction Set
2-51
AMD
2.46
FCOMP
Compares Real and Pops FPU Stack Top
Opcode
Instruction
Clocks
Description
D8 /3
DC /3
D8 D8+i
D8 D9
FCOMP m32real
FCOMP m64real
FCOMP ST(i)
FCOMP
4
4
4
4
Compares ST with m32real and pops ST.
Compares ST with m64real and pops ST.
Compares ST with ST(i) and pops ST.
Compares ST with ST(1) and pops ST.
Operation
CASE (relation of operands) OF
Not comparable:C3, C2, C0 ←
ST > SRC:
C3, C2, C0 ←
ST < SRC:
C3, C2, C0 ←
ST = SRC:
C3, C2, C0 ←
CF ← C0;
PF ← C2;
ZF ← C3;
pop ST; FI
111;
000;
001;
100;
Description
FCOMP compares the stack top to the source, which can be a register or a single or doublereal memory-operand, and then pops the stack. If no operand is encoded, ST is compared
to ST(1). Following the instruction, the condition codes reflect the relation between ST and
the source operand.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are set as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If either operand is a NaN or is in an undefined format, or if a stack fault occurs, the
invalid-operation exception is raised, and the condition bits are set to “unordered.” The sign
of zero is ignored, so that – 0.0 = + 0.0.
2-52
Am486 Microprocessor Instruction Set
AMD
2.47
FCOMPP Compares Real and Pops FPU Stack Top Twice
Opcode
Instruction
Clocks
Description
DE D9
FCOMPP
5
Compares ST with ST(1) and pops ST twice.
Operation
CASE (relation of operands) OF
Not comparable:C3, C2, C0 ←
ST > ST(1):
C3, C2, C0 ←
ST < ST(1):
C3, C2, C0 ←
ST = ST(1):
C3, C2, C0 ←
CF ← C0;
PF ← C2;
ZF ← C3;
pop ST; pop ST; FI
111;
000;
001;
100;
Description
FCOMPP compares the stack top to ST(1). Following the instruction, the condition codes
reflect the relation between ST and ST(1).
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are set as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If either operand is a NaN or is in an undefined format, or if a stack fault occurs, the
invalid-operation exception is raised, and the condition bits are set to “unordered.” The sign
of zero is ignored, so that – 0.0 = + 0.0.
Am486 Microprocessor Instruction Set
2-53
AMD
2.48
FCOS
Cosine
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 FF
FCOS
241 (193–279)
2
Replaces ST with its cosine.
Operation
IF operand is in range
THEN
C2 ← 0;
ST ← cos (ST);
ELSE
C2 ← 1;
FI
Description
The cosine instruction replaces the contents of ST with cos (ST). ST, expressed in radians,
must lie in the range | θ | < 263.
FPU Flags Affected
If C2 = 0 (reduction complete), the result determines the C1 setting. If both the IE and SF
bits of the status word are set (indicating a stack exception), C0 distinguishes between
stack overflow (C1 = 1) and underflow (C1 = 0); if PE is set, C1 indicates whether the last
rounding was upward. If C2 = 1 (reduction incomplete), C1 is undefined. C0 and C3 are
undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Denormalized Operand, Invalid Operation, Stack
Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If the operand is outside the acceptable range, the C2 flag is set and ST remains
unchanged. Reduce the operand to an absolute value smaller than 263 by subtracting an
appropriate integer multiple of 2π. For π, use the full 66-bit internal π used by the FPU:
4 ⋅ 0.C90FDAA22168C234Ch. This ensures that the results are consistent with argument
reduction used by the FPU for trigonometric functions.You cannot represent this number
as an extended-real value, however. A suggested solution is to represent π as the sum of
a highπ (the 33 most-significant bits) and a lowπ (the 33 least-significant bits). The Am486
processor checks for interrupts while performing this instruction. It aborts execution to
service an interrupt.
If you need to compute sine and cosine, use FSINCOS for faster execution.
2-54
Am486 Microprocessor Instruction Set
AMD
2.49
FDECSTP
Decrements Top-of-Stack Pointer
Opcode
Instruction
Clocks
Description
D9 F6
FDECSTP
3
Decrements top-of-stack pointer for FPU register stack.
Operation
IF TOP = 0
THEN TOP ← 7;
ELSE TOP ← TOP – 1;
FI
Description
FDECSTP subtracts one (without carry) from the 3-bit TOP field of the FPU status word.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: The effect of FDECSTP is to rotate the stack. It does not alter register tags or contents,
nor does it transfer data.
Am486 Microprocessor Instruction Set
2-55
AMD
2.50
FDIV
Divides Real
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D8 /6
DC /6
D8 F0+i
DC F8+i
FDIV m32real
FDIV m64real
FDIV ST,ST(i)
FDIV ST(i),ST
73
73
73
73
70
70
70
70
Divides ST by m32real.
Divides ST by m64real.
Divides ST by ST(i).
Replaces ST(i) with ST ÷ ST(i).
Operation
DEST ← ST ÷ other operand
Description
The division instructions divide the stack top by the other operand and return the quotient
to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Divide By Zero, Denormalized Operand,
Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it automatically converts to the extended-real
format. The performance of division instructions depends on the PC (Precision Control)
field of the FPU control word. If PC specifies a precision of 53 bits, the division instruction
executes in 62 clocks. If the specified precision is 24 bits, the division instruction takes only
35 clocks.
2-56
Am486 Microprocessor Instruction Set
AMD
2.51
FDIVP
Divides Real and Pops FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE F8+i
DE F9
FDIVP ST(i),ST
FDIVP
73
73
70
70
Replaces ST(i) with ST ÷ ST(i); pops ST.
Replaces ST(1) with ST ÷ ST(1); pops ST.
Operation
DEST ← ST÷ other operand;
pop ST FI
Description
The division instructions divide the stack top by the other operand and return the quotient
to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Divide By Zero, Denormalized Operand,
Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
If the source operand is in memory, it automatically converts to the extended-real format.
The performance of division instructions depends on the PC (Precision Control) field of the
FPU control word. If PC specifies a precision of 53 bits, the division instruction executes in
62 clocks. If the specified precision is 24 bits, the division instruction takes only 35 clocks.
Am486 Microprocessor Instruction Set
2-57
AMD
2.52
FDIVR
Reverse Divides Real
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D8 /7
DC /7
D8 F8+i
DC F0+i
FDIVR m32real
FDIVR m64real
FDIVR ST,ST(i)
FDIVR ST(i),ST
73
73
73
73
70
70
70
70
Replaces ST with m32real ÷ ST.
Replaces ST with m64real ÷ ST.
Replaces ST with ST(i) ÷ ST.
Divides ST(i) by ST.
Operation
DEST ← other operand ÷ ST
Description
The division instructions divide the other operand by the stack top and return the quotient
to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Divide By Zero, Denormalized Operand,
Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it automatically converts to the extended-real
format. The performance of division instructions depends on the PC (Precision Control)
field of the FPU control word. If PC specifies a precision of 53 bits, the division instruction
executes in 62 clocks. If the specified precision is 24 bits, the division instruction takes only
35 clocks.
2-58
Am486 Microprocessor Instruction Set
AMD
2.53
FDIVRP
Reverse Divides Real and Pops FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE F0+i
DE F1
FDIVRP ST(i),ST
FDIVRP
73
73
70
70
Divides ST(i) by ST and pops ST
Divides ST(1) by ST and pops ST
Operation
DEST ← other operand ÷ ST;
pop ST FI
Description
The division instructions divide the other operand by the stack top and return the quotient
to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Divide By Zero, Denormalized Operand,
Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it automatically converts to the extended-real
format. The performance of division instructions depends on the PC (Precision Control)
field of the FPU control word. If PC specifies a precision of 53 bits, the division instruction
executes in 62 clocks. If the specified precision is 24 bits, the division instruction takes only
35 clocks.
Am486 Microprocessor Instruction Set
2-59
AMD
2.54
FFREE
Free Floating-Point Register
Opcode
Instruction
Clocks
Description
DD C0+i
FFREE ST(i)
3
Tags ST(i) as empty.
Operation
TAG(i) ← 11B
Description
FFREE tags the destination register as empty.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: FFREE does not affect the contents of the destination register. The floating-point
top-of-stack pointer (TOP) is also unaffected.
2-60
Am486 Microprocessor Instruction Set
AMD
2.55
FIADD
Adds Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DA /0
DE /0
FIADD m32int
FIADD m16int
22.5 (19–32)
24 (20–35)
7 (5–17)
7 (5–17)
Adds m32int to ST.
Adds m16int to ST.
Operation
DEST ← DEST + SRC
Description
The addition instructions add the source and destination operands and return the sum to
the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
Am486 Microprocessor Instruction Set
2-61
AMD
2.56
FICOM
Compares Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE /2
DA /2
FICOM m16intl
FICOM m32intl
18 (16–20)
16.5 (15–17)
1
1
Compares ST with m16int.
Compares ST with m32int.
Operation
CASE (relation of operands) OF
Not comparable:C3, C2, C0 ←
ST > SRC:
C3, C2, C0 ←
ST < SRC:
C3, C2, C0 ←
ST = SRC:
C3, C2, C0 ←
CF ← C0;
PF ← C2;
ZF ← C3;
FI
111;
000;
001;
100;
Description
FICOM compares the stack top to the source. Following the instruction, the condition codes
reflect the relation between ST and the source operand.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. The values of C0, C2, and C3 are as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: The memory operand is converted to extended-real format before the comparison
is performed. If either operand is a NaN or is in an undefined format, or if a stack fault
occurs, the invalid-operation exception is raised and the condition bits are set to
“unordered.”
2-62
Am486 Microprocessor Instruction Set
AMD
2.57
FICOMP
Compares Integer and Pops FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE /3
DA /3
FICOMP m16int
FICOMP m32int
18 (16–20)
16.5 (15–17)
1
1
Compares ST with m16int and pops ST.
Compares ST with m32int and pops ST.
Operation
CASE (relation of operands) OF
Not comparable:C3, C2, C0 ←
ST > SRC:
C3, C2, C0 ←
ST < SRC:
C3, C2, C0 ←
ST = SRC:
C3, C2, C0 ←
CF ← C0;
PF ← C2;
ZF ← C3;
pop ST FI
111;
000;
001;
100;
Description
FICOMP compares the stack top to the source, then pops the stack top. Following the
instruction, the condition codes reflect the relation between ST and the source operand.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. The values of C0, C2, and C3 are as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: The memory operand is converted to extended-real format before the comparison
is performed. If either operand is a NaN or is in an undefined format, or if a stack fault
occurs, the invalid-operation exception is raised and the condition bits are set to
“unordered.”
Am486 Microprocessor Instruction Set
2-63
AMD
2.58
FIDIV
Divides Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DA /6
DE /6
FIDIV m32int
FIDIV m16int
73
73
70
70
Divides ST by m32int.
Divides ST by m16int.
Operation
DEST ← ST÷ other operand
Description
The division instructions divide the stack top by the other operand and return the quotient
to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Divide By Zero, Denormalized Operand,
Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it automatically converts to the extended-real
format. The performance of division instructions depends on the PC (Precision Control)
field of the FPU control word. If PC specifies a precision of 53 bits, the division instruction
executes in 62 clocks. If the specified precision is 24 bits, the division instruction takes only
35 clocks.
2-64
Am486 Microprocessor Instruction Set
AMD
2.59
FIDIVR
Reverse Divides Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DA /7
DE /7
FIDIVR m32int
FIDIVR m16int
73
73
70
70
Replaces ST with m32int ÷ ST.
Replaces ST with m16int ÷ ST.
Operation
DEST ← other operand ÷ ST
Description
The division instructions divide the other operand by the stack top and return the quotient
to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Divide By Zero, Denormalized Operand,
Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it automatically converts to the extended-real
format. The performance of division instructions depends on the PC (Precision Control)
field of the FPU control word. If PC specifies a precision of 53 bits, the division instruction
executes in 62 clocks. If the specified precision is 24 bits, the division instruction takes only
35 clocks.
Am486 Microprocessor Instruction Set
2-65
AMD
2.60
FILD
Loads Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DF /0
DB /0
DF /5
FILD m16int
FILD m32int
FILD m64int
14.5 (13–16)
11.5 (9–12)
16.8 (10–18)
4
4 (2–4)
7.8 (2–8)
Pushes m16int onto FPU stack.
Pushes m32int onto FPU stack.
Pushes m64int onto FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← SRC
Description
FILD converts the source signed integer operand into extended-real format and pushes it
onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: The source is loaded without rounding error. ST(7) must be empty to avoid causing
an invalid-operation exception.
2-66
Am486 Microprocessor Instruction Set
AMD
2.61
FIMUL
Multiplies Integer
Opcode
Instruction
Clocks
Description
DA /1
DE /1
FIMUL m32int
FIMUL m16int
8
8
Multiplies ST by m32int.
Multiplies ST by m16int.
Operation
DEST ← DEST
⋅
SRC
Description
The multiplication instructions multiply the destination operand by the source operand and
return the product to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
Am486 Microprocessor Instruction Set
2-67
AMD
2.62
FINCSTP
Increments Top-of-Stack Pointer
Opcode
Instruction
Clocks
Description
D9 F7
FINCSTP
3
Increments top-of-stack pointer for FPU register stack.
Operation
IF TOP = 7
THEN TOP ← 0;
ELSE TOP ← TOP + 1;
FI
Description
FINCSTP adds one (without carry) to the 3-bit TOP field of the FPU status word.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) if either EM or TS in CR0 is set.
Note: The effect of FINCSTP is to rotate the stack. It does not alter register tags or contents,
nor does it transfer data. It is not equivalent to popping the stack because it does not set
the tag of the old stack-top to empty.
2-68
Am486 Microprocessor Instruction Set
AMD
2.63
FINIT
Initializes FPU after Checking for Unmasked FPU Error
Opcode
Instruction
Clocks
Description
DB E3
FINIT
17 + 3+
for FWAIT
Initializes FPU after checking for unmasked floating-point
error condition.
Operation
CW ← 037Fh;
SW ← 0;
TW ← FFFFh;
FEA ← 0; FDS ← 0;
FIP ← 0; FOP ← 0; FCS ← 0
(*
(*
(*
(*
(*
Control word *)
Status word *)
Tag word *)
Data pointer *)
Instruction pointer *)
Description
The initialization instructions set the FPU into a known state, unaffected by any previous
activity.
The FPU control word is set to 037Fh (round to nearest, all exceptions masked, 64-bit
precision). The status word is cleared (no exception flags set, stack register R0 = stack
top). The stack registers are all tagged as empty. The error pointers (both instruction and
data) are cleared.
FPU Flags Affected
C0, C1, C2, C3 cleared
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: FINIT leaves the FPU in the same state as that which results from a hardware RESET
signal. Unlike the Intel 387 math coprocessor, FINIT clears the error pointers in the Am486
processor.
Am486 Microprocessor Instruction Set
2-69
AMD
2.64
FIST
Stores Integer
Opcode
Instruction
Clocks
Description
DF /2
DB /2
FIST m16int
FIST m32int
33.4 (29–34)
32.4 (28–34)
Stores ST in m16int.
Stores ST in m32int.
Operation
DEST ← ST(0)
Description
FIST converts the value in ST into a signed integer according to the RC field of the control
word and transfers the result to the destination. ST remains unchanged. FIST accepts word
and short integer destinations.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. Coprocessor Not
Available (7) occurs if either EM or TS in CR0 is set. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
Note: Negative zero is stored with the same encoding (00...00) as positive zero. If the value
is too large to represent as an integer, an exception is raised. The masked response is to
write the most negative integer to memory.
2-70
Am486 Microprocessor Instruction Set
AMD
2.65
FISTP
Stores Integer and Pops FPU Stack Top
Opcode
Instruction
Clocks
Description
DF /3
DB /3
DF /7
FISTP m16int
FISTP m32int
FISTP m64int
33.4 (29–34
33.4 (29–34)
33.4 (29–34)
Stores ST in m16int and pops ST.
Stores ST in m32int and pops ST.
Stores ST in m64int and pops ST.
Operation
DEST ← ST(0);
pop ST FI
Description
FISTP converts the value in ST into a signed integer according to the RC field of the control
word and transfers the result to the destination. ST remains unchanged. FISTP accepts
word, short integer, and long integer destinations.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: Negative zero is stored with the same encoding (00...00) as positive zero. If the value
is too large to represent as an integer, an exception is raised. The masked response is to
write the most negative integer to memory.
Am486 Microprocessor Instruction Set
2-71
AMD
2.66
FISUB
Subtracts Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DA /4
DE /4
FISUB m32int
FISUB m16int
22.5 (19–32)
24 (20–35)
7 (5–17)
7 (5–17)
Subtracts m32int from ST.
Subtracts m16int from ST.
Operation
DEST ← ST – Other Operand
Description
The subtraction instructions subtract the other operand from the stack top and return the
difference to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
2-72
Am486 Microprocessor Instruction Set
AMD
2.67
FISUBR
Reverse Subtracts Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DA /5
DE /5
FISUBR m32int
FISUBR m16int
22.5 (19–32)
24 (20–35)
7 (5–17)
7 (5–17)
Replaces ST with m32int – ST.
Replaces ST with m16int – ST.
Operation
DEST ← Other Operand – ST
Description
The reverse subtraction instructions subtract the stack top from the other operand and
return the difference to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
Am486 Microprocessor Instruction Set
2-73
AMD
2.68
FLD
Loads Real
Opcode
Instruction
Clocks
Description
D9 /0
DD /0
DB /5
D9 C0+i
FLD m32real
FLD m64real
FLD m80real
FLD ST(i)
3
3
6
4
Pushes m32real onto the FPU stack.
Pushes m64real onto the FPU stack.
Pushes m80real onto the FPU stack.
Pushes ST(i) onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← SRC
Description
FLD pushes the source operand onto the FPU stack. If the source is an FPU stack register,
the register number is computed from the top-of-stack pointer before it is decremented.
Because of this instruction characteristic, the following coding duplicates the stack top:
FLD ST(0)
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in single or double-real format, it is automatically converted
to the extended-real format. Loading an extended-real operand does not require conversion, so the I and D exceptions will not occur in this case. ST(7) must be empty to avoid
causing an invalid-operation exception.
2-74
Am486 Microprocessor Instruction Set
AMD
2.69
FLD1
Loads Constant +1.0
Opcode
Instruction
Clocks
Description
D9 E8
FLD1
4
Pushes +1.0 onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← +1.0
Description
FLD1 pushes a +1.0 (in extended-real format) onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: ST(7) must be empty to avoid an invalid exception. An internal 66-bit constant is
used and rounded to external-real format (as specified by the RC bit of the control words).
The precision exception is not raised.
Am486 Microprocessor Instruction Set
2-75
AMD
2.70
FLDCW
Loads Control Word
Opcode
Instruction
Clocks
Description
D9 /5
FNLDCW m2byte
4
Loads the FPU control word from m2byte.
Operation
CW ← SRC
Description
FLDCW replaces the current value of the FPU control word with the value contained in the
specified memory word.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None, except for unmasking an existing exception.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: FLDCW is typically used to establish or change the FPU’s mode of operation. If an
exception bit in the status word is set, loading a new control word that unmasks that
exception will result in a floating-point error condition. When changing modes, the
recommended procedure is to clear any pending exceptions before loading the new control
word.
2-76
Am486 Microprocessor Instruction Set
AMD
2.71
FLDENV
Loads FPU Environment
Opcode
Instruction
Clocks
Description
D9 /4
FLDENV
m14/28byte
44 real or virtual/
34 protected
Loads FPU environment from m14byte or m28byte.
Operation
FPU environment ← SRC
Description
FLDENV reloads the FPU environment from the memory area defined by the source operand. This data should be written by previous FSTENV or FNSTENV instruction. The FPU
environment consists of the FPU control word, status word, tag word, and error pointers
(both data and instruction). The environment layout in memory depends on both the operand
size and the current operating mode of the microprocessor. The USE attribute of the current
code segment determines the operand size: the 14-byte operand applies to a USE16
segment, and the 28-byte operand applies to a USE32 segment. FLDENV should be executed in the same operating mode as the corresponding FSTENV or FNSTENV.
FPU Flags Affected
C0, C1, C2, C3 as loaded
Numeric Exceptions
None, except for loading an unmasked exception.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the environment image contains an unmasked exception, loading it will result in a
floating-point error condition.
Am486 Microprocessor Instruction Set
2-77
AMD
2.72
FLDL2E
Loads Constant log2e
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 EA
FLDL2E
8
2
Pushes log2e onto the FPU Stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← log2e
Description
FLDL2E pushes log2e (in extended-real format) onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: ST(7) must be empty to avoid an invalid exception. An internal 66-bit constant is
used and rounded to external-real format (as specified by the RC bit of the control words).
The precision exception is not raised.
2-78
Am486 Microprocessor Instruction Set
AMD
2.73
FLDL2T
Loads Constant log210
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 E9
FLDL2T
8
2
Pushes log210 onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← log210
Description
FLDL2T pushes log210 (in extended-real format) onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: ST(7) must be empty to avoid an invalid exception. An internal 66-bit constant is
used and rounded to external-real format (as specified by the RC bit of the control words).
The precision exception is not raised.
Am486 Microprocessor Instruction Set
2-79
AMD
2.74
FLDLG2
Loads Constant log102
Opcode
Instruction
Clocks
D9 EC
FLDLG2
8
Concurrent
Execution
Description
Pushes log102 onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← log102
Description
FLDLG2 pushes log102 (in extended-real format) onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: ST(7) must be empty to avoid an invalid exception. An internal 66-bit constant is
used and rounded to external-real format (as specified by the RC bit of the control words).
The precision exception is not raised.
2-80
Am486 Microprocessor Instruction Set
AMD
2.75
FLDLN2
Loads Constant loge2
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 ED
FLDLN2
8
2
Pushes loge2 onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← loge2
Description
FLDLN2 pushes loge2 (in extended-real format) onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: ST(7) must be empty to avoid an invalid exception. An internal 66-bit constant is
used and rounded to external-real format (as specified by the RC bit of the control words).
The precision exception is not raised.
Am486 Microprocessor Instruction Set
2-81
AMD
2.76
Loads Constant π
FLDPI
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 EB
FLDPI
8
2
Pushes π onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← π
Description
FLDPI pushes π (in extended-real format) onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: ST(7) must be empty to avoid an invalid exception. An internal 66-bit constant is
used and rounded to external-real format (as specified by the RC bit of the control words).
The precision exception is not raised.
2-82
Am486 Microprocessor Instruction Set
AMD
2.77
FLDZ
Loads Constant +0.0
Opcode
Instruction
Clocks
Description
D9 EE
FLDZ
4
Pushes +0.0 onto the FPU stack.
Operation
Decrement FPU top-of-stack pointer;
ST(0) ← +0.0
Description
FLDZ pushes +0.00 (in extended-real format) onto the FPU stack.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: ST(7) must be empty to avoid an invalid exception. An internal 66-bit constant is
used and rounded to external-real format (as specified by the RC bit of the control words).
The precision exception is not raised.
Am486 Microprocessor Instruction Set
2-83
AMD
2.78
FMUL
Multiplies Real
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D8 /1
DC /1
D8 C8+i
DC C8+i
FMUL m32real
FMUL m64real
FMUL ST,ST(i)
FMUL ST(i),ST
11
14
16
16
8
11
13
13
Multiplies ST by m32real.
Multiplies ST by m64real.
Multiplies ST by ST(i).
Multiplies ST(i) by ST.
Operation
DEST ← DEST
⋅
SRC
Description
The multiplication instructions multiply the destination operand by the source operand and
return the product to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
2-84
Am486 Microprocessor Instruction Set
AMD
2.79
FMULP
Multiplies Real and Pops FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE C8+i
DE C9
FMULP ST(i),ST
FMULP
16
16
13
13
Multiplies ST(i) by ST and pops ST.
Multiplies ST(1) by ST and pops ST.
Operation
DEST ← DEST
pop ST FI
⋅
SRC;
Description
The multiplication instructions multiply the destination operand by the source operand and
return the product to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
Am486 Microprocessor Instruction Set
2-85
AMD
2.80
FNCLEX
Clears Exceptions without Checking for FPU Error
Opcode
Instruction
Clocks
Description
D8 E2
FNCLEX
7
Clears floating-point exception flag without
checking for floating-point error conditions.
Operation
SW[0–7] ← 0;
SW[15] ← 0
Description
FNCLEX clears the exception flags, the exception status flag, and the busy flag of the FPU
status word.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
2-86
Am486 Microprocessor Instruction Set
AMD
2.81
FNINIT Initializes FPU without Checking for Unmasked FPU Error
Opcode
Instruction
Clocks
Description
DB E3
FNINIT
17
Initializes FPU without checking for unmasked
floating-point error condition.
Operation
CW ← 037Fh;
SW ← 0;
TW ← FFFFh;
FEA ← 0; FDS ← 0;
FIP ← 0; FOP ← 0; FCS ← 0;
(*
(*
(*
(*
(*
Control word *)
Status word *)
Tag word *)
Data pointer *)
Instruction pointer *)
Description
The initialization instructions set the FPU into a known state, unaffected by any previous
activity.
The FPU control word is set to 037Fh (round to nearest, all exceptions masked, 64-bit
precision). The status word is cleared (no exception flags set, stack register R0 = stack
top). The stack registers are all tagged as empty. The error pointers (both instruction and
data) are cleared.
FPU Flags Affected
C0, C1, C2, C3 cleared
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: FNINIT leaves the FPU in the same state as that which results from a hardware
RESET signal. Unlike the Intel 387 math coprocessor, FNINIT clears the error pointers in
the Am486 processor.
Am486 Microprocessor Instruction Set
2-87
AMD
2.82
FNOP
No Operation
Opcode
Instruction
Clocks
Description
D9 D0
FNOP
3
No operation is performed.
Description
FNOP performs no operation. If affects only the instruction pointers.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
2-88
Am486 Microprocessor Instruction Set
AMD
2.83
FNSAVE
Stores FPU State w/o Checking for Unmasked FPU Error
Opcode
Instruction
Clocks
Description
DD /6
FNSAVE
m94/108byte
154 real or virtual/
143 protected
Stores FPU environment to m94byte or
m108byte without checking for unmasked
floating-point error condition, and then reinitializes the FPU.
Operation
DEST ← FPU state;
initialize FPU; (* Equivalent to FNINIT *)
Description
FNSAVE writes the current FPU state (environment and register stack) to the specified
destination, and then reinitializes the FPU, without checking for unmasked floating-point
error conditions. The environment consists of the FPU control word, status word, tag word,
and error pointers (both data and instruction). The state layout in memory depends on both
the operand size and the current operating mode of the microprocessor. The USE attribute
of the current code segment determines the operand size: the 94-byte operand applies to
USE16 segment, and the 108-byte operand applies to a USE32 segment. The stack registers, ST(0) to ST(7), are in the 80 bytes immediately following the environment image.
FPU Flags Affected
C0, C1, C2, C3 cleared
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: FNSAVE does not store the FPU state until all FPU activity is complete; the saved
image reflects the state of the FPU after any previously decoded instruction is executed.
If a program must read from the memory image of the state after a save instruction, it must
issue an FWAIT instruction to ensure that the storage is complete. The save instructions
are typically used when an operating system needs to perform a context switch, or an
exception handler needs to use the FPU, or an application program wants to pass a “clean”
FPU to a subroutine.
Am486 Microprocessor Instruction Set
2-89
AMD
2.84
FNSTCW
Stores Control Word without Checking for FPU Error
Opcode
Instruction
Clocks
Description
D9 /7
FNSTCW m2byte
3
Stores FPU control work to m2byte without checking for
unmasked floating-point error condition.
Operation
DEST ← CW
Description
FNSTCW writes the current value of the FPU control word to the specified destination
without checking for an unmasked floating-point error condition.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
2-90
Am486 Microprocessor Instruction Set
AMD
2.85
FNSTENV
Stores FPU Environment w/o Checking for FPU Error
Opcode
Instruction
Clocks
Description
D9 /6
FNSTENV
m14/28byte
67 real or virtual/
56 protected
Stores FPU environment to m14byte or
m28byte without checking for unmasked floating-point error condition. Then masks all floating-point exceptions.
Operation
DEST ← FPU environment;
CW[0–5] ← 111111
Description
FNSTENV writes the current FPU environment to the specified destination, and then masks
all floating-point exceptions without checking for unmasked floating-point error conditions.
The FPU environment consists of the FPU control word, status word, tag word, and error
pointer (both data and instruction). The environment layout in memory depends on both
the operand size and the current operating mode of the microprocessor. The USE attribute
of the current code segment determines the operand size: the 14-byte operand applies to
a USE16 segment, and the 28-byte operand applies to a USE32 segment.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: FNSTENV does not store the environment until all FPU activity is complete; the saved
environment reflects the state of the FPU after any previously decoded instruction has been
executed. The store environment instructions are often used by exception handlers
because they provide access to the FPU error pointers. The environment is typically saved
onto the memory stack. After saving the environment, FNSTENV sets all the exception
masks in the FPU control word. This prevents floating-point errors from interrupting the
exception handler.
Am486 Microprocessor Instruction Set
2-91
AMD
2.86
FNSTSW Stores Status Word w/o Checking for Unmasked FPU Error
Opcode
Instruction
Clocks
Description
DF /7
FNSTSW m2byte
3
DF E0
FNSTSW AX
3
Stores FPU status word to m2byte without checking for
unmasked floating-point error condition.
Stores FPU status word to AX register without checking
for unmasked floating-point error condition.
Operation
DEST ← SW
Description
FNSTSW writes the current value of the FPU status word to the specified destination, which
can be either a 2-byte location in memory or the AX register, without checking for an
unmasked floating-point error condition.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: FNSTSW is used primarily in conditional branching (after a comparison, FPREM,
FPREM1, or FXAM instruction). It can also invoke exception handlers (by polling the
exception bits) in environments that do not use interrupts. When FNSTSW AX is executed,
the AX register is updated before the Am486 microprocessor executes any further
instructions. The status stored is that from the completion of the prior ESC instruction.
2-92
Am486 Microprocessor Instruction Set
AMD
2.87
FPATAN
Partial Arctangent
Opcode
Instruction
Clocks
Concurrent
Execution
D9 F3
FPATAN
289 (218–303)
5 (2–17)
Description
Replaces ST(1) with
arctan (ST(1) ÷ ST) and pops ST.
Operation
ST(1) ← arctan (ST(1) ÷ ST);
pop ST FI
Description
The partial arctangent instruction computes the arctangent of ST(1) ÷ ST and returns the
computed value, expressed in radians, to ST(1). It then pops ST. The result has the same
sign as the operand from ST(1) and a magnitude less than π.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Denormalized Operand, Invalid Operation, Stack
Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: There is no restriction on the range of arguments that FPATAN can accept. The fact
that FPATAN takes two arguments and computes the arctangent of their ratio simplifies the
calculation of other trigonometric functions. For instance, arcsin (x) (which is the arctangent
of x ÷ √(1 – x2)) can be computed using the following sequence of operations: Push x onto
the FPU stack; compute √(1 – x2) and push the resulting value onto the stack; execute
FPATAN. The Am486 processor checks for interrupts while performing this instruction. It
will abort this instruction to serve an interrupt.
Am486 Microprocessor Instruction Set
2-93
AMD
2.88
FPREM
Partial Remainder (Non-IEEE 754 compliant)
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 F8
FPREM
84 (70–138)
2 (2–8)
Replaces ST with the remainder
obtained when dividing ST by ST(1).
Operation
EXPDIF ← exponent(ST) – exponent(ST(1));
IF EXPDIF < 64
THEN
Q ← integer obtained by chopping ST ÷ ST(1) toward zero;
ST ← ST – (ST(1) ⋅ Q);
C2 ← 0;
C0, C1, C3 ← three least-significant bits of Q; (* Q2, Q1, Q0 *)
ELSE
C2 ← 1;
N ← a number between 32 and 63
QQ ← integer obtained by chopping (ST ÷ ST(1)) ÷ 2EXPDIF–N toward zero;
ST ← ST – (ST(1) ⋅ QQ ⋅ 2EXPDIF–N;
FI;
Description
FPREM computes the remainder of dividing ST by ST(1) using iterative subtraction and
leaves the result in ST. The remainder’s sign is the same as the sign of the original dividend
in ST. The magnitude of the remainder is less than that of the modulus.
FPU Flags Affected
If the IE and SF status word bits are set (stack exception), C1 indicates whether it is an
overflow (C1 = 1) or underflow (C1 = 0); otherwise, C3 = Q0, C1 = Q1, and C0 = Q2 leastsignificant quotient bits. C2 indicates the reduction status: 0 = complete; 1 = incomplete.
Numeric Exceptions
Underflow, Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: FPREM produces an exact result with no precision (inexact) exception and no
rounding. FPREM does not comply with IEEE Std 754 (see FPREM1), but is compatible
with 8087 and 80287 coprocessors. A higher-priority interrupting routine can force the FPU
to switch context between the instructions in the remainder loop.
FPREM can reduce periodic function arguments. C3, C1, and C0 represent the three leastsignificant quotient bits when execution is complete. This is important in argument reduction
for the tangent function (using a modulus of π/4), because it locates the original angle within
the correct sector of the unit circle.
2-94
Am486 Microprocessor Instruction Set
AMD
2.89
FPREM1
Partial Remainder (IEEE 754 compliant)
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 F5
FPREM1
94.5 (72–167)
5.5 (2–18)
Replaces ST with the remainder
obtained when dividing ST by ST(1).
Operation
EXPDIF ← exponent(ST) – exponent(ST(1));
IF EXPDIF < 64
THEN
Q ← integer obtained by chopping ST ÷ ST(1) toward zero;
ST ← ST – (ST(1) ⋅ Q);
C2 ← 0;
C0, C1, C3 ← three least-significant bits of Q; (* Q2, Q1, Q0 *)
ELSE
C2 ← 1;
N ← a number between 32 and 63
QQ ← integer obtained by chopping (ST ÷ ST(1)) ÷ 2EXPDIF–N toward zero;
ST ← ST – (ST(1) ⋅ QQ ⋅ 2EXPDIF–N;
FI;
Description
FPREM1 computes the remainder of dividing ST by ST(1) using iterative subtraction, and
leaves the result in ST. The magnitude of the remainder is less than half that of the modulus.
FPU Flags Affected
If the IE and SF status word bits are set (stack exception), C1 indicates whether it is an
overflow (C1 = 1) or underflow (C1 = 0); otherwise, C3 = Q0, C1 = Q1, and C0 = Q2 leastsignificant quotient bits. C2 indicates the reduction status: 0 = complete; 1 = incomplete.
Numeric Exceptions
Underflow, Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: FPREM1 produces an exact result with no precision (inexact) exception and no
rounding. FPREM1 complies with IEEE Std 754 (see also FPREM). A higher-priority
interrupting routine can force the FPU to switch context between the instructions in the
remainder loop.
FPREM1 can reduce periodic function arguments. C3, C1, and C0 represent the three
least-significant quotient bits when execution is complete. This is important in argument
reduction for the tangent function (using a modulus of π/4), because it locates the original
angle within the correct sector of the unit circle.
Am486 Microprocessor Instruction Set
2-95
AMD
2.90
FPTAN
Partial Tangent
Opcode
Instruction
Clocks
Concurrent
Execution
D9 F2
FPTAN
244 (200–273)
70
Description
Replaces ST with its tangent and
push 1 onto the FPU stack.
Operation
IF operand is in range
THEN
C2 ← 0;
ST ← tan (ST);
Decrement top-of-stack pointer;
ST ← 1.0;
ELSE
C2 ← 1;
FI
Description
FPTAN replaces the contents of ST with tan (ST), and then pushes 1.0 onto the FPU stack
to maintain 8087 and 80287 compatibility. ST, expressed in radians, must lie in the range
| θ | < 263.
FPU Flags Affected
If C2 = 0 (reduction complete), the result determines the C1 setting. If both the IE and SF
bits of the status word are set (indicating a stack exception), C0 distinguishes between
stack overflow (C1 = 1) and underflow (C1 = 0); if PE is set, C1 indicates whether the last
rounding was upward. If C2 = 1 (reduction incomplete), C1 is undefined. C0 and C3 are
always undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Denormalized Operand, Invalid Operation, Stack
Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If the operand is outside the acceptable range, the C2 flag is set, and ST remains
unchanged. Reduce the operand to an absolute value smaller than 263 by subtracting an
appropriate integer multiple of 2π. For π, use the value used as the full 66-bit internal π
used by the FPU: 4 ⋅ 0.C90FDAA22168C234Ch. This ensures that the results are
consistent with argument reduction used by the FPU for trigonometric functions. You cannot
represent this number as an extended-real value, however. A suggested solution is to
represent π as the sum of a highπ (the 33 most-significant bits) and a lowπ (the 33 leastsignificant bits). The Am486 processor can abort this instruction to service an interrupt.
ST(7) must be empty to avoid an invalid-operation exception.
2-96
Am486 Microprocessor Instruction Set
AMD
2.91
FRNDINT
Rounds to Integer
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 FC
FRNDINT
29.1 (21–30)
7.4 (2–8)
Rounds ST to an integer.
Operation
ST ← rounded ST
Description
FRNDINT rounds the value in ST to an integer according to the RC field of the FPU control
word.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Am486 Microprocessor Instruction Set
2-97
AMD
2.92
FRSTOR
Restores FPU State
Opcode
Instruction
Clocks
Description
DB /4
FRSTOR
m94/108byte
131 real or virtual/ 120
protected
Loads FPU state from m94byte or m108byte.
Operation
FPU state ← SRC;
Description
FRSTOR reloads the FPU state (environment and register stack) from the memory area
defined by the source operand. This data should have been written by a previous FSAVE
or FNSAVE instruction.
The FPU environment consists of the FPU control word, status word, tag word, and error
pointers (both data and instruction). The environment layout in memory depends on both
the operand size and the current operating mode of the microprocessor. The USE attribute
of the current code segment determines the operand size: the 14-byte operand applies to
a USE16 segment, and the 28-byte operand applies to a USE32 segment.
Figures 15-5 through 15-8 show the environment layouts for both operand sizes in both
Real Mode and Protected Mode. (In Virtual 8086 Mode, the Real Mode layout is used.) The
stack registers, beginning with ST and ending with ST(7), are in the 80 bytes that immediately follow the environment image. FRSTOR should be executed in the same operating
mode as the corresponding FSAVE or FNSAVE.
FPU Flags Affected
C0, C1, C2, C3 as loaded
Numeric Exceptions
None, except for loading an unmasked exception.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the state image contains an unmasked exception, loading it generates a floatingpoint error condition.
2-98
Am486 Microprocessor Instruction Set
AMD
2.93
FSAVE
Stores FPU State after Checking for Unmasked FPU Error
Opcode
Instruction
Clocks
Description
9B DD /6
FSAVE
m94/108byte
154 real or virtual/
143 protected + 3+ for
FWAIT
Stores FPU environment to m94byte or
m108byte after checking for unmasked floating-point error condition. Reinitializes FPU.
Operation
DEST ← FPU state;
initialize FPU; (* Equivalent to FNINIT *)
Description
FSAVE writes the current FPU state (environment and register stack) to the specified
destination and then reinitializes the FPU, without checking for unmasked floating-point
error conditions. The environment consists of the FPU control word, status word, tag word,
and error pointers (both data and instruction). The state layout in memory depends on both
the operand size and the current operating mode of the microprocessor. The USE attribute
of the current code segment determines the operand size: the 94-byte operand applies to
USE16 segment, and the 108-byte operand applies to a USE32 segment. The stack registers, ST(0) to ST(7), are in the 80 bytes immediately following the environment image.
FPU Flags Affected
C0, C1, C2, C3 cleared
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: FSAVE does not store the FPU state until all FPU activity is complete. The saved
image reflects the state of the FPU after any previously decoded instruction is executed.
If a program must read from the memory image of the state after a save instruction, it must
issue an FWAIT instruction to ensure that the storage is complete. The save instructions
are typically used when an operating system needs to perform a context switch, or an
exception handler needs to use the FPU, or an application program wants to pass a “clean”
FPU to a subroutine.
Am486 Microprocessor Instruction Set
2-99
AMD
2.94
FSCALE
Scales
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 FD
FSCALE
31 (30–32)
2
Scales ST by ST(1).
Operation
ST ← ST
⋅
2ST(1)
Description
The scale instruction interprets the value in ST(1) as an integer, and adds this integer to
the exponent of ST. Thus, FSCALE provides rapid multiplication or division by integral
powers of 2.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: FSCALE can be used as an inverse to FXTRACT. Since FSCALE does not pop the
exponent part, however, FSCALE must be followed by FSTP ST(1) in order to completely
undo the effect of a preceding FXTRACT. There is no limit on the range of the scale factor
in ST(1). If the value is not integral, FSCALE uses the nearest integer smaller in magnitude
(i.e., it chops the value toward 0). If the resulting integer is zero, the value in ST is not
changed.
2-100
Am486 Microprocessor Instruction Set
AMD
2.95
FSIN
Sine
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 FE
FSIN
241 (193–279)
2
Replaces ST with its sine.
Operation
IF operand is in range
THEN
C2 ← 0;
ST ← sin (ST);
ELSE
C2 ← 1;
FI;
Description
The sine instruction replaces the contents of ST with sin (ST). ST, expressed in radians,
must lie in the range | θ | < 263.
FPU Flags Affected
If C2 = 0 (reduction complete), the result determines the C1 setting. If both the IE and SF
bits of the status word are set (indicating a stack exception), C0 distinguishes between
stack overflow (C1 = 1) and underflow (C1 = 0); if PE is set, C1 indicates whether the last
rounding was upward. If C2 = 1 (reduction incomplete), C1 is undefined. C0 and C3 are
undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Denormalized Operand, Invalid Operation, Stack
Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If the operand is outside the acceptable range, the C2 flag is set and ST remains
unchanged. Reduce the operand to an absolute value smaller than 263 by subtracting an
appropriate integer multiple of 2π. For π, use the full 66-bit internal π used by the FPU:
4 ⋅ 0.C90FDAA22168C234Ch. This ensures that the results are consistent with the
argument reduction used by the FPU for trigonometric functions.You cannot represent this
number as an extended-real value, however. A suggested solution is to represent π as the
sum of a highπ (the 33 most-significant bits) and a lowπ (the 33 least-significant bits). The
Am486 processor can abort this instruction to service an interrupt.
If you need to compute sine and cosine, use FSINCOS for faster execution.
Am486 Microprocessor Instruction Set
2-101
AMD
2.96
FSINCOS
Sine and Cosine
Opcode
Instruction
Clocks
Concurrent
Execution
D9 FB
FSINCOS
291 (243–329)
2
Description
Computes the sine and cosine of ST;
replaces ST with the sine, and then
pushes the cosine onto the FPU stack.
Operation
IF operand is in range
THEN
C2 ← 0;
TEMP ← cos (ST);
ST ← sin (ST);
Decrement FPU top-of-stack pointer;
ST ← TEMP;
ELSE
C2 ← 1;
FI:
Description
FSINCOS computes both sine (ST) and cosine (ST), replaces ST with the sine, and then
pushes the cosine onto the FPU stack. ST, expressed in radians, must lie in the range | θ
| < 263.
FPU Flags Affected
If C2 = 0 (reduction complete), the result determines the C1 setting. If both the IE and SF
bits of the status word are set (indicating a stack exception), C0 distinguishes between
stack overflow (C1 = 1) and underflow (C1 = 0); if PE is set, C1 indicates whether the last
rounding was upward. If C2 = 1 (reduction incomplete), C1 is undefined. C0 and C3 are
undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Denormalized Operand, Invalid Operation, Stack
Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If the operand is outside the acceptable range, the C2 flag is set, and ST remains
unchanged. Reduce the operand to an absolute value smaller than 263 by subtracting an
appropriate integer multiple of 2π. For π, use the full 66-bit internal π used by the FPU:
4 ⋅ 0.C90FDAA22168C234Ch. This ensures that the results are consistent with the
argument reduction used by the FPU for trigonometric functions. You cannot represent this
number as an extended-real value, however. A suggested solution is to represent π as the
sum of a highπ (the 33 most-significant bits) and a lowπ (the 33 least-significant bits). The
Am486 processor can abort this instruction to service an interrupt.
2-102
Am486 Microprocessor Instruction Set
AMD
2.97
FSQRT
Square Root
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 FA
FSQRT
85.5 (83–87)
70
Replaces ST with its square root.
Operation
ST ← square root of ST;
Description
The square root instruction replaces the value in ST with its square root.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: The square root of –0 is –0.
Am486 Microprocessor Instruction Set
2-103
AMD
2.98
FST
Stores Real
Opcode
Instruction
Clocks
Description
D9 /2
DD /2
DD D0+i
FST m32real
FST m64real
FST ST(i)
7
8
3
Copies ST to m32real.
Copies ST to m64real.
Copies ST to ST(i).
Operation
DEST ← ST(0)
Description
FST copies the current value in the ST register to the destination, which can be another
register or a single or double real-memory operand.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Register destinations: Stack Fault
Single or double real destinations: Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the destination is 32 bits or 64 bits, the significand is rounded to the width of the
destination according to the RC field of the control word, and the exponent is converted to
the width and bias of the destination format. The over/underflow condition is checked as
well. If ST contains zero, ±∞, or a NaN, then the significand is not rounded but chopped
(on the right) to fit the destination. The exponent of such a value is not converted; it too is
chopped on the right. These operations preserve the value's identity as ∞ or NaN (exponent
all ones). The invalid-operation exception is not raised when the destination is a nonempty
stack element.
2-104
Am486 Microprocessor Instruction Set
AMD
2.99
FSTCW Stores Control Word after Checking for FPU Error
Opcode
Instruction
Clocks
Description
9B D9 /7
FSTCW m2byte
3 + 3+
for FWAIT
Stores FPU control word to m2byte after checking for
unmasked floating-point error condition.
Operation
DEST ← CW
Description
FSTCW writes the current value of the FPU control word to the specified destination, after
checking for an unmasked floating-point error condition.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-105
AMD
2.100
FSTENV
Stores FPU Environment after Checking for FPU Error
Opcode
Instruction
Clocks
Description
9B D9 /6
FSTENV
m14/28byte
67 real or virtual/
56 protected +
3+ for FWAIT
Stores FPU environment to m14byte or
m28byte after checking for unmasked floatingpoint error condition; then masks all floatingpoint exceptions.
Operation
DEST ← FPU environment;
CW[O–5] ← 111111
Description
FSTENV writes the current FPU environment to the specified destination, and then masks
all floating-point exceptions, after checking for unmasked floating-point error conditions.
The FPU environment consists of the FPU control word, status word, tag word, and error
pointer (both data and instruction). The environment layout in memory depends on both
the operand size and the current operating mode of the microprocessor. The USE attribute
of the current code segment determines the operand size: the 14-byte operand applies to
a USE16 segment, and the 28-byte operand applies to a USE32 segment.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: FSTENV does not store the FPU environment until all FPU activity is complete. The
saved environment reflects the state of the FPU after any previously decoded instruction
has been executed. The stored environment instructions are often used by exception
handlers because they provide access to the FPU error pointers. The FPU environment is
typically saved onto the memory stack. After saving the FPU environment, FSTENV sets
all the exception masks in the FPU control word. This prevents floating-point errors from
interrupting the exception handler.
2-106
Am486 Microprocessor Instruction Set
AMD
2.101
FSTP
Stores Real and Pops the FPU Stack Top
Opcode
Instruction
Clocks
Description
D9 /3
DD /3
DB /7
DD D8+i
FSTP m32real
FSTP m64real
FSTPm80real
FSTP ST(i)
7
8
6
3
Copies ST to m32real, then pops ST.
Copies ST to m64real, then pops ST.
Copies ST to m80real, then pops ST.
Copies ST to ST(i), then pops ST.
Operation
DEST ← ST(0);
pop ST FI
Description
FSTP copies the current ST register value to the destination, which can be another register
or a single-, double-, or extended-real memory operand, and then pops ST. If the source
is a register, the number is used before the stack is popped.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Register or extended-real destinations: Stack Fault
Single or double real destinations: Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates part of the operand lies outside the effective address
space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the destination is 32 bits or 64 bits, the significand is rounded to the width of the
destination according to the RC field of the control word, and the exponent is converted to
the width and bias of the destination format. The over/underflow condition is checked for
as well. If ST contains zero, ±∞, or a NaN, then the significand is not rounded but chopped
(on the right) to fit the destination. The exponent of such a value is not converted; it too is
chopped on the right. These operations preserve the value's identity as ∞ or NaN (exponent
all ones). The invalid-operation exception is not raised when the destination is a nonempty
stack element.
Am486 Microprocessor Instruction Set
2-107
AMD
2.102
FSTSW Stores Status Word after Checking for Unmasked FPU Error
Opcode
Instruction
Clocks
Description
9B DF /7
FSTSW m2byte
9B DF E0
FSTSW AX
3 + 3+
for FWAIT
3 + 3+
for FWAIT
Stores FPU status word to m2byte after checking for
unmasked floating-point error condition.
Stores FPU status word to AX register after checking for
unmasked floating-point error condition.
Operation
DEST ← SW
Description
FSTSW writes the current value of the FPU status word to the specified destination, which
can be either a 2-byte location in memory or the AX register, after checking for an unmasked
floating-point error condition.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: FSTSW is used primarily in conditional branching (after a comparison, FPREM,
FPREM1, or FXAM instruction). It can also invoke exception handlers (by polling the
exception bits) in environments that do not use interrupts.
2-108
Am486 Microprocessor Instruction Set
AMD
2.103
FSUB
Subtracts Real
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D8 /4
DC /4
D8 E0+i
DC E8+i
FSUB m32rea;
FSUB m64real
FSUB ST,ST(i)
FSUB ST(i),ST
10 (8–20)
10 (8–20)
10 (8–20)
10 (8–20)
7 (5–17)
7 (5–17)
7 (5–17)
7 (5–17)
Subtracts m32real from ST.
Subtracts m64real from ST.
Subtracts ST(i) from ST.
Replaces ST(i) with ST–ST(i).
Operation
DEST ← ST – Other Operand
Description
The subtraction instructions subtract the other operand from the stack top and return the
difference to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
Am486 Microprocessor Instruction Set
2-109
AMD
2.104
FSUBP
Subtracts Real and Pops FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE E8+i
DE E9
FSUBP ST(i),ST
FSUBP
10 (8–20)
10 (8–20)
7 (5–17)
7 (5–17)
Replaces ST(i) with ST–ST(i); pops ST.
Replaces ST(1) with ST–ST(1); pops ST.
Operation
DEST ← ST – Other Operand;
pop ST FI
Description
FSUBP subtracts the other operand from the stack top, returns the difference to the destination, and pops ST.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
2-110
Am486 Microprocessor Instruction Set
AMD
2.105
FSUBR
Reverse Subtracts Real
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D8 /5
DC /5
D8 E8+i
DC E0+i
FSUBR m32real
FSUBR m64real
FSUBR ST,ST(i)
FSUBR ST(i),ST
10 (8–20)
10 (8–20)
10 (8–20)
10 (8–20)
7 (5–17)
7 (5–17)
7 (5–17)
7 (5–17)
Replaces ST with m32real – ST.
Replaces ST with m64real – ST.
Replaces ST with ST(i) – ST.
Subtracts ST from ST(i).
Operation
DEST ← Other Operand – ST
Description
The reverse subtraction instructions subtract the stack top from the other operand and
return the difference to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
Am486 Microprocessor Instruction Set
2-111
AMD
2.106
FSUBRP
Reverse Subtracts and Pops FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DE E0+i
DE E1
FSUBRP ST(i),ST
FSUBRP
10 (8–20)
10 (8–20)
7 (5–17)
7 (5–17)
Subtracts ST from ST(i); pops ST.
Subtracts ST from ST(1); pops ST.
Operation
DEST ← Other Operand – ST;
pop ST FI
Description
The reverse subtraction instructions subtract the stack top from the other operand and
return the difference to the destination.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Denormalized Operand, Invalid Operation,
Stack Fault
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Coprocessor Not Available (7) occurs if either EM or TS in
CR0 is set. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: If the source operand is in memory, it is automatically converted to the extendedreal format.
2-112
Am486 Microprocessor Instruction Set
AMD
2.107
FTST
Test
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 E4
FTST
4
1
Compares ST with 0.0.
Operation
CASE (relation of operands) OF
Not comparable:
C3,
ST > SRC:
C3,
ST < SRC:
C3,
ST = SRC:
C3,
CF ← C0;
PF ← C2;
ZF ← C3;
FI
C2,
C2,
C2,
C2,
C0
C0
C0
C0
←
←
←
←
111;
000;
001;
100;
Description
FTST compares the stack top to 0.0. Following the instruction, the condition codes reflect
the result of the comparison.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are set as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If ST contains a NaN or is in an undefined format, or if a stack fault occurs, the invalidoperation exception is raised, and the condition bits are set to “unordered.”
The sign of zero is ignored, so that –0.0 = +0.0.
Am486 Microprocessor Instruction Set
2-113
AMD
2.108
FUCOM
Unordered Compare Real
Opcode
Instruction
Clocks
DD E0+1
DD E1
FUCOM ST(i)
FUCOM
4
4
Concurrent
Execution
Description
Compares ST with ST(i).
Compares ST with ST(1).
Operation
CASE (relation of operands) OF
Not comparable:
C3,
ST > SRC:
C3,
ST < SRC:
C3,
ST = SRC:
C3,
CF ← C0;
PF ← C2;
ZF ← C3;
FI
C2,
C2,
C2,
C2,
C0
C0
C0
C0
←
←
←
←
111;
000;
001;
100;
Description
FUCOM compares the stack top to the source, which must be a register. If no operand is
encoded, ST is compared to ST(1). Following the instruction, the condition codes reflect
the relation between ST and the source operand.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are set as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If either operand is a NaN or is in an undefined format, or if a stack fault occurs, the
invalid-operation exception is raised, and the condition bits are set to “unordered.”
If either operand is a QNaN, the condition bits are set to “unordered.” Unlike the ordinary
compare instructions (FCOM, etc.), the unordered compare instructions do not raise the
invalid-operation exception if there is a QNaN operand.
The sign of zero is ignored, so that –0.0 = +0.0.
2-114
Am486 Microprocessor Instruction Set
AMD
2.109
FUCOMP
Unordered Compare Real and Pop FPU Stack Top
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DD E8+i
DD E9
FUCOMP ST(i)
FUCOMP
4
4
1
1
Compares ST with ST(i) and pops ST.
Compares ST with ST(1) and pops ST.
Operation
CASE (relation of operands) OF
Not comparable:
C3,
ST > SRC:
C3,
ST < SRC:
C3,
ST = SRC:
C3,
CF ← C0;
PF ← C2;
ZF ← C3;
pop ST FI
C2,
C2,
C2,
C2,
C0
C0
C0
C0
←
←
←
←
111;
000;
001;
100;
Description
FUCOMP compares the stack top to the source, which must be a register, then pops ST.
If no operand is encoded, ST is compared to ST(1). Following the instruction, the condition
codes reflect the relation between ST and the source operand.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are set as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If either operand is a NaN or is in an undefined format, or if a stack fault occurs, the
invalid-operation exception is raised and the condition bits are set to “unordered.”
If either operand is a QNaN, the condition bits are set to “unordered.” Unlike the ordinary
compare instructions (FCOM, etc.), the unordered compare instructions do not raise the
invalid-operation exception if there is a QNaN operand.
The sign of zero is ignored, so that –0.0 = +0.0.
Am486 Microprocessor Instruction Set
2-115
AMD
2.110
FUCOMPP Unordered Compare Real and Pop FPU Stack Top Twice
Opcode
Instruction
Clocks
Concurrent
Execution
Description
DA E9
FUCOMPP
5
1
Compares ST with ST(1) and pops ST twice.
Operation
CASE (relation of operands) OF
Not comparable:
C3,
ST > SRC:
C3,
ST < SRC:
C3,
ST = SRC:
C3,
CF ← C0;
PF ← C2;
ZF ← C3;
pop ST; pop ST; FI
C2,
C2,
C2,
C2,
C0
C0
C0
C0
←
←
←
←
111;
000;
001;
100;
Description
FUCOMPP compares the stack top to ST(1) and pops ST twice. Following the instruction,
the condition codes reflect the relation between ST and the source operand.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are set as specified above.
Numeric Exceptions
Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If either operand is a NaN or is in an undefined format, or if a stack fault occurs, the
invalid-operation exception is raised and the condition bits are set to “unordered.”
If either operand is a QNaN, the condition bits are set to “unordered.” Unlike the ordinary
compare instructions (FCOM, etc.), the unordered compare instructions do not raise the
invalid-operation exception if there is a QNaN operand.
The sign of zero is ignored, so that –0.0 = +0.0.
2-116
Am486 Microprocessor Instruction Set
AMD
2.111
FWAIT
Wait
Opcode
Instruction
Clocks
Description
9B
FWAIT
(1–3)
Alias for WAIT.
Description
FWAIT causes the microprocessor to check for pending unmasked numeric exceptions
before proceeding.
FPU Flags Affected
C0, C1, C2, C3 undefined
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if both MP and TS in CR0 are set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if both MP and TS in CR0 are set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if both MP and TS in CR0 are set.
Note: As its opcode shows, FWAIT is not actually an ESC instruction but an alternate
mnemonic for WAIT. Coding FWAIT after an ESC instruction ensures that any unmasked
floating-point exceptions caused by the instruction are handled before the processor
modifies the instruction’s results.
Am486 Microprocessor Instruction Set
2-117
AMD
2.112
FXAM
Examine
Opcode
Instruction
Clocks
Description
D9 E5
FXAM
8
Reports the type of object in the ST register.
Operation
Cl ← sign bit of ST; (* 0 for positive, 1 for negative *)
CASE (type of object in ST) OF
Unsupported:
C3, C2, C0 ←
NaN:
C3, C2, C0 ←
Normal:
C3, C2, C0 ←
Infinity:
C3, C2, C0 ←
Zero:
C3, C2, C0 ←
Empty:
C3, C2, C0 ←
Denormal:
C3, C2, C0 ←
CF ← C0;
PF ← C2;
ZF ← C3;
FI
000;
001;
010;
011;
100;
101;
110;
Description
The examine instruction reports the type of object contained in the ST register by setting
the FPU Flags.
FPU Flags Affected
C0, C1, C2, C3 are set as shown above.
Numeric Exceptions
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
2-118
Am486 Microprocessor Instruction Set
AMD
2.113
FXCH
Exchanges Stack Register Contents
Opcode
Instruction
Clocks
Description
D9 C8+i
D9 C9
FXCH ST(i)
FXCH
4
4
Exchanges the contents of ST and ST(i).
Exchanges the contents of ST and ST(1).
Operation
TEMP ← ST;
ST ← DEST;
DEST ← TEMP
Description
FXCH swaps the contents of the destination and stack top registers. If the destination is
not coded explicitly, ST(1) is used.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: Many numeric instructions operate only on the stack top; FXCH provides a simple
means for using these instructions on lower stack elements. For example, the following
sequence takes the square root of the third register from the top (assuming that ST is not
empty):
FXCH ST(3)
FSQRT
FXCH ST(3)
Am486 Microprocessor Instruction Set
2-119
AMD
2.114
FXTRACT
Extracts Exponent and Significand
Opcode
Instruction
Clocks
Concurrent
Execution
D9 F4
FXTRACT
19 (16–20)
4 (2–4)
Description
Separates ST into its exponent and
significand; replaces ST with the exponent and then pushes the significand onto the FPU stack.
Operation
TEMP ← significand of ST;
ST ← exponent of ST;
Decrement FPU top-of-stack pointer;
ST ← TEMP
Description
FXTRACT splits the value in ST into its exponent and significand. The exponent replaces
the original operand on the stack and the significand is pushed onto the stack. ST (the new
stack top) contains the value of the significand as a real number with the same sign, a 0
true (16,383 or 3FFFh biased) exponent, and identical significand as the original operand.
ST(1) contains the original operand’s true (unbiased) exponent expressed as a real number.
FPU Flags Affected
The result determines the C1 setting. If both the IE and SF bits of the status word are set
(indicating a stack exception), C0 distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0); otherwise, C1 = 0. C0, C2, and C3 are undefined.
Numeric Exceptions
Divide By Zero, Denormalized Operand, Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: FXTRACT (extract exponent and significand) performs a superset of the IEEE
recommended logb(x) function. It is useful for power and range scaling operations. Both
FXTRACT and F2XM1 are needed to perform a general power operation. You must use
FXTRACT with FBSTP when converting extend-format real numbers to decimal
representations to allow scaling that does not overflow the extended format range.
FXTRACT is also useful for debugging because it allows separate examination of a real
number’s exponent and significand.
If the original operand is zero, FXTRACT leaves – ∞ in ST(1) (the exponent), assigns a
zero value with the same sign as the original operand to ST, and generates a zero divide
exception. ST(7) must be empty to avoid the invalid-operation exception.
2-120
Am486 Microprocessor Instruction Set
AMD
2.115
FYL2X
Computes y
⋅
log2x
Opcode
Instruction
Clocks
Concurrent
Execution
D9 F1
FYL2X
311 (196–329)
13
Description
Replaces ST(1) with ST(1)
and pops ST.
⋅
log2ST
Operation
ST(l) ← ST(l)
pop ST
⋅
log2ST;
Description
FYL2X computes the base-2 logarithm of ST, multiplies the logarithm by ST(1), and returns
the resulting value to ST(1). It then pops ST. The operand in ST cannot be negative.
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Overflow, Divide By Zero, Denormalized Operand,
Invalid Operation, Stack Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: The Am486 processor can abort this instruction to service an interrupt.
If the operand in ST is negative, the invalid-operation exception is raised. FYL2X has builtin multiplication to optimize the calculation of logarithms with an arbitrary positive base:
logbx = (log2b)–1
⋅
log2x
The instructions FLDL2T and FLDL2E load the constants log210 and log2e, respectively.
Am486 Microprocessor Instruction Set
2-121
AMD
2.116
FYL2XP1
Computes y
⋅
log2(x+1)
Opcode
Instruction
Clocks
Concurrent
Execution
Description
D9 F9
FYL2XP1
313 (171–326)
13
Replaces ST(1) with
ST(1) ⋅ log2(ST+1.0) and pops ST.
Operation
ST(1) ← ST(1)
pop ST
⋅
log2(ST + 1.0);
Description
FYL2XP1 computes the base-2 logarithm of (ST + 1.0), multiplies the logarithm by ST(1),
and returns the resulting value to ST(1). It then pops ST. The operand in ST must be in the
range:
– (1 – (√2 / 2)) ≤ ST ≤ √2 –1
FPU Flags Affected
The result determines the C1 setting. If the PE bit of the status word is set, C1 represents
whether the last rounding in the instruction was upward or not. If both the IE and SF bits
of the status word are set (indicating a stack exception), C0 distinguishes between stack
overflow (C1 = 1) and underflow (C1 = 0). C0, C2, and C3 are undefined.
Numeric Exceptions
Precision (Inexact Result), Underflow, Denormalized Operand, Invalid Operation, Stack
Fault
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if either EM or TS in CR0 is set.
Note: If the operand in ST is outside the acceptable range, the result of FYL2XP1 is
undefined.
The FYL2XP1 instruction provides improved accuracy over FYL2X when computing the
logarithms of numbers very close to 1. When ε is small, more significant digits can be
retained by providing ε as an argument to FYL2XP1 than by providing 1 + ε as an argument
to FYL2X.
The Am486 processor can abort this instruction to service an interrupt.
2-122
Am486 Microprocessor Instruction Set
AMD
2.117
HLT
Halt
Opcode
Instruction
Clocks
Description
F4
HLT
4
Halt
Operation
Enter Halt state
Description
The HLT instruction stops instruction execution and places the microprocessor in a HALT
state. An enabled interrupt, an NMI, or a reset resumes execution. If an interrupt (including
NMI) is used to resume execution after a HLT instruction, the saved CS:IP (or CS:EIP)
value points to the instruction following the HLT instruction.
Flags Affected
None
Protected Mode Exceptions
The HLT instruction is a privileged instruction; General Protection Fault (13) indicates the
current privilege level is not 0.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
General Protection Fault (13); the HLT instruction is a privileged instruction.
Am486 Microprocessor Instruction Set
2-123
AMD
2.118
IDIV
Signed Divide
Opcode
Instruction
Clocks Description
F6 /7
F7 /7
F7 /7
IDIV r/m8
IDIV AX,r/m16
IDIV EAX,r/m32
19/20
27/28
43/44
Performs a signed divide AX by r/m byte (AL = Quo, AH = Rem).
Performs a signed divide DX:AX by r/m word (AX = Quy, DX =
Rem).
Performs a signed divide EDX:EAX by r/m doubleword
(EAX = Quo, EDX = Rem).
Operation
temp ← dividend / divisor;
IF temp does not fit in quotient
THEN Divide By Zero Exception 0;
ELSE
quotient ← temp;
remainder ← dividend MOD (r/m);
FI
Note: Divisions are signed.
Description
IDIV performs a signed division. The dividend, quotient, and remainder are implicitly allocated to fixed registers. The divisor is an explicit r/m operand. The divisor type determines
which registers to use as follows:
Size
Divisor
Quotient
Remainder
Dividend
byte
word
doubleword
r/m8
r/m16
r/m32
AL
AX
EAX
AH
DX
EDX
AX
DX:AX
EDX:EAX
Non-integral quotients are truncated toward 0. The remainder has the same sign as the
dividend and the remainder absolute value is always less than the divisor absolute value.
Flags Affected
OF, SF, ZF, AF, PF, CF are undefined.
Protected Mode Exceptions
Divide By Zero (0) indicates a quotient too large for the designated register (AL or AX), or
a divisor of 0. General Protection Fault (13) indicates either that the result is in a nonwritable segment or there is an illegal memory-operand effective address in the code or
data segments. Stack Fault (12) indicates an illegal SS segment address. Page Fault (14)
indicates a page fault. For CPL = 3, Alignment Check (17) indicates an unaligned memory
reference.
Real Address Mode Exceptions
Divide By Zero (0) indicates a quotient too large for the designated register (AL or AX), or
a divisor of 0. General Protection Fault (13) indicates that part of the operand is outside
the effective address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
Divide By Zero (0) indicates a quotient too large for the designated register (AL or AX), or
a divisor of 0. General Protection Fault (13) indicates that part of the operand is outside
the effective address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is
3, Alignment Check (17) indicates there is an unaligned memory reference.
2-124
Am486 Microprocessor Instruction Set
AMD
2.119
IMUL
Signed Multiply
Opcode Instruction
Clocks
Description
F6 /5
F7 /5
F7 /5
0F AF /r
0F AF /r
6B /r ib
6B /r ib
6B /r ib
6B /r ib
69 /r iw
69 /r id
69 /r iw
69 /r id
13–18
13–26
13–42
13–26
13–42
13–26
13–42
13–26
13–42
13–26
13–42
13–26
13–42
AX ← AL ⋅ r/m byte
DX:AX ← AX ⋅ r/m word
EDS:EAX ← EAX ⋅ r/m doubleword
word reg ← word reg ⋅ r/m word
doubleword reg ← doubleword reg ⋅ r/m doubleword
word reg ← r/m16 ⋅ sign-extended immediate byte
doubleword reg ← r/m32 ⋅ sign-extended immediate byte
word reg ← word reg ⋅ sign-ext. immediate byte
doubleword reg ← doubleword reg ⋅ sign-ext. immediate byte
word reg ← r/m16 ⋅ immediate word
doubleword reg ← r/m32 ⋅ immediate doubleword
word reg ← r/m16 ⋅ immediate word
doubleword reg ← r/m32 ⋅ immediate doubleword
IMUL r/8
IMUL r/16
IMUL r/m32
IMUL r16,r/m16
IMUL r32,r/m32
IMUL r16,r/m16,imm8
IMUL r32,r/m32,imm8
IMUL r16,imm8
IMUL r32,imm8
IMUL r16,r/m16,imm16
IMUL r32,r/m32,imm32
IMUL r16,imm16
IMUL r32,imm32
Actual clock count depends on the most-significant bit location in the optimizing multiplier. If the multipler (m)
is 0, the clock count is 9; otherwise clock = max (ceiling(log2 |m|), 3) + 6. If m is a memory operand, add 3.
Operation
result ← multiplicand
⋅
multiplier
Description
IMUL performs signed multiplication. Some forms of the instruction use implicit register
operands.
Flags Affected
SF, ZF, AF, and PF are undefined. IMUL clears CF and OF under certain conditions. If you
use the accumulator form (IMUL r/m8, IMUL r/m16, or IMUL r/32), IMUL clears the flags if
the result equals the sign-extended value of the source register (AL, AX, or EAX respectively). For IMUL r16,r/m16; IMUL r/32,r/m32; IMUL r16,r/m16,imm16; or IMUL r32,r/m32,
imm32; IMUL clears the flags if the result fits exactly in the destination register.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: For the accumulator forms (IMUL r/m8, IMUL r/m16, or IMUL r/m32), the result of
the multiplication is available even if the Overflow Flag is set because the result is twice
the size of the multiplicand and multiplier. This is large enough to handle any possible result.
Am486 Microprocessor Instruction Set
2-125
AMD
2.120
IN
Inputs Data from Port
Opcode
Instruction
Clocks
Description
E4 ib
E5 ib
E5 ib
EC
ED
ED
IN AL,imm8
IN AX,imm8
IN EAX,imm8
IN AL,DX
IN AX,DX
IN EAX,DX
All forms:
rm = 14, vm = 27
If CPL ≤ IOPL,
pm = 8
If CPL>IOPL,
pm = 28
Inputs byte from immediate port into AL.
Inputs word from immediate port into AX.
Inputs doubleword from immediate port into EAX.
Inputs byte from port DX into AL.
Inputs word from port DX into AX.
Inputs doubleword from port DX into EAX.
Operation
IF (PE = 1) AND ((VM = 1) OR (CPL > IOPL))
THEN (* Virtual 8086 Mode, or Protected Mode with CPL > IOPL *)
IF NOT I/O-Permission (SRC, width (SRC))
THEN General Protection Fault (13);
FI;
FI;
DEST ← [SRC]; (* Reads from I/O address space *)
Description
The IN instruction transfers a data byte, word, or doubleword from the port numbered by
the second operand into the register (AL, AX, or EAX) specified by the first operand. Access
any port from 0 to 65535 by placing the port number in the DX register and using an IN
instruction with the DX register as the second parameter. These I/O instructions can be
shortened by using an 8-bit port I/O in the instruction. The upper eight bits of the port
address will be 0 when 8-bit port I/O is used.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the current privilege level is larger (has less privilege) than the I/O privilege level and any of the corresponding I/O permission bits in TSS
equals 1.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that at least one of the corresponding I/O permission
bits in TSS equals 1.
2-126
Am486 Microprocessor Instruction Set
AMD
2.121
INC
Increments by One
Opcode
Instruction
Clocks
Description
FE /0
FF /0
FF /6
40 + rw
40 + rd
INC r/m8
INC r/m16
INC r/m32
INC r16
INC r32
1/3
1/3
1/3
1
1
Increments r/m byte by 1.
Increments r/m word by 1.
Increments r/m doubleword by 1.
Increments word register by 1.
Increments doubleword register by 1.
Operation
DEST ← DEST + 1
Description
The INC instruction adds 1 to the operand. It does not change CF. To affect CF, use the
ADD instruction with a second operand of 1.
Flags Affected
OF, SF, ZF, AF, and PF are set according to the result.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-127
AMD
2.122
INS/INSB/INSD/INSW
Inputs Data from Port to String
Opcode
Instruction
Clocks
Description
6C
6D
6D
6C
6D
6D
INS r/m8,DX
INS r/m16,DX
INS r/m32,DX
INSB
INSD
INSW
All forms:
If CPL ≤ IOPL,
17, pm = 10
If CPL>IOPL,
32, vm = 30
Inputs byte from port DX into ES:DI.
Inputs word from port DX into ES:DI.
Inputs doubleword from port DX into ES:EDI.
Inputs byte from port DX into ES:DI.
Inputs doubleword from port DX into ES:EDI.
Inputs word from port DX into ES:DI.
Operation
IF (PE = 1) AND ((VM = 1) OR (CPL > IOPL))
THEN (* Virtual 8086 Mode, or Protected Mode with CPL > IOPL *)
IF NOT I/O-Permission (SRC, width(SRC))
THEN General Protection Fault (13);
FI;
FI;
IF OperandSize = 8 (* byte *)
THEN
ES:DI ← [DX]; (* Reads byte at DX from I/O address space *)
IF DF = 0 THEN IncDec ← 1 ELSE IncDec ← –1; FI;
IF OperandSize = 16 (* word *)
THEN
ES:DI ← [DX]; (* Reads word at DX from I/O address space *)
IF DF = 0 THEN IncDec ← 2 ELSE IncDec ← –2; FI;
IF OperandSize = 32 (* doubleword *)
THEN
ES:EDI ← [DX]; (* Reads doubleword at DX from I/O address space *)
IF DF = 0 THEN IncDec ← 4 ELSE IncDec ← –4; FI;
FI;
source-index = source-index + IncDec;
destination-index = destination-index + IncDec
Description
INS transfers data from the input port numbered by the DX register to the memory byte,
word, or doubleword at ES:(E)DI. The memory operand must be addressable from the ES
register; no segment override is possible. The destination register is the DI register if the
address-size attribute of the instruction is 16 bits, or the EDI register if the address-size
attribute is 32 bits.
The INS instruction does not allow the specification of the port number as an immediate
value. You must address the port through the DX register value. Similarly, the destination
index register determines the destination address. You must preload the DX register value
into the DX register and the correct index into the destination index register before executing
the INS instruction.
After the transfer is made, the DI or EDI register advances automatically. If DF is 0 (a CLD
instruction was executed), the DI or EDI register increments; if DF is 1 (an STD instruction
was executed), the DI or EDI register decrements. The DI register increments or decrements
by 1 if the input is a byte, by 2 if it is a word, or by 4 if it is a doubleword.
The INSB, INSW, and INSD instructions are synonyms of the byte, word, and doubleword
INS instructions. INS instructions can use the REP prefix for block input of CX bytes or
words. Refer to the REP instruction for details of this operation.
2-128
Am486 Microprocessor Instruction Set
AMD
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the current privilege level is numerically greater
than the I/O privilege level and any of the corresponding I/O permission bits in TSS equals
1. General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates either that one of the corresponding I/O permission
bits in TSS equals 1, or that part of the operand lies outside the effective address space:
0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-129
AMD
2.123
INT/INTO
Call to Interrupt Procedure
Opcode
Instruction
Clocks
Description
CC
CC
CC
CC
CC
CD ib
CD ib
CD ib
CD ib
CD ib
CE
CE
CE
CE
CE
INT 3
INT 3
INT 3
INT 3
INT 3
INT imm8
INT imm8
INT imm8
INT imm8
INT imm8
INTO
INTO
INTO
INTO
INTO
26
44
71
82
37 + ts*
30
44
71
86
37 + ts*
Pass = 28; Fail = 3
46
73
84
39 + ts*
Interrupt 3 — trap to debugger
Interrupt 3 — Protected Mode, same privilege
Interrupt 3 — Protected Mode, more privilege
Interrupt 3 — from V86 Mode to PL 0
Interrupt 3 — Protected Mode via task gate
Interrupt numbered by immediate byte
Interrupt — Protected Mode, same privilege
Interrupt — Protected Mode, more privilege
Interrupt — from V86 Mode to PL 0
Interrupt — Protected Mode, via task gate
Interrupt 4 — if Overflow Flag is 1
Interrupt 4 — Protected Mode, same privilege
Interrupt 4 — Protected Mode, more privilege
Interrupt 4 — from V86 Mode to PL 0
Interrupt 4 — Protected Mode, via task gate
*ts = 199 for 486 TSS, 180 for 286 TSS, or 177 for VM TSS
Operation
Note: The following operational description applies not only to the above instructions but
also to external interrupts and exceptions.
IF PE = 0
THEN GOTO REAL-ADDRESS-MODE;
ELSE GOTO PROTECTED-MODE;
FI;
REAL-ADDRESS-MODE:
Push (FLAGS);
IF ← 0; (* C1ear Interrupt Flag *)
TF ← 0; (* Clear Trap Flag *)
Push(CS);
Push(IP);
(* No error codes are pushed *)
CS ← IDT[interrupt number ⋅ 4].selector;
IP ← IDT[Interrupt number ⋅ 4].offset;
(* Start execution in Real Address Mode *)
PROTECTED-MODE:
Interrupt vector must be within IDT table limits,
else General Protection Fault(vector number ⋅ 8 + 2 + EXT);
Descriptor AR byte must indicate interrupt gate, trap gate, or task gate,
else General Protection Fault(vector number ⋅ 8 + 2 + EXT);
IF software interrupt (* i.e. caused by INT n, INT 3, or INTO *)
THEN
IF gate descriptor DPL < CPL
THEN General Protection Fault(vector number ⋅ 8 + 2 + EXT);
FI;
FI;
Gate must be present,
ELSE Segment Not Present(vector number ⋅ 8 + 2 + EXT);
IF trap gate OR interrupt gate
THEN GOTO TRAP-GATE-OR-INTERRUPT-GATE;
ELSE GOTO TASK-GATE;
FI;
2-130
Am486 Microprocessor Instruction Set
AMD
TRAP-GATE-OR-INTERRUPT-GATE:
Examine CS selector and descriptor given in the gate descriptor;
Selector must be non-null, else General Protection Fault(EXT);
Selector must be within its descriptor table limits
ELSE General Protection Fault(selector + EXT);
Descriptor AR byte must indicate code segment
ELSE General Protection Fault(selector + EXT);
Segment must be present, else Segment Not Present (11)(selector + EXT);
IF code segment is non-conforming AND DPL < CPL
THEN GOTO INTERRUPT-TO-INNER-PRIVILEGE;
ELSE
IF code segment is conforming OR code segment DPL = CPL
THEN GOTO INTERRUPT-TO-SAME-PRIVILEGE-LEVEL;
ELSE General Protection Fault(CS selector + EXT);
FI;
FI;
INTERRUPT-TO-INNER-PRIVILEGE:
Check selector and descriptor for new stack in current TSS;
Selector must be non-null, ELSE Invalid TSS(EXT);
Selector index must be within its descriptor table limits
ELSE Invalid TSS(SS selector+ EXT);
Selector’s RPL must equal DPL of code segment,
ELSE Invalid TSS(SS selector+ EXT);
Stack segment DPL must equal DPL of code segment,
ELSE Invalid TSS(SS selector+ EXT);
Descriptor must indicate writable data segment,
ELSE Invalid TSS(SS selector + EXT);
Segment must be present, else Stack Fault(SS selector+ EXT);
IF 32-bit gate
THEN New stack must have room for 20 bytes else Stack Fault
ELSE New stack must have room for 10 bytes else Stack Fault
FI;
Instruction pointer must be within CS segment boundaries
ELSE General Protection Fault;
If VM = 1 in EFLAGS
Then Goto INTERRUPT from V-86-MODE;
Load new SS and eSP value from TSS;
IF 32-bit gate
THEN CS:EIP ← selector:offset from gate;
ELSE CS:IP ← selector:offset from gate;
FI;
Load CS descriptor into invisible portion of CS register;
Load SS descriptor into invisible portion of SS register;
IF 32-bit gate
THEN
Push (long pointer to old stack) (* 3 words padded to 4 *);
Push (EFLAGS);
Push (long pointer to return location) (* 3 words padded to 4 *);
ELSE
Push (long pointer to old stack) (* 2 words *);
Push (FLAGS);
Push (long pointer to return location) (* 2 words *);
FI;
Set CPL to new code segment DPL;
Set RPL of CS to CPL;
IF interrupt gate THEN IF 0 (* Interrupt Flag to 0 (disabled) *); FI;
Am486 Microprocessor Instruction Set
2-131
AMD
TF ← 0;
NT ← 0;
INTERRUPT-FROM-V86-MODE:
TempEFlags ← EFLAGS;
VM ← 0;
TF ← 0;
IF service through Interrupt Gate THEN IF ← 0;
TempSS ← SS;
TempESP ← ESP;
SS ← TSS. SSO; (* Change to level 0 stack segment *)
ESP ← TSS. ESPO; (* Change to level 0 stack pointer *)
Push(GS); (* padded to two words *)
Push(FS); (* padded to two words *)
Push(DS); (* padded to two words *)
Push(ES); (* padded to two words *)
GS ;ID 0;
FS ← 0;
DS ← 0;
ES ← 0;
Push(TempSS); (* padded to two words *)
Push(TempESP);
Push(TempEFlags);
Push(CS); (* padded to two words *)
Push(EIP); CS:EIP <- selector:offset from interrupt gate;
(* Starts execution of new routine in Protected Mode *)
INTERRUPT-TO-SAME-PRIVILEGE-LEVEL:
IF 32-bit gate
THEN Current stack limits must allow pushing 10 bytes, else Stack Fault
(12);
ELSE Current stack limits must allow pushing 6 bytes, else Stack Fault
(12);
FI;
IF interrupt was caused by exception with error code
THEN Stack limits must allow push of two more bytes;
ELSE Stack Fault (12);
FI;
Instruction pointer must be in CS limit, else General Protection Fault
(13) (0);
IF 32-bit gate
THEN
Push (EFLAGS);
Push (long pointer to return location); (* 3 words padded to 4 *)
CS: EIP ← selector:offset from gate;
ELSE (* 16-bit gate *)
Push (FLAGS);
Push (long pointer to return location); (* 2 words *)
CS:IP ← selector:offset from gate;
FI;
Load CS descriptor into invisible portion of CS register;
Set the RPL field of CS to CPL;
Push (error code); (* if any *)
IF interrupt gate THEN IF ← 0; FI;
TF ← 0;
NT ← 0;
TASK-GATE:
Examine selector to TSS, given in task gate descriptor;
2-132
Am486 Microprocessor Instruction Set
AMD
Must specify global in the local/global bit, else Invalid TSS (10)(TSS
selector);
Index must be within GDT limits, else Invalid TSS (10)(TSS selector);
AR byte must specify available TSS (bottom bits 00001),
else Invalid TSS (10)(TSS selector);
TSS must be present, else Segment Not Present (11)(TSS selector);
SWITCH-TASKS with nesting to TSS;
IF interrupt was caused by fault with error code
THEN
Stack limits must allow push of two more bytes, else Stack Fault (12);
Push error code onto stack;
FI;
Instruction pointer must be in CS limit, else General Protection Fault
(13)
Description
The INT n instruction generates a call to an interrupt handler via software. The immediate
operand, from 0 to 255, gives the index number into the Interrupt Descriptor Table (IDT) of
the called interrupt routine. In Protected Mode, the IDT consists of an array of 8-byte
descriptors; the invoked interrupt descriptor must indicate an interrupt, trap, or task pointer.
In Real Address Mode, the IDT is an array of 4-byte pointers. In Protected and Real Address
Modes, the base linear address of the IDT is defined by the contents of the IDTR.
The INTO conditional software instruction is identical to the INT n interrupt instruction except
that the interrupt number is implicitly 4, and the interrupt is made only if the Am486 microprocessor Overflow Flag is set.
The first 32 interrupts are reserved for system use. Some of these interrupts are used for
internally generated exceptions.
The INT n instruction generally behaves like a far call except that the contents of the FLAGS
register are pushed onto the stack before the return address. Interrupt procedures return
via the IRET instruction, which pops the flags and return address from the stack.
In Real Address Mode, the INT n instruction pushes the flags, the CS register, and the
return IP onto the stack, in that order, then jumps to the long pointer indexed by the interrupt
number.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13), Segment Not Present (11), Stack Fault (12), and Invalid TSS
(10) can occur as indicated under ‘Operation’ above.
Real Address Mode Exceptions
None. However, if when INT or INTO starts executing, the SP or ESP register is 1, 3, or 5,
the processor shuts down due to insufficient stack space.
Virtual 8086 Mode Exceptions
General Protection Fault (13) occurs if IOPL is less than 3, for the INT n instruction only,
as part of the mode emulation; Interrupt 3 (0CCh) generates a breakpoint exception; the
INTO instruction generates an overflow exception if OF is set.
Am486 Microprocessor Instruction Set
2-133
AMD
2.124
INVD
Invalidates Cache
Opcode
Instruction
Clocks
Description
0F 08
INVD
4
Invalidates entire cache.
Operation
FLUSH INTERNAL CACHE
SIGNAL EXTERNAL CACHE TO FLUSH
Description
The processor flushes the internal cache and issues a special-function bus cycle that
indicates that the external cache should be flushed. Any data held in write-back external
cache is discarded.
Flags Affected
None
Protected Mode Exceptions
The INVD instruction is a privileged instruction. General Protection Fault (13) indicates the
current privilege level is not 0.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
General Protection Fault (13); the INVD instruction is a privileged instruction.
Note: This instruction is implementation-dependent; its function may be implemented
differently on future AMD microprocessors. It is the responsibility of the designer to ensure
that the hardware responds to the external cache flush indication.
This instruction is not supported by Am386® microprocessors.
2-134
Am486 Microprocessor Instruction Set
AMD
2.125
INVLPG
Invalidates TLB Entry
Opcode
Instruction
Clocks
Description
0F 01/7
INVLPG m
12 for hit
Invalidates TLB entry.
Operation
INVALIDATE TLB ENTRY
Description
INVLPG invalidates a single entry in the TLB (the cache used for page table entries). If the
TLB contains a valid entry that maps the address of the memory operand, that TLB entry
is marked invalid.
Flags Affected
None
Protected Mode Exceptions
INVLPG is a privileged instruction; General Protection Fault (13) indicates the current
privilege level is not 0. An invalid-opcode exception is generated when used with a register
operand.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
An invalid-opcode exception is generated when used with a register operand. General
Protection Fault (13); the INVLPG instruction is a privileged instruction.
Note: This instruction is not supported on Am386 microprocessors.
Am486 Microprocessor Instruction Set
2-135
AMD
2.126
IRET/IRETD
Interrupt Return
Opcode
Instruction
Clocks
Description
CF
CF
CF
CF
CF
CF
CF
IRET
IRET
IRET
IRETD
IRETD
IRETD
IRETD
15
36
32 + ts*
15
36
15
32 + ts*
Interrupt return (far return and pop FLAGS)
Interrupt return to lesser privilege
Interrupt return, different task (NT = 1)
Interrupt return, (far return and pop FLAGS)
Interrupt return to lesser privilege
Interrupt return to V86 Mode
Interrupt return, different task (NT = 1)
*ts = 199 for 486 TSS, 180 for 286 TSS, or 177 for VM TSS
Operation
IF PE = 0
THEN (* Real Address Mode *)
IF OperandSize = 32 (* Instruction = IRETD *)
THEN EIP ← Pop();
ELSE (* Instruction = IRET *)
IP ← Pop();
FI;
CS ← Pop();
IF OperandSize = 32 (* Instruction = IRETD *)
THEN Pop(); EFLAGS ← Pop();
ELSE (* Instruction = IRET *)
FLAGS ← Pop();
FI;
ELSE (* Protected Mode)
IF VM = 1
THEN General Protection Fault (13);
ELSE
IF NT = 1
THEN GOTO TASK-RETURN;
ELSE
IF VM = 1 in FLAGS image on stack
THEN GO TO STACK-RETURN-TO-V86;
ELSE GOTO STACK-RETURN;
FI;FI;FI;
FI;STACK-RETURN-TO-V86:(* Interrupted procedure was in Virtual 8086 Mode *)
IF top 36 bytes of stack not within limits
THEN Stack Fault (12);
FI;
IF instruction pointer not within code segment limit
THEN General Protection Fault (13);
FI;
EFLAGS ← SS:[ESP + 8]; (* Sets VM in interrupted routine *)
EIP ← Pop();
CS ← Pop(); (* CS behaves as in 8086, due to VM = 1 *)
throwaway ← Pop(); (* pop away EFLAGS already read *)
TempESP ← Pop();
TempSS ← Pop();
ES ← Pop(); (* pop 2 words; throw away high-order word *)
DS ← Pop(); (* pop 2 words; throw away high-order word *)
FS ← Pop(); (* pop 2 words; throw away high-order word *)
GS ← Pop(); (* pop 2 words; throw away high-order word *)
SS:ESP ← TempSS:TempESP;
(* Resume execution in Virtual 8086 Mode *)
2-136
Am486 Microprocessor Instruction Set
AMD
TASK-RETURN:
Examine Back Link Selector in TSS addressed by the current task
register:
Must specify global in the local/global bit,
ELSE Invalid TSS(new TSS selector);
Index must be within GDT limits, else Invalid TSS(new TSS selector;
AR byte must specify TSS, else Invalid TSS(new TSS selector);
New TSS must be busy, else Invalid TSS(new TSS selector);
TSS must be present, else Segment Not Present(new TSS selector);
SWITCH-TASKS without nesting to TSS specified by back link selector;
Mark the task just abandoned as NOT BUSY;
Instruction pointer must be within code segment limit
ELSE General Protection Fault);
STACK-RETURN:
IF OperandSize = 32
THEN Third word on stack must be within stack limits, else Stack Fault;
ELSE Second word on stack must be within stack limits, else Stack Fault
FI;
Return CS selector RPL must be ≥ CPL,
ELSE General Protection Fault(Return selector);
IF return selector RPL = CPL
THEN GOTO RETURN-SAME-LEVEL;
ELSE GOTO RETURN-OUTER-LEVEL;
FI;
RETURN-SAME-LEVEL:
IF OperandSize = 32
THEN
Top 12 bytes on stack must be within limits, else Stack Fault;
Return CS selector (at eSP+ 4) must be non-null,
ELSE General Protection Fault;
ELSE
Top 6 bytes on stack must be within limits, else Stack Fault;
Return CS selector (at eS P + 2) must be non-null,
ELSE General Protection Fault; FI;
Selector index must be within its descriptor table limits,
ELSE General Protection Fault
Return selector; AR byte must indicate code segment,
ELSE General Protection Fault(Return selector);
IF non-conforming THEN code segment DPL must = CPL;
ELSE General Protection Fault(Return selector); FI;
IF conforming
THEN code segment DPL must be ≤ CPL,
ELSE General Protection Fault(Return selector);
Segment must be present, else Segment Not Present (11)(Return selector);
Instruction pointer must be within code segment boundaries,
ELSE General Protection Fault; FI;
IF OperandSize = 32
THEN
Load CS: EIP from stack;
Load CS-register with new code segment descriptor;
Load EFLAGS with third doubleword from stack;
Increment eSP by 12;
ELSE
Load CS-register with new code segment descriptor;
Load FLAGS with third word on stack;
Increment eSP by 6; FI;
Am486 Microprocessor Instruction Set
2-137
AMD
RETURN-OUTER-LEVEL:
IF OperandSize = 32
THEN Top 20 bytes on stack must be ithin limits, else Stack Fault;
ELSE Top 10 bytes on stack must be within limits, else Stack Fault;
FI;
Examine return CS selector and associated descriptor:
Selector must be non-null, else General Protection Fault;
Selector index must be within its descriptor table limits;
ELSE General Protection Fault(Return selector);
AR byte must indicate code segment,
ELSE General Protection Fault (Return selector);
IF non-conforming
THEN code segment DPL must = CS selector RPL;
ELSE General Protection Fault(Return selector); FI;
IF conforming
THEN code segment DPL must be > CPL;
ELSE General Protection Fault(Return selector); FI;
Segment must be present,
ELSE Segment Not Present(Return selector);
Examine return SS selector and associated descriptor:
Selector must be non-null, ELSE General Protection Fault;
Selector index must be within its descriptor table limits
ELSE General Protection Fault(SS selector);
Selector RPL must equal the RPL of the return CS selector
ELSE General Protection Fault(SS selector);
AR byte must indicate a writable data segment,
ELSE General Protection Fault(SS selector);
Stack segment DPL must equal the RPL of the return CS selector
ELSE General Protection Fault(SS selector);
SS must be present, else Segment Not Present(SS selector);
Instruction pointer must be within code segment limit
ELSE General Protection Fault;
IF OperandSize = 32
THEN
Load CS:EIP from stack;
Load EFLAGS with values at (eSP + 8);
ELSE
Load CS:IP from stack;
Load FLAGS with values at (eSP + 4) ;
FI;
Load SS:eSP from stack;
Set CPL to the RPL of the return CS selector;
Load the CS register with the CS descriptor;
Load the SS register with the SS descriptor;
FOR each of ES, FS, GS, and DS
DO;
IF the current value of the register is not valid for the outer level;
THEN zero the register and clear the valid flag;
FI;
To be valid, the register setting must satisfy the following properties:
Selector index must be within descriptor table limits;
AR byte must indicate data or readable code segment;
IF segment is data or non-conforming code,
THEN DPL must be > CPL, or DPL must be < RPL;
OD
2-138
Am486 Microprocessor Instruction Set
AMD
Description
In Real Address Mode, the IRET instruction pops the instruction pointer, the CS register,
and the FLAGS register from the stack and resumes the interrupted routine.
In Protected Mode, the action of the IRET instruction depends on the setting of the Nested
Task flag (NT) bit in the EFLAGS register. When the new flag image is popped from the
stack, the IOPL bits in the EFLAGS register are changed only when CPL equals 0.
If the NT flag is cleared, the IRET instruction returns from an interrupt procedure without a
task switch. The code returned to must be equally or less privileged than the interrupt routine
(as indicated by the RPL bits of the CS selector popped from the stack). If the destination
code is less privileged, the IRET instruction also pops the stack pointer and SS from the
stack.
If the NT flag is set, the IRET instruction reverses the operation of a CALL or INT that
caused a task switch. The updated state of the task executing the IRET instruction is saved
in its task state segment. If the task is re-entered later, the code that follows the IRET
instruction is executed.
Flags Affected
All flags are affected; the FLAGS or EFLAGS register is popped from stack.
Protected Mode Exceptions
General Protection Fault (13), Segment Not Present (11), or Stack Fault (12), occurs as
indicated under ‘Operation’ above.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand being popped lies beyond
address 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates the I/O privilege level is less than 3, as part of the
emulation.
Am486 Microprocessor Instruction Set
2-139
AMD
2.127
JA
Jumps If Above (see also JNBE)
Opcode
Instruction
Clocks
Description
77 cb
0F 87 cw/cd
JA rel8
JA rel16/32
3 (true),1 (false)
3 (true), 1 (false)
Jumps short if above (CF = 0 and ZF = 0).
Jumps near if above (CF = 0 and ZF = 0).
Operation
IF CF = 0 AND ZF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFH;
FI;
FI
Description
JA tests the flag set by a previous instruction. ‘Above’ indicates an unsigned integer comparison. If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-140
Am486 Microprocessor Instruction Set
AMD
2.128
JAE
Jumps If Above or Equal (see also JNB and JNC)
Opcode
Instruction
Clocks
Description
73 cb
0F 83 cw/cd
JAE rel8
JAE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if above or equal (CF = 0).
Jumps near if above or equal (CF = 0).
Operation
IF CF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JAE tests the flag set by a previous instruction. ‘Above’ indicates an unsigned integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-141
AMD
2.129
JB
Jumps If Below (see also JC and JNAE)
Opcode
Instruction
Clocks
Description
72 cb
0F 82 cw/cd
JB rel8
JB rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if below (CF = 1).
Jumps near if below (CF = 1).
Operation
IF CF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JB tests the flag set by a previous instruction. ‘Below’ indicates an unsigned integer comparison. If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-142
Am486 Microprocessor Instruction Set
AMD
2.130
JBE
Jumps If Below or Equal (see also JNA)
Opcode
Instruction
Clocks
Description
76 cb
0F 86 cw/cd
JBE rel8
JBE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if below or equal (CF = 1 or ZF = 1).
Jumps near if below or equal (CF = 1 or ZF = 1).
Operation
IF CF = 1 OR ZF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JBE tests the flag set by a previous instruction. ‘Below’ indicates an unsigned integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-143
AMD
2.131
JC
Jumps If Carry (see also JB and JNAE)
Opcode
Instruction
Clocks
Description
72 cb
0F 86 cw/cd
JC rel8
JC rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if carry (CF = 1).
Jumps near if carry (CF = 1).
Operation
IF CF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JC tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-144
Am486 Microprocessor Instruction Set
AMD
2.132
JCXZ
Jumps Short If CX Register is 0 (see also JECXZ)
Opcode
Instruction
Clocks
Description
E3 cb
JCXZ rel8
8 (true), 5 (false)
Jumps short if CX register is 0.
Operation
IF CX = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JCXZ tests the flag set by a previous instruction. If the given condition is true, a jump is
made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: JCXZ takes longer to execute than a two-instruction sequence that compares the
count register to zero and jumps if the count is zero.
The instruction converts all branches into 16-byte code fetches regardless of jump address
or cacheability.
Am486 Microprocessor Instruction Set
2-145
AMD
2.133
JE
Jumps Short If Equal (see also JZ)
Opcode
Instruction
Clocks
Description
74 cb
0F 84 cw/cd
JE rel8
JE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if equal (ZF = 1).
Jumps near if equal (ZF = 1).
Operation
IF ZF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JE tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-146
Am486 Microprocessor Instruction Set
AMD
2.134
JECXZ Jumps Short If ECX Register is 0 (see also JCXZ)
Opcode
Instruction
Clocks
Description
E3 cb
JECXZ rel8
8 (true), 5 (false)
Jumps short if ECX register is 0.
Operation
IF ECX = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JECXZ tests the flag set by a previous instruction. ‘Above’ indicates an unsigned integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: JECXZ takes longer to execute than a two-instruction sequence that compares the
count register to zero and jumps if the count is zero.
The instruction converts all branches into 16-byte code fetches regardless of jump address
or cacheability.
Am486 Microprocessor Instruction Set
2-147
AMD
2.135
JG
Jumps If Greater (see also JNLE)
Opcode
Instruction
Clocks
Description
7F cb
0F 84 cw/cd
JG rel8
JG rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if greater (ZF = 0 and SF = OF).
Jumps near if greater (ZF = 0 and SF = OF).
Operation
IF ZF = 0 AND SF = CF
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JG tests the flag set by a previous instruction. ‘Greater’ indicates a signed integer comparison. If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-148
Am486 Microprocessor Instruction Set
AMD
2.136
JGE
Jumps If Greater or Equal (see also JNL)
Opcode
Instruction
Clocks
Description
7D cb
0F 8D cw/cd
JGE rel8
JGE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if greater or equal (SF = OF).
Jumps near if greater or equal (SF = OF).
Operation
IF SF = OF
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JGE tests the flag set by a previous instruction. ‘Greater’ indicates a signed integer comparison. If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-149
AMD
2.137
JL
Jumps If Less (see also JNGE)
Opcode
Instruction
Clocks
Description
7C cd
0F 8C cw/cd
JL rel8
JL rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if less (SF ≠ OF).
Jumps near if less (SF ≠ OF).
Operation
IF SF ≠ OF
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JL tests the flag set by a previous instruction. ‘Less’ indicates a signed integer comparison.
If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-150
Am486 Microprocessor Instruction Set
AMD
2.138
JLE
Jumps If Less or Equal (see also JNG)
Opcode
Instruction
Clocks
Description
7E cb
0F 8E cw/cd
JLE rel8
JLE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if less or equal (ZF = 1 and SF ≠ OF).
Jumps near if less or equal (ZF = 1 and SF ≠ OF).
Operation
IF ZF = 1 AND SF≠OF
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JLE tests the flag set by a previous instruction. ‘Less’ indicates a signed integer comparison.
If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-151
AMD
2.139
JMP
Jump
Opcode
Instruction
Clocks
Description
EB cb
E9 cw
FF /4
EA cd
EA cd
EA cd
EA cd
FF /5
FF /5
FF /5
FF /5
E9 cd
FF /4
EA cp
EA cp
EA cp
EA cp
FF /5
FF /5
FF /5
FF /5
JMP rel8
JMP rel16
JMP r/m16
JMP ptr16:16
JMP ptr 16:16
JMP ptr 16:16
JMP ptr 16:16
JMP m16:16
JMP m16:16
JMP m16:16
JMP m16:16
JMP rel32
JMP r/m32
JMP ptr16:32
JMP ptr16:32
JMP ptr16:32
JMP ptr16:32
JMP m16:32
JMP m16:32
JMP m16:32}
JMP m16:32
3
3
5/5
17,pm = 19
32
42 + ts*
43 + ts*
13,pm = 18
31
41 + ts*
42 + ts*
3
5/5
13,pm = 18
31
42 + ts*
43 + ts*
13,pm = 18
31
41 + ts*
42 + ts*
Jumps short.
Jumps near, displacement relative to next instruction.
Jumps near indirect.
Jumps far to 4-byte intermediate address.
Jumps to call gate, same privilege.
Jumps via task state segment.
Jumps via task gate.
Jumps r/m16:16 indirect and far.
Jumps to call gate, same privilege.
Jumps via task state segment.
Jumps via task gate.
Jumps near with displacement relative to next instruction.
Jumps near, indirect.
Jumps far to 6-byte immediate address.
Jumps to call gate, same privilege.
Jumps via task state segment.
Jumps via task gate.
Jumps far to address in r/m doubleword.
Jumps to call gate, same privilege.
Jumps via task state segment.
Jumps via task gate.
*ts = 199 for 486 TSS, 180 for 286 TSS, or 177 for VM TSS
Operation
IF instruction = relative JMP
(* i.e. operand is rel8, rel16, or rel32 *)
THEN
EIP ← EIP + rel8/16/32,
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI;
IF instruction = near indirect JMP
(* i.e. operand is r/m16 or r/m32 *)
THEN
IF OperandSize = 16
THEN
EIP ← [r/m16] AND OOO0FFFFh;
ELSE (* OperandSize = 32 *)
EIP ← [r/m32];
FI;
FI;
IF (PE = 0 OR (PE = 1 AND VM = 1)) (* Real Mode or Virtual 8086 Mode *)
AND instruction = far JMP
(* i.e., operand type is m 16:16, m 16:32, ptr16:16, ptr16:32 *)
THEN GOTO REAL-OR-V86-MODE;
IF operand type = m16:16 or m16:32
THEN (* indirect *)
IF OperandSize = 16
THEN
CS:IP ← [m16:16];
EIP ← EIP AND 0000FFFFh; (* clear upper 16 bits *)
ELSE (* OperandSize = 32 *)
CS:EIP ← [m16:32];
2-152
Am486 Microprocessor Instruction Set
AMD
FI;
FI;
IF operand type = ptr16:16 or ptr16:32
THEN
IF OperandSize = 16
THEN
CS:IP ← ptr16:16,
EIP ← EIP AND 000FFFFh; (* dear upper 16 bits *)
ELSE (* OperandSize = 32 *)
CS:EIP ← ptr16:32;
FI;
FI;
FI;
IF (PE = 1 AND VM = 0) (* Protected Mode, not Virtual 8086 Mode *)
AND instruction = far JMP
THEN
IF operand type = m16 or m16:32
THEN (* indirect *)
check access of EA doubleword;
General Protection Fault or Stack Fault IF limit violation;
FI;
Destination selector is not null ELSE General Protection Fault
Destination selector index is within its descriptor table limits ELSE
General Protection Fault(selector)
Depending on AR byte of destination descriptor:
GOTO C0NFORMING-CODE-SEGMENT;
GOTO NONCONFORMING-CODE-SEGMENT;
GOTO CALL-GATE;
GOTO TASK-GATE;
GOTO TASK-STATE-SEGMENT;
ELSE General Protection Fault(selector); (* illegal AR in descriptor *)
FI;
CONFORMING-CODE-SEGMENT:
Descriptor DPL must be ≤ CPL ELSE General Protection Fault(selector);
Segment must be present ELSE Segment Not Present(selector);
Instruction pointer must be within code-segment limit
ELSE General Protection Fault;
IF OperandSize = 32
THEN Load CS:EIP from destination pointer;
ELSE Load CS:IP from destination pointer;
FI;
Load CS register with new segment descriptor;
NONCONFORMING-CODE-SEGMENT:
RPL of destination selector must be ≤ CPL
ELSE General Protection Fault (selector);
Descriptor DPL must = CPL ELSE General Protection Fault(selector);
Segment must be present ELSE Segment Not Present (11)(selector);
Instruction pointer must be within code-segment limit
ELSE General Protection Fault;
IF OperandSize = 32 THEN Load CS:EIP from destination pointer;
ELSE Load CS:IP from destination pointer;
FI;
Load CS register with new segment descriptor;
Set RPL field of CS register to CPL;
Am486 Microprocessor Instruction Set
2-153
AMD
CALL-GATE:
Descriptor DPL must be ≥ CPL
ELSE General Protection Fault(gate selector);
Descriptor DPL must be ≥ gate selector RPL
ELSE General Protection Fault(gate selector);
Gate must be present ELSE Segment Not Present(gate selector);
Examine selector to code segment given in call gate descriptor:
Selector must not be null ELSE General Protection Fault;
Selector must be within its descriptor table limit
ELSE General Protection Fault(CS selector);
Descriptor AR byte must indicate code segment
ELSE General Protection Fault(CS selector);
IF non-conforming
THEN code-segment descriptor DPL must = CPL
ELSE General Protection Fault(CS selector); FI;
IF conforming
THEN code-segment descriptor DPL must be ≤ CPL;
ELSE General Protection Fault (13)(CS selector;
Code segment must be present ELSE Segment Not Present(CS selector);
Instruction pointer must be within code-segment limit
ELSE General Protection Fault;
IF OperandSize = 32
THEN Load CS:EIP from call gate;
ELSE Load CS:IP from call gate; FI;
Load CS register with new code-segment descriptor;
Set RPL of CS to CPL
TASK-GATE:
Gate descriptor DPL must be ≥ CPL
ELSE General Protection Fault(gate selector);
Gate descriptor DPL must be ≥ gate selector RPL
ELSE General Protection Fault(gate selector);
Task Gate must be present ELSE Segment Not Present(gate selector);
Examine selector to TSS, given in Task Gate descriptor:
Must specify global in the local/global bit
ELSE General Protection Fault(TSS selector);
Index must be within GDT limits
ELSE General Protection Fault(TSS selector);
Descriptor AR byte must specify available TSS (bottom bits 00001);
ELSE General Protection Fault(TSS selector);
Task State Segment must be present
ELSE Segment Not Present(TSS selector);
SWITCH-TASKS (without nesting) to TSS;
Instruction pointer must be within code-segment limit
ELSE General Protection Fault;
TASK-STATE-SEGMENT:
TSS DPL must be ≥ CPL ELSE General Protection Fault(TSS selector);
TSS DPL must be ≥ TSS selector RPL
ELSE General Protection Fault(TSS selector);
Descriptor AR byte must specify available TSS (bottom bits 00001)
ELSE General Protection Fault(TSS selector);
Task State Segment must be present
ELSE Segment Not Present(TSS selector);
SWITCH-TASKS (without nesting) to TSS;
Instruction pointer must be within code-segment limit
ELSE General Protection Fault
2-154
Am486 Microprocessor Instruction Set
AMD
Description
JMP transfers control to a different point in the instruction stream without recording return
information. The instruction has several different forms, as follows:
n
Near Direct Jumps: The JMP r/m16 and JMP r/m32 forms specify a register or memory
location from which the procedure absolute offset is fetched. The offset is 32 bits for
r/m32, or 16 bits for r/m16.
n
Near Indirect Jumps: To determine the destination, the JMP rel16 and JMP rel32 forms
add an offset to the address of the instruction following the JMP. The rel16 form is used
for 16-bit operand-size attributes (segment-size attribute 16 only); rel32 is used for 32bit operand-size attributes (segment-size attribute 32 only). The result is stored in the
32-bit EIP register. With rel16, the upper 16 bits of the EIP register are cleared, which
results in an offset that does not exceed 16 bits.
n
Far Jumps: The JMP ptr16:16 and ptr16:32 forms use a 4-byte or 6-byte operand as a
long pointer to the destination. The JMP m16:16 and m16:32 forms fetch the long pointer
from the specified memory location (indirection). In Real or Virtual 8086 Mode, the long
pointer provides 16 bits for the CS register and 16 or 32 bits for the EIP register (depending on operand-size). In Protected Mode, both forms consult the Access Rights
(AR) byte in the descriptor indexed by the selector part of the long pointer. Depending
on the value of the AR byte, the jump performs one of the following control transfer types:
— A jump to a code segment at the same privilege level
— A task switch
Flags Affected
All if a task switch occurs; none if no task switch occurs.
Protected Mode Exceptions
Near direct jumps: General Protection Fault (13) indicates the procedure is outside the
code segment limits. If CPL is 3, Alignment Check (17) indicates there is an unaligned
memory reference.
Near indirect jumps: General Protection Fault (13) indicates either that the result is in a
non-writable segment or there is an illegal memory-operand effective address in the code
or data segments. Stack Fault (12) indicates an illegal SS segment address. General
Protection Fault (13) indicates the indirect offset is beyond the code segment limits. Page
Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an
unaligned memory reference.
Far jumps: General Protection Fault (13), Segment Not Present (11), Stack Fault (12), and
Invalid TSS (10), as listed in the ‘Operations’ section starting on page 2-152.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand is outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: All branches are converted into 16-byte code fetches regardless of jump address or
cacheability.
Am486 Microprocessor Instruction Set
2-155
AMD
2.140
JNA
Jumps If Not Above (see also JBE)
Opcode
Instruction
Clocks
Description
76 cb
0F 86 cw/cd
JNA rel8
JNA rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not above (CF = 1 or ZF = 1).
Jumps near if not above (CF = 1 or ZF = 1).
Operation
IF CF = 1 OR ZF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNA tests the flag set by a previous instruction. “Above” indicates an unsigned integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-156
Am486 Microprocessor Instruction Set
AMD
2.141
JNAE
Jumps If Not Above or Equal (see also JB and JC)
Opcode
Instruction
Clocks
Description
72 cb
0F 82 cw/cd
JNAE rel8
JNAE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not above or equal (CF = 1).
Jumps near if not above or equal (CF = 1).
Operation
IF CF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNAE tests the flag set by a previous instruction. “Above” indicates an unsigned integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-157
AMD
2.142
JNB
Jumps If Not Below (see also JAE and JNC)
Opcode
Instruction
Clocks
Description
73 cb
0F 83 cw/cd
JNB rel8
JNB rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not below (CF = 0).
Jumps near if not below (CF = 0).
Operation
IF CF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNB tests the flag set by a previous instruction. “Below” indicates an unsigned integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-158
Am486 Microprocessor Instruction Set
AMD
2.143
JNBE
Jumps If Not Below or Equal (see also JA)
Opcode
Instruction
Clocks
Description
77 cb
0F 87 cw/cd
JNBE rel8
JNBE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not below or equal (CF = 0 and ZF = 0).
Jumps near if not below or equal (CF = 0 and ZF = 0).
Operation
IF CF = 0 AND ZF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNBE tests the flag set by a previous instruction. ‘Below’ indicates an unsigned integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-159
AMD
2.144
JNC
Jumps If Not Carry (see also JAE and JNB)
Opcode
Instruction
Clocks
Description
73 cb
0F 83 cw/cd
JNC rel8
JNC rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not carry (CF = 0).
Jumps near if not carry (CF = 0).
Operation
IF CF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNC tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-160
Am486 Microprocessor Instruction Set
AMD
2.145
JNE
Jumps If Not Equal (see also JNZ)
Opcode
Instruction
Clocks
Description
75 cb
0F 85 cw/cd
JNE rel8
JNE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not equal (ZF = 0).
Jumps near if not equal (ZF = 0).
Operation
IF ZF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNE tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-161
AMD
2.146
JNG
Jumps If Not Greater (see also JLE)
Opcode
Instruction
Clocks
Description
7E cb
0F 8E cw/cd
JNG rel8
JNG rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not greater (ZF = 1 and SF ≠ OF).
Jumps near if not greater (ZF = 1 and SF ≠ OF).
Operation
IF ZF = 1 AND SF ≠ OF
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNG tests the flag set by a previous instruction. ‘Greater’ indicates a signed integer comparison. If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-162
Am486 Microprocessor Instruction Set
AMD
2.147
JNGE
Jumps If Not Greater or Equal (see also JL)
Opcode
Instruction
Clocks
Description
7C cb
0F 8C cw/cd
JNGE rel8
JNGE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not greater or equal (SF ≠ OF).
Jumps near if not greater or equal (SF ≠ OF).
Operation
IF SF ≠ OF
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNGE tests the flag set by a previous instruction. ‘Greater’ indicates a signed integer
comparison. If the given condition is true, a jump is made to the location provided as the
operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-163
AMD
2.148
JNL
Jumps If Not Less (see also JGE)
Opcode
Instruction
Clocks
Description
7D cb
0F 8D cw/cd
JNL rel8
JNL rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not less (SF = OF).
Jumps near if not less (SF = OF).
Operation
IF SF = OF
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNL tests the flag set by a previous instruction. ‘Less’ indicates a signed integer comparison.
If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-164
Am486 Microprocessor Instruction Set
AMD
2.149
JNLE
Jumps If Not Less or Equal (see also JG)
Opcode
Instruction
Clocks
Description
7F cb
0F 8F cw/cd
JNLE rel8
JNLE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not less or equal (ZF = 0 and SF = 0).
Jumps near if not less or equal (ZF = 0 and SF = 0).
Operation
IF ZF = 0 AND SF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNLE tests the flag set by a previous instruction. ‘Less’ indicates a signed integer comparison. If the given condition is true, a jump is made to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-165
AMD
2.150
JNO
Jumps If Not Overflow
Opcode
Instruction
Clocks
Description
71 cb
0F 81 cw/cd
JNO rel8
JNO rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not overflow (OF = 0).
Jumps near if not overflow (OF = 0).
Operation
IF OF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNO tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-166
Am486 Microprocessor Instruction Set
AMD
2.151
JNP
Jumps If Not Parity (see also JPO)
Opcode
Instruction
Clocks
Description
7B cb
0F 8B cw/cd
JNP rel8
JNP rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not parity (PF = 0).
Jumps near if not parity (PF = 0).
Operation
IF PF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNP tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-167
AMD
2.152
JNS
Jumps If Not Sign
Opcode
Instruction
Clocks
Description
79 cb
0F 89 cw/cd
JNS rel8
JNS rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not sign (SF = 0).
Jumps near if not sign (SF = 0).
Operation
IF SF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNS tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-168
Am486 Microprocessor Instruction Set
AMD
2.153
JNZ
Jumps If Not Zero (see also JNE)
Opcode
Instruction
Clocks
Description
75 cb
0F 85 cw/cd
JNZ rel8
JNZ rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if not zero (ZF = 0).
Jumps near if not zero (ZF = 0).
Operation
IF ZF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JNZ tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-169
AMD
2.154
JO
Jumps If Overflow
Opcode
Instruction
Clocks
Description
70 cb
0F 80 cw/cd
JO rel8
JO rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if overflow (OF = 1).
Jumps near if overflow (OF = 1).
Operation
IF OF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JO tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-170
Am486 Microprocessor Instruction Set
AMD
2.155
JP
Jumps If Parity (see also JPE)
Opcode
Instruction
Clocks
Description
7A cb
0F 8A cw/cd
JP rel8
JP rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if parity (PF = 1).
Jumps near if parity (PF = 1).
Operation
IF PF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JP tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-171
AMD
2.156
JPE
Jumps If Parity Even (see also JP)
Opcode
Instruction
Clocks
Description
7A cb
0F 8A cw/cd
JPE rel8
JPE rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if parity even (PF = 1).
Jumps near if parity even (PF = 1).
Operation
IF PF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JPE tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-172
Am486 Microprocessor Instruction Set
AMD
2.157
JPO
Jumps if Parity Odd (see also JNP)
Opcode
Instruction
Clocks
Description
7B cb
0F 8B cw/cd
JPO rel8
JPO rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if parity odd (PF = 0).
Jumps near if parity odd (PF = 0).
Operation
IF PF = 0
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JPO tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-173
AMD
2.158
JS
Jumps If Sign
Opcode
Instruction
Clocks
Description
78 cb
0F 88 cw/cd
JS rel8
JS rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if sign (SF = 1).
Jumps near if sign (SF = 1).
Operation
IF SF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JS tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
2-174
Am486 Microprocessor Instruction Set
AMD
2.159
JZ
Jumps If 0 (see also JE)
Opcode
Instruction
Clocks
Description
74 cb
0F 84 cw/cd
JZ rel8
JZ rel16/32
3 (true), 1 (false)
3 (true), 1 (false)
Jumps short if 0 (ZF = 1).
Jumps near if 0 (ZF = 1).
Operation
IF ZF = 1
THEN
EIP ← EIP + SignExtend(rel8/16/32)
IF OperandSize = 16
THEN EIP ← EIP AND 0000FFFFh;
FI;
FI
Description
JZ tests the flag set by a previous instruction. If the given condition is true, a jump is made
to the location provided as the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the limits of the code segment.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The instruction converts all branches into 16-byte code fetches regardless of jump
address or cacheability.
Am486 Microprocessor Instruction Set
2-175
AMD
2.160
LAHF
Loads Flags into AH
Opcode
Instruction
Clocks
Description
9F
LAHF
3
Loads the FLAGS register into AH.
Operation
AH ← SF:ZF:xx:AF:xx:PF:xx:CF
Description
The LAHF instruction transfers the FLAGS register (low byte of the EFLAGS register) to
the AH register. After the transfer, the bits shadow the flags as follows:
n
AH bit 0 = Carry Flag
n
AH bit 2 = Parity Flag
n
AH bit 4 = Auxiliary Flag
n
AH bit 6 = Zero Flag
n
AH bit 7 = Sign Flag
Flags Affected
None
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-176
Am486 Microprocessor Instruction Set
AMD
2.161
LAR
Loads Access Rights Byte
Opcode
Instruction
Clocks
Description
0F 02 /r
0F 02 /r
LAR r16,r/m16
LAR r32,r/m32
11/11
11/11
r16 ← r/m16 masked by FF00h
r32 ← r/m32 masked by 00FxFF00h
Description
If the source selector is visible at the current privilege level (modified by the selector’s RPL)
and is a valid descriptor type within the descriptor Iimits, LAR stores the high-order doubleword of the descriptor masked by 00FxFF00 in the destination register, and sets ZF. The
x indicates that the four bits corresponding to the upper four bits of the limit are undefined
in the value loaded by the LAR instruction. If the selector is invisible or of the wrong type,
LAR clears ZF. If the 32-bit operand size is specified, the entire 32-bit value is loaded into
the 32-bit destination register. If the 16-bit operand size is specified, the lower 16 bits of
this value are stored in the 16-bit destination register. All code and data segment descriptors
are valid for LAR.
The valid special segment and gate descriptor types for the LAR instruction are given in
the following table:
Type
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Name
Valid/Invalid
Invalid
Available 80286 TSS
LDT
Busy 80286 TSS
80286 call gate
80286/486 task gate
80286 trap gate
80286 interrupt gate
Invalid
Available 486 TSS
Invalid
Busy 486 TSS
486 call gate
Invalid
486 trap gate
486 interrupt gate
Invalid
Valid
Valid
Valid
Valid
Valid
Valid
Valid
Invalid
Valid
Invalid
Valid
Valid
Invalid
Valid
Valid
Flags Affected
If the selector is invisible or of the wrong type, LAR clears ZF; otherwise, it sets ZF.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Invalid Opcode (6) occurs. LAR is unrecognized in Real Address Mode.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) occurs. LAR is unrecognized in Virtual 8086 Mode.
Am486 Microprocessor Instruction Set
2-177
AMD
2.162
LDS
Loads Pointer Using DS
Opcode
Instruction
Clocks
Description
C5 /r
C5 /r
LDS r16,m16:16
LDS r32,m16:32
6/12
6/12
Loads DS:r16 with pointer from memory.
Loads DS:r32 with pointer from memory.
Operation
IF (OperandSize = 16)
THEN
r16 ← [Effective Address]; (* 16-bit transfer *)
DS ← [Effective Address + 2]; (* 16-bit transfer *)
ELSE (* OperandSize = 32 *)
r32 ← [Effective Address]; (* 32-bit transfer *)
DS ← [Effective Address + 4]; (* 16-bit transfer *) FI;
IF Protected Mode and DS is loaded with a non-null selector:
Index is within limits ELSE General Protection Fault(selector);
AR byte indicates data segment ELSE General Protection Fault(selector);
IF data or non-conforming code
THEN RPL and CPL are ≤ DPL in AR byte ELSEGen.Protect.Fault(selector);
Segment must be marked present ELSE Segment Not Present(selector);
Load segment register with selector and RPL bits;
Load segment register with descriptor;
IF Protected Mode and DS is loaded with a null selector:
Load segment register with selector; Clear descriptor valid bit
Description
LDS reads a full pointer from memory and stores it in a register pair consisting of the DS
register and a second operand-specified register. The first 16 bits are in DS and the remaining 16 or 32 bits (as specifed by the operand size) are placed into the register specified
by the r16 or r32 register operand. The segment register descriptor comes from the selector
descriptor table entry. Loading a null selector (values 0000–0003) into DS does not cause
a protection exception, but any subsequent reference to a segment with a null selector
causes a General Protection Fault (13) and no memory reference to the segment occurs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Invalid Opcode (6) indicates the second
operand is a register. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check
(17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-178
Am486 Microprocessor Instruction Set
AMD
2.163
LEA
Loads Effective Address
Opcode
Instruction
Clocks
Description
8D /r
8D /r
8D /r
8D /r
LEA r16,m[16-bit]
LEA r32,m[16-bit]
LEA r16,m[32-bit]
LEA r32,m[32-bit]
1
1
1
1
Stores effective address for m in 16-bit register.
Stores effective address for m in 32-bit register.
Stores effective address for m in 16-bit register.
Stores effective address for m in 32-bit register.
Operation
IF OperandSize = 16 AND AddressSize = 16
THEN r16 Addr(m);
IF OperandSize = 16 AND AddressSize = 32
THEN r16 ← Truncate_to_16bits(Addr(m)); (* 32-bit address *)
IF OperandSize = 32 AND AddressSize = 16
THEN r32 ← Truncate_to_16bits(Addr(m));
IF OperandSize = 32 AND AddressSize = 32
THEN r32 ← Addr(m);
FI
Description
LEA calculates the effective address (offset part) and stores it in the specified register. The
operand-size attribute of the instruction (represented by OperandSize in ‘Operation’ above)
is determined by the chosen register. The address-size attribute (represented by AddressSize) is determined by the USE attribute of the segment containing the second operand.
The address-size and operand-size attributes affect the action performed by the LEA instruction, as follows:
n
16-bit operand, 16-bit address: LEA calculates the effective 16-bit address and stores
it in the 16-bit destination register.
n
16-bit operand, 32-bit address: LEA calculates the effective 32-bit address and stores
the lower 16 bits in the 16-bit destination register.
n
32-bit operand, 16-bit address: LEA calculates the effective 16-bit address, zero extends
it, and stores it in the 32-bit destination register.
n
32-bit operand, 32-bit address: LEA calculates the effective 32-bit address and stores
it in the 32-bit destination register.
Flags Affected
None
Protected Mode Exceptions
Invalid Opcode (6) indicates the second operand is a register.
Real Address Mode Exceptions
Invalid Opcode (6) indicates the second operand is a register.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) indicates the second operand is a register.
Am486 Microprocessor Instruction Set
2-179
AMD
2.164
LEAVE
High Level Procedure Exit
Opcode
Instruction
Clocks
Description
C9
C9
LEAVE
LEAVE
5
5
Sets SP to BP, then pops BP.
Sets ESP to EBP, then pops EBP.
Operation
IF StackAddrSize = 16
THEN
SP ← BP;
pop BP;
ELSE (* StackAddrSize = 32 *)
ESP ← EBP;
pop EBP;
FI
Description
The LEAVE instruction reverses the actions of the ENTER instruction. By copying the frame
pointer to the stack pointer, the LEAVE instruction releases the stack space used by a
procedure for its local variables. The old frame pointer is popped into the BP or EBP register,
restoring the caller’s frame. A subsequent RET nn instruction removes any arguments
pushed onto the stack of the exiting procedure.
Flags Affected
None
Protected Mode Exceptions
Stack Fault (12) indicates the BP or EBP register does not point to a location within the
limits of the current stack segment.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
2-180
Am486 Microprocessor Instruction Set
AMD
2.165
LES
Loads Pointer Using ES
Opcode
Instruction
Clocks
Description
C4 /r
C4 /r
LES r16,m16:16
LES r32,m16:32
6/12
6/12
Loads ES:r16 with pointer from memory.
Loads ES:r32 with pointer from memory.
Operation
IF (OperandSize = 16)
THEN
r16 ← [Effective Address]; (* 16-bit transfer *)
ES ← [Effective Address + 2]; (* 16-bit transfer *)
ELSE (* OperandSize = 32 *)
r32 ← [Effective Address]; (* 32-bit transfer *)
ES ← [Effective Address + 4]; (* 16-bit transfer *)
FI;
IF Protected Mode and ES is loaded with a non-null selector:
Index is within limits ELSE General Protection Fault(selector);
AR byte indicates data segment ELSE General Protection Fault(selector);
IF data or non-conforming code
THEN RPL and CPL are ≤ DPL in AR byte ELSEGen.Protect.Fault(selector);
Segment is marked present ELSE Segment Not Present Fault(selector);
Load segment register with selector and RPL bits;
Load segment register with descriptor;
IF Protected Mode and ES is loaded with a null selector:
Load segment register with selector; Clear descriptor valid bit
Description
LES reads a full pointer from memory and stores it in a register pair consisting of the ES
register and a second operand-specified register. The first 16 bits are in ES and the remaining 16 or 32 bits (as specifed by the operand size) are placed into the register specified
by the r16 or r32 register operand. The segment register descriptor comes from the selector
descriptor table entry. Loading a null selector (values 0000–0003) into ES does not cause
a protection exception, but any subsequent reference to a segment with a null selector
causes a General Protection Fault (13) and no memory reference to the segment occurs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Invalid Opcode (6) indicates the second
operand is a register. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check
(17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-181
AMD
2.166
LFS
Loads Pointer Using FS
Opcode
Instruction
Clocks
Description
04 B4 /r
04 B4 /r
LFS r16,m16:16
LFS r32,m16:32
6/12
6/12
Loads FS:r16 with pointer from memory.
Loads FS:r32 with pointer from memory.
Operation
IF (OperandSize = 16)
THEN
r16 ← [Effective Address]; (* 16-bit transfer *)
FS ← [Effective Address + 2]; (* 16-bit transfer *)
ELSE (* OperandSize = 32 *)
r32 ← [Effective Address]; (* 32-bit transfer *)
FS ← [Effective Address + 4]; (* 16-bit transfer *)
FI;
IF Protected Mode and FS is loaded with a non-null selector:
Index is within limits ELSE General Protection Fault(selector);
AR byte indicates data segment ELSE Gen.Protect. Fault(selector);
IF data or non-conforming code
THEN RPL and CPL are ≤ DPL in AR byte ELSEGen.Protect.Fault(selector);
Segment is marked present ELSE Segment Not Present(selector);
Load segment register with selector and RPL bits;
Load segment register with descriptor;
IF Protected Mode and FS is loaded with a null selector:
Load segment register with selector; Clear descriptor valid bit;
Description
LFS reads a full pointer from memory and stores it in a register pair consisting of the FS
register and a second operand-specified register. The first 16 bits are in FS and the remaining 16 or 32 bits (as specifed by the operand size) are placed into the register specified
by the r16 or r32 register operand. The segment register descriptor comes from the selector
descriptor table entry. Loading a null selector (values 0000–0003) into FS does not cause
a protection exception, but any subsequent reference to a segment with a null selector
causes a General Protection Fault (13) and no memory reference to the segment occurs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Invalid Opcode (6) indicates the second
operand is a register. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check
(17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-182
Am486 Microprocessor Instruction Set
AMD
2.167
LGDT
Loads GDTR
Opcode
Instruction
Clocks
Description
0F 01 /2
LGDT m16&32
11
Loads m into GDTR.
Operation
IF OperandSize = 16
THEN GDTR.Limit:Base ← m16:24 (* 24 bits of base loaded *)
ELSE GDTR.Limit:Base ← m16:32,
FI;
Description
LGDT loads a linear base address and limit value from a 6-byte data operand in memory
into the GDTR. If a 16-bit operand is used with the LGDT instruction, the register is loaded
with a 16-bit limit and a 24-bit base, and the high-order 8 bits of the 6-byte data operand
are not used. If a 32-bit operand is used, a 16-bit limit and a 32-bit base are loaded; the
high-order 8 bits of the 6-byte operand are used as high-order base address bits.
The SGDT instruction always stores into all 48 bits of the 6-byte data operand. With the
80286 microprocessor, the upper 8 bits are undefined after SGDT executes. With the
Am386DX or Am486 microprocessors, the upper 8 bits are written with the high-order 8
address bits, for both a 16-bit operand and a 32-bit operand. If the LGDT instruction is used
with a 16-bit operand to load the register stored by the SGDT instruction, the upper 8 bits
are stored as zeros.
The LGDT instruction appears in operating system software. It is not used in application
programs. LGDT and LIDT are the only instructions that load a linear address directly (i.e.,
not a segment relative address) in Protected Mode.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates one of three conditions: the current privilege level
is not 0, the result destination is a non-writable segment, or the code or data segments
have an illegal memory-operand effective address. Invalid Opcode (6) indicates the source
operand is a register. Stack Fault (12) indicates an illegal SS segment address. Page Fault
(14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned
memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Invalid Opcode (6) indicates the source operand is a register.
Note: This instruction is valid in Real Address Mode to allow power-up initialization for
Protected Mode.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Invalid Opcode (6) indicates the source operand is a register.
Page Fault (14) indicates a page fault.
Am486 Microprocessor Instruction Set
2-183
AMD
2.168
LGS
Loads Pointer Using GS
Opcode
Instruction
Clocks
Description
0F B5 /r
0F B5 /r
LGS r16,m16:16
LGS r32,m16:32
6/12
6/12
Loads GS:r16 with pointer from memory.
Loads GS:r32 with pointer from memory.
Operation
IF (OperandSize = 16)
THEN
r16 ← [Effective Address]; (* 16-bit transfer *)
GS ← [Effective Address + 2]; (* 16-bit transfer *)
ELSE (* OperandSize = 32 *)
r32 ← [Effective Address]; (* 32-bit transfer *)
GS ← [Effective Address + 4]; (* 16-bit transfer *)
FI;
IF Protected Mode and GS is loaded with a non-null selector:
Index must be within limits ELSE General Protection Fault(selector);
AR byte indicates data segment ELSE Gen.Protect. Fault(selector);
IF data or non-conforming code
THEN RPL and CPL are ≤ DPL in AR byte ELSEGen.Protect.Fault(selector);
Segment must be marked present ELSE Segment Not Present(selector);
Load segment register with selector and RPL bits;
Load segment register with descriptor;
IF Protected Mode and GS is loaded with a null selector:
Load segment register with selector; Clear descriptor valid bit;
Description
LGS reads a full pointer from memory and stores it in a register pair consisting of the GS
register and a second operand-specified register. The first 16 bits are in GS and the remaining 16 or 32 bits (as specifed by the operand size) are placed into the register specified
by the r16 or r32 register operand. The segment register descriptor comes from the selector
descriptor table entry. Loading a null selector (values 0000–0003) into GS does not cause
a protection exception, but any subsequent reference to a segment with a null selector
causes a General Protection Fault (13) and no memory reference to the segment occurs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Invalid Opcode (6) indicates the second
operand is a register. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check
(17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-184
Am486 Microprocessor Instruction Set
AMD
2.169
LIDT
Loads IDTR
Opcode
Instruction
Clocks
Description
0F 01 /3
LIDT m16&32
11
Loads m into IDTR.
Operation
IF OperandSize = 16
THEN IDTR.Limit:Base ← m16:24 (* 24 bits of base loaded *)
ELSE IDTR.Limit:Base ← m16:32
FI;
Description
The LIDT instruction loads a linear base address and limit value from a 6-byte data operand
in memory into the IDTR. If a 16-bit operand is used with the LIDT instruction, the register
is loaded with a 16-bit limit and a 24-bit base, and the high-order 8 bits of the 6-byte data
operand are not used. If a 32-bit operand is used, a 16-bit limit and a 32-bit base are loaded;
the high-order 8 bits of the 6-byte operand are used as high-order base address bits.
The SIDT instruction always stores into all 48 bits of the 6-byte data operand. With the
80286 microprocessor, the upper 8 bits are undefined after SIDT executes. With the
Am386DX or Am486 microprocessors, the upper 8 bits are written with the high-order 8
address bits, for both a 16-bit operand and a 32-bit operand. If the LIDT instruction is used
with a 16-bit operand to load the register stored by the SIDT instruction, the upper 8 bits
are stored as zeros.
The LIDT instruction appears in operating system software. It is not used in application
programs. LGDT and LIDT are the only instructions that directly load a linear address (i.e.,
not a segment relative address) in Protected Mode.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates one of three conditions: the current privilege level
is not 0, the result destination is a non-writable segment, or the code or data segments
have an illegal memory-operand effective address. Invalid Opcode (6) indicates the source
operand is a register. Stack Fault (12) indicates an illegal SS segment address. Page Fault
(14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned
memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Invalid Opcode (6) indicates the source operand is a register.
Note: This instruction is valid in Real Address Mode to allow power-up initialization for
Protected Mode.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Invalid Opcode (6) indicates the source operand is a register.
Page Fault (14) indicates a page fault.
Am486 Microprocessor Instruction Set
2-185
AMD
2.170
LLDT
Loads LDTR
Opcode
Instruction
Clocks
Description
0F 00 /2
LLDT r/m16
11/11
Loads selector r/m16 into LDTR.
Operation
LDTR ← SRC
Description
The LLDT instruction loads the Local Descriptor Table register (LDTR). The word operand
(memory or register) used with the LLDT instruction must contain a selector to the Global
Descriptor Table (GDT). The GDT entry must be a Local Descriptor Table; the LDTR loads
from the entry. The segment registers DS, ES, SS, FS, GS, and CS are not affected. The
LDT field in the task state segment does not change. The selector operand can be 0; if so,
the LDTR is marked invalid. All descriptor references (except by the LAR, VERR, VERW,
or LSL instructions) cause a General Protection Fault (13).
Note: The LLDT instruction is used in operating system software. It is not used in application
programs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates one of three conditions: the current privilege level
is not 0, the result destination is a non-writable segment, or the code or data segments
have an illegal memory-operand effective address. General Protection Fault (13) indicates
the selector operand does not point into the Global Descriptor Table, or if the entry in the
GDT is not a Local Descriptor Table. Segment Not Present (11) indicates the LDT descriptor
is not present. Stack Fault (12) indicates an illegal SS segment address. Page Fault (14)
indicates a page fault.
Real Address Mode Exceptions
Invalid Opcode (6) occurs because the LLDT instruction is not recognized in Real Address
Mode.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) occurs because the LLDT instruction is not recognized in Virtual 8086
Mode.
Note: The operand-size attribute has no effect on this instruction.
2-186
Am486 Microprocessor Instruction Set
AMD
2.171
LMSW
Loads Machine Status Word
Opcode
Instruction
Clocks
Description
0F 01 /6
LMSW r/m16
13/13
Loads r/m16 into the machine status word.
Operation
MSW ← r/m16; (* 16 bits is stored in the machine status word *)
Description
The LMSW instruction loads the machine status word (part of the CR0 register) from the
source operand. This instruction can be used to switch to Protected Mode; if so, it must be
followed by an intrasegment jump to flush the instruction queue. The LMSW instruction will
not switch back to Real Address Mode.
Note: The LMSW instruction is used only in operating system software. It is not used in
application programs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates one of three conditions: the current privilege level
is not 0, the result destination is a non-writable segment, or the code or data segments
have an illegal memory-operand effective address. Stack Fault (12) indicates an illegal SS
segment address. Page Fault (14) indicates a page fault.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault.
Note: The operand-size attribute has no effect on this instruction. This instruction is
provided for compatibility with the 80286 microprocessor; programs for the Am486
microprocessor should use the MOV CR0, ... instruction instead. The LMSW instruction
does not affect the PG or ET bits, and it cannot be used to clear the PE bit.
Am486 Microprocessor Instruction Set
2-187
AMD
2.172
LOCK
Asserts LOCK Signal Prefix
Opcode
Instruction
Clocks
Description
F0
LOCK
1
Asserts LOCK signal for the next instruction.
Description
The LOCK prefix causes the processor to assert the LOCK signal during execution of the
following instruction. In a multiprocessor environment, use of this signal ensures that the
processor has exclusive use of any shared memory while LOCK is asserted. The readmodify-write sequence typically used to implement test and set on the processor is the BTS
instruction.
LOCK functions only with the following instructions:
BTS, BTR, BTC
XCHG
XCHG
ADD, OR, ADC, SBB, AND, SLTB, XOR
NOT, NEG, INC, DEC
CMPXCHG, XADD
mem, reg/imm
reg, mem
mem, reg
mem, reg/imm
mem
reg/mem, reg
Using the LOCK prefix with any instruction not listed above generates an undefined opcode
trap.
The XCHG instruction always asserts LOCK regardless of the presence or absence of the
LOCK prefix. The integrity of the LOCK prefix is not affected by the alignment of the memory
field. Memory locking is observed for arbitrarily misaligned fields.
Flags Affected
None
Protected Mode Exceptions
Invalid Opcode (6) indicates the LOCK prefix is used with an instruction not listed in the
‘Description’ section above; other exceptions can be generated by the subsequent (locked)
instruction.
Real Address Mode Exceptions
Invalid Opcode (6) indicates the LOCK prefix is used with an instruction not listed in the
‘Description’ section above; other exceptions can be generated by the subsequent (locked)
instruction.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) indicates the LOCK prefix is used with an instruction not listed in the
‘Description’ section above; exceptions can still be generated by the subsequent (locked)
instruction.
2-188
Am486 Microprocessor Instruction Set
AMD
2.173
LODS/LODSB/LODSD/LODSW
Loads String Operand
Opcode
Instruction
Clocks
Description
AC
AD
AD
AC
AD
AD
LODS m8
LODS m16
LODS m32
LODSB
LODSD
LODSW
5
5
5
5
5
5
Loads byte (E)SI into AL.
Loads word (E)SI into AX.
Loads doubleword (E)SI into EAX.
Loads byte DS:(E)SI into AL.
Loads doubleword DS:(E)SI into EAX.
Loads word DS:(E)SI into AX.
Operation
AddressSize = 16
THEN use SI for source-index
ELSE (* AddressSize = 32 *)
use ESI for source-index;
FI;
IF byte type of instruction
THEN
AL ← [source-index); (* byte load *)
IF DF = 0 THEN IncDec ← 1 ELSE IncDec ← –1; FI;
ELSE
IF OperandSize = 16
THEN
AX ← [source-index]; (* word load *)
IF DF = 0 THEN IncDec ← 2 ELSE IncDec ← –2; FI;
ELSE (* OperandSize = 32 *)
EAX ← [source-index]; (* doubleword load *)
IF DF = 0 THEN IncDec ← 4 ELSE IncDec ← –4; FI;
FI;
FI;
source-index ← source-index + IncDec
Description
LODS loads the memory byte, word, or doubleword at the location pointed to by the sourceindex register into the AL, AX, or EAX register. After the transfer, the instruction automatically advances the source-index register. If DF = 0 (the CLD instruction was executed), the
source index increments; if DF = 1 (the STD instruction was executed), it decrements. The
increment/decrement rate is 1 for a byte, 2 for a word, or 4 for a doubleword. If the addresssize attribute is 16 bits, the Sl register is the source-index register; otherwise, the ESI
register is used. The source data address is determined solely by the contents of the sourceindex register; load the correct index value into the register before executing LODS. LODSB,
LODSW, and LODSD are synonyms for the byte, word, and doubleword LODS instructions.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-189
AMD
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-190
Am486 Microprocessor Instruction Set
AMD
2.174
LOOP/LOOPE/LOOPNE/LOOPNZ/LOOPZ Loop Control CX Counter
Opcode
Instruction
Clocks
Description
E2
E1 cb
E0 cb
E0 cb
E1 cb
LOOP rel8
LOOPE rel8
LOOPNE rel8
LOOPNZ rel8
LOOPZ rel8
2,6
9,6
9,6
9,6
9,6
Decrements count; jumps short if CX ≠ 0.
Decrements count; jumps short if CX ≠ 0 and ZF = 1.
Decrements count; jumps short if CX ≠ 0 and ZF = 0.
Decrements count; jumps short if CX ≠ 0 and ZF = 0.
Decrements count; jumps short if CX ≠ 0 and ZF = 1.
Operation
IF AddressSize = 16 THEN CountReg is CX ELSE CountReg is ECX; FI;
CountReg ← CountReg –1;
IF instruction ≠ LOOP
THEN
IF (instruction = LOOPE) OR (instruction = LOOPZ)
THEN BranchCond ← (ZF = 1) AND (CountReg ≠ 0); FI;
IF (instruction = LOOPNE) OR (instruction = LOOPNZ)
THEN BranchCond ← (ZF = 0) AND (CountReg ≠ 0); FI; FI;
IF BranchCond
THEN
IF OperandSize = 16
THEN
IP ← IP + SignExtend(re/8);
ELSE (* OperandSize = 32 *)
EIP ← EIP + SignExtend(re/8); FI; FI
Description
LOOP instructions provide iteration control, combining loop index management with conditional branching. Load an unsigned iteration count into the count register, then code the
LOOP instruction at the end of the iterative instruction series. Make the LOOP destination
the label at the beginning of the iteration. When executed, LOOP decrements the CX or
ECX register without changing any flags. Then it checks the register and, if required, ZF.
If the conditions are met, LOOP executes a short jump to the label. The address-size
attribute determines whether to use the CX (16-bit) or ECX (32-bit) register as the count
register. The LOOP operand must be in the range from 128 (decimal) bytes before the
instruction to 127 bytes after the instruction.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the offset is beyond the current code segment limits.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: The unconditional LOOP instruction takes longer to execute than a 2-instruction
sequence that decrements the count register and jumps if the count does not equal zero.
All branches are converted into 16-byte code fetches regardless of jump address or
cacheability.
Am486 Microprocessor Instruction Set
2-191
AMD
2.175
LSL
Loads Segment Limit
Opcode
Instruction
Clocks
Description
0F 03 /r
0F 03 /r
0F 03 /r
0F 03 /r
LSL r16,r/m16
LSL r32,r/m32
LSL r16,r/m16
LSL r32,r/m32
10/10
10/10
10/10
10/10
r16 ← segment limit, selector r/m16 (byte granular)
r32 ← segment limit, selector r/m32 (byte granular)
r16 ← segment limit, selector r/m16 (page granular)
r32 ← segment limit, selector r/m32 (page granular)
Description
If the source selector within the descriptor table is visible at the CPL and RPL, and the
descriptor is a type accepted by LSL, the instruction loads a register with an unscrambled
segment limit and sets ZF. Otherwise, ZF is cleared and the destination register is unchanged. The segment limit loads as a byte-granular value. If the descriptor has a pagegranular segment limit, LSL translates it to a byte limit before loading it into the destination
register (shifts the 20-bit “raw” limit from descriptor 12 bits left, then ORs with 00000FFFh).
The 32-bit forms of the LSL instruction store the 32-bit byte granular limit in the 32-bit
destination register. Code and data segment descriptors are valid for the LSL instruction.
The valid special segment and gate descriptor types for LSL are in the following table:
Type
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Name
Valid/Invalid
Invalid
Available 80286 TSS
LDT
Busy 80286 TSS
80286 call gate
80286/486 task gate
80286 trap gate
80286 interrupt gate
Invalid
Available 486 TSS
Invalid
Busy 486 TSS
486 call gate
Invalid
486 trap gate
486 interrupt gate
Invalid
Valid
Valid
Valid
Invalid
Invalid
Invalid
Invalid
Valid
Valid
Invalid
Valid
Invalid
Invalid
Invalid
Invalid
Flags Affected
If the selector is invisible or of the wrong type, LSL clears ZF; otherwise, it is set.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Invalid Opcode (6) occurs because LSL is not recognized in Real Address Mode.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) occurs because LSL is not recognized in Virtual 8086 Mode. If CPL is
3, Alignment Check (17) indicates there is an unaligned memory reference.
2-192
Am486 Microprocessor Instruction Set
AMD
2.176
LSS
Loads Pointer Using SS
Opcode
Instruction
Clocks
Description
0F B2 /r
0F B2 /r
LSS r16,m16:16
LSS r32,m16:32
6/12
6/12
Loads SS:r16 with pointer from memory.
Loads SS:r32 with pointer from memory.
Operation
IF (OperandSize = 16)
THEN
r16 ← [Effective Address]; (* 16-bit transfer *)
SS ← [Effective Address + 2]; (* 16-bit transfer *)
(* In Protected Mode, load the descriptor into the segment register *)
ELSE (* OperandSize = 32 *)
r32 ← [Effective Address]; (* 32-bit transfer *)
SS ← [Effective Address + 4]; (* 16-bit transfer *)
(* In Protected Mode, load the descriptor into the segment register *)
FI;
IF selector is null THEN General Protection Fault; FI;
Selector index is in limits ELSE General Protection Fault(selector);
Selector’s RPL = CPL ELSE General Protection Fault(selector);
AR byte indicates a writable data segment
ELSE General Protection Fault(selector);
DPL in the AR byte equals CPL ELSE General Protection Fault(selector);
Segment is marked present ELSE Stack Fault(selector);
Load SS with selector;
Load SS with descriptor;
Description
LSS reads a full pointer from memory and stores it in a register pair consisting of the SS
register and a second operand-specified register. The first 16 bits are in SS and the remaining 16 or 32 bits (as specifed by the operand size) are placed into the register specified
by the r16 or r32 register operand. The segment register descriptor comes from the selector
descriptor table entry.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Invalid Opcode (6) indicates the second
operand is a register. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check
(17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-193
AMD
2.177
LTR
Loads Task Register
Opcode
Instruction
Clocks
Description
0F 00 /3
LTR r/m16
20/20
Loads EA word into task register.
Description
The LTR instruction loads the task register from the source register or memory location
specified by the operand. The loaded TSS is marked busy. A task switch does not occur.
Note: The LTR instruction is used only in operating system software. It is not used in
application programs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the current privilege level is not 0 or that
there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. General Protection Fault (13) with a
selector indicates the object named by the source selector is not a TSS or is already busy.
Segment Not Present (11) with a selector indicates the TSS is marked “not present.” Page
Fault (14) indicates a page fault.
Real Address Mode Exceptions
Invalid Opcode (6) occurs because the LTR instruction is not recognized in Real Address
Mode.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) occurs because the LTR instruction is not recognized in Virtual 8086
Mode.
Note: The operand-size attribute has no effect on this instruction.
2-194
Am486 Microprocessor Instruction Set
AMD
2.178
MOV
Moves Data/Registers
Opcode Instruction
Clocks
Description
88 /r
89 /r
89 /r
8A /r
8B /r
8B /r
8C /r
8E /r
A0
A1
A1
A2
A3
A3
B0 + rb
B8 + rw
B8 + rd
C6
C7
C7
1
1
1
1
1
1
3/3
3/9
1
1
1
1
1
1
1
1
1
1
1
1
Moves byte register to r/m byte.
Moves word register to r/m word.
Moves doubleword register to r/m doubleword.
Moves r/m byte to byte register.
Moves r/m word to word register.
Moves r/m doubleword to doubleword register.
Moves segment register to r/m word.
Moves r/m word to segment register.
Moves byte at (seg:offset) to AL.
Moves word at (seg:offset) to AX.
Moves doubleword at (seg:offset) to EAX.
Moves AL to (seg:offset).
Moves AX to (seg:offset).
Moves EAX to (seg:offset).
Moves immediate byte to register.
Moves immediate word to register.
Moves immediate doubleword to register.
Moves immediate byte to r/m byte.
Moves immediate word to r/m word.
Moves immediate doubleword to r/m doubleword.
16
4
4
10
10
11
11
4
4
3
6
Moves (register) to (control register).
Moves (control register) to (register).
Moves (register) to (control register).
Moves (debug register) to (register).
Moves (debug register) to (register).
Moves (register) to (debug register).
Moves (register) to (debug register).
Moves (test register) to (register).
Moves (register) to (test register).
Moves (test register3) to (register).
Moves (registers) to (test register3).
0F 22 /r
0F 20 /r
0F 22 /r
0F 21 /r
0F 21 /r
0F 23 /r
0F 23 /r
0F 24 /r
0F 26 /r
0F 24 /r
0F 26 /r
MOV r/m8,r8
MOV r/m16,r16
MOV r/m32,r32
MOV r8,r/m8
MOV r16,r/m16
MOV r32,r/m32
MOV r/m16,Sreg
MOV Sreg,r/m16
MOV AL,moffs8
MOV AX,moffs16
MOV EAX,moffs32
MOV moffs8,AL
MOV moffs16,AX
MOV moffs32,EAX
MOV reg8,imm8
MOV reg16,imm16
MOV reg32,imm32
MOV r/m8,imm8
MOV r/m16,imm16
MOV r/m32,imm32
Special Registers:
MOV CR0,r32
MOV r32,CR0/CR2/CR3
MOV CR2/CR3,r32
MOV r32,DR0/DR1/DR2/DR3
MOV r32,DR6/DR7
MOV DR0 -3,r32
MOV DR6/DR7,r32
MOV r32,TR4/TR5/TR6/TR7
MOV TR4/TR5/TR6/TR7,r32
MOV r32,TR3
MOV TR3,r32
Note: moffs8, moffs16, and moffs32 all consist of a simple offset relative to the segment
base. The 8, 16, and 32 refer to the data size. The address-size attribute of the instruction
determines the size of the offset, either 16 or 32 bits.
Operation
DEST ← SRC
Description
The MOV instruction copies the second operand to the first operand. If the destination is
a segment register (DS, ES, SS, etc.), then descriptor data is also loaded into the register.
The data for the register is obtained from the descriptor table entry for the selector given.
You can load a null selector (values 0000–0003) into the DS and ES registers without
causing an exception; however, use of the DS or ES register causes a General Protection
Fault (13) exception and no memory reference occurs. A MOV into SS instruction inhibits
all interrupts until after the execution of the next instruction (which is presumably a MOV
into ESP instruction).
Am486 Microprocessor Instruction Set
2-195
AMD
Loading a segment register under Protected Mode results in special checks and actions,
as described in the following listing:
IF SS is loaded;
THEN
IF selector is null THEN General Protection Fault; FI;
Index must be within limits else General Protection Fault(selector);
Selector’s RPL equals CPL else General Protection Fault(selector);
AR byte indicates a writable data segment
ELSE General Protection Fault(selector);
DPL in the AR byte equals CPL ELSE General Protection Fault(selector);
Segment is marked present ELSE Stack Fault(selector);
Load SS with selector;
Load SS with descriptor; FI;
IF DS, ES, FS or GS is loaded with non-null selector;
THEN
Index is within limits ELSE General Protection Fault(selector);
AR byte indicates data or readable code segment
ELSE General Protection Fault(selector);
IF data or non-conforming code segment
THEN RPL and CPL are ≤ DPL in AR byte;
ELSE Gen.Protect.Fault(selector);FI;
Segment is marked present ELSE Segment Not Present Fault(selector);
Load segment register with selector;
Load segment register with descriptor;FI;
IF DS, ES, FS or GS is loaded with a null selector;
THEN
Load segment register with selector;
Clear descriptor valid bit; FI
The last eleven listed forms of the MOV instruction store or load the following special
registers in or from a general purpose register:
n
Control registers CR0, CR2, and CR3
n
Debug Registers DRO, DR1, DR2, DR3, DR6, and DR7
n
Test Registers TR3, TR4, TR5, TR6, and TR7
Note: 32-bit operands are always used with these instructions, regardless of the operandsize attribute.
Flags Affected
MOV data: None
MOV register: OF, SF, ZF, AF, PF, and CF are undefined.
Protected Mode Exceptions
MOV data: General Protection Fault (13), Stack Fault (12), and Segment Not Present (11)
occur if a segment register is being loaded; otherwise, General Protection Fault (13) indicates either that the result is in a non-writable segment or there is an illegal memory-operand
effective address in the code or data segments. Stack Fault (12) indicates an illegal SS
segment address. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
MOV register: General Protection Fault (13) indicates the current privilege level is not 0.
2-196
Am486 Microprocessor Instruction Set
AMD
Real Address Mode Exceptions
MOV data: General Protection Fault (13) indicates that part of the operand lies outside the
effective address space: 0 to 0FFFFh.
MOV register: None
Virtual 8086 Mode Exceptions
MOV data: General Protection Fault (13) indicates that part of the operand lies outside the
effective address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3,
Alignment Check (17) indicates there is an unaligned memory reference.
MOV register: General Protection Fault (13) occurs if instruction execution is attempted.
Note: MOV register instructions must be executed at privilege level 0 or in Real Address
Mode; otherwise, a protection exception will be raised. The reg field within the ModR/M
byte specifies which of the special registers in each category is involved. The two bits in
the mod field are always 11. The r/m field specifies the general register involved. Always
set undefined or reserved bits to the value previously read.
Am486 Microprocessor Instruction Set
2-197
AMD
2.179
MOVS/MOVSB/MOVSD/MOVSW
Moves Data from String to String
Opcode
Instruction
Clocks
Description
A4
A5
A5
A4
A5
A5
MOVS m8,m8
MOVS m16,m16
MOVS m32,m32
MOVSB
MOVSD
MOVSW
7
7
7
7
7
7
Moves byte (E)SI to ES:(E)DI.
Moves word (E)SI to ES:(E)DI.
Moves doubleword (E)SI to ES:(E)DI.
Moves byte (E)SI to ES:(E)DI.
Moves doubleword (E)SI to ES:(E)DI.
Moves word (E)SI to ES:(E)DI.
Operation
IF (instruction = MOVSD) OR (instruction has doubleword operands)
THEN OperandSize ← 32;
ELSE OperandSize ← 16;
IF AddressSize = 16
THEN use Sl for source-index and DI for destination-index;
ELSE (* AddressSize = 32 *)
use ESI for source-index and EDI for destination-index; FI;
IF byte type of instruction
THEN
[destination-index] ← [source-index); (* byte assignment *)
IF DF = 0 THEN IncDec ← 1 ELSE IncDec ← –1; FI;
ELSE
IF OperandSize = 16
THEN
[destination-index] ← [source-index]; (* word assignment *)
IF DF = 0 THEN IncDec ← 2 ELSE IncDec ← –2; FI;
ELSE (* OperandSize = 32 *)
[destination-index] ← [source-index); (* doubleword assignment *)
IF DF = 0 THEN IncDec ← 4 ELSE IncDec ← –4; FI;
FI;
FI;
source-index ← source-index + IncDec;
destination-index ← destination-index + IncDec
Description
MOVS copies the byte, word, or doubleword at SI or ESI to the byte, word, or doubleword
at ES:DI or ES:EDI. The destination operand must be addressable from the ES register;
no segment override is possible for the destination. You can use a segment override for
the source operand; the default is the DS register. The contents of SI and DI (or ESI and
EDI for 32-bit values) determine the source and destination addresses. Load the correct
index values into the SI and DI (or ESI and EDI) registers before executing the MOVS
instruction. After moving the data, MOVS advances the SI and DI (or ESI and EDI) registers
automatically. If the Direction Flag (DF) is 0 (see STC), the registers increment; if DF is 1
(see STD), the registers decrement. The stepping is 1 for a byte, 2 for a word, or 4 for a
doubleword operand.
MOVSB, MOVSW, and MOVSD are synonyms for the byte, word, and doubleword MOVS
instructions.
You can use the REP prefix with MOVS for movement of CX bytes or words.
Flags Affected
None
2-198
Am486 Microprocessor Instruction Set
AMD
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-199
AMD
2.180
MOVSX
Moves with Sign Extension
Opcode
Instruction
Clocks
Description
0F BE /r
0F BE /r
0F BF /r
MOVSX r16,r/m8
MOVSX r32,r/m8
MOVSX r32,r/m16
3/3
3/3
3/3
Moves byte to word with sign-extend.
Moves byte to doubleword with sign-extend.
Moves word to doubleword with sign-extend.
Operation
DEST ← SignExtend(SRC)
Description
The MOVSX instruction reads the contents of the effective address or register as a byte or
a word, sign-extends the value to the operand-size attribute of the instruction (16 or 32
bits), and stores the result in the destination register.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-200
Am486 Microprocessor Instruction Set
AMD
2.181
MOVZX
Moves with Zero Extension
Opcode
Instruction
Clocks
Description
0F B6 /r
0F B6 /r
0F B7 /r
MOVZX r16,r/m8
MOVZX r32,r/m8
MOVZX r32,r/m16
3/3
3/3
3/3
Moves byte to word with zero-extend.
Moves byte to doubleword with zero-extend.
Moves word to doubleword with zero-extend.
Operation
DEST ← ZeroExtend(SRC)
Description
The MOVZX instruction reads the contents of the effective address or register as a byte or
a word, zero extends the value to the operand-size attribute of the instruction (16 or 32
bits), and stores the result in the destination register.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-201
AMD
2.182
MUL
Unsigned Multiply
Opcode
Instruction
Clocks
Description
F6 /4
F7 /4
F7 /4
MUL AL,r/m8
MUL AX,r/m16
MUL EAX,r/m32
13/18,13/18
13/26,13/26
13/42,13/42
Unsigned multiply (AX←AL ⋅ r/m byte)
Unsigned multiply (DX:AX ← AX ⋅ r/m word)
Unsigned multiply (EDS:EAX ← EAX ⋅ r/m doubleword)
Actual clock count depends on the most-significant bit location in the optimizing multiplier. If the multipler (m)
= 0, the clock count is 9. Otherwise clock = max (ceiling(log2 |m|), 3) + 6.
Operation
IF byte-size operation
THEN AX ← AL ⋅ r/m8
ELSE (* word or doubleword operation *)
IF OperandSize = 16
THEN DX:AX ← AX ⋅ r/m16
ELSE (* OperandSize = 32 *)
EDX:EAX ← EAX ⋅ r/m32
FI;
FI
Description
The MUL instruction performs unsigned multiplication. Its actions depend on the size of its
operand, as follows:
n
A byte operand is multiplied by the AL value; the result is left in the AX register. The CF
and OF flap are cleared if the AH value is 0; otherwise, they are set.
n
A word operand is multiplied by the AX value; the result is left in the DX:AX register pair.
The DX register contains the high-order 16 bits of the product. CF and OF are cleared
if the DX value is 0; otherwise, they are set.
n
A doubleword operand is multiplied by the EAX value and the result is left in the EDX:EAX
register. The EDX register contains the high-order 32 bits of the product. CF and OF are
cleared if the EDX value is 0; otherwise, they are set.
Flags Affected
OF and CF are cleared if the upper half of the result is 0; otherwise they are set. SF, ZF,
AF, and PF are undefined.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-202
Am486 Microprocessor Instruction Set
AMD
2.183
NEG
Two’s Complement Negation
Opcode
Instruction
Clocks
Description
F6 /3
F7 /3
F7 /3
NEG r/m8
NEG r/m16
NEG r/m32
1/3
1/3
1/3
Performs a two’s complement negation of r/m byte.
Performs a two’s complement negation of r/m word.
Performs a two’s complement negation of r/m doubleword.
Operation
IF r/m = 0 THEN CF ← 0 ELSE CF ← 1; FI;
r/m ← –r/m
Description
The NEG instruction replaces the value of a register or memory operand with its two’s
complement. The operand is subtracted from zero and the result is placed in the operand.
NEG sets CF if the operand is not zero. If the operand is zero, NEG clears CF.
Flags Affected
CF is set unless the operand is zero. OF, SF, ZF, and PF are set according to the result.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-203
AMD
2.184
NOP
No Operation
Opcode
Instruction
Clocks
Description
90
NOP
1
No operation is performed.
Description
The NOP instruction performs no operation. The NOP instruction is a 1-byte instruction that
takes up space but affects none of the machine context except the instruction pointer.
The NOP instruction is an alias mnemonic for the XCHG AX, AX or XCHG EAX, EAX
instruction.
Flags Affected
None
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-204
Am486 Microprocessor Instruction Set
AMD
2.185
NOT
One’s Complement Negation
Opcode
Instruction
Clocks
Description
F6 /2
F7 /2
F7 /2
NOT r/m8
NOT r/m16
NOT r/m32
1/3
1/3
1/3
Reverses each bit in r/m byte.
Reverses each bit in r/m word.
Reverses each bit in r/m doubleword.
Operation
r/m ← NOT r/m
Description
The NOT instruction inverts the operand; every 1 becomes a 0, and vice versa.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-205
AMD
2.186
OR
Logical Inclusive OR
Opcode
Instruction
Clocks
Description
0C ib
0D iw
0D id
80 /1 ib
81 /1 iw
81 /1 id
83 /1 ib
83 /1 ib
08 /r
09 /r
09 /r
0A /r
0B /r
0B /r
OR AL,imm8
OR AX,imm16
OR EAX,imm32
OR r/m8,imm8
OR r/m16,imm16
OR r/m32,imm32
OR r/m16,imm8
OR r/m 32,imm8
OR r/m8,r8
OR r/m16,r16
OR r/m32,r32
OR r8,r/m8
OR r16,r/m16
OR r32,r/m32
1
1
1
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/2
1/2
1/2
ORs immediate byte to AL.
ORs immediate word to AX.
ORs immediate doubleword to EAX.
ORs immediate byte to r/m byte.
ORs immediate word to r/m word.
ORs immediate word to r/m doubleword.
ORs sign-extended immediate byte to r/m word.
ORs sign-ext. immediate byte to r/m doubleword.
ORs byte register to r/m byte.
ORs word register to r/m word.
ORs doubleword register to r/m doubleword.
ORs r/m byte to byte register.
ORs r/m word to word register.
ORs r/m doubleword to doubleword register.
Operation
DEST ← DEST OR SRC;
CF ← 0;
OF ← 0
Description
The OR instruction computes the inclusive OR of its two operands and places the result in
the first operand. Each bit of the result is 0 if both corresponding bits of the operands are
0; otherwise, each bit is 1.
Flags Affected
OF and CF are cleared. SF, ZF, and PF are set according to the result. AF is undefined.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-206
Am486 Microprocessor Instruction Set
AMD
2.187
OUT
Outputs to Port
Opcode
Instruction
Clocks*
Description
E6 ib
E7 ib
E7 ib
EE
EF
EF
OUT imm8,AL
OUT imm8,AX
OUT imm8,EAX
OUT DX,AL
OUT DX,AX
OUT DX,EAX
All forms:
rm = 16, vm = 29
If CPL ≤ IOPL,
pm = 11,10
If CPL>IOPL,
pm = 31,30
Outputs byte AL to immediate port number.
Outputs word AX to immediate port number.
Outputs doubleword EAX to imm. port number.
Outputs byte AL to port number in DX.
Outputs word AX to port number in DX.
Outputs double EAX to port number in DX.
*rm is Real Mode, vm is Virtual 8086 Mode, pm is Protected Mode. For pm, the first number is the
value for the imm8 form, and the second number is for the DX form of the port number.
Operation
IF (PE = 1) AND ((VM = 1) OR (CPL > IOPL))
THEN (* Virtual 8086 Mode, or Protected Mode with CPL > IOPL *)
IF NOT I/O-Permission (DEST, width(DEST))
THEN General Protection Fault (13);
FI;
FI;
[DEST] ← SRC; (* I/O address space used *)
Description
The OUT instruction transfers a data byte or data word from the register (AL, AX, or EAX)
given as the second operand to the output port numbered by the first operand. Output to
any port from 0 to 65535 is performed by placing the port number in the DX register and
then using an OUT instruction with the DX register as the first operand. If the instruction
contains an 8-bit port ID, that value is zero-extended to 16 bits.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates the current privilege level is higher (has less privilege) than the I/O privilege level, and any of the corresponding I/O permission bits in the
TSS equals 1.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that one of the corresponding I/O permission bits
in the TSS equals 1.
Am486 Microprocessor Instruction Set
2-207
AMD
2.188
OUTS/OUTSB/OUTSD/OUTSW
Output String to Port
Opcode
Instruction
Clocks
Description
6E
6F
6F
6E
6F
6F
OUTS DX,r/m8
OUTS DX,r/m16
OUTS DX,r/m32
OUTSB
OUTSD
OUTSW
All forms:
rm = 17, vm = 30
If CPL ≤ IOPL,
pm = 10
If CPL>IOPL,
pm = 32
Outputs byte (E)SI to port in DX.
Outputs word (E)SI to port in DX.
Outputs doubleword (E)SI to port in DX.
Outputs byte (E)SI to port in DX.
Outputs word (E)SI to port in DX.
Outputs doubleword (E)SI to port in DX.
Operation
IF AddressSize = 16
THEN use Sl for source-index;
ELSE (* AddressSize = 32 *)
use ESI for source-index;
FI;
IF (PE = 1) AND ((VM = 1) OR (CPL > IOPL))
THEN (* Virtual 8086 Mode, or Protected Mode with CPL > IOPL *)
IF NOT I/O-Permission (DEST, width(DEST))
THEN General Protection Fault (13);
FI;
FI;
IF byte type of instruction
THEN
[DX] ← [source-index]; (* Write byte at DX 1/0 address *)
IF DF = 0 THEN IncDec ← 1 ELSE IncDec ← –1; FI;
FI;
IF OperandSize = 16
THEN
[DX] ← [source-index]; (* Write word at DX I/O address *)
IF DF = 0 THEN IncDec ← 2 ELSE IncDec ← –2; FI;
FI;
IF OperandSize = 32
THEN
[DX] ← [source-index]; (* Write doubleword at DX I/O address *)
IF DF = 0 THEN IncDec ← 4 ELSE IncDec ← –4; FI;
FI;
FI;
source-index ← source-index + IncDec
Description
OUTS transfers data from the address indicated by the source-index register SI (16-bit
addresses) or ESI (32-bit addresses) to the output port addressed by the DX register. OUTS
does not allow specification of the port number as an immediate value. You must address
the port through the DX register value. Load the correct values into the DX register and the
source-index (SI or ESI) register before executing the OUTS instruction.
After the transfer, the source-index register advances automatically. If the Direction Flag
(DF) is 0 (see CLD), the source-index register increments; if DF is 1 (see STD), it decrements. The increment/decrement rate is 1 for a byte, 2 for a word, or 4 for a doubleword.
OUTSB, OUTSW, and OUTSD are synonyms for the byte, word, and doubleword OUTS
instructions.
You can use the REP prefix with the OUTS instruction for block output of CX bytes or words.
2-208
Am486 Microprocessor Instruction Set
AMD
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates one of three conditions: the current privilege level
is greater than the I/O privilege level and at least one of the I/O permission bits in TSS
equals 1, the result destination is a non-writable segment, or the code or data segments
have an illegal memory-operand effective address. Stack Fault (12) indicates an illegal SS
segment address. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that at least one of the corresponding I/O permission
bits in TSS equals 1. General Protection Fault (13) indicates that part of the operand lies
outside the effective address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-209
AMD
2.189
POP
Pops Word from Stack
Opcode
Instruction
Clocks
Description
8F /0
8F /0
58 + rw
58 + rd
1F
07
17
0F A1
0F A9
POP m16
POP m32
POP r16
POP r32
POP DS
POP ES
POP SS
POP FS
POP GS
6
6
4
4
3
3
3
3
3
Pops top of stack into memory word.
Pops top of stack into memory doubleword.
Pops top of stack into word register.
Pops top of stack into doubleword register.
Pops top of stack into DS.
Pops top of stack into ES.
Pops top of stack into SS.
Pops top of stack into FS.
Pops top of stack into GS.
Operation
IF StackAddrSize = 16
THEN
IF OperandSize = 16
THEN
DEST ← (SS:SP); (* copy a word *)
SP ← SP + 2;
ELSE (* OperandSize = 32 *)
DEST ← (SS:SP); (* copy a doubleword *)
SP ← SP + 4 FI;
ELSE (* StackAddrSize = 32 * )
IF OperandSize = 16
THEN
DEST ← (SS: ESP); (* copy a word *)
ESP ← ESP + 2;
ELSE (* OperandSize = 32 *)
DEST ← (SS:ESP); (* copy a doubleword *)
ESP ← ESP + 4 FI;FI;
(* Protected Mode execution uses the following special checks and actions *)
IF SS is loaded:
IF selector is null THEN General Protection Fault;
Selector index is within its descriptor table limits
ELSE General Protection Fault(selector);
Selector’s RPL equals CPL ELSE General Protection Fault(selector);
AR byte indicates writable data segment
ELSE General Protection Fault(selector);
DPL in the AR byte equals CPL ELSE General Protection Fault(selector);
Segment must be marked present ELSE Stack Fault(selector);
Load SS register with selector;
Load SS register with descriptor;
IF DS, ES, FS or GS is loaded with non-null selector:
AR byte must indicate data or readable code segment
ELSE General Protection Fault(selector);
IF data or non-conforming code
THEN RPL and CPL must be less than or equal to DPL in
AR byte
ELSE General Protection Fault (13)(selector) FI;
Segment must be marked present ELSE Segment Not Present (11)(selector);
Load segment register with selector;
Load segment register with descriptor;
IF DS, ES, FS, or GS is loaded with a null selector:
Load segment register with selector
Clear valid bit in invisible portion of register
2-210
Am486 Microprocessor Instruction Set
AMD
Description
POP loads the word at the top of the processor stack into the destination specified by the
operand. The top of the stack is specified by the contents of SS and either stack pointer
register: SP for 16-bit addresses or ESP for 32-bit addresses. The stack pointer increments
by 2 for a 16-bit operand or by 4 for a 32-bit operand to point to the new top of stack.
If the destination operand is a segment register (DS, ES, FS, GS, or SS), the value popped
must be a selector. In Protected Mode, loading the selector initiates automatic loading of
the descriptor information associated with that selector into the hidden part of the segment
register; loading also initiates validation of both the selector and the descriptor information.
A null value (0000–0003) may be popped into the DS, ES, FS, or GS register without
causing a protection exception. An attempt to reference a segment whose corresponding
segment register is loaded with a null value causes a General Protection Fault (13) exception. No memory reference occurs. The saved value of the segment register is null.
A POP SS instruction inhibits all interrupts, including NMI, until after execution of the next
instruction. This allows sequential execution of POP SS and POP SP (or POP ESP) instructions without danger of having an invalid stack during an interrupt. However, use of
the LSS instruction is the preferred method of loading the SS and SP (or ESP) registers.
A POP-to-memory instruction that uses the stack pointer as a base register references
memory after the POP. The base is the value of the stack pointer after the instruction
executes.
Note: POP CS is not a 486-processor instruction; use RET to pop from the stack into CS.
Flags Affected
None
Protected Mode Exceptions
Segment Not Present (11) occurs if the segment descriptor indicates the segment is not
present in memory; a Stack Fault (12) and a General Protection Fault (13) occur automatically with this error. By itself, a Stack Fault (12) indicates either that the current top of stack
is not within the stack segment, or that the SS segment address is illegal. By itself, a General
Protection Fault (13) indicates either that the result is in a non-writable segment or there
is an illegal memory-operand effective address in the code or data segments. Page Fault
(14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned
memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: Back-to-back PUSH/POP instruction sequences are allowed without incurring an
additional clock. The SSB bit determines the Stack Address Size. Pop ESP instructions
increment the stack pointer (ESP) before data at the old top of stack is written into the
destination.
Am486 Microprocessor Instruction Set
2-211
AMD
2.190
POPA
Pops All 16-Bit General Registers
Opcode
Instruction
Clocks
Description
61
POPA
9
Pops DI, SI, BP, BX, DX, CX, and AX.
Operation
DI ← Pop();
Sl ← Pop();
BP ← Pop();
Increment SP by 2 (* skip next 2 bytes of stack *)
BX ← Pop();
DX ← Pop();
CX ← Pop();
AX ← Pop()
Description
POPA pops the eight 16-bit general registers, but it discards the SP value instead of loading
it into the SP register. POPA reverses a previous PUSHA, restoring the general registers
to their values before the PUSHA instruction was executed. POPA pops the DI register first.
Flags Affected
None
Protected Mode Exceptions
Stack Fault (12) indicates the starting or ending stack address is not within the stack
segment. Page Fault (14) indicates a page fault.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault.
2-212
Am486 Microprocessor Instruction Set
AMD
2.191
POPAD
Pops All 32-Bit General Registers
Opcode
Instruction
Clocks
Description
61
POPAD
9
Pops EDI, ESI, EBP, EDX, ECX, and EAX.
Operation
EDI ← Pop();
ESI ← Pop();
EBP ← Pop();
increment SP by 4 (* skip next 4 bytes of stack *)
EBX ← Pop();
EDX ← Pop();
ECX ← Pop();
EAX ← Pop()
Description
POPAD pops the eight 32-bit general registers, but discards the ESP value instead of
loading it into the ESP register. POPAD reverses the previous PUSHAD instruction, restoring the general registers to their values before the PUSHAD instruction executed. POPAD
pops the EDI register first.
Flags Affected
None
Protected Mode Exceptions
Stack Fault (12) indicates the starting or ending stack address is not within the stack
segment. Page Fault (14) indicates a page fault.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault.
Am486 Microprocessor Instruction Set
2-213
AMD
2.192
POPF/POPFD
Pops Stack into FLAGS or EFLAGS Register
Opcode
Instruction
Clocks
Description
9D
9D
POPF
POPFD
9, pm = 6
9, pm = 6
Pops word on top of stack into FLAGS.
Pops doubleword on top of stack into EFLAGS.
Operation
Flags ← Pop()
Description
POPF and POPFD instructions pop a word or doubleword on the top of the stack and store
the value in the FLAGS or EFLAGS register. If the instruction operand-size attribute is 16
bits, a word is popped and stored in the FLAGS register. If the operand-size attribute is 32
bits, a doubleword is popped and stored in the EFLAGS register.
Note: Note that bits 16 and 17 of the EFLAGS register, called the VM and RF flags,
respectively, are not affected by the POPF or POPFD instruction.
The I/O privilege level is altered only when executing at privilege level 0. The Interrupt Flag
is altered only when executing at a level at least as privileged as the I/O privilege level.
(Real Address Mode is equivalent to privilege level 0.) If a POPF instruction is executed
with insufficient privilege, an exception does not occur and the privileged bits do not change.
Flags Affected
All except the VM and RF flags are affected.
Protected Mode Exceptions
Stack Fault (12) indicates the top of stack is not within the stack segment.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
To maintain emulation, General Protection Fault (13) indicates the I/O privilege level is less
than 3.
2-214
Am486 Microprocessor Instruction Set
AMD
2.193
PUSH
Pushes Operand onto Stack
Opcode
Instruction
Clocks
Description
FF /6
FF /6
50 + /r
50 + /r
6A
68
68
0E
16
1E
06
0F A0
0F A8
PUSH m16
PUSH m32
PUSH r16
PUSH r32
PUSH imm8
PUSH imm16
PUSH imm32
PUSH CS
PUSH SS
PUSH DS
PUSH ES
PUSH FS
PUSH GS
4
4
1
1
1
1
1
3
3
3
3
3
3
Pushes memory word
Pushes memory doubleword
Pushes register word
Pushes register doubleword
Pushes immediate byte
Pushes immediate word
Pushes immediate doubleword
Pushes CS
Pushes SS
Pushes DS
Pushes ES
Pushes FS
Pushes GS
Operation
IF StackAddrSize = 16
THEN
IF OperandSize = 16 THEN
SP ← SP 2;
(SS:SP) ← (SOURCE); (* word assignment *)
ELSE
SP ← SP – 4;
(SS:SP) ← (SOURCE); (* doubleword assignment *) FI;
ELSE (* StackAddrSize = 32 *)
IF OperandSize = 16
THEN
ESP ← ESP – 2;
(SS:ESP) ← (SOURCE); (* word assignment *)
ELSE
ESP ← ESP – 4;
(SS:ESP) ← (SOURCE); (* doubleword assignment *) FI;
FI
Description
PUSH decrements the stack pointer by 2 (16-bit operands) or 4 (32-bit operands). Then
PUSH places the operand on the new stack top, indicated by the stack pointer.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates either that the new value of SP or ESP register is outside the stack
segment limit, or that there is an illegal SS segment address. Page Fault (14) indicates a
page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned memory
reference.
Real Address Mode Exceptions
None, but if SP or ESP is 1, the processor shuts down due to a lack of stack space.
Am486 Microprocessor Instruction Set
2-215
AMD
Virtual 8086 Mode Exceptions
Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is
an unaligned memory reference. If SP or ESP is 1, the processor shuts down due to a lack
of stack space.
Note: When used with a memory operand, PUSH takes longer to execute than a twoinstruction sequence that moves the operand through a register. Back-to-back PUSH/POP
instruction sequences are allowed without incurring an additional clock. Selective pushes
write only to the top of the stack.
2-216
Am486 Microprocessor Instruction Set
AMD
2.194
PUSHA
Pushes All 16-Bit General Registers
Opcode
Instruction
Clocks
Description
60
PUSHA
11
Pushes AX, CX, DX, BX, original SP, BP, SI, and DI.
Operation
Temp ← (SP);
Push(AX);
Push(CX);
Push(DX);
Push(BX);
Push(Temp);
Push(BP);
Push(SI);
Push(DI)
Description
PUSHA saves the 16-bit general registers on the processor stack. PUSHA decrements the
stack pointer (SP) by 16 to accommodate the required 8-word field. Because the registers
are pushed onto the stack in the order in which they were given, they appear in the 16 new
stack bytes in reverse order. The last register pushed is the DI register.
Flags Affected
None
Protected Mode Exceptions
Stack Fault (12) indicates the starting or ending stack address is outside the stack segment
limit. Page Fault (14) indicates a page fault.
Real Address Mode Exceptions
General Protection Fault (13) occurs if SP equals 7, 9, 11, 13, or 15. If the SP register
equals 1, 3, or 5 before executing the PUSHA instruction, the processor shuts down.
Virtual 8086 Mode Exceptions
General Protection Fault (13) occurs if SP equals 7, 9, 11, 13, or 15. If the SP register
equals 1, 3, or 5 before executing the PUSHA instruction, the processor shuts down. Page
Fault (14) indicates a page fault.
Am486 Microprocessor Instruction Set
2-217
AMD
2.195
PUSHAD
Pushes All 32-Bit General Registers
Opcode
Instruction
Clocks
Description
60
PUSHAD
11
Pushes EAX, ECX, EDX, EBX, original ESP, EBP, ESI, and EDI.
Operation
Temp ← (ESP);
Push(EAX);
Push(ECX);
Push(EDX);
Push(EBX);
Push(Temp);
Push(EBP);
Push(ESI);
Push(EDI)
Description
PUSHAD saves the 32-bit general registers on the processor stack. PUSHAD decrements
the stack pointer (ESP) by 32 to accommodate the eight doubleword values. Because the
registers are pushed onto the stack in the order in which they were given, they appear in
the 32 new stack bytes in reverse order. The last register pushed is the EDI register.
Flags Affected
None
Protected Mode Exceptions
Stack Fault (12) indicates the starting or ending stack address is outside the stack segment
limit. Page Fault (14) indicates a page fault.
Real Address Mode Exceptions
General Protection Fault (13) occurs if SP equals 7, 9, 11, 13, or 15. If the SP register
equals 1, 3, or 5 before executing the PUSHAD instruction, the processor shuts down.
Virtual 8086 Mode Exceptions
General Protection Fault (13) occurs if SP equals 7, 9, 11, 13, or 15. If the SP register
equals 1, 3, or 5 before executing the PUSHAD instruction, the processor shuts down.
Page Fault (14) indicates a page fault.
2-218
Am486 Microprocessor Instruction Set
AMD
2.196
PUSHF/PUSHFD
Pushes FLAGS Register onto the Stack
Opcode
Instruction
Clocks
Description
9C
9C
PUSHF
PUSHFD
4, pm = 3
4, pm = 3
Pushes FLAGS.
Pushes EFLAGS.
Operation
IF OperandSize = 32
THEN push(EFLAGS);
ELSE push(FLAGS);
FI
Description
The PUSHF instruction decrements the stack pointer by 2 and copies the FLAGS register
to the new top of stack; the PUSHFD instruction decrements the stack pointer by 4, and
copies the EFLAGS register to the new stack top pointed to by SS:ESP.
Flags Affected
None
Protected Mode Exceptions
Stack Fault (12) indicates the new value of the ESP register is outside the stack segment
boundaries.
Real Address Mode Exceptions
None; the processor shuts down due to a lack of stack space.
Virtual 8086 Mode Exceptions
To maintain emulation, General Protection Fault (13) indicates the I/O privilege level is less
than 3.
Am486 Microprocessor Instruction Set
2-219
AMD
2.197
RCL
Rotates through Carry Left
Opcode
Instruction
Clocks
Description
D0 /2
D2 /2
C0 /2 ib
D1 /2
D3 /2
C1 /2 ib
D1 /2
D3 /2
C1 /2 ib
RCL r/m8,1
RCL r/8,CL
RCL r/m8,imm8
RCL r/m16,1
RCL r/m16,CL
RCL r/m16,imm8
RCL r/m32,1
RCL r/m32,CL
RCL r/m32,imm8
3/4
3–30/9–31
8–30/9–31
3/4
8–30/9–31
8–30/9–31
3/4
8–30/9–31
8–30/9–31
Rotates 9 bits (CF,r/m byte) left once.
Rotates 9 bits (CF,r/m byte) left CL times.
Rotates 9 bits (CF,r/m byte) left imm8 times.
Rotates 17 bits (CF,r/m word) left once.
Rotates 17 bits (CF,r/m word) left CL times.
Rotates 17 bits (CF,r/m word) left imm8 times.
Rotates 33 bits (CF,r/m doubleword) left once.
Rotates 33 bits (CF,r/m doubleword) left CL times.
Rotates 33 bits (CF,r/m doubleword) left imm8 times.
Operation
temp C0UNT;
WHILE (temp ≠ 0)
DO
tmpcf ← high-order bit of (r/m);
r/m ← r/m ⋅ 2 + (tmpcf);
temp ← temp – 1;
OD;
IF C0UNT = 1
THEN
IF high-order bit of r/m ≠ CF
THEN OF ← 1;
ELSE OF ← 0;
FI;
ELSE OF ← undefined FI
Description
RCL shifts CF into the bottom bit and shifts the top bit into CF. The second operand indicates
the number of rotations. The operand is either an immediate number or the CL register
contents. The processor does not allow rotation counts greater than 31, using only the
bottom five bits of the operand if it is greater than 31. Virtual 8086 Mode masks rotation
counts.
Flags Affected
OF is affected only by single-bit rotations but is undefined otherwise. CF contains the value
of the bit shifted into it. SF, ZF, AF, and PF are not affected.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
2-220
Am486 Microprocessor Instruction Set
AMD
2.198
RCR
Rotates through Carry Right
Opcode
Instruction
Clocks
Description
D0 /3
D2 /3
C0 /3 ib
D1 /3
D3 /3
C1 /3 ib
D1 /3
D3 /3
C1 /3 ib
RCR r/m8,1
RCR r/8,CL
RCR r/m8,imm8
RCR r/m16,1
RCR r/m16,CL
RCR r/m16,imm8
RCR r/m32,1
RCR r/m32,CL
RCR r/m32,imm8
3/4
3–30/9–31
8–30/9–31
3/4
8–30/9–31
8–30/9–31
3/4
8–30/9–31
8–30/9–31
Rotates 9 bits (CF,r/m byte) right once.
Rotates 9 bits (CF,r/m byte) right CL times.
Rotates 9 bits (CF,r/m byte) right imm8 times.
Rotates 17 bits (CF,r/m word) right once.
Rotates 17 bits (CF,r/m word) right CL times.
Rotates 17 bits (CF,r/m word) right imm8 times.
Rotates 33 bits (CF,r/m doubleword) right once.
Rotates 33 bits (CF,r/m doubleword) right CL times.
Rotates 33 bits (CF,r/m doubleword) right imm8 times.
Operation
temp ← C0UNT;
WHILE (temp ≠ 0 )
DO
tmpcf ← low-order bit of (r/m);
r/m ← r/m / 2 + (tmpcf ⋅ 2 width(r/m));
temp ← temp – 1;
OD;
IF C0UNT = 1
THEN
IF (high-order bit of r/m) ≠ (bit next to high-order bit of r/m)
THEN OF ← 1;
ELSE OF ← 0;
FI;
ELSE OF ← undefined FI
Description
RCR shifts CF into the top bit and shifts the bottom bit into CF. The second operand indicates
the number of rotations. The operand is either an immediate number or the CL register
contents. The processor does not allow rotation counts greater than 31, using only the
bottom five bits of the operand if it is greater than 31. Virtual 8086 Mode masks rotation
counts.
Flags Affected
OF is affected only by single-bit rotations but is undefined otherwise. CF contains the value
of the bit shifted into it. SF, ZF, AF, and PF are not affected.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-221
AMD
2.199
REP/REPE/REPNE/REPNZ/REPZ Repeats Specified String Operation
Opcode Instruction
Clocks*
Description
F3 6C
REP INS r/8,DX
Inputs (E)CX bytes from port DX into ES:(E)DI.
F3 6D
REP INS r/m16,DX
F3 6D
REP INS r/m32,DX
rm = 16+8(E)CX
If CPL≤IOPL,
pm = 10+8(E)CX
If CPL>IOPL,
pm = 30+8(E)CX
vm = 29+8(E)CX
F3 A4
REP MOVS m8,m8
Moves (E)CX bytes from (E)SI to ES:(E)DI.
F3 A5
REP MOVS m16,m16
F3 A5
REP MOVS m32,m32
If (E)CX = 0
5
If (E)CX = 1
13
If (E)CX > 1
12+3(E)CX
F3 6E
REP OUTS DX,r/m8
Outputs (E)CX bytes to port DX from ES:(E)DI.
F3 6F
REP OUTS DX,r/m16
F3 6F
REP OUTS DX,r/m32
rm = 17+5(E)CX
If CPL≤IOPL,
pm = 11+5(E)CX
If CPL>IOPL,
pm = 31+5(E)CX
vm = 30+5(E)CX
F2 AC
F2 AD
F2 AD
REP LODS m8
REP LODS m16
REP LODS m32
F3 AA
F3 AB
F3 AB
REP STOS m8
REP STOS m16
REP STOS m32
F3 A6
F3 A7
F3 A7
REPE CMPS m8,m8
REPE CMPS m16,m16
REPE CMPS m32,m32
Finds nonmatching bytes in ES:(E)DI and (E)SI.
Finds nonmatching words in ES:(E)DI and (E)SI.
Finds nonmatching doublewords in ES:(E)DI and
(E)SI.
F3 AE
F3 AF
F3 AF
REPE SCAS m8
REPE SCAS m16
REPE SCAS m32
Finds non-AL byte starting at ES:(E)DI.
Finds non-AX word starting at ES:(E)DI.
Finds non-EAX doubleword starting at ES:(E)DI.
F2 A6
F2 A7
F2 A7
REPNE CMPS m8,m8
REPNE CMPS m16,m16
REPNE CMPS m32,m32
Finds matching bytes in ES:(E)DI and (E)SI.
Finds matching words in ES:(E)DI and (E)SI.
Finds matching doublewords in ES:(E)DI and
(E)SI.
F2 AE
F2 AF
F2 AF
REPNE SCAS m8
REPNE SCAS m16
REPNE SCAS m32
Finds AL, starting at ES:(E)DI.
Finds AX, starting at ES:(E)DI.
Finds EAX, starting at ES:(E)DI.
If (E)CX = 0,
5
IF (E)CX > 0,
7+4(E)CX
Inputs (E)CX words from port DX into ES:(E)DI.
Inputs (E)CX doublewords from port DX into
ES:(E)DI.
Moves (E)CX words from (E)SI to ES:(E)DI.
Moves (E)CX doublewords from (E)SI to
ES:(E)DI.
Outputs (E)CX words to port DX from ES:(E)DI.
Outputs (E)CX doublewords to port DX from
ES:(E)DI.
Loads (E)CX bytes from (E)SI to AL.
Loads (E)CX words from (E)SI to AX.
Loads (E)CX doublewords from (E)SI to EAX.
Fills (E)CX bytes at ES:(E)DI with AL.
Fills (E)CX words at ES:(E)DI with AX.
Fills (E)CX doublewords at ES:(E)DI with EAX.
*Clock data is grouped by category. The category applies to all instructions to the left of the enclosed cell.
Modes: rm = Real, pm = Protected, vm = Virtual. If no Mode is indicated, values apply to all modes.
2-222
Am486 Microprocessor Instruction Set
AMD
Operation
IF AddressSize = 16
THEN use CX for CountReg;
ELSE (* AddressSize = 32 *) use ECX for CountReg FI;
WHILE CountReg ≠ 0
DO
service pending interrupts (if any);
perform primitive string instruction;
CountReg ← CountReg – 1;
IF primitive operation is CMPSB, CMPSW, SCASB, or SCASW
THEN
IF (instruction is REP/REPE/REPZ) AND (ZF = 0)
THEN exit WHILE loop
ELSE
IF (instruction is REPNZ or REPNE) AND (ZF = 1)
THEN exit WHILE loop FI FI FI;
OD
Description
The REPeat string instructions are prefixes used with string instructions. The prefix causes
the string instruction to repeat the number of times indicated in the count register (CX or
ECX) or (for the REPE/REPZ and REPNE/REPNZ prefixes) until the indicated condition in
ZF is no longer met. You can only apply a REP prefix to one string instruction at a time. To
repeat an instruction block, use the LOOP instruction or another looping construct.
REP begins by checking the address size to select the correct count register: CX (16-bit)
or ECX (32-bit). Then REP checks the count register. If it is zero, execution moves to the
next instruction. REP then allows the processor to acknowledge any pending interrupts.
After interrupt servicing, the processor performs the string operation and decrements the
count register by one. REP checks ZF if the string operation is a SCAS or CMPS instruction.
If the prefix is REPE or REPZ and ZF = 0 (last comparison was not equal), exit the interation
and continue with the next instruction. If the prefix is REPNE or REPNZ and ZF = 1 (last
comparison was equal), exit the iteration and continue with the next instruction. Otherwise
REP checks the count register to start the next iteration. Repeated CMPS and SCAS
instructions can be exited if either the count goes to 0 or if ZF fails the repeat condition.
You can use either the JCXZ instruction or the conditional jumps that test ZF (the JZ, JNZ,
and JNE instructions) to distinguish why iterations stopped.
Flags Affected
ZF is affected by the REP CMPS and REP SCAS as described above.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Note: Not all I/O ports can handle the rate at which REP INS and REP OUTS execute. Do
not use REP with the LOOP instruction; it yields unpredictable results. The processor
ignores REP when it is used with non-string instructions.
Am486 Microprocessor Instruction Set
2-223
AMD
2.200
RET
Returns from Procedure
Opcode
Instruction
Clocks
Description
C3
CB
CB
C2 iw
CA iw
CA iw
RET
RET
RET
RET imm16
RET imm16
RET imm16
5
13, pm = 18
13, pm = 18
5
14, pm = 17
14, pm = 17
Returns near to caller.
Returns far to caller at same privilege.
Returns far at lesser privilege, switches stacks.
Returns near, pops imm16 bytes of parameters.
Returns far to same privilege, pops imm16 bytes.
Returns far to lesser privilege, pops imm16 bytes.
Operation
IF instruction = near RET
THEN;
IF OperandSize = 16
THEN
lP ← Pop();
EIP ← EIP AND 0000FFFFh;
ELSE (* OperandSize = 32 *)
EIP ← Pop();
FI;
IF instruction has immediate operand THEN eSP ← eSP + imm16;
FI;
FI;
IF (PE = 0 OR (PE = 1 AND VM = 1)) (* Real Mode or Virtual 8086 Mode *)
AND instruction = far RET
THEN;
IF OperandSize = 16
THEN
lP ← Pop();
EIP ← EIP AND 0000FFFFh;
CS ← Pop(); (* 16-bit pop *)
ELSE (* OperandSize = 32 *)
EIP ← Pop();
CS ← Pop(); (* 32-bit pop, high-order 16-bits discarded *)
FI;
IF instruction has immediate operand THEN eSP ← eSP + imm16;
FI;
FI;
IF (PE = 1 AND VM = 0) (* Protected Mode, not V86 Mode *)
AND instruction = far RET
THEN
IF OperandSize = 32
THEN Third word on stack must be within stack limits else Stack Fault;
ELSE Second word on stack must be within stack limits else Stack Fault;
FI;
Return selector RPL is ≥ CPL ELSE Gen. Protection Fault(return selector)
IF return selector RPL = CPL
THEN GOTO SAME-LEVEL;
ELSE GOTO OUTER-PRIVILEGE-LEVEL;
FI;
FI;
2-224
Am486 Microprocessor Instruction Set
AMD
SAME-LEVEL:
Return selector must be non-null ELSE General Protection Fault
Selector index is within limits ELSE General Protection Fault(selector)
Descriptor AR byte indicates code segment
ELSE General Protectection Fault(selector)
IF non-conforming
THEN code segment DPL must equal CPL;
ELSE General Protection Fault(selector);
FI;
IF conforming
THEN code segment DPL must be ≤ CPL;
ELSE General Protection Fault(selector);
FI;
Code segment must be present ELSE Segment Not Present(selector);
Top word on stack must be within stack limits ELSE Stack Fault;
IP must be in code segment limit ELSE General Protection Fault;
IF OperandSize = 32
THEN
Load CS: EIP from stack
Load CS register with descriptor
Increment eSP by 8 plus the immediate offset if it exists
ELSE (* OperandSize = 16 *)
Load CS:IP from stack
Load CS register with descriptor
Increment eSP by 4 plus the immediate offset if it exists
FI;
OUTER-PRIVILEGE-LEVEL:
IF OperandSize = 32
THEN Top (16 + immediate) bytes on stack must be within stack limits
ELSE Stack Fault;
ELSE Top (8 +immediate) bytes on stack must be within stack limits
ELSE Stack Fault;
FI;
Examine return CS selector and associated descriptor:
Selector must be non-null ELSE General Protection Fault;
Selector index is within limits ELSE Gen.Protection Fault(selector)
Descriptor AR byte indicates code segment
ELSE General Protection Fault(selector);
IF non-conforming
THEN code segment DPL must equal return selector RPL
ELSE General Protection Fault(selector);
FI;
IF conforming
THEN code segment DPL must be ≤ return selector RPL;
ELSE General Protection Fault(selector);
FI;
Segment must be present ELSE Segment Not Present(selector)
Examine return SS selector and associated descriptor:
Selector must be non-null ELSE General Protection Fault;
Selector index is within limits ELSE Gen.Protection Fault (selector);
Selector RPL = RPL of the return CS selector
ELSE General Protection Fault(selector);
Descriptor AR byte indicates a writable data segment
ELSE General Protection Fault(selector);
Descriptor DPL = RPL of the return CS selector
ELSE General Protection Fault(selector);
Segment must be present ELSE Segment Not Present(selector);
Am486 Microprocessor Instruction Set
2-225
AMD
IP must be in code segment limit ELSE General Protection Fault;
Set CPL to the RPL of the return CS selector;
IF OperandSize = 32
THEN
Load CS: EIP from stack;
Set CS RPL to CPL;
Increment eSP by 8 plus the immediate offset if it exists;
Load SS:eSP from stack;
ELSE (* OperandSize = 16 *)
Load CS:IP from stack;
Set CS RPL to CPL;
Increment eSP by 4 plus the immediate offset if it exists;
Load SS:eSP from stack;
FI;
Load the CS register with the return CS descriptor;
Load the SS register with the return SS descriptor;
For each of ES, FS, GS, and DS
DO
IF the current register setting is not valid for the outer level,
set the register to null (selector ← AR ← 0);
To be valid, register setting must satisfy the following properties:
Selector index must be within descriptor table limits;
Descriptor AR byte must indicate data or readable code segment;
IF segment is data or non-conforming code, THEN
DPL must be ≥ CPL, or DPL must be ≥ RPL;
FI;
OD
Description
RET transfers control to a return address located on the stack. The address is usually
placed on the stack by a CALL instruction, and the return is made to the instruction that
follows the CALL instruction. The optional numeric parameter to the RET instruction gives
the number of stack bytes (OperandMode = 16) or words (OperandMode = 32) to be
released after the return address is popped. These items are typically used as input parameters to the procedure called. For the intrasegment (near) return, the address on the
stack is a segment offset, which is popped into the instruction pointer. The CS register is
unchanged.
For the intersegment (far) return, the address on the stack is a long pointer. The offset is
popped first, followed by the selector. In Real Mode, the CS and IP registers are loaded
directly. In Protected Mode, an intersegment return causes the microprocessor to check
the descriptor addressed by the return selector. The AR byte of the descriptor must indicate
a code segment of equal or lesser privilege (or greater or equal numeric value) than the
current privilege level. Returns to a lesser privilege level cause the stack to be reloaded
from the value saved beyond the parameter block.
The DS, ES, FS, and GS segment registers can be cleared by the RET instruction during
an interlevel transfer. If these registers refer to segments that cannot be used by the new
privilege level, they are cleared to prevent unauthorized access from the new privilege level.
Flags Affected
None
2-226
Am486 Microprocessor Instruction Set
AMD
Protected Mode Exceptions
General Protection Fault (13), Segment Not Present (11), or Stack Fault (12) occur as
described under ‘Operation.’ Page Fault (14) indicates a page fault.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault.
Am486 Microprocessor Instruction Set
2-227
AMD
2.201
ROL
Rotates Left
Opcode
Instruction
Clocks
Description
D0 /0
D2 /0
C0 /0 ib
D1 /0
D3 /0
C1 /0 ib
D1 /0
D3 /0
C1 /0 ib
ROL r/m8,1
ROL r/8,CL
ROL r/m8,imm8
ROL r/m16,1
ROL r/m16,CL
ROL r/m16,imm8
ROL r/m32,1
ROL r/m32,CL
ROL r/m32,imm8
3/4
3/4
2/4
3/4
3/4
2/4
3/4
3/4
2/4
Rotates 8 bits r/m byte left once.
Rotates 8 bits r/m byte left CL times.
Rotates 8 bits r/m byte left imm8 times.
Rotates 16 bits r/m word left once.
Rotates 16 bits r/m word left CL times.
Rotates 16 bits r/m word left imm8 times.
Rotates 32 bits r/m doubleword left once.
Rotates 32 bits r/m doubleword left CL times.
Rotates 32 bits r/m doubleword left imm8 times.
Operation
temp C0UNT;
WHILE (temp ≠ 0)
DO
tmpcf ← high-order bit of (r/m);
r/m ← r/m ⋅ 2 + (tmpcf);
temp ← temp – 1;
OD;
IF C0UNT = 1
THEN
IF high-order bit of r/m ≠ CF
THEN OF ← 1;
ELSE OF ← 0;FI;
ELSE OF ← undefined;FI
Description
ROL shifts the bits upward, except for the top bit, which becomes the bottom bit; ROL also
copies the bit to CF. The second operand indicates the number of rotations. The operand
is either an immediate number or the CL register contents. The processor does not allow
rotation counts greater than 31, using only the bottom five bits of the operand if it is greater
than 31. The 486 processor in Virtual 8086 Mode masks rotation counts.
Flags Affected
OF is only defined for single-bit rotations but is undefined otherwise. CF contains the value
of the top bit copied into it. SF, ZF, AF, and PF are not affected.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
2-228
Am486 Microprocessor Instruction Set
AMD
2.202
ROR
Rotates Right
Opcode
Instruction
Clocks
Description
D0 /1
D2 /1
C0 /1 ib
D1 /1
D3 /1
C1 /1 ib
D1 /1
D3 /1
C1 /1 ib
RCR r/m8,1
RCR r/8,CL
RCR r/m8,imm8
RCR r/m16,1
RCR r/m16,CL
RCR r/m16,imm8
RCR r/m32,1
RCR r/m32,CL
RCR r/m32,imm8
3/4
3/4
2/4
3/4
3/4
2/4
3/4
3/4
2/4
Rotates 8 bits r/m byte right once.
Rotates 8 bits r/m byte right CL times.
Rotates 8 bits r/m byte right imm8 times.
Rotates 16 bits r/m word right once.
Rotates 16 bits r/m word right CL times.
Rotates 16 bits r/m word right imm8 times.
Rotates 32 bits r/m doubleword right once.
Rotates 32 bits r/m doubleword right CL times.
Rotates 32 bits r/m doubleword right imm8 times.
Operation
temp ← C0UNT;
WHILE (temp ≠ 0 )
DO
tmpcf ← low-order bit of (r/m);
r/m ← r/m / 2 + (tmpcf ⋅ 2 width(r/m));
temp ← temp – 1;
OD;
IF C0UNT = 1
THEN
IF (high-order bit of r/m) ≠ (bit next to high-order bit of r/m)
THEN OF ← 1;
ELSE OF ← 0;FI;
ELSE OF ← undefined FI
Description
ROR shifts the bits downward, except for the bottom bit, which becomes the top bit; ROR
also copies the bit to CF. The second operand indicates the number of rotations to make.
The operand is either an immediate number or the CL register contents. The processor
does not allow rotation counts greater than 31, using only the bottom five bits of the operand
if it is greater than 31. The 486 processor in Virtual 8086 Mode does mask rotation counts.
Flags Affected
OF is only defined for single-bit rotations but is undefined otherwise. CF contains the value
of the top bit copied into it. SF, ZF, AF, and PF are not affected.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17)
indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-229
AMD
2.203
SAHF
Stores AH into Flags
Opcode
Instruction
Clocks
Description
9E
SAHF
2
Stores AH into EFLAGS bits SF, ZF, AF, PF, CF.
Operation
SF:ZF:xx:AF:xx:PF:xx:CF ← AH
Description
The SAHF instruction loads the SF, ZF, AF, PF, and CF bits in the EFLAGS register with
values from the AH register, from bits 7, 6, 4, 2, and 0, respectively.
Flags Affected
SF, ZF, AF, PF, and CF are loaded with values from the AH register.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-230
Am486 Microprocessor Instruction Set
AMD
2.204
SAL
Shifts Arithmetic Left
Opcode
Instruction
Clocks
Description
D0 /4
D2 /4
C0 /4 ib
D1 /4
D3 /4
C1 /4 ib
D1 /4
D3 /4
C1 /4 ib
SAL r/m8,1
SAL r/m8,CL
SAL r/m8,imm8
SAL r/m16,1
SAL r/m16,CL
SAL r/m16,imm8
SAL r/m32,1
SAL r/m32,CL
SAL r/m32,imm8
3/4
3/4
2/4
3/4
3/4
2/4
3/4
3/4
2/4
Multiplies r/m byte by 2, once.
Multiplies r/m byte by 2, CL times.
Multiplies r/m byte by 2, imm8 times.
Multiplies r/m word by 2, once.
Multiplies r/m word by 2, CL times.
Multiplies r/m word by 2, imm8 times.
Multiplies r/m doubleword by 2, once.
Multiplies r/m doubleword by 2, CL times.
Multiplies r/m doubleword by 2, imm8 times.
Operation
(* C0UNT is the second parameter *)
(temp) ← C0UNT;
WHILE (temp ≠ 0)
DO
CF ← high-order bit of r/m;
r/m ← r/m ⋅ 2;
temp ← temp 1 ;
OD;
IF C0UNT = 1
THEN
OF ← high-order bit of r/m ≠ (CF);
FI
Description
SAL (or its synonym, SHL) shifts the bits of the operand upward. SAL shifts the high-order
bit into CF and clears the Low order bit. The second operand indicates the number of shifts
to make. The operand is either an immediate number or the CL register contents. The
processor does not allow shift counts greater than 31; it uses only the bottom five bits of
the operand if it is greater than 31.
Flags Affected
OF is defined for single-bit shifts; otherwise, it is undefined. The result determines the CF,
ZF, PF, and SF settings.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-231
AMD
2.205
SAR
Shifts Arithmetic Right
Opcode
Instruction
Clocks
Description
D0 /7
D2 /7
C0 /7 ib
D1 /7
D3 /7
C1 /7 ib
D1 /7
D3 /7
C1 /7 ib
SAR r/m8,1
SAR r/m8,CL
SAR r/m8,imm8
SAR r/m16,1
SAR r/m16,CL
SAR r/m16,imm8
SAR r/m32,1
SAR r/m32,CL
SAR r/m32,imm8
3/4
3/4
2/4
3/4
3/4
2/4
3/4
3/4
2/4
Performs a signed divide* r/m byte by 2 once.
Performs a signed divide* r/m byte by 2 CL times.
Performs a signed divide* r/m byte by 2 imm8 times.
Performs a signed divide* r/m word by 2 once.
Performs a signed divide* r/m word by 2 CL times.
Performs a signed divide* r/m word by 2 imm8 times.
Performs a signed divide* r/m doubleword by 2 once.
Performs a signed divide* r/m doubleword by 2 CL times.
Performs a signed divide* r/m doubleword by 2 imm8 times.
*Not the same division as IDIV; rounding is toward negative infinity.
Operation
(* C0UNT is the second parameter *)
(temp) ← C0UNT;
WHILE (temp ≠ 0)
DO
CF ← low-order bit of r/m;
r/m ← r/m / 2 (* Signed divide, rounding toward negative infinity *);
temp ← temp 1 ;
OD;
IF C0UNT = 1
THEN
OF ← 0 FI
Description
SAR shifts the bits of the operand downward. SAR shifts the Low order bit into CF. The
effect is to divide the operand by two. SAR performs a signed divide with rounding toward
negative infinity (not like IDIV ); the high-order bit remains the same. The second operand
indicates the number of shifts to make. The operand is either an immediate number or the
Cl register contents. The processor does not allow shift counts greater than 31; it only uses
the bottom five bits of the operand if it is greater than 31.
Flags Affected
OF is cleared for single shifts; otherwise, it is undefined. The result determines the CF, ZF,
PF, and SF settings.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-232
Am486 Microprocessor Instruction Set
AMD
2.206
SBB
Integer Subtract with Borrow
Opcode Instruction
Clocks Description
1C ib
1D iw
1D id
80 /3 ib
81 /3 iw
81 /3 id
83 /3 ib
83 /3 ib
18 /r
19 /r
19 /r
1A /r
1B /r
1B /r
1
1
1
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/2
1/2
SBB AL,imm8
SBB AX,imm16
SBB EAX,imm32
SBB r/m8,imm8
SBB r/m16,imm16
SBB r/m32,imm32
SBB r/m16,imm8
SBB r/m32,imm8
SBB r/m8,r8
SBB r/m16,r16
SBB r/m32,r32
SBB r8,r/m8
SBB r16,r/m16
SBB r32,r/m32
Subtracts immediate byte from AL with borrow.
Subtracts immediate word from AX with borrow.
Subtracts immediate doubleword from EAX with borrow.
Subtracts immediate byte from r/m byte with borrow.
Subtracts immediate word from r/m word with borrow.
Subtracts imm. doubleword from r/m doubleword with borrow.
Subtracts sign-extended imm. byte from r/m word with borrow.
Subtracts sign-ext. imm. byte from r/m doubleword with borrow.
Subtracts byte register from r/m byte with borrow.
Subtracts word register from r/m word with borrow.
Subtracts doubleword register from r/m doubleword with borrow.
Subtracts r/m byte from byte register with borrow.
Subtracts r/m word from word register with borrow.
Subtracts r/m doubleword from doubleword register with borrow.
Operation
IF SRC is a byte and DEST is a word or doubleword
THEN DEST = DEST – (SignExtend(SRC) + CF)
ELSE DEST ← DEST – (SRC + CF)
Description
The SBB instruction adds the second operand (SRC) to CF and subtracts the result from
the first operand (DEST). The result of the subtraction is assigned to the first operand
(DEST) and the flags are set accordingly.
Note: When an immediate byte value is subtracted from a word operand, the immediate
value is first sign-extended.
Flags Affected
OF, SF, ZF, AF, PF, and CF are set according to the result.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-233
AMD
2.207
SCAS/SCASB/SCASD/SCASW
Compares String Data
Opcode
Instruction
Clocks
Description
AE
AF
AF
AE
AF
AF
SCAS m8
SCAS m16
SCAS m32
SCASB
SCASD
SCASW
6
6
6
6
6
6
Compares bytes AL–ES:DI, updates (E)DI.
Compares words AX–ES:DI, updates (E)DI.
Compares doublewords EAX–ES:DI, updates (E)DI.
Compares bytes AL–ES:DI, updates (E)DI.
Compares doublewords EAX–ES:DI, updates (E)DI.
Compares words AX–ES:DI, updates (E)DI.
Operation
IF AddressSize = 16
THEN use DI for dest-index;
ELSE (* AddressSize = 32 *) use EDI for dest-index;
FI;
IF byte type of instruction
THEN
AL – [dest-index]; (* Compare byte in AL and dest *)
IF DF = 0 THEN IndDec ← 1 ELSE ← IncDec – 1 ;
FI;
ELSE
IF OperandSize = 16
THEN
AX – [dest-index] ; (* compare word in AL and dest *)
IF DF = 0 THEN IncDec ← 2 ELSE IncDec ← –2;
FI;
ELSE (* OperandSize = 32 *)
EAX – [dest-index];(* compare doubleword in EAX & dest *)
IF DF = 0 THEN IncDec ← 4 ELSE IncDec ← –4;
FI;
FI;
FI;
dest-index = dest-index + IncDec
Description
SCAS subtracts the memory byte, word, or doubleword at the destination register from the
AL, AX, or EAX register. The result is discarded; only the flags are set. The operand must
be addressable from the ES segment; no segment override is possible. The address size
determines whether the index register is DI (16-bit address) or EDI (32-bit address). The
contents of the destination register determine the address of the memory data being compared, not the SCAS instruction operand. The operand validates ES segment addressability
and determines the data type. Load the correct index value into the DI or EDI register before
executing the SCAS instruction.
After the comparison, the destination index register automatically updates. If the Direction
Flag (DF) is 0 (see CLD), the destination index register increments; if DF is 1 (see STD),
it decrements. The increment/decrement rate is 1 for bytes, 2 for words, or by 4 for
doublewords.
The SCASB, SCASW, and SCASD instructions are synonyms for the byte, word, and
doubleword SCAS instructions that do not require operands. They are simpler to code, but
provide no type or segment checking.
You can precede SCAS with the REPE or REPNE prefix for a block search of CX or ECX
bytes or words.
2-234
Am486 Microprocessor Instruction Set
AMD
Flags Affected
OF, SF, ZF, AF, PF, and CF are set according to the result.
Protected Mode Exceptions
General Protection Fault (13) indicates that there is an illegal memory-operand effective
address in the ES segment. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-235
AMD
2.208
SETcc
Sets Byte on Condition (see list below)
Opcode
Instruction
Clocks
Description
0F 97
0F 93
0F 92
0F 96
0F 92
0F 94
0F 9F
0F 9D
0F 9C
0F 9E
0F 96
0F 92
0F 93
0F 97
0F 93
0F 95
0F 9E
0F 9C
0F 9D
0F 9F
0F 91
0F 9B
0F 99
0F 95
0F 90
0F 9A
0F 9A
0F 9B
0F 98
0F 94
SETA r/m8
SETAE r/m8
SETB r/m8
SETBE r/m8
SETC r/m8
SETE r/m8
SETG r/m8
SETGE r/m8
SETL r/m8
SETLE r/m8
SETNA r/m8
SETNAE r/m8
SETNB r/m 8
SETNBE r/m8
SETNC r/m8
SETNE r/m8
SETNG r/m8
SETNEG r/m8
SETNL r/m8
SETNLE r/m8
SETNO r/m8
SETNP r/m8
SETNS r/m8
SETNZ r/m8
SETO r/m8
SETP r/m8
SETPE r/m8
SETPO r/m8
SETS r/m8
SETZ r/m8
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
4/3
Sets byte if above (CF = 0 and ZF = 0).
Sets byte if above or equal (CF = 0).
Sets byte if below (CF = 1).
Sets byte if below or equal (CF = 1 or ZF = 1).
Sets if carry (CF = 1).
Sets byte if equal (ZF = 1).
Sets byte if greater (ZF = 0 and SF = OF).
Sets byte if greater or equal (SF = OF).
Sets byte if less (SF≠OF).
Sets byte if less or equal (ZF = 1 or SF≠OF).
Sets byte if not above (CF = 1 or ZF = 1).
Sets byte if not above or equal (CF = 1).
Sets byte if not below (CF = 0).
Sets byte if not below or equal (CF = 0 and ZF = 0).
Sets byte if not carry (CF = 0).
Sets byte if not equal (ZF = 0).
Sets byte if not greater (ZF = 1 or SF≠OF).
Sets byte if not greater or equal (SF≠OF).
Sets byte if not less (SF = OF).
Sets byte if not less or equal (ZF = 0 and SF = OF).
Sets byte if not overflow (OF = 0).
Sets byte if not parity (PF = 0).
Sets byte if not sign (SF = 0).
Sets byte if not zero (ZF = 0).
Sets byte if overflow (OF = 1).
Sets byte if parity (PF = 1).
Sets byte if parity even (PF = 1).
Sets byte if parity odd (PF = 0).
Sets byte if sign (SF = 1).
Sets byte if zero (ZF = 1).
Operation
IF condition THEN r/m8 ← 1 ELSE r/m8 ← 0; FI
Description
SETcc loads a 1 (condition met) or a 0 (not met) into the r/m byte specified by the operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space.
Virtual 8086 Mode Exceptions
Same as Real Mode. Page Fault (14) indicates a page fault.
2-236
Am486 Microprocessor Instruction Set
AMD
2.209
SGDT
Store Global Descriptor Table Register
Opcode
Instruction
Clocks
Description
0F 01 /0
SGDT m
10
Store GDTR to m
Operation
DEST ← 48-bit BASE/LIMIT register contents
Description
SGDT copies the contents of the descriptor table register to the six bytes of memory indicated by the operand. The LIMIT field of the register is assigned to the first word at the
effective address. If the operand-size attribute is 16 bits, the next three bytes are assigned
to the BASE field of the register and the fourth byte is undefined. Otherwise, if the operandsize attribute is 32 bits, the next four bytes are assigned to the 32-bit BASE field of the
register.
Note: The SGDT instruction is used only in operating system software. It is not used in
application programs.
Flags Affected
None
Protected Mode Exceptions
Invalid Opcode (6) indicates the destination operand is a register. General Protection Fault
(13) indicates either that the destination is in a non-writable segment or there is an illegal
memory-operand effective address in the CS, DS, ES, FS, or GS segments. Stack Fault
(12) indicates an illegal address is in the SS segment. Page Fault (14) indicates a page
fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Invalid Opcode (6) indicates the destination operand is a register. General Protection Fault
(13) indicates that part of the operand lies outside of the effective address space from 0 to
0FFFFh.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) indicates the destination operand is a register. General Protection Fault
(13) indicates that part of the operand lies outside of the effective address space from 0 to
0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates
there is an unaligned memory reference.
Note: The 16-bit forms of the SGDT instructions are compatible with the 286 processor if
the value in the upper eight bits is not referenced. The 286 processor stores a 1 in each of
the upper bits, whereas Am386 and Am486 microprocessors store a 0 if the operand-size
attribute is 16 bits.
Am486 Microprocessor Instruction Set
2-237
AMD
2.210
SHL
Shift Left
Opcode
Instruction
Clocks
Description
D0 /4
D2 /4
C0 /4 ib
D1 /4
D3 /4
C1 /4 ib
D1 /4
D3 /4
C1 /4 ib
SHL r/m8,1
SHL r/m8,CL
SHL r/m8,imm8
SHL r/m16,1
SHL r/m16,CL
SHL r/m16,imm8
SHL r/m32,1
SHL r/m32,CL
SHL r/m32,imm8
3/4
3/4
2/4
3/4
3/4
2/4
3/4
3/4
2/4
Multiplies r/m byte by 2 once.
Multiplies r/m byte by 2 CL times.
Multiplies r/m byte by 2 imm8 times.
Multiplies r/m word by 2 once.
Multiplies r/m word by 2 CL times.
Multiplies r/m word by 2 imm8 times.
Multiplies r/m doubleword by 2 once.
Multiplies r/m doubleword by 2 CL times.
Multiplies r/m doubleword by 2 imm8 times.
Operation
(* C0UNT is the second parameter *)
(temp) ← C0UNT;
WHILE (temp ≠ 0)
DO
CF ← high-order bit of r/m;
r/m ← r/m ⋅ 2;
temp ← temp 1;
OD;
IF C0UNT = 1
THEN
OF ← high-order bit of r/m ≠ (CF);
FI
Description
SHL (or its synonym, SAL) shifts the bits of the operand upward. SHL shifts the high-order
bit into CF and clears the Low order bit. The second operand indicates the number of shifts
to make. The operand is either an immediate number or the CL register contents. The
processor does not allow shift counts greater than 31; it uses only the bottom five bits of
the operand if it is greater than 31.
Flags Affected
OF is defined for single-bit shifts; otherwise, it is undefined. The result determines the CF,
ZF, PF, and SF settings. CF is undefined if the shift lengths are greater than the size of the
shifted operand.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-238
Am486 Microprocessor Instruction Set
AMD
2.211
SHLD
Double Precision Shift Left
Opcode
Instruction
Clocks
Description
0F A4
0F A4
0F A5
0F A5
SHLD r/m16,r16,imm8
SHLD r/m32,r32,imm8
SHLD r/m16,r16,CL
SHLD r/m32,r32,CL
2/3
2/3
3/4
3/4
r/m16 gets SHL of r/m16 concatenated with r16.
r/m32 gets SHL of r/m32 concatenated with r32.
r/m16 gets SHL of r/m16 concatenated with r16.
r/m32 gets SHL of r/m32 concatenated with r32.
Operation
(* count is an unsigned integer corresponding to the last operand of the
instruction, either an immediate byte or the byte in register CL *)
ShiftAmt ← count MOD 32;
inBits ← register; (* Allow overlapped operands *)
IF ShiftAmt = 0
THEN no operation
ELSE
IF ShiftAmt ≥ OperandSize
THEN (* Bad parameters *)
r/m ← UNDEFINED;
CF, OF, SF, ZF, AF, PF ← UNDEFINED;
ELSE (* Perform the shift *)
CF ← BIT[Base, OperandSize – ShiftAmt];
(* Last bit shifted out on exit *)
FOR i ← OperandSize – 1 DOWNTO ShiftAmt
DO
BIT[Base, i] ← BIT[Base, i – ShiftAmt];
OF;
FOR i ← ShiftAmt – 1 DOWNTO 0
DO
BIT[Base, i] ← BIT[inBits, i – ShiftAmt + OperandSize];
OD;
Set SF, ZF, PF (r/m);
(* SF, ZF, PF are set according to the value of the result *)
AF ← UNDEFINED;
FI;
FI
Description
SHLD shifts the r/m word/doubleword specified by the first operand to the left as many bits
as indicated by the count operand, specified by an immediate byte or the CL register. The
second operand word/doubleword register (r16/ r32) provides the bits to shift in from the
right (starting with bit 0). SHLD then stores the result back into the r/m word/doubleword
specified by the first operand. The register remains unaltered.
The count operand is taken modulo 32 to provide a number between 0 and 31 by which to
shift. Because the bits to shift are provided by the specified registers, the operation is useful
for multiprecision shifts (64 bits or more).
Flags Affected
SF, ZF, and PF are set according to the result. CF is set to the value of the last bit shifted
out. OF is valid for a shift of one bit position only: 0 = no sign change occurred; 1 = sign
change occurred; for a multibit shift, OF is undefined. AF is undefined.
Am486 Microprocessor Instruction Set
2-239
AMD
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-240
Am486 Microprocessor Instruction Set
AMD
2.212
SHR
Shift Right
Opcode Instruction
Clocks Description
D0 /5
D2 /5
C0 /5 ib
D1 /5
D3 /5
C1 /5 ib
D1 /5
D3 /5
C1 /5 ib
3/4
3/4
2/4
3/4
3/4
2/4
3/4
3/4
2/4
SHR r/m8,1
SHR r/m8,CL
SHR r/m8,imm8
SHR r/m16,1
SHR r/m16,CL
SHR r/m16,imm8
SHR r/m32,1
SHR r/m32,CL
SHR r/m32,imm8
Performs unsigned divide r/m byte by 2 once.
Performs unsigned divide r/m byte by 2 CL times.
Performs unsigned divide r/m byte by 2 imm8 times.
Performs unsigned divide r/m word by 2 once.
Performs unsigned divide r/m word by 2 CL times.
Performs unsigned divide r/m word by 2 imm8 times.
Performs unsigned divide r/m doubleword by 2 once.
Performs unsigned divide r/m doubleword by 2 CL times.
Performs unsigned divide r/m doubleword by 2 imm8 times.
Operation
(* C0UNT is the second parameter *)
(temp) ← C0UNT;
WHILE (temp ≠ 0)
DO
CF ← low-order bit of r/m;
r/m ← r/m / 2; (* Unsigned divide *);
temp ← temp 1 ;
OD;
OF ← high-order bit of operand;
FI
Description
SHR shifts the bits of the operand downward. SHR shifts the Low order bit into CF. The
effect is to divide the operand by 2. SHR performs an unsigned divide and clears the highorder bit. The second operand indicates the number of shifts to make. The operand is either
an immediate number or the CL register contents. The processor does not allow shift counts
greater than 31; it only uses the bottom five bits of the operand if it is greater than 31.
Flags Affected
OF is set to the high-order bit of the original operand. The result determines the CF, ZF,
PF, and SF settings. CF is undefined if the shift lengths are greater than the size of the
shifted operand.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-241
AMD
2.213
SHRD
Double Precision Shift Right
Opcode
Instruction
Clocks
Description
0F AC
0F AC
0F AD
0F AD
SHRD r/m16,r16,imm8
SHRD r/m32,r32,imm8
SHRD r/m16,r16,CL
SHRD r/m32,r32,CL
2/3
2/3
3/4
3/4
r/m16 gets SHR of r/m16 concatenated with r16.
r/m32 gets SHR of r/m32 concatenated wtih r32.
r/m16 gets SHR of r/m16 concatenated with r16.
r/m32 gets SHR of r/m32 concatenated with r32.
Operation
(* count is an unsigned integer corresponding to the last operand of the
instruction, either an immediate byte or the byte in register CL *)
ShiftAmt ← count MOD 32;
inBits ← register; (* Allow overlapped operands *)
IF ShiftAmt = 0
THEN no operation
ELSE
IF ShiftAmt _ OperandSize
THEN (* Bad parameters *)
r/m ← UNDEFINED;
CF, OF, SF, ZF, AF, PF ← UNDEFINED;
ELSE (* Perform the shift *)
CF ← BIT[r/m, Shift – 1 ]; (* last bit shifted out on exit *)
FOR i ← 0 TO OperandSize – 1 – ShiftAmt
DO
BIT[r/m, i] ← BIT[r/m, 1 – ShiftAmt];
OD;
FOR i ← OperandSize – ShiftAmt TO OperandSize – 1
DO;
BIT[r/m,i] ← BIT[inBits,i +ShiftAmt – OperandSize];
OD;
(* SF, ZF, PF are set according to the value of the result *)
Set SF, ZF, PF (r/m);
AF ← UNDEFINED;
FI;
FI
Description
SHRD shifts the r/m word/doubleword specified by the first operand to the right as many
bits as indicated by the count operand, specified by an immediate byte or the CL register.
The second operand word/doubleword register (r16/ r32) provides the bits to shift in from
the left (starting with bit 31). SHRD then stores the result back into the r/m word/doubleword
specified by the first operand. The register remains unaltered.
The count operand is taken modulo 32 to provide a number between 0 and 31 by which to
shift. Because the bits to shift are provided by the specified registers, the operation is useful
for multiprecision shifts (64 bits or more).
Flags Affected
SF, ZF, and PF are set according to the result. CF is set to the value of the last bit shifted
out. OF is valid for a shift of one bit position only: 0 = no sign change occurred; 1 = sign
changed occurred; for a multibit shift, OF is undefined. AF is undefined.
2-242
Am486 Microprocessor Instruction Set
AMD
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-243
AMD
2.214
SIDT
Stores Interrupt Descriptor Table Register
Opcode
Instruction
Clocks
Description
0F 01 /1
SIDT m
10
Stores IDTR to m.
Operation
DEST ← 48-bit BASE/LIMIT register contents
Description
The SIDT instruction copies the contents of the descriptor table register to the 6 bytes of
memory indicated by the operand. The LIMIT field of the register is assigned to the first
word at the effective address. If the operand-size attribute is 16 bits, the next 3 bytes are
assigned the BASE field of the register and the fourth byte is undefined. Otherwise, if the
operand-size attribute is 32 bits, the next 4 bytes are assigned the 32-bit BASE field of the
register.
SIDT is only used in operating system software. It should not be used in application
programs.
Flags Affected
None
Protected Mode Exceptions
Invalid Opcode (6) indicates the destination operand is a register. General Protection Fault
(13) indicates the destination is in a non-writable segment or there is an illegal memoryoperand effective address in the code or data segments. Stack Fault (12) indicates an illegal
SS segment address. Page Fault (14) indicates a page fault. Alignment Check (17) indicates
an unaligned memory reference if the current privilege level is 3.
Real Address Mode Exceptions
Invalid Opcode (6) indicates the destination operand is a register. General Protection Fault
(13) indicates that part of the operand is referenced outside the effective address space
from 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) indicates the destination operand is a register. General Protection Fault
(13) indicates that part of the operand is referenced outside the effective address space
from 0 to 0FFFFh. Page Fault (14) indicates a page fault. Alignment Check (17) indicates
an unaligned memory reference if the current privilege level is 3.
Note: The 16-bit forms of the SIDT instructions are compatible with the 286 processor if
the value in the upper eight bits is not referenced. The 286 processor stores a 1 in each of
the upper bits, whereas Am386 and Am486 microprocessors store a 0 if the operand-size
attribute is 16 bits.
2-244
Am486 Microprocessor Instruction Set
AMD
2.215
SLDT
Stores Local Descriptor Table Register
Opcode
Instruction
Clocks
Description
0F 00 /0
SLDT r/m16
2/3
Stores LDTR to EA word.
Operation
r/m16 ← LDTR
Description
The SLDT instruction stores the Local Descriptor Table Register (LDTR) in the 2-byte
register or memory location indicated by the effective address operand. This register is a
selector that points into the Global Descriptor Table.
Note: The SLDT instruction is used only in operating system software. It is not used in
application programs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Invalid Opcode (6) occurs. SLDT is not recognized in Real Address Mode.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) occurs. SLDT is not recognized in Virtual 8086 Mode. Page Fault (14)
indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned
memory reference.
Note: The operand-size attribute has no effect on the operation of the instruction.
Am486 Microprocessor Instruction Set
2-245
AMD
2.216
SMSW
Stores Machine Status Word
Opcode
Instruction
Clocks
Description
0F 01 /4
SMSW r/m16
2/3
Stores machine status word to EA word.
Operation
r/m16 ← MSW
Description
The SMSW instruction stores the machine status word (part of the CR0 register) in the
2-byte register or memory location indicated by the effective address operand.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: This instruction is provided for compatibility with the 80286 microprocessor; programs
for the Am486 microprocessor should use the MOV ..., CR0 instruction.
2-246
Am486 Microprocessor Instruction Set
AMD
2.217
STC
Sets Carry Flag
Opcode
Instruction
Clocks
Description
F9
STC
2
Sets Carry Flag.
Operation
CF ← 1
Description
The STC instruction sets CF.
Flags Affected
CF is set.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
Am486 Microprocessor Instruction Set
2-247
AMD
2.218
STD
Sets Direction Flag
Opcode
Instruction
Clocks
Description
FD
STD
2
Sets Direction Flag to make the Stack Index (SI or ESI)
and/or the Data Index (DI or EDI) Registers decrement.
Operation
DF ← 1
Description
The STD instruction sets the Direction Flag, causing all subsequent string operations to
decrement the index registers on which they operate: SI (8-bit or 16-bit address) or ESI
(32-bit address), and/or DI (8-bit or 16-bit address) or EDI (32-bit address).
Flags Affected
DF is set. No other flags or registers are affected.
Protected Mode Exceptions
None
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
None
2-248
Am486 Microprocessor Instruction Set
AMD
2.219
STI
Sets Interrupt-Enable Flag
Opcode
Instruction
Clocks
Description
FB
STI
5
SetsInterrupt-enable Flag to enable interrupts at the end
of the next instruction.
Operation
IF ← 1
Description
STI sets the Interrupt-enable Flag (IF). The processor responds to external interrupts after
executing the next instruction if that instruction does not clear IF. If external interrupts are
disabled and the program executes STI before a RET instruction (such as at the end of a
subroutine), RET executes before processing any external interrupts. If external interrupts
are disabled and the program executes STI before a CLI instruction, no external interrupts
are processed because CLI clears IF.
Flags Affected
IF is set.
Protected Mode Exceptions
General Protection Fault (13) indicates the current privilege level is greater (has less privilege) than the I/O privilege level.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates the current privilege level is greater (has less privilege) than the I/O privilege level.
Note: If an NMI, trap, or fault occurs following STI, the interrupt will be processed before
executing the next sequential instruction in the code.
Am486 Microprocessor Instruction Set
2-249
AMD
2.220
STOS/STOSB/STOSD/STOSW
Stores String Data
Opcode
Instruction
Clocks
Description
AA
AB
AB
AA
AB
AB
STOS m8
STOS m16
STOS m32
STOSB
STOSD
STOSW
5
5
5
5
5
5
Stores AL in byte ES:(E)DI, update (E)DI.
Stores AX in word ES:(E)DI, update (E)DI.
Stores EAX in doubleword ES:(E)DI, update (E)DI.
Stores AL in byte ES:(E)DI, update (E)DI.
Stores EAX in doubleword ES:(E)DI, update (E)DI.
Stores AX in word ES:(E)DI, update (E)DI.
Operation
IF AddressSize = 16
THEN use ES:DI for DestReg
ELSE (* AddressSize = 32 *) use
FI;
IF byte type of instruction
THEN
(ES:DestReg) ← AL,
IF DF = 0
THEN DestReg ← DestReg + 1;
ELSE DestReg ← DestReg – 1;
FI;
ELSE IF OperandSize = 16
THEN (ES:DestReg) ← AX;
IF DF = 0
THEN DestReg ← DestReg +
ELSE DestReg ← DestReg –
FI;
ELSE (* OperandSize = 32 *)
(ES:DestReg) ← EAX;
IF DF = 0
THEN DestReg ← DestReg +
ELSE DestReg ← DestReg –
FI;
FI;
FI
ES:EDI for DestReg;
2;
2;
4;
4;
Description
STOS transfers the contents of the AL, AX, or EAX register to the memory byte, word, or
doubleword given by the destination register (DI for 16-bit addresses, EDI for 32-bit addresses) relative to the ES segment. The destination operand must be addressable from
the ES register. A segment override is not possible. The contents of the destination register
determine the destination address. STOS does not use an explicit operand. This operand
only validates ES segment addressability and determines the data type. You must load the
correct index value into the destination register before executing the STOS instruction.
After the transfer, STOS automatically updates the Data Index (DI or EDI) register. If the
Direction Flag (DF) is 0 (see CLD), the register increments; if DF is 1 (see STD), the register
decrements. The increment/decrement rate is 1 for a byte, 2 for a word, or 4 for a doubleword.
STOSB, STOSW, and STOSD are synonyms for the byte, word, and doubleword STOS
instructions. These forms do not require an operand and are simpler to use, but provide
no type or segment checking.
You can precede STOS with the REP prefix for a block fill of CX or ECX bytes, words, or
doublewords.
2-250
Am486 Microprocessor Instruction Set
AMD
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the ES segment. Page Fault (14)
indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an unaligned
memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-251
AMD
2.221
STR
Stores Task Register
Opcode
Instruction
Clocks
Description
0F 00 /1
STR r/m16
2/3
Stores task register to EA word.
Operation
r/m ← task register
Description
The contents of the task register are copied to the 2-byte register or memory location
indicated by the effective address operand.
Note: The STR instruction is used only in operating system software. It is not used in
application programs.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Invalid Opcode (6) occurs. STR is not recognized in Real Address Mode.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) occurs. STR is not recognized in Virtual 8086 Mode.
Note: The operand-size attribute has no effect on this instruction.
2-252
Am486 Microprocessor Instruction Set
AMD
2.222
SUB
Integer Subtraction
Opcode
Instruction
Clocks
Description
2C ib
2D iw
2D id
80 /5 ib
81 /5 iw
81 /5 id
83 /5 ib
83 /5 ib
28 /r
29 /r
29 /r
2A /r
2B /r
2B /r
SUB AL,imm8
SUB AX,imm16
SUB EAX,imm32
SUB r/m8,imm8
SUB r/m16,imm16
SUB r/m32,imm32
SUB r/m16,imm8
SUB r/m32,imm8
SUB r/m8,r8
SUB r/m16,r16
SUB r/m32,r32
SUB r8,r/m8
SUB r16,r/m16
SUB r32,r/m32
1
1
1
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/2
1/2
1/2
Subtracts immediate byte from AL.
Subtracts immediate word from AX.
Subtracts immediate doubleword from EAX.
Subtracts immediate byte from r/m byte.
Subtracts immediate word from r/m word.
Subtracts immediate doubleword from r/m doubleword.
Subtracts sign-ext. immediate byte from r/m word.
Subtracts sign-ext. immediate byte from r/m doubleword.
Subtracts byte register from r/m byte.
Subtracts word register from r/m word.
Subtracts doubleword register from r/m doubleword.
Subtracts r/m byte from byte register.
Subtracts r/m word from word register.
Subtracts r/m doubleword from doubleword register.
Operation
IF SRC is a byte and DEST is a word or doubleword
THEN DEST = DEST – SignExtend(SRC);
ELSE DEST ← DEST – SRC;
FI
Description
The SUB instruction subtracts the second operand (SRC) from the first operand (DEST).
The first operand is assigned the result of the subtraction and the flags are set accordingly.
If an immediate byte value is subtracted from a word operand, the immediate value is first
sign-extended to the size of the destination operand.
Flags Affected
OF, SF, ZF, AF, PF, and CF are set according to the result.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-253
AMD
2.223
TEST
Logical Compare
Opcode
Instruction
Clocks
Description
A8 ib
A9 iw
A9 id
F6 /0 ib
F7 /0 iw
F7 /0 id
84 /r
85 /r
85 /r
TEST AL,imm8
TEST AX,imm16
TEST EAX,imm32
TEST r/m8,imm8
TEST r/m16,imm16
TEST r/m32,imm32
TEST r/m8,r8
TEST r/m16,r16
TEST r/m32,r32
1
1
1
1/2
1/2
1/2
1/2
1/2
1/2
AND immediate byte with AL
AND immediate word with AX
AND immediate doubleword with EAX
AND immediate byte with r/m byte
AND immediate word with r/m word
AND immediate doubleword with r/m doubleword
AND byte register with r/m byte
AND word register with r/m word
AND doubleword register with r/m doubleword
Operation
DEST : = LeftSRC AND RightSRC;
CF ← 0;
OF ← 0
Description
The TEST instruction computes the bit-wise logical AND of its two operands. Each bit of
the result is 1 if both of the corresponding bits of the operands are 1; otherwise, each bit
is 0. The result of the operation is discarded and only the flags are modified.
Flags Affected
OF and CF are cleared; SF, ZF, and PF are set according to the result.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-254
Am486 Microprocessor Instruction Set
AMD
2.224
VERR/VERW
Verifies Segment for Read/Write
Opcode
Instruction
Clocks
Description
0F 00 /4
0F 00 /5
VERR r/m16
VERW r/m16
11/11
11/11
Sets ZF = 1 if segment readable, selector in r/m16.
Sets ZF = 1 if segment writable, selector in r/m16.
Operation
IF segment with selector at (r/m) is accessible
with current protection level
AND ((segment is readable for VERR) OR
(segment is writable for VERW))
THEN ZF ← 1;
ELSE ZF ← 0;
FI
Description
The VERR and VERW r/m word operand contains the selector value. The instructions
determine whether the segment pointed to by the selector is accessible from the current
privilege level, and, if it is readable (VERR) or writable (VERW). If the segment is accessible
and usable, the processor sets the Zero Flag (ZF); if the segment is not accessible or
usable, ZF is cleared. The following conditions must be met to set ZF:
n
The selector must denote a descriptor within the bounds of the descriptor table (GDT
or LDT); the selector must be “defined.”
n
The selector must denote a code or data segment descriptor (not a task state segment,
LDT, or gate).
n
For VERR , the segment must be readable. For VERW, the segment must be a writable
data segment.
n
If the code segment is usable and conforming, the descriptor privilege level (DPL) can
be any value for the VERR instruction. Otherwise, the DPL must be greater than or equal
to (have less or the same privilege as) both the current privilege level and the selector’s
RPL.
Validation is the same as that used for reading/writing segments loaded into the DS, ES,
FS, or GS register. ZF stores the validation result. The selector’s value cannot cause a
protection exception that would cause the software to anticipate segment access problems.
Flags Affected
ZF is set if the segment is accessible, and cleared if it is not.
Protected Mode Exceptions
No faults attributable to the selector operand are generated. General Protection Fault (13)
indicates either that the result is in a non-writable segment or there is an illegal memoryoperand effective address in the code or data segments. Stack Fault (12) indicates an illegal
SS segment address. Page Fault (14) indicates a page fault. If CPL is 3, Alignment Check
(17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
Invalid Opcode (6) occurs. VERR and VERW are not recognized in Real Address Mode.
Virtual 8086 Mode Exceptions
Invalid Opcode (6) occurs. VERR and VERW are not recognized in Virtual 8086 Mode.
Am486 Microprocessor Instruction Set
2-255
AMD
2.225
WAIT
Wait
Opcode
Instruction
Clocks
Description
9B
WAIT
1–3
Causes processor to check for numeric exceptions.
Description
WAIT causes the microprocessor to check for pending unmasked numeric exceptions before proceeding.
Flags Affected
None
Protected Mode Exceptions
Coprocessor Not Available (7) occurs if both MP and TS in CR0 are set.
Real Address Mode Exceptions
Coprocessor Not Available (7) occurs if both MP and TS in CR0 are set.
Virtual 8086 Mode Exceptions
Coprocessor Not Available (7) occurs if both MP and TS in CR0 are set.
Note: Coding WAIT after an ESC instruction ensures that any unmasked floating-point
exceptions the instruction may cause are handled before the microprocessor has a chance
to modify the instruction’s results. FWAIT is an alternate mnemonic for WAIT.
2-256
Am486 Microprocessor Instruction Set
AMD
2.226
WBINVD
Writes Back and Invalidates Cache
Opcode
Instruction
Clocks
Description
0F 09
WBINVD
5
Invalidates entire cache thereby causing the external
cache to write its contents back to memory and then flush
itself.
Operation
FLUSH INTERNAL CACHE
SIGNAL EXTERNAL CACHE TO WRITE-BACK
SIGNAL EXTERNAL CACHE TO FWSH
Description
The internal cache is flushed and a special-function bus cycle is issued to cause the external
cache to write its contents to main memory. Another special-function bus cycle follows,
directing the external cache to flush itself.
Flags Affected
None
Protected Mode Exceptions
The WBINVD instruction is a privileged instruction; General Protection Fault (13) indicates
the current privilege level is not 0.
Real Address Mode Exceptions
None
Virtual 8086 Mode Exceptions
General Protection Fault (13) occurs. WBINVD instruction is a privileged instruction.
Note: This instruction is implementation-dependent; its function may be implemented
differently on future AMD microprocessors. Hardware designers should ensure that their
systems respond to the external cache write-back and flush indications. This instruction is
not supported by 386 microprocessors.
Am486 Microprocessor Instruction Set
2-257
AMD
2.227
XADD
Exchanges and Adds
Opcode
Instruction
Clocks
Description
0F C0 /r
XADD r/m8,r8
4
0F C1 /r
XADD r/m16,r16
4
0F C1 /r
XADD r/m32,r32
4
Exchanges byte register and r/m byte; loads sum into
r/m byte.
Exchanges word register and r/m word; loads sum into
r/m word.
Exchanges doubleword register and r/m doubleword;
loads sum into r/m doubleword.
Operation
TEMP ← SRC + DEST
SRC ← DEST
DEST ← TEMP
Description
The XADD instruction loads DEST into SRC and then loads the sum of DEST and the
original value of SRC into DEST.
Flags Affected
CF, PF, AF, SF, ZF, and OF are affected as if an ADD instruction had been executed.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: You can use a LOCK prefix with this instruction. You cannot use this instruction with
386 microprocessors.
2-258
Am486 Microprocessor Instruction Set
AMD
2.228
XCHG
Exchange
Opcode
Instruction
Clocks
Description
90 + r
90 + r
90 + r
90 + r
86 /r
86 /r
87 /r
87 /r
87 /r
87 /r
XCHG AX,r16
XCHG r16,AX
XCHG EAX,r32
XCHG r32,EAX
XCHG r/m8,r8
XCHG r8,r/m8
XCHG r/m16,r16
XCHG r16,r/m16
XCHG r/m32,r32
XCHG r32,r/m32
3
3
3
3
3/5
3/5
3/5
3/5
3/5
3/5
Exchanges word register with AX.
Exchanges AX with word register.
Exchanges doubleword register with EAX.
Exchanges EAX with doubleword register.
Exchanges byte register with r/m byte.
Exchanges r/m byte with byte register.
Exchanges word register with r/m word.
Exchanges r/m word with word register.
Exchanges doubleword register with r/m doubleword.
Exchanges r/m doubleword with doubleword register.
Operation
temp ← DEST
DEST ← SRC
SRC ← temp
Description
The XCHG instruction exchanges two operands. The operands can be in either order. If a
memory operand is involved, the LOCK signal is asserted for the duration of the exchange,
regardless of the presence or absence of the LOCK prefix or of the value of the IOPL.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Note: For 16-bit data, you can use XCHG instead of BSWAP.
Am486 Microprocessor Instruction Set
2-259
AMD
2.229
XLAT/XLATB
Table Look-Up Translation
Opcode
Instruction
Clocks
Description
D7
D7
XLAT m8
XLATB
4
4
Sets AL to memory byte DS:[(E)BX + unsigned AL].
Sets AL to memory byte DS:[(E)BX + unsigned AL].
Operation
IF AddressSize = 16
THEN
AL ← (BX + ZeroExtend (AL))
ELSE (* AddressSize = 32 *)
AL ← (EBX + ZeroExtend (AL));
FI
Description
XLAT changes the AL register from the table index to the table entry. The AL register should
be an unsigned index into a table addressed by the DS:BX register pair (for a 16-bit address)
or the DS:EBX register pair (for a 32-bit address).
The XLAT operand allows for the possibility of a segment override, but the instruction uses
the contents of the BX register even if they differ from the offset of the operand. Load the
operand offset into the (E)BX register and the table index into AL before executing XLAT.
Use the no-operand form, XLATB, if the table referenced by (E)BX resides in the DS
segment.
Flags Affected
None
Protected Mode Exceptions
General Protection Fault (13) indicates an illegal memory-operand effective address in the
code or data segments. Stack Fault (12) indicates an illegal SS segment address. Page
Fault (14) indicates a page fault. If CPL is 3, Alignment Check (17) indicates there is an
unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
2-260
Am486 Microprocessor Instruction Set
AMD
2.230
XOR
Logical Exclusive OR
Opcode
Instruction
Clocks
Description
34 ib
35 iw
35 id
80 /6 ib
81 /6 iw
81 /6 id
83 /6 ib
83 /6 ib
30 /r
31 /r
31 /r
32 /r
33 /r
33 /r
XOR AL, imm8
XOR AX, imm16
XOR EAX, imm32
XOR r/m8, imm8
XOR r/m16, imm16
XOR r/m32, imm32
XOR r/m16, imm8
XOR r/m32, imm8
XOR r/m8, r8
XOR r/m16, r16
XOR r/m32, r32
XOR r8, r/m8
XOR r16, r/m16
XOR r32, r/m32
1
1
1
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/3
1/2
1/2
1/2
XOR immediate byte to AL
XOR immediate word to AX
XOR immediate doubleword to EAX
XOR immediate byte to r/m byte
XOR immediate word to r/m word
XOR immediate doubleword to r/m doubleword
XOR sign-extended immediate bye with r/m word
XOR sign-extended immediate byte with r/m doubleword
XOR byte register to r/m byte
XOR word register to r/m word
XOR doubleword register to r/m doubleword
XOR r/m byte to byte register
XOR r/m word to word register
XOR r/m doubleword to doubleword register
Operation
DEST ← LeftSRC XOR RightSRC
CF ← 0
OF ← 0
Description
XOR computes the exclusive OR of the two operands. If corresponding bits of the operands
are different, the resulting bit is 1. If the bits are the same, the result is 0. The answer
replaces the first operand.
Flags Affected
XOR clears CF and OF. The result sets or resets SF, ZF, and PF as required. XOR does
not affect AF.
Protected Mode Exceptions
General Protection Fault (13) indicates either that the result is in a non-writable segment
or there is an illegal memory-operand effective address in the code or data segments. Stack
Fault (12) indicates an illegal SS segment address. Page Fault (14) indicates a page fault.
If CPL is 3, Alignment Check (17) indicates there is an unaligned memory reference.
Real Address Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh.
Virtual 8086 Mode Exceptions
General Protection Fault (13) indicates that part of the operand lies outside the effective
address space: 0 to 0FFFFh. Page Fault (14) indicates a page fault. If CPL is 3, Alignment
Check (17) indicates there is an unaligned memory reference.
Am486 Microprocessor Instruction Set
2-261
AMD
2-262
Am486 Microprocessor Instruction Set
APPENDIX
A
A.1
GENERAL GUIDELINES FOR PROGRAMMING
GENERAL
An Am486 microprocessor communicates with the outside world through programming. If
you look at a description of its pinouts, you discover that the interface itself is not very
complex. The major lines of two-way communication are the 32 data lines (D31–D0), the
30 address lines (A31–A2), the four byte enable lines (BE3–BE0), and the four parity lines
(DP3–DP0). The CLK input provides the basic data timing signal—the heartbeat of the
computer. The only lines that activate the processor are the RESET, INTR, and NMI lines.
The RESET signal initializes the processor to a known state. The INTR and NMI lines come
from system hardware to signal either that a peripheral device needs service or that an
error or failure has occurred. The remaining input signals are control signals that tell the
microprocessor when to access its bus (RDY, HOLD, BRDY, and BOFF), manipulate the
internal cache (KEN, FLUSH, AHOLD, and EADS), use less than the 32 available data bits
in a data transfer (BS16 or BS8), emulate Virtual 8086 Mode (A20M), or ignore numeric
errors (IGNNE). So where does programming come in?
Programming defines the values (1s and 0s) that are placed on the data lines. Circuits
inside the microprocessor define how these values are interpreted by the microprocessor.
These circuits define the microprocessor “instruction code.” Corresponding data signals
activate specific processing by the microprocessor.
There are three major types of programming that correspond to the basic requirements of
a personal computer:
n
Basic input/output system (BIOS) software
n
Operating system (OS) software
n
Application software
All of these types of software use the same instruction set to perform operations within a
personal computer system. The major difference between them is the level at which they
operate and the operations they perform.
A.1.1
BIOS Software
BIOS software is stored in a stable memory storage device (some type of ROM or FLASH
RAM). This software usually performs at two levels: system initialization and peripheral
interface (input/output). Initialization begins when the microprocessor receives a RESET
signal. The signal starts an internal “hard-wired” program to test and initialize the internal
registers (data transfer and storage locations) in the processor and load a test program
into the system memory. This Power-On Self-Test program (POST) evaluates the operational status of the system components. When the tests are complete, the BIOS loads the
lower part of memory with a set of address maps that reference the input/output (I/O) part
of the BIOS software; and, if specified by stored system parameter values, loads the I/O
programs themselves into locations in the system memory (BIOS shadowing). Finally, the
BIOS turns over control to the operating system software by issuing an INT 19h instruction.
The actual location of the BIOS memory references in the lower part of memory is based
on the original IBM standards and subsequent industry developments. See Appendix H for
General Guidelines for Programming
A-1
AMD
a description of the memory map. The I/O software referred to in the memory map can be
read directly from in its source or from the system memory if the software is shadowed.
Shadowing the BIOS software provides faster system response. The I/O software assumes
further that the I/O devices themselves have a specific physical address through which
they are addressed. See Appendix J for a list of the standard I/O addresses.
A.1.2
OS Software
Operating system software provides a more user-friendly level of operation. It provides a
base set of programs that allow the user to access information retained on bulk storage
devices (defining manageable sets of data using files and directories), adjust system information (such as system time and date), and invoke application programs. There is a variety
of operating systems available, but the two most common among IBM-compatible personal
computer users are the command-oriented DOS and the icon-oriented graphical interface
Microsoft® Windows™. In addition to providing a basic interface between the user and the
personal computer system, the OS software also integrates special programs called drivers
to allow you to expand and customize the number and types of peripheral devices used
with your system. These may include special video drivers to accommodate newer types
of video cards and monitors, as well as user input devices (scanners, digitizers, mouse
devices, trackballs, etc.), communication devices (fax/modems), network interfaces, and a
myriad of emerging multimedia devices.
A.1.3
Application Software
Application software includes a variety of specific-function packages. With these packages,
a personal computer can be a documentation production unit, an animation studio, a musical
instrument, a tutor, a drafting tool, a communication base, an accounting division, a game
arcade, or almost anything imaginable. More programs become available every day.
A.1.4
Software Overview
Regardless of the level of complexity, all of the types of programming share a common
base. They all use the same concepts of program development, use the same instruction
set, and have access to the same general registers. However, the microprocessor provides
internal divisions of memory access (segmentation) and coding (priority levels) that allow
segregation of the operation of the various types of programming.
A.2
BASIC PROGRAMMING MODEL
To create effective and efficient software, a programmer must have a good understanding
of the following environmental elements established by the microprocessor architecture:
A-2
n
Operating modes
n
Memory organization
n
Internal system protection
n
Data types
n
Registers
n
Instruction format
n
Operand selection
n
Interrupts and exceptions
n
I/O operations
General Guidelines for Programming
AMD
A.2.1
Operating Modes
For user convenience, industry-wide compatibility, and general acceptance in the personal
computer market, the Am486 microprocessor must support a variety of programs originally
written for 8086, 8088, 286, 386, and 486 microprocessors. The Am486 microprocessor
uses three operating modes to provide this level of compatibility:
n
Protected Mode—the highest operating mode level. This mode supports the full 32-bit
instruction set with all of its architectural features.
n
Real Mode—the basic 8086 emulation mode. This mode limits the processor to real
addresses (from 0 to 1 Mbyte) only with no translation. Some extensions of the 8086
mode are provided, such as the ability to break out of the mode.
Note: Reset initialization always places the processor into Real Mode; the operating system
needs to change bit 0 in CR0 to a 1 to go to Protected Mode.
n
Virtual Mode—a modified 8086 emulation mode. This mode is compatible with available
protection and memory management. The processor can enter the Virtual Mode from
Protected Mode to run programs written for the 8086 processor, and then return to
Protected Mode to execute 32-bit instructions without having to undergo system reset
and initialization.
Note: Bit 17 of the EFLAGS register is the VM bit. Setting the bit to a 1 places the processor
in Virtual Mode. Resetting the bit to 0 returns to Protected Mode.
Whenever execution occurs, the current operating mode determines the extent to which a
program implements a specific instruction. Chapter 2 includes for each instruction the
exceptions that it may generate depending on the operating mode. In general, both Real
Mode and Virtual Mode are limited to 8-bit operations and 1-Mbyte maximum addressing
limits. Most memory management features, such as segmentation and paging, are not
available to Real Mode or Virtual Mode operation. In these two modes, addressing is linear
and direct. These two modes can access the instructions added by later processors with
the restrictions described above.
A.2.2
Memory Organization
A microprocessor requires external memory to store the values (both data and programming
code) that are loaded into and out of the microprocessor through the data lines. Although
a personal computer system uses physical memory chips organized as a series of 8-bit
bytes located at unique sequential physical addresses, the programmer has a variety of
methods available to access a specific memory location. These optional memory access
methods are controlled by the microprocessor memory management system.
The memory management system lets operating systems control the environments in which
programs run. If several programs run at the same time, they each need an independent
address space to avoid having to perform difficult and time-consuming checks to avoid
interfering with each other. To accomplish this, the memory management system in Am486
processors uses two memory control mechanisms: segmentation and paging. Segmentation gives each program several independent and protected address spaces. Paging supports an environment where large address spaces are simulated using a small amount of
RAM and some disk storage. System designers may choose to use either or both of these
mechanisms.
General Guidelines for Programming
A-3
AMD
A.2.2.1
Segmentation
Segmentation can allow memory to be completely unstructured and simple, like the memory
model of an 8-bit microprocessor, or highly structured with address translation and protection. The microprocessor implements this concept by dividing memory into units called
segments. Each segment is an independent, protected address space. Access to segments
is controlled by a data set that describes its size, the privilege level required for access,
the kinds of memory references allowed to it (instruction fetch, stack push or pop, read
operation, write operation, etc.), and whether it is present in memory (this final feature
allows segment contents to be swapped between memory and disk space).
In addition to controlling memory access, segmentation can also simplify the linkage of
object code modules. There is no reason to write position dependent code when full use
is made of the segmentation mechanism, because all memory references can be made
relative to the base addresses of a module’s code and data segments. Segmentation can
be used to create ROM-based software modules in which fixed addresses (fixed, in the
sense that they cannot be changed) are offsets from a segment’s base address. Different
software systems can have the ROM modules at different physical addresses because the
segmentation mechanism will direct all memory references to the right place.
A.2.2.1.1
Simple Memory Architecture
In a simple memory architecture, all addresses refer to the same address space. This is
the memory model used by 8-bit microprocessors such as the 8086 microprocessor where
the logical address is the physical address. The Am486 microprocessor can be used in this
way by mapping all segments into the same address space and keeping paging disabled.
This might be done where an older design is being updated to 32-bit technology without
also adopting the new architectural features.
A.2.2.1.2
Partial Segmentation Use
An application can also make partial use of segmentation. A common cause of software
failures is the growth of the stack into the instruction code or data used by the program.
Proper use of segmentation can prevent this. The stack can be put in an address space
separate from the address space for both code and data. Stack addresses always refer to
memory in the stack segment, while data addresses always refer to memory in the data
segment. The stack segment has a hardware controlled maximum limit. Any attempt to
exceed this limit generates an exception.
A.2.2.1.3
Full Segmentation Implementation
A complex system of programs may make full use of segmentation and have precise control
of access to shared data. This creates an environment in which the programs can interact
by manipulating data used throughout the system without creating exceptions or overwriting
an operating code or data. Real Mode can implement full segmentation within the overall
memory limits.
A.2.2.2
Paging
Paging simulates a large, unsegmented address space using a small, fragmented address
space and some disk storage. Paging provides access to data structures larger than the
available memory space by keeping them partly in memory and partly on disk. The microprocessor creates memory units of 4 Kbytes called pages. When a program attempts to
access a page stored on disk, a special exception occurs. Unlike other exceptions and
interrupts, an address translation exception restores the contents of the microprocessor
registers to values that allow the exception generating instruction to reexecute. This special
action is called instruction restart. It allows the operating system to read the page from disk,
update the mapping of linear addresses to physical addresses for that page, and restart
the program. This process is transparent to the program.
A-4
General Guidelines for Programming
AMD
If an operating system or memory manager never sets bit 31 of the CR0 register (the PG
bit), the paging mechanism is not enabled. Linear addresses are read as physical addresses
directly. This might be desirable if you are updating a 16-bit processor design for use with
a 32-bit microprocessor. The 16-bit processor operating system does not use paging because its address space is so small (64 Kbytes) and it is more efficient to swap entire
segments between RAM and disk, rather than individual pages. Paging is enabled for
operating systems that can support demand-paged virtual memory, such as UNIX. Paging
is transparent to application software, so an operating system intended to support application programs written for 16-bit microprocessors may run those programs with paging
enabled. Unlike paging, segmentation is not transparent to application programs. Programs
that use relocatable codes (i.e., hard coded segments) must be run with the segments they
were designed to use. Segmentation hardware translates a segmented (logical) address
into an address for a continuous, unsegmented address space, called a linear address. If
paging is enabled, paging hardware translates a linear address into a physical address. If
paging is not enabled, the linear address is used as the physical address. The physical
address appears on the address bus coming out of the microprocessor.
A.2.2.3
Selecting a Segmentation Model
A model for the segmentation of memory is chosen on the basis of reliability and performance. For example, a system that has several programs sharing data in real time would
get maximum performance from a model that checks memory references in hardware. This
would be a multisegment model. At the other extreme, a system that has just one program
may get higher performance from an unsegmented or “flat” model. The elimination of “far”
pointers and segment override prefixes reduces code size and increases execution speed.
Context switching is faster because the contents of the segment registers no longer have
to be saved or restored. Some of the benefits of segmentation also can be provided by
paging. For example, data can be shared by mapping the same pages onto the address
space of each program.
A.2.2.3.1
Flat Model
The simplest model is the flat model. In this model, all segments are mapped to the entire
physical address space. A segment offset can refer to either code or data areas. To the
greatest extent possible, this model removes the segmentation mechanism from the architecture seen by either the system designer or the application programmer. This might be
done for a programming environment like UNIX, which supports paging but does not support
segmentation. A segment is defined by a segment descriptor. At least two segment descriptors must be created for a flat model, one for code references and one for data references. Both descriptors have the same base address value. Whenever memory is accessed, the contents of one of the segment registers is used to select a segment descriptor.
The segment descriptor provides the base address of the segment and its limit, as well as
access control information (see Figure A-1).
Figure A-1
Flat Memory Model
General Guidelines for Programming
A-5
AMD
ROM usually is put at the top of the physical address space because the microprocessor
begins execution at 0FFFFFFF0h. RAM is placed at the bottom of the address space
because the initial base address for the DS data segment after reset initialization is 0. For
a flat model, each descriptor has a base address of 0 and a segment limit of 4 Gbytes. By
setting the segment limit to 4 Gbytes, the segmentation mechanism is kept from generating
exceptions for memory references that fall outside of a segment. Exceptions could still be
generated by the paging or segmentation protection mechanisms, but these also can be
removed from the memory model.
A.2.2.3.2
Protected Flat Model
The protected flat model is similar to the flat model, except the segment limits are set to
include only the range of addresses for which memory actually exists. A general protection
exception is generated by any attempt to access unimplemented memory. This provides a
minimum level of hardware protection against unexpected programming results when the
paging mechanism is disabled in a system.
In this model, the segmentation hardware prevents programs from addressing nonexistent
memory locations. The consequences of being allowed access to these memory locations
are hardware-dependent. For example, if the microprocessor does not receive a READY
signal (the signal used to acknowledge and terminate a bus cycle), the bus cycle does not
terminate and program execution stops. Although no program should make an attempt to
access these memory locations, an attempt may occur as a result of programming errors.
Without hardware checking of addresses, it is possible that an expected programming result
could suddenly stop program execution. With hardware checking, programs fail in a controlled way. A diagnostic message can appear and recovery procedures can be attempted.
An example of a protected flat model is shown in Figure A-2. Here, segment descriptors
have been set up to cover only those ranges of memory that exist. A code and a data
segment cover the EPROM and DRAM of physical memory. The code segment limit can
be optionally set to allow access to DRAM area. The data segment limit must be set to the
sum of EPROM and DRAM sizes. If memory-mapped I/O is used, it can be addressed just
beyond the end of the DRAM area.
Figure A-2
A-6
Protected Flat Memory Model
General Guidelines for Programming
AMD
A.2.2.3.3
Multisegment Model
The most sophisticated model is the multisegment model. Here the full capabilities of the
segmentation mechanism are used. Each program is given its own table of segment descriptors, and its own segments. The segments can be completely private to the program,
or they can be shared with specific other programs. Access between programs and particular segments can be individually controlled. Up to six segments can be ready for immediate
use. These are the segments that have segment selectors loaded in the segment registers.
Other segments are accessed by loading their segment selectors into the segment registers
(see Figure A-3).
Figure A-3
Multisegment Memory Model
Each segment is a separate address space. Even though they may be placed in adjacent
blocks of physical memory, the segmentation mechanism prevents access to the contents
of one segment by reading beyond the end of another. Every memory operation is checked
against the limit specified for the segment it uses. An attempt to address memory beyond
the end of the segment generates a general-protection exception.
The segmentation mechanism only enforces the address range specified in the segment
descriptor. It is the responsibility of the operating system to allocate separate address
ranges to each segment. There may be situations in which it is desirable to have segments
that share the same range of addresses. For example, a system may have both code and
data stored in a ROM. A code segment descriptor is used when the ROM is accessed for
instruction fetches. A data segment descriptor is used when the ROM is accessed as data.
General Guidelines for Programming
A-7
AMD
A.2.2.4
Segment Translation
A logical address consists of the 16-bit segment selector for its segment and a 32-bit offset
into the segment. A logical address is translated into a linear address by adding the offset
to the base address of the segment. The base address comes from the segment descriptor,
a data structure in memory that provides the size and location of a segment, as well as
access control information. The segment descriptor comes from one of two tables, the
global descriptor table (GDT) or the local descriptor table (LDT). There is one GDT for all
programs in the system, and one LDT for each separate program being run. If the operating
system allows, different programs can share the same IDT. The system also may be set
up with no LDTs; all programs will then use the GDT.
Every logical address is associated with a segment (even if the system maps all segments
into the same linear address space). Although a program may have thousands of segments,
only six may be available for immediate use. These are the six segments whose segment
selectors are loaded in the microprocessor. The segment selector holds information used
to translate the logical address into the corresponding linear address.
Separate segment registers exist in the microprocessor for each kind of memory reference
(code space, stack space, and data spaces). They hold the segment selectors for the
segments currently in use. Access to other segments requires loading a segment register
using a form of the MOV instruction. Up to four data spaces may be available at the same
time, thus providing a total of six segment registers.
When a segment selector is loaded, the base address, segment limit, and access control
information also are loaded into the segment register. The microprocessor does not reference the descriptor tables again until another segment selector is loaded. The information
saved in the microprocessor allows it to translate addresses without making extra bus
cycles. In systems in which multiple microprocessors have access to the same descriptor
tables, it is the responsibility of software to reload the segment registers when the descriptor
tables are modified. If this is not done, an old segment descriptor cached in a segment
register might be used after its memory-resident version has been modified.
The segment selector contains a 13-bit index into one of the descriptor tables. The index
is scaled by 8 (the number of bytes in a segment descriptor) and added to the 32-bit base
address of the descriptor table. The base address comes from either the global descriptor
table register (GDTR) or the local descriptor table register (LDTR). These registers hold
the linear address of the beginning of the descriptor tables. A bit in the segment selector
specifies which table to use (see Figure A-4).
A-8
General Guidelines for Programming
AMD
Figure A-4
TI Bit Selects Descriptor Table
General Guidelines for Programming
A-9
AMD
Figure A-5
Segment Translation
The translated address is the linear address (see Figure A-5). If paging is not used, the
translated address is also the physical address. If paging is used, a second level of address
translation produces the physical address. This translation is described in Section A.2.2.5.
A.2.2.4.1
Segment Registers
Each kind of memory reference is associated with a segment register. Code, data, and
stack references each access the segment specified by their segment register contents.
More segments can be made available by loading their segment selectors into these registers during program execution. Every segment register has a “visible” part and an “invisible” part (see Figure A-6). There are forms of the MOV instruction to load the visible part
of these segment registers. The invisible part is loaded by the microprocessor.
The operations that load these registers are instructions for application programs (described
in Chapter 2). There are two kinds of these instructions:
Figure A-6
A-10
n
Direct load instructions such as the MOV, POP, LDS, LES, LFS, LGS, and LSS instructions. These instructions explicitly reference the segment registers.
n
Implied load instructions such as the far pointer versions of the CALL and JMP instructions. These instructions change the contents of the CS register as an incidental part of
their function.
Segment Registers
General Guidelines for Programming
AMD
When these instructions are used, the visible part of the segment register is loaded with a
segment selector. The microprocessor automatically fetches the base address, limit, type,
and other information from the descriptor table and loads the invisible part of the segment
register.
Because most instructions refer to segments whose selectors already have been loaded
into segment registers, the microprocessor can add the logical-address offset to the segment base address with no performance penalty.
A.2.2.4.2
Segment Selectors
A segment selector points to the information that defines a segment, called a segment
descriptor. A program may have more segments than the six whose segment selectors
occupy segment registers. When this is true, the program uses forms of the MOV instruction
to change the contents of these registers when it needs to access a new segment.
A segment selector identifies a segment descriptor by specifying a descriptor table and a
descriptor within that table. Segment selectors are visible to application programs as a part
of a pointer variable, but the values of selectors are usually assigned or modified by link
editors or linking loaders, not application programs. Figure A-7 shows the format of a
segment selector.
Figure A-7
Segment Selector
n
Index: Selects one of 8192 descriptors in a descriptor table. The microprocessor multiplies the index value by 8 (the number of bytes in a segment descriptor) and adds the
result to the base address of the descriptor table (from the GDTR or LDTR register).
n
Table Indicator bit: Specifies the descriptor table to use. A clear bit selects the GDT; a
set bit selects the current LDT.
n
Requester Privilege Level: When this field contains a privilege level having a greater
value (i.e., less privileged) than the program, it overrides the program’s privilege level.
When a program uses a less privileged segment selector, memory accesses take place
at the lesser privilege level. This is used to guard against a security violation in which a
less privileged program uses a more privileged program to access protected data.
For example, system utilities or device drivers must run with a high level of privilege in order
to access protected facilities such as the control registers of peripheral interfaces. But they
must not interfere with other protected facilities, even if a request to do so is received from
a less privileged program. If a program requested reading a sector of disk into memory
occupied by a more privileged program, such as the operating system, the RPL can be
used to generate a general-protection exception when the less privileged segment selector
is used. This exception occurs even though the program using the segment selector would
have a sufficient privilege level to perform the operation on its own.
Because the first entry of the GDT is not used by the microprocessor, a selector that has
an index of 0 and a table indicator of 0 (i.e., a selector that points to the first entry of the
GDT) is used as a “null selector.” The microprocessor does not generate an exception when
General Guidelines for Programming
A-11
AMD
a segment register (other than the CS or SS registers) is loaded with a null selector. It does,
however, generate an exception when a segment register holding a null selector is used
to access memory. This feature can be used to initialize unused segment registers.
A.2.2.4.3
Segment Descriptors
A segment descriptor is a data structure in memory that provides the microprocessor with
the size and location of a segment, as well as control and status information. Descriptors
are typically created by compilers, linkers, loaders, or the operating system, but not application programs. Figure A-8 illustrates the general descriptor format.
Figure A-8
Segment Descriptor
All types of segment descriptors take one of these formats:
n
Base: Defines the location of the segment within the 4-Gbyte physical address space.
The microprocessor puts together the three base address fields to form a single 32-bit
value. Segment base values should be aligned to 16-byte boundaries to allow programs
to maximize performance by aligning code/data on 16-byte boundaries.
n
Granularity bit: Turns on scaling of the limit field by a factor of 4096 (212). When the bit
is clear, the segment limit is interpreted in units of 1 byte; when set, the segment limit
is interpreted in units of 4 Kbytes (one page). Note that the twelve least-significant bits
of the address are not tested when scaling is used. For example, a limit of 0 with the
Granularity bit set results in valid offsets from 0 to 4095. Also note that only the Limit
field is affected. The base address remains byte-granular.
n
Limit: Defines the size of the segment. The microprocessor puts together the two limit
fields to form a 20-bit value. The microprocessor interprets the limit in one of two ways,
depending on the setting of the Granularity bit:
— If the Granularity bit is clear, the limit has a value from 1 byte to 1 Mbyte, in increments
of 1 byte.
— If the Granularity bit is set, the Limit has a value from 4 Kbytes to 4 Gbytes, in
increments of 4 Kbytes.
A-12
General Guidelines for Programming
AMD
Table A-1
n
Offset: For most segments, a logical address may have an offset ranging from 0 to the
limit. Other offsets generate exceptions. Expand-down segments reverse the sense of
the Limit field; they may be addressed with any offset except those from 0 to the limit
(see the Type field, below). This is done to allow segments to be created in which
increasing the value held in the Limit field allocates new memory at the bottom of the
segment’s address space, rather than at the top. Expand-down segments are intended
to hold stacks, but it is not necessary to use them. If a stack is going to be put in a
segment that does not need to change size, it can be a normal data segment.
n
S bit: Determines whether a given segment is a system segment or a code or data
segment. If the S bit is set, then the segment is either a code or a data segment. If it is
clear, then the segment is a system segment.
n
D bit: The code segment D bit indicates the default length for operands and effective
addresses. If the D bit is set, then 32-bit operands and 32-bit effective addressing modes
are assumed. If it is clear, then 16-bit operands and addressing modes are assumed.
n
Type: The interpretation of this field depends on whether the segment descriptor is for
an application segment or a system segment. System segments have a slightly different
descriptor format. The Type field of a memory descriptor specifies the kind of access
that may be made to a segment, and its direction of growth (see Table A-1).
Application Segment Types
Number
E
W
A
Descriptor
Type
0
0
0
0
Data
Read-Only
1
0
0
1
Data
Read-Only, accessed
2
0
1
0
Data
Read/Write
3
0
1
1
Data
Read/Write, accessed
4
1
0
0
Data
Read-Only, expand-down
5
1
0
1
Data
Read-Only, expand-down, accessed
6
1
1
0
Data
Read/Write, expand-down
7
1
1
1
Data
Read/Write, expand-down, accessed
Number
C
R
A
Descriptor
Type
8
0
0
0
Code
Execute-Only
9
0
0
1
Code
Execute-Only, accessed
10
0
1
0
Code
Execute/Read
11
0
1
1
Code
Execute/Read, accessed
12
1
0
0
Code
Execute-Only, conforming
13
1
0
1
Code
Execute-Only, conforming, accessed
14
1
1
0
Code
Execute/Read, conforming
15
1
1
1
Code
Execute/Read, conforming, accessed
Description
Description
General Guidelines for Programming
A-13
AMD
For data segments, the three lowest bits of the type field can be interpreted as expanddown (E), write-enable (W), and accessed (A). For code segments, the three lowest bits
of the type field can be interpreted as conforming (C), read-enable (R), and accessed (A).
Data segments can be read-only or read/write. Stack segments are data segments that
must be read/write. Loading the SS register with a segment selector for any other type of
segment generates a general-protection exception. If the stack segment needs to be able
to change size, it can be an expand-down data segment. The meaning of the segment limit
is reversed for an expand-down segment. While an offset in the range from 0 to the segment
limit is valid for other kinds of segments (outside this range a general protection exception
is generated), in an expand-down segment these offsets are the ones that generate exceptions. The valid offsets in an expand-down segment are those that generate exceptions
in the other kinds of segments. Expand-up segments must be addressed by offsets that
are equal to or less than the segment limit. Offsets into expand down segments always
must be greater than the segment limit. This interpretation of the segment limit causes
memory space to be allocated at the bottom of the segment when the segment limit is
decreased, which is correct for stack segments because they grow toward lower addresses.
If the stack is given a segment that does not change size, it does not need to be an expanddown segment.
Code segments can be execute-only or execute/read. An execute/read segment might be
used, for example, when constants have been placed with instruction code in a ROM. In
this case, the constants can be read either by using an instruction with a CS override prefix
or by placing a segment selector for the code segment in a segment register for a data
segment.
Code segments can be either conforming or non-conforming. A transfer of execution into
a more privileged conforming segment keeps the current privilege level. A transfer into a
non-conforming segment at a different privilege level results in a general protection exception, unless a task gate is used. System utilities that do not access protected facilities, such
as data-conversion functions (e.g., EBCDIC/ASCII translation, Huffman encoding/decoding, math library) and some types of exceptions (e.g., Divide Error, INTO-detected overflow,
and BOUND range exceeded) may be loaded in conforming code segments.
The Type field also reports whether the segment has been accessed. Segment descriptors
initially report a segment as having been accessed. If the Type field then is set to a value
for a segment that has not been accessed, the microprocessor restores the value if the
segment is accessed. By clearing and testing the Low bit of the Type field, software can
monitor segment usage (the Low bit of the Type field also is called the Accessed bit).
For example, a program development system might clear all of the Accessed bits for the
segments of an application. If the application crashes, the states of these bits can be used
to generate a map of all the segments accessed by the application. Unlike the breakpoints
provided by the debugging mechanism, the usage information applies to segments rather
than physical addresses.
The microprocessor may update the Type field when a segment is accessed, even if the
access is a read cycle. If the descriptor tables have been put in ROM, it may be necessary
for hardware to prevent the ROM from being enabled onto the data bus during a write cycle.
It also may be necessary to return the READY signal to the microprocessor when a write
cycle to ROM occurs, otherwise the cycle does not terminate. These features of the hardware design are necessary for using ROM-based descriptor tables with the Am386DX
microprocessor, which always sets the Accessed bit when a segment descriptor is loaded.
The Am486 microprocessor, however, only sets the Accessed bit if it is not already set.
A-14
General Guidelines for Programming
AMD
Writes to descriptor tables in ROM can be avoided by setting the Accessed bits in every
descriptor.
n
DPL (Descriptor Privilege level): Defines the privilege level of the segment. This is used
to control access to the segment, using the protection mechanism described in
Section A.2.3.
n
Segment-Present bit: If this bit is clear, the microprocessor generates a segment-notpresent exception when a selector for the descriptor is loaded into a segment register.
This is used to detect access to segments that have become unavailable. A segment
can become unavailable when the system needs to create free memory. Items in memory, such as character fonts or device drivers, which currently are not being used are
deallocated. An item is deallocated by marking the segment “not present” (this is done
by clearing the Segment-Present bit). The memory occupied by the segment then can
be put to another use. The next time the deallocated item is needed, the segment-notpresent exception will indicate the segment needs to be loaded into memory. When this
kind of memory management is provided in a manner invisible to application programs,
it is called virtual memory. A system may maintain a total amount of virtual memory far
larger than physical memory by keeping only a few segments present in physical memory
at any one time.
Figure A-9 shows the format of a descriptor when the Segment-Present bit is clear. When
this bit is clear, the operating system is free to use the locations marked Available to store
its own data, such as information regarding the whereabouts of the missing segment.
Figure A-9
Segment Descriptor (Segment Not Present)
A.2.2.4.4
Segment Descriptor Tables
A segment descriptor table is an array of segment descriptors. There are two kinds of
descriptor tables:
n
The global descriptor table (GDT)
n
The local descriptor tables (LDT)
There is one GDT for all tasks, and an LDT for each task being run. A descriptor table is
an array of segment descriptors (see Figure A-10).
A descriptor table is variable in length and may contain up to 8192 (213) descriptors. The
first descriptor in the GDT is not used by the microprocessor. A segment selector to this
“null descriptor” does not generate an exception when loaded into a segment register, but
it always generates an exception when an attempt is made to access memory using the
descriptor. By initializing the segment registers with this segment selector, accidental reference to unused segment registers can be guaranteed to generate an exception.
General Guidelines for Programming
A-15
AMD
Figure A-10
Descriptor Tables
Figure A-11
Pseudo-Descriptor Format
A.2.2.4.5
Descriptor Table Base Registers
The microprocessor finds the global descriptor table (GDT) and interrupt descriptor table
(IDT) using the GDTR and IDTR registers. These registers hold 32-bit base addresses for
tables in the linear address space. They also hold 16-bit limit values for the size of these
tables. When the registers are loaded or stored, a 48-bit “pseudo-descriptor” is accessed
in memory (see Figure A-11). The GDT and IDT should be aligned on a 16-byte boundary
to maximize performance due to cache line fills. The limit value is expressed in bytes. As
with segments, the limit value is added to the base address to get the address of the last
valid byte. A limit value of 0 results in exactly one valid byte. Because segment descriptors
are always 8 bytes, the limit should always be one less than an integral multiple of eight
(that is, 8N – 1). The LGDT and SGDT instructions read and write the GDTR register; the
LIDT and SIDT instructions read and write the IDTR register.
A-16
General Guidelines for Programming
AMD
A third descriptor table is the local descriptor table (LDT). It is identified by a 16-bit segment
selector held in the LDTR register. The LLDT and SLDT instructions read and write the
segment selector in the LDTR register. The LDTR register also holds the base address and
limit for the LDT, but these are loaded automatically by the microprocessor from the segment
descriptor for the LDT. The LDT should be aligned on a 16-byte boundary to maximize
performance due to cache line fills.
Alignment check faults may be generated by storing a pseudo-descriptor in user mode
(privilege level 3). User-mode programs normally do not store pseudo-descriptors, but the
possibility of generating an alignment check fault in this way can be avoided by placing the
pseudo-descriptor at an odd word address (i.e., an address which is 2 MOD 4). This causes
the microprocessor to store an aligned word, followed by an aligned doubleword.
A.2.2.5
Page Translation
A linear address is a 32-bit address into a uniform, unsegmented address space. This
address space may be a large physical address space (i.e., an address space composed
of 4 Gbytes of RAM), or paging can be used to simulate this address space using a small
amount of RAM and some disk storage. When paging is used, a linear address is translated
into its corresponding physical address or an exception is generated. The exception gives
the operating system a chance to read the page from disk (perhaps sending a different
page out to disk in the process), then restart the instruction that generated the exception.
Paging differs from segmentation by its use of small, fixed-size pages. Unlike segments,
which vary in size depending on the data structures they hold, Am486 microprocessor
pages are always 4 Kbytes. If segmentation is the only form of address translation that is
used, a data structure present in physical memory has all of its parts in memory. If paging
is used, a data structure may be partly in memory and partly in disk storage.
Information that maps linear addresses into physical addresses and exceptions is held in
data structures in memory called page tables. As with segmentation, this information is
cached in microprocessor registers to minimize the number of bus cycles required for
address translation. Unlike segmentation, these microprocessor registers are completely
invisible to application programs. For testing purposes, however, these registers are visible
to programs running with maximum privileges.
The paging mechanism treats the 32-bit linear address as having three parts, two 10-bit
indexes into the page tables and a 12-bit offset into the page addressed by the page tables.
Because both the virtual pages in the linear address space and the physical pages of
memory are aligned to 4-Kbyte page boundaries, there is no need to modify the Low 12
bits of the address. These 12 bits pass straight through the paging hardware, whether
paging is enabled or not. Note that this is different from segmentation, because segments
can start at any byte address.
The upper 20 bits of the address are used to index into the page tables. If every page in
the linear address space were mapped by a single page table in RAM, 4 Mbytes would be
needed. This is not done. Instead, two levels of page tables are used. The top level page
table is called the page directory. It maps the upper 10 bits of the linear address to the
second level of page tables. The second level of page tables maps the middle 10 bits of
the linear address to the base address of a page in physical memory (called a page frame
address).
General Guidelines for Programming
A-17
AMD
An exception may be generated based on the contents of the page table or the page
directory. An exception gives the operating system a chance to bring in a page table from
disk storage. By allowing the second-level page tables to be sent to disk, the paging mechanism can support mapping of the entire linear address space using only a few pages in
memory.
The CR3 register holds the page frame address of the page directory. For this reason, it
also is called the Page Directory Base Register or PDBR. The upper 10 bits of the linear
address are scaled by four (the number of bytes in a page table entry) and added to the
value in the PDBR register to get the physical address of an entry in the page directory.
Because the page frame address is always clear in its lowest 12 bits, this addition is
performed by concatenation (replacement of the Low 12 bits with the scaled index).
When the entry in the page directory is accessed, several checks are performed. Exceptions
may be generated if the page is protected or is not present in memory. If no exception is
generated, the upper 20 bits of the page table entry are used as the page frame address
of a second-level page table. The middle 10 bits of the linear address are scaled by four
(again, the size of a page table entry) and concatenated with the page frame address to
get the physical address of an entry in the second-level page table.
Again, access checks are performed and exceptions may be generated. If no exception
occurs, the upper 20 bits of the second-level page table entry are concatenated with the
lowest 12 bits of the linear address to form the physical address of the operand (data) in
memory.
Although this process may seem complex, it requires very little overhead. The microprocessor has a cache for page table entries called the Translation Lookaside Buffer (TLB).
The TLB satisfies most requests for reading the page tables. Extra bus cycles occur only
when a new page is accessed. The page size (4 Kbytes) is large enough so that very few
bus cycles are made to the page tables, compared to the number of bus cycles made to
instructions and data. At the same time, the page size is small enough to make efficient
use of memory. (No matter how small a data structure is, it occupies at least one page of
memory.)
A.2.2.5.1
PG Bit Enables Paging
If paging is enabled, a second stage of address translation is used to generate the physical
address from the linear address. If paging is not enabled, the linear address is used as the
physical address. Paging is enabled when bit 31 (the PG bit) of the CR0 register is set.
This bit usually is set by the operating system during software initialization. The PG bit must
be set if the operating system is running more than one program in Virtual 8086 Mode or
if demand-paged virtual memory is used.
A.2.2.5.2
Linear Address
Figure A-12 shows the format of a linear address.
Figure A-12
A-18
Linear Address Format
General Guidelines for Programming
AMD
Figure A-13 shows how the microprocessor translates the DIRECTORY, TABLE, and OFFSET fields of a linear address into the physical address using two levels of page tables.
The paging mechanism uses the DIRECTORY field as an index into a page directory, the
TABLE field as an index into the page table determined by the page directory, and the
OFFSET field to address an operand within the page specified by the page table.
Figure A-13
Page Translation
A.2.2.5.3
Page Tables
A page table is an array of 32-bit entries. A page table is itself a page, and contains 4096
bytes of memory or, at most, 1K 32-bit entries. All pages, including page directories and
page tables, are aligned to 4-Kbyte boundaries.
A page of memory uses a two-tier reference system. The top tier is the page directory. The
page directory addresses up to 1K or 210 page tables, the second tier. A page table addresses up to 1K or 210 pages in physical memory. Therefore, one page directory can
address 1M or 220 pages. Because each page contains 4K or 2 12 bytes, one page directory
can span the entire linear address space of the Am486 microprocessor (4G or 232 bytes).
The physical address of the current page directory is stored in the CR3 register, also called
the Page Directory Base Register (PDBR). Memory management software has the option
of using one page directory for all tasks, one page directory for each task, or some combination of the two.
General Guidelines for Programming
A-19
AMD
Figure A-14
Page Table Entry Format
A.2.2.5.4
Page Table Entries
Entries in either level of page tables have the same format, except that the page directory
has no Dirty bit. Figure A-14 illustrates this format. The bit position of the D bit is reserved
for future AMD use.
A.2.2.5.5
Page Frame Address
The page frame address is the base address of a page. In a page table entry, the upper
20 bits are used to specify a page frame address, and the lowest 12 bits specify control
and status bits for the page. In a page directory, the page frame address is the address of
a page table. In a second-level page table, the page frame address is the address of a
page containing instructions or data.
A.2.2.5.6
Present Bit
The Present bit indicates whether the page frame address in a page table entry maps to a
page in physical memory. When set, the page is in memory.
When the Present bit is clear, the page is not in memory, and the rest of the page table
entry is available for the operating system, for example, to store information regarding the
whereabouts of the missing page. Figure A-15 illustrates the format of a page table entry
when the Present bit is clear.
Figure A-15
Page Table Entry Format for a Not-Present Page
If the Present bit is clear in either level of page tables when an attempt is made to use a
page table entry for address translation, a page-fault exception is generated. In systems
that support demand-paged virtual memory, the following sequence of events then occurs:
1. The operating system copies the page from disk storage into physical memory.
2. The operating system loads the page frame address into the page table entry and sets
its Present bit. Other bits, such as the R/W bit, may be set as well.
A-20
General Guidelines for Programming
AMD
3. Because a copy of the old page table entry may still exist in the translation lookaside
buffer (TLB), the operating system empties it.
4. The program that caused the exception is then restarted.
Since there is no Present bit in CR3 to indicate when the page directory is not resident in
memory, the page directory pointed to by CR3 should always be present in physical memory.
A.2.2.5.7
Accessed and Dirty Bits
These bits provide data about page usage in both levels of page tables. The Accessed bit
is used to report read or write access to a page or second-level page table. The Dirty bit is
used to report write access to a page.
With the exception of the Dirty bit in a page directory entry, these bits are set by the hardware;
however, the microprocessor does not clear either of these bits. The microprocessor sets
the Accessed bits in both levels of page tables before a read or write operation to a page.
The microprocessor sets the Dirty bit in the second-level page table before a write operation
to an address mapped by that page table entry. The Dirty bit in directory entries is undefined.
The operating system may use the Accessed bit when it needs to create some free memory
by sending a page or second-level page table to disk storage. By periodically clearing the
Accessed bits in the page tables, it can see which pages have been used recently. Pages
that have not been used are candidates for sending out to disk.
The operating system may use the Dirty bit when a page is sent back to disk. By clearing
the Dirty bit when the page is brought into memory, the operating system can see if it has
received any write access. If there is a copy of the page on disk and the copy in memory
has not received any writes, there is no need to update disk from memory.
A.2.2.5.8
Read/Write and User/Supervisor Bits
The Read/Write and User/Supervisor bits are used for protection checks applied to pages,
which the microprocessor performs at the same time as address translation. See Section
A.2.3.1 for more information on protection.
A.2.2.5.9
Page-Level Cache Control Bits
The PCD and PWT bits are used for page-level cache management. Software can control
the caching of individual pages or second-level page tables using these bits.
A.2.2.5.10
Translation Lookaside Buffer (TLB)
The microprocessor stores the most recently used page table entries in an on-chip cache
called the translation lookaside buffer or TLB. Most paging is performed using the contents
of the TLB. Bus cycles to the page tables are performed only when a new page is used.
The TLB is invisible to application programs, but not to operating systems. Operating system
programmers must flush the TLB (dispose of its page table entries) when entries in the
page tables are changed. If this is not done, old data that has not received the changes
might get used for address translation. A change to an entry for a page that is not present
in memory does not require flushing the TLB, because entries for not-present pages are
not cached.
The TLB is flushed when the CR3 register is loaded. The CR3 register can be loaded in
either of two ways:
n
Explicit loading using MOV instructions, such as: MOV CR3, EAX
n
Implicit loading by a task switch that changes the contents of the CR3 register
An individual entry in the TLB can be flushed using an INVLPG instruction. This is useful
when the mapping of an individual page is changed.
General Guidelines for Programming
A-21
AMD
A.2.2.6
Combining Segment and Page Translation
Figure A-16 summarizes both stages of translation from a logical address to a physical
address when paging is enabled. Options available in both stages of address translation
can be used to support several different styles of memory management.
Figure A-16
Combining Segment and Page Address Translation
A.2.2.6.1
Flat Model
When the Am486 microprocessor is used to run software written without segments, it may
be desirable to remove the segmentation features of the Am486 microprocessor. The
Am486 microprocessor does not have a mode bit for disabling segmentation, but the same
effect can be achieved by mapping the stack, code, and data spaces to the same range of
linear addresses. The 32-bit offsets used by Am486 microprocessor instructions can cover
the entire linear address space.
When paging is used, the segments can be mapped to the entire linear address space. If
more than one program is being run at the same time, the paging mechanism can be used
to give each program a separate address space.
A.2.2.6.2
Segments Spanning Several Pages
The architecture allows segments that are larger than the size of a page (4 Kbytes). For
example, a large data structure may span thousands of pages. If paging were not used,
A-22
General Guidelines for Programming
AMD
access to any part of the data structure would require the entire data structure to be present
in physical memory. With paging, only the page containing the part being accessed needs
to be in memory.
A.2.2.6.3
Pages Spanning Several Segments
Segments also may be smaller than the size of a page. If one of these segments is placed
in a page that is not shared with another segment, the extra memory is wasted. For example,
a small data structure, such as a 1-byte semaphore, occupies 4 Kbytes if it is placed in a
page by itself. If many semaphores are used, it is more efficient to pack them into a single
page.
A.2.2.6.4
Non-Aligned Page and Segment Boundaries
The architecture does not enforce any correspondence between the boundaries of pages
and segments. A page may contain the end of one segment and the beginning of another.
Likewise, a segment may contain the end of one page and the beginning of another.
A.2.2.6.5
Aligned Page and Segment Boundaries
Memory-management software may be simpler and more efficient if it enforces some alignment between page and segment boundaries. For example, if a segment that may fit in
one page is placed in two pages, there may be twice as much paging overhead to support
access to that segment.
A.2.2.6.6
Page-Table Per Segment
An approach to combining paging and segmentation that simplifies memory management
software is to give each segment its own page table (see Figure A-17). This gives the
segment a single entry in the page directory that provides the access control information
for paging the segment.
Figure A-17
Separate Page Tables for Each Segment
General Guidelines for Programming
A-23
AMD
A.2.3
Internal System Protection
The internal system protection mechanism allows the programmer to prevent interference
between tasks. Protection can keep one task from overwriting the instructions or data of
another task. During program development, the protection mechanism can also give a
clearer picture of program bugs. When a program makes an unexpected reference to the
wrong memory space, the protection mechanism can block the event and report its
occurrence.
In end-user systems, the protection mechanism can guard against the possibility of software
failures caused by undetected program bugs. If a program fails, its effects can be confined
to a limited domain, protecting the operating system against damage. With the proper
exception routines, the system can record diagnostic information and attempt automatic
recovery.
Programmers can also apply protection to segments and pages. Two bits in a microprocessor register define the privilege level of the program currently running (called the current
privilege level or CPL). The CPL is checked during address translation for segmentation
and paging.
Although there is no control register or mode bit for turning off the protection mechanism,
the same effect can be achieved by assigning privilege level 0 (the highest level of privilege)
to all segment selectors, segment descriptors, and page table entries.
A.2.3.1
Segment-Level Protection
Protection provides the ability to limit the amount of interference that a malfunctioning
program can inflict on other programs and their data. Protection is a valuable aid in software
development because it allows software tools (operating system, debugger, etc.) to survive
in memory, undamaged. When an application program fails, the software is available to
report diagnostic messages and the debugger is available for post-mortem analysis of
memory and registers. In production, protection can make software more reliable by giving
the system an opportunity to initiate recovery procedures.
Each memory reference is checked to verify that it satisfies the protection checks. All checks
are made before the memory cycle is started; any violation prevents the cycle from starting
and results in an exception. Because checks are performed in parallel with address translation, there is no performance penalty. There are five protection checks:
n
Type check
n
Limit check
n
Restriction of addressable domain
n
Restriction of procedure entry points
n
Restriction of instruction set
A protection violation results in an exception. This chapter describes the protection violations that lead to exceptions.
A.2.3.2
Segment Descriptors and Protection
Figure A-18 shows the fields of a segment descriptor which are used by the protection
mechanism. Individual bits in the Type field also are referred to by the names of their
functions.
A-24
General Guidelines for Programming
AMD
Figure A-18
Description Fields Used for Protection
Protection parameters are placed in the descriptor when it is created. In general, application
programmers do not need to be concerned about protection parameters.When a program
loads a segment selector into a segment register, the microprocessor loads both the base
address of the segment and the protection information. The invisible part of each segment
register stores the base, limit, type, and privilege level. While this information is resident in
the segment register, subsequent protection checks on the same segment can be performed with no performance penalty.
General Guidelines for Programming
A-25
AMD
A.2.3.2.1
Type Checking
In addition to the descriptors for application code and data segments, the Am486 microprocessor has descriptors for system segments and gates. These are data structures used
for managing tasks and exceptions/interrupts. Table A-2 lists all the types defined for system
segments and gates.
Note: Not all descriptors define segments; gate descriptors hold pointers to procedure entry
points.
Table A-2
System Segment and Gate Types
Type
Description
0
Reserved
1
Available 80286 TSS
2
LDT
3
Busy 80286 TSS
4
Call Gate
5
Task Gate
6
80286 Interrupt Gate
7
80286 Trap Gate
8
Reserved
9
Available Am486 processor TSS
10
Reserved
11
Busy Am486 processor TSS
12
Am486 processor Call Gate
13
Reserved
14
Am486 processor Interrupt Gate
15
Am486 processor Task Gate
The Type fields of code and data segment descriptors include bits that further define the
purpose of the segment (see Figure A-18):
n
The Writable bit in a data-segment descriptor controls whether programs can write to
the segment.
n
The Readable bit in an executable-segment descriptor specifies whether programs can
read from the segment (e.g., to access constants stored in the code space). A readable,
executable segment may be read in two ways:
— With the CS register, by using a CS override prefix
— By loading a selector for the descriptor into a data-segment register (the DS, ES, FS,
or GS registers)
Type checking can detect programming errors due to attempts to use segments in ways
not intended by the programmer. The microprocessor examines type information under two
circumstances:
n
When a selector for a descriptor is loaded into a segment register. Certain segment
registers can contain only certain descriptor types; for example:
— The CS register only can be loaded with a selector for an executable segment.
A-26
General Guidelines for Programming
AMD
— Selectors of executable segments that are not readable cannot be loaded into datasegment registers.
— Only selectors of writable data segments can be loaded into the SS register.
n
Certain segments can be used by instructions only in certain predefined ways; for
example:
— No instruction may write into an executable segment.
— No instruction may write into a data segment if the writable bit is not set.
— No instruction may read an executable segment unless the readable bit is set.
A.2.3.2.2
Limit Checking
The Limit field of a segment descriptor prevents programs from addressing outside the
segment. The effective value of the limit depends on the setting of the G bit (Granularity
bit). For data segments, the limit also depends on the E bit (Expansion Direction bit). The
E bit is a designation for one bit of the Type field, when referring to data segment descriptors.
When the G bit is clear, the limit is the value of the 20-bit Limit field in the descriptor. In this
case, the limit ranges from 0 to 0FFFFFh (220 –1 or 1 Mbyte). When the G bit is set, the
microprocessor scales the value in the Limit field by a factor of 212. In this case, the limit
ranges from 0FFFh (212 –1 or 4 Kbytes) to 0FFFFFFFFh (232 – 1 or 4 Gbytes).
Note: When scaling is used, the lower twelve bits of the address are not checked against
the limit; when the G bit is set and the segment limit is 0, valid offsets within the segment
are 0 through 4095.
For all types of segments except expand-down data segments (stack segments), the value
of the limit is one less than the size of the segment in bytes. The microprocessor causes
a general-protection exception in any of these cases:
n
Attempt to access a memory byte at an address > limit
n
Attempt to access a memory word at an address > (limit – 1)
n
Attempt to access a memory doubleword at an address > (limit – 3)
For expand-down data segments, the limit has the same function but is interpreted differently. In these cases, the range of valid offsets is from (limit + 1) to 232 – 1. An expand-down
segment has maximum size when the segment limit is 0.
Limit checking catches programming errors such as runaway subscripts and invalid pointer
calculations. These errors are detected when they occur, so identification of the cause is
easier. Without limit checking, these errors could overwrite critical memory in another module, and the existence of these errors would not be discovered until the damaged module
crashed, an event that may occur long after the actual error. Protection can block these
errors and report their source.
In addition to limit checking on segments, there is limit checking on the descriptor tables.
The GDTR and IDTR registers contain a 16-bit limit value. It is used by the microprocessor
to prevent programs from selecting a segment descriptor outside the descriptor table. The
limit of a descriptor table identifies the last valid byte of the table. Because each descriptor
is 8 bytes long, a table that contains up to N descriptors should have a limit of 8N – 1.
A descriptor may be given a zero value. This refers to the first descriptor in the GDT, which
is not used. Although this descriptor may be loaded into a segment register, any attempt
to reference memory using this descriptor generates a general-protection exception.
General Guidelines for Programming
A-27
AMD
A.2.3.2.3
Privilege Levels
The protection mechanism recognizes four privilege levels: from 0 to 3. The greater numbers have lower privilege. If all other protection checks are satisfied, a general-protection
exception occurs if a program with a higher privilege number attempts to access a segment
with a lower privilege number. Although no control register or mode bit exists to disable the
protection mechanism, you can achieve the same effect by assigning 0 to all operations.
(The PE bit in the CR0 register does not enable the protection mechanism alone; it enables
Protected Mode, the full 32-bit architecture execution mode. When Protected Mode is
disabled, the microprocessor operates in Real Address Mode.)
You can use privilege levels to improve operating system reliability. By giving the operating
system the highest privilege level, it is protected from damage by bugs in other programs.
If a program crashes, the operating system can generate a diagnostic message and attempt
recovery procedures. Another level of privilege can be established for other parts of the
system software, such as the programs that handle peripheral devices, both in BIOS and
specific device drivers. Device drivers should be given an intermediate privilege level between the operating system and the application programs. This protects both the operating
system from errors in the drivers or BIOS, and it protects the drivers from bugs in application
programs. Application programs are given the lowest privilege level.
Figure A-19 shows how these levels of privilege can be interpreted as rings of protection.
The center is for the segments containing the most critical software, usually the kernel of
an operating system. Outer rings are for less critical software.
Figure A-19
A-28
Protection Rings
General Guidelines for Programming
AMD
The following data structures contain privilege levels:
n
The lowest two bits of the CS segment register hold the current privilege level (CPL).
This is the privilege level of the program being run. The lowest two bits of the SS register
also hold a copy of the CPL. Normally, the CPL is equal to the privilege level of the code
segment from which instructions are being fetched. The CPL changes when control is
transferred to a code segment with a different privilege level.
n
Segment descriptors contain a field called the descriptor privilege level (DPL). The DPL
is the privilege level applied to a segment.
n
Segment selectors contain a field called the requester privilege level (RPL). The RPL
is intended to represent the privilege level of the procedure that created the selector. If
the RPL is a less privileged level than the CPL, it overrides the CPL. When a more
privileged program receives a segment selector from a less privileged program, the RPL
causes the memory access to take place at the less privileged level.
Privilege levels are checked when the selector of a descriptor is loaded into a segment
register. The checks used for data access differ from those used for transfers of execution
among executable segments; therefore, the two types of access are considered separately
in the following sections.
A.2.3.3
Restricting Access to Data
To address operands in memory, a segment selector for a data segment must be loaded
into a data-segment register (the DS, ES, FS, GS, or SS registers). The microprocessor
checks the segment’s privilege levels. The check is performed when the segment selector
is loaded. As Figure A-20 shows, three different privilege levels enter into this type of
privilege check.
Figure A-20
Privilege Check for Data Access
General Guidelines for Programming
A-29
AMD
The three privilege levels that are checked are:
1. The CPL (current privilege level) of the program—this is held in the two least-significant
bit positions of the CS register.
2. The DPL (descriptor privilege level) of the segment descriptor of the segment containing
the operand
3. The RPL (requester's privilege level) of the selector used to specify the segment containing the operand—this is held in the two lowest bit positions of the segment register
used to access the operand (the SS, DS, ES, FS, or GS registers). If the operand is in
the stack segment, the RPL is the same as the CPL.
Instructions may load a segment register only if the DPL of the segment is the same or a
less privileged level (greater privilege number) than the less privileged of the CPL and the
selector's RPL.
The addressable domain of a task varies as its CPL changes. When the CPL is 0, data
segments at all privilege levels are accessible; when the CPL is 1, only data segments at
privilege levels 1 through 3 are accessible; when the CPL is 3, only data segments at
privilege level 3 are accessible.
It may be desirable to store data in a code segment, for example, when both code and data
are provided in ROM. Code segments may legitimately hold constants; it is not possible to
write to a segment defined as a code segment, unless a data segment is mapped to the
same address space. The following methods of accessing data in code segments are
possible:
n
Load a data-segment register with a segment selector for a non-conforming, readable,
executable segment.
n
Load a data-segment register with a segment selector for a conforming, readable, executable segment.
n
Use a code-segment override prefix to read a readable, executable segment whose
selector already is loaded in the CS register.
The same rules for access to data segments apply to case 1. Case 2 is always valid because
the privilege level of a code segment with a set Conforming bit is effectively the same as
the CPL, regardless of its DPL. Case 3 is always valid because the DPL of the code segment
selected by the CS register is the CPL.
A.2.3.4
Restricting Control Transfers
Control transfers are provided by the JMP, CALL, RET, INT, and IRET instructions, as well
as by the exception and interrupt mechanisms. This section discusses only the JMP, CALL,
and RET instructions.
The “near” forms of the JMP, CALL, and RET instructions transfer program control within
the current code segment, and therefore are subject only to limit checking. The microprocessor checks that the destination of the JMP, CALL, or RET instruction does not exceed
the limit of the current code segment. This limit is cached in the CS register, so protection
checks for near transfers require no performance penalty.
The operands of the “far” forms of the JMP and CALL instruction refer to other segments,
so the microprocessor performs privilege checking. There are two ways a JMP or CALL
instruction can refer to another segment:
A-30
n
The operand selects the descriptor of another executable segment.
n
The operand selects a call gate descriptor.
General Guidelines for Programming
AMD
Figure A-21
Privilege Check for Control Transfer Without Gate
As Figure A-21 shows, two different privilege levels enter into a privilege check for a control
transfer that does not use a call gate:
n
The CPL (current privilege level)
n
The DPL of the descriptor of the destination code segment
Normally the CPL is equal to the DPL of the segment that the microprocessor is currently
executing. The CPL may, however, be greater (less privileged) than the DPL if the current
code segment is a conforming segment (as indicated by the Type field of its segment
descriptor). A conforming segment runs at the privilege level of the calling procedure. The
microprocessor keeps a record of the CPL cached in the CS register; this value can be
different from the DPL in the segment descriptor of the current code segment.
The microprocessor only permits a JMP or CALL instruction directly into another segment
if one of the following privilege rules is satisfied:
n
The DPL of the segment is equal to the current CPL.
n
The segment is a conforming code segment, and its DPL is less (higher privilege) than
the current CPL.
Conforming segments are used for programs, such as math libraries and some kinds of
exception handlers, that support applications but do not require access to protected system
facilities. When control is transferred to a conforming segment, the CPL does not change,
even if the selector used to address the segment has a different RPL. This is the only
condition in which the CPL may be different from the DPL of the current code segment.
Most code segments are non-conforming. For these segments, control can be transferred
without a gate only to other code segments at the same level of privilege. It is sometimes
necessary, however, to transfer control to higher privilege levels. This is accomplished with
the CALL instruction using call-gate descriptors. The JMP instruction may never transfer
control to a non-conforming segment whose DPL does not equal the CPL.
General Guidelines for Programming
A-31
AMD
A.2.3.5
Gate Descriptors
To provide protection for control transfers among executable segments at different privilege
levels, the Am486 microprocessor uses gate descriptors. There are four kinds of gate
descriptors:
n
Task gates
n
Trap gates
n
Interrupt gates
n
Call gates
Task gates are used for task switching. Trap gates and interrupt gates are used by exceptions and interrupts. Call gates are a form of protected control transfer. They are used for
control transfers between different privilege levels. They only need to be used in systems
in which more than one privilege level is used.
Figure A-22 illustrates the format of a call gate.
Figure A-22
Call Gate Format
A call gate has two main functions:
n
To define an entry point of a procedure
n
To specify the privilege level required to enter a procedure
Call gate descriptors are used by CALL and JUMP instructions in the same manner as
code segment descriptors. When the hardware recognizes that the destination segment
selector refers to a gate descriptor, the call gate contents determine the operation of the
instruction. A call gate descriptor may reside in the GDT or in an LDT, but not in the IDT.
The selector and offset fields of a gate form a pointer to the entry point of a procedure. A
call gate guarantees that all control transfers to other segments go to a valid entry point,
rather than to the middle of a procedure (or worse, to the middle of an instruction). The
operand of the control transfer instruction is not the segment selector and is not offset within
the segment to the procedure’s entry point. Instead, the segment selector points to a gate
descriptor, and the offset is not used. Figure A-23 shows this form of addressing. As shown
in Figure A-24, four different privilege levels are used to check the validity of a control
transfer through a call gate.
A-32
General Guidelines for Programming
AMD
Figure A-23
Call Gate Mechanism
Figure A-24
Privilege Check for Control Transfer with Call Gate
General Guidelines for Programming
A-33
AMD
The privilege levels checked during a transfer of execution through a call gate are:
n
The CPL (current privilege level)
n
The RPL (requester’s privilege level) of the segment selector used to specify the call gate
n
The DPL (descriptor privilege level) of the gate descriptor
n
The DPL of the segment descriptor of the destination code segment
The DPL field of the gate descriptor determines the privilege levels that can access the
gate. One code segment can have procedures used by different privilege levels. For example, an operating system may have some services used by both the operating system
and application software, such as routines to handle character I/O, while other services
may be for use only by the operating system itself, such as routines to initialize device
drivers.
Gates can be used for control transfers to higher privilege levels or to the same privilege
level (though they are not necessary for same-level transfers). Only CALL instructions can
use gates to transfer to higher privilege levels. A JMP instruction may use a gate only to
transfer control to a code segment with the same privilege level, or to a conforming code
segment with the same or a higher privilege level.
To use a JMP instruction to transfer to a non-conforming segment, both of the following
privilege rules must be satisfied; otherwise, a general-protection exception occurs:
n
MAX (CPL,RPL) ≤ gate DPL
n
Destination code segment DPL = CPL
For a CALL instruction (or for a JMP instruction to a conforming segment), both of the
following privilege rules must be satisfied; otherwise, a general-protection exception occurs.
A.2.3.5.1
n
MAX (CPL,RPL) ≤ gate DPL
n
Destination code segment DPL ≤ CPL
Stack Switching
A procedure call to a more privileged level does the following:
n
Changes the CPL
n
Transfers control (execution)
n
Switches stacks
All inner protection rings (privilege levels 0, 1, and 2) have their own stacks for receiving
calls from less privileged levels. If the caller were to provide the stack and the stack were
too small, the called procedure might fail due to insufficient stack space. The system design
avoids this problem by creating a new stack when a call is made to a more privileged level.
The mechanism creates a new stack, copies the parameters from the old stack, and saves
the register contents; then execution proceeds normally. When the procedure returns, the
contents of the saved registers restore the original stack.
A-34
General Guidelines for Programming
AMD
Figure A-25
Initial Stack Pointers in a TSS
The microprocessor finds the space to create new stacks using the task state segment
(TSS) (see Figure A-25). Each task has its own TSS. The TSS contains initial stack pointers
for the inner protection rings. The operating system is responsible for creating each TSS
and initializing its stack pointers. An initial stack pointer consists of a segment selector and
an initial value for the ESP register (an initial offset into the segment). The initial stack
pointers are strictly read-only values. The microprocessor does not change them while the
task runs. These stack pointers are used only to create new stacks when calls are made
to more privileged levels. These stacks disappear when the called procedure returns. The
next time the procedure is called, a new stack is created using the initial stack pointer.
When a call gate is used to change privilege levels, a new stack is created by loading an
address from the TSS. The microprocessor uses the DPL of the destination code segment
(the new CPL) to select the initial stack pointer for privilege level 0, 1, or 2.
The DPL of the new stack segment must equal the new CPL; if not, a stack-fault exception
is generated. It is the responsibility of the operating system to create stacks and stacksegment descriptors for all privilege levels that are used. The stacks must be read/write as
specified in the Type field of their segment descriptors. They must contain enough space,
as specified in the Limit field, to hold the contents of the SS and ESP registers, the return
address, and the parameters and temporary variables required by the called procedure.
As with calls within a privilege level, parameters for the procedure are placed on the stack.
The parameters are copied to the new stack. The parameters can be accessed within the
called procedure using the same relative addresses that would have been used if no stack
switching had occurred. The count field of a call gate tells the microprocessor how many
doublewords (up to 31) to copy from the caller’s stack to the stack of the called procedure.
If the count is 0, no parameters are copied.
General Guidelines for Programming
A-35
AMD
If more than 31 doublewords of data need to be passed to the called procedure, one of the
parameters can be a pointer to a data structure, or the saved contents of the SS and ESP
registers may be used to access parameters in the old stack space.
The microprocessor performs the following stack-related steps in executing a procedure
call between privilege levels:
n
The stack of the called procedure is checked to make certain it is large enough to hold
the parameters and the saved contents of registers; if not, a stack exception is generated.
n
The old contents of the SS and ESP registers are pushed onto the stack of the called
procedure as two doublewords (the 16-bit SS register is zero-extended to 32 bits; the
zero-extended upper word is AMD reserved; do not use).
n
The parameters are copied from the stack of the caller to the stack of the called
procedure.
n
A pointer to the instruction after the CALL instruction (the old contents of the CS and
EIP registers) is pushed onto the new stack. The contents of the SS and ESP registers
after the call point to this return pointer on the stack.
Figure A-26 illustrates the stack frame before, during, and after a successful interlevel
procedure call and return.
Figure A-26
Stack Frame During Interlevel CALL
The TSS does not have a stack pointer for a privilege level 3 stack, because a procedure
at privilege level 3 cannot be called by a less privileged procedure. The stack for privilege
level 3 is preserved by the contents of the SS and EIP registers that have been saved on
the stack of the privilege level called from level 3.
A call using a call gate does not check the values of the words copied onto the new stack.
The called procedure should check each parameter for validity. A later section discusses
how the ARPL, VERR, VERW, LSL, and LAR instructions can be used to check pointer
values.
A-36
General Guidelines for Programming
AMD
A.2.3.5.2
Returning from a Procedure
The “near” forms of the RET instruction only transfer control within the current code segment
and therefore, are subject only to limit checking. The microprocessor checks the offset to
ensure that it does not exceed the current code segment limit.
The “far” form of the RET instruction pops the return address that was pushed onto the
stack by an earlier far CALL instruction. Under normal conditions, the return pointer is valid.
Nevertheless, the microprocessor performs privilege checking because the current procedure can alter the pointer or fail to maintain the stack properly. The RPL of the code-segment
selector popped off the stack should have the privilege level of the calling procedure.
A return to another segment can change privilege levels, but only to a lower privilege level.
When RET encounters a saved CS value whose RPL is numerically greater than the CPL
(less privileged level), a return across privilege levels occurs. A return of this kind performs
these steps:
Table A-3
n
The checks shown in Table A-3 are made, and the CS, EIP, SS, and ESP registers are
loaded with their former values, which were saved on the stack.
n
The old contents of the SS and ESP registers (from the top of the current stack) are
adjusted by the number of bytes indicated in the RET instruction. The resulting ESP
value is not checked against the limit of the stack segment. An ESP value beyond the
limit is not recognized until the next stack operation. (The returning procedure SS and
ESP register contents are not preserved; normally, their values equal those in the TSS.)
n
The DS, ES, FS, and GS segment register contents are checked. If any of these registers
refer to segments whose DPL is less than the new CPL (excluding conforming code
segments), the segment register is loaded with the null selector (Index = 0, TI = 0). The
RET instruction itself does not signal exceptions in these cases, but any subsequent
memory reference using a segment register with the null selector causes a generalprotection exception. This prevents less privileged code from accessing more privileged
segments using selectors left in the segment registers by a more privileged procedure.
Interlevel Return Checks
Type of Check
Top-of-stack must be within stack segment limit
Top-of-stack + 7 must be within stack segment limit
RPL of return code segment must be greater than the CPL
Return code segment must be non-null
Return code segment descriptor must be within descriptor table limit
Return segment descriptor must be a code segment
Return code segment is present
Return non-conforming code segment DPL must equal return code
segment selector RPL; or return conforming code segment DPL
must be less than or equal the return code segment selector RPL
ESP + RET operand + 15 must be within the stack segment limit
Segment descriptor at ESP+ RET operand +12 must be non-null
Segment descriptor at ESP+ RET operand +12 must be within
descriptor table limit
Stack segment must be read/write
Stack segment must be present
Old stack segment DPL must equal old code segment RPL
Old stack selector RPL must equal old stack segment CPL
General Guidelines for Programming
Exception Type
Error Code
stack
stack
protection
protection
protection
protection
protection
protection
0
0
return CS
return CS
return CS
return CS
return CS
return CS
protection
protection
protection
return CS
return CS
return CS
protection
protection
protection
protection
return CS
return CS
return CS
return CS
A-37
AMD
A.2.3.6
Instructions Reserved for the Operating System
Instructions that can affect the protection mechanism or influence general system performance can only be executed by trusted procedures. The Am486 microprocessor has two
classes of such instructions:
A.2.3.6.1
n
Privileged instructions—those used for system control
n
Sensitive instructions—those used for I/O and I/O-related activities
Privileged lnstructions
The instructions that affect protected facilities can be executed only when the CPL is 0
(most privileged). If one of these instructions is executed when the CPL is not 0, a generalprotection exception is generated. These instructions include:
A.2.3.6.2
n
CLTS
– Clear Task-Switched Flag
n
HLT
– Halt Microprocessor
n
INVD
– Invalidate Cache
n
INVLPG
– Invalidate TLB Entry
n
LGDT
– Load GDT Register
n
LIDT
– Load IDT Register
n
LIDT
– Load LDT Register
n
LMSW
– Load Machine Status Word
n
LTR
– Load Task Register
n
MOV CR0 – Move to/from Control Register 0
n
MOV DRn – Move to/from Debug Register n
n
MOV TRn – Move to/from Test Register n
n
WBINVD
– Write Back and Invalidate Cache
Sensitive Instructions
Instructions that deal with I/O need to be protected, but they also need to be used by
procedures executing at privilege levels other than 0 (the most privileged level). The mechanisms for protection of I/O operations are covered in detail in Section A.2.9.
A.2.3.7
Instructions for Pointer Validation
Pointer validation is necessary for maintaining isolation between privilege levels. It consists
of the following steps:
1. Checks if the supplier of the pointer is allowed to access the segment.
2. Checks if the segment type is compatible with its use.
3. Checks if the pointer offset exceeds the segment limit.
Although the Am486 microprocessor automatically performs checks 2 and 3 during instruction execution, software must assist in performing the first check. The ARPL instruction is
provided for this purpose. Software also can use steps 2 and 3 to check for potential
violations, rather than waiting for an exception to be generated. The LAR, LSL, VERR, and
VERW instructions are provided for this purpose.
An additional check, the alignment check, can be applied in user mode. When both the AM
bit in CR0 and the AC flag are set, unaligned memory references generate exceptions.
This is useful for programs that use the Low two bits of pointers to identify the type of data
A-38
General Guidelines for Programming
AMD
structure they address. For example, a subroutine in a math library may accept pointers to
numeric data structures. If the type of this structure is assigned a code of 10 (binary) in the
lowest two bits of pointers to this type, math subroutines can correct for the type code by
adding a displacement of –10 (binary). If the subroutine should ever receive the wrong
pointer type, an unaligned reference would be produced, which would generate an exception. Alignment checking accelerates the processing of programs written in symbolic-processing (i.e., Artificial Intelligence) languages such as Lisp, Prolog, Smalltalk, and C++. It
can be used to speed up pointer tag type checking.
LAR (Load Access Rights) is used to verify that a pointer refers to a segment of a compatible
privilege level and type. The LAR instruction has one operand, a segment selector for the
descriptor whose access rights are to be checked. The segment descriptor must be readable at a privilege level that is numerically greater (less privileged) than the CPL and the
selector's RPL. If the descriptor is readable, the LAR instruction gets the second doubleword
of the descriptor, masks this value with 00FxFF00h, stores the result into the specified 32bit destination register, and sets the Zero Flag (ZF). (The x indicates that the corresponding
four bits of the stored value are undefined.) Once loaded, the access rights can be tested.
All valid descriptor types can be tested by the LAR instruction. If the RPL or CPL is greater
than the DPL, or if the segment selector would exceed the limit for the descriptor table, no
access rights are returned and ZF is cleared. Conforming code segments may be accessed
from any privilege level.
LSL (Load Segment Limit) allows software to test the limit of a segment descriptor. If the
descriptor referenced by the segment selector (in memory or a register) is readable at the
CPL, the LSL instruction loads the specified 32-bit register with a 32-bit, byte granular limit
calculated from the concatenated limit fields and the G bit of the descriptor. This only can
be done for descriptors that describe segments (data, code, task state, and local descriptor
tables); gate descriptors are inaccessible. (Table A-4 lists in detail which types are valid
and which are not.) Interpreting the limit is a function of the segment type. For example,
downward-expandable data segments (stack segments) treat the limit differently than other
kinds of segments. For both the LAR and LSL instructions, ZF is set if the load was successful; otherwise, ZF is cleared.
Table A-4
Valid Descriptor Types for LSL Instruction
Type Code
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Descriptor Type
Reserved
Reserved
LDT
Reserved
Reserved
Task Gate
Reserved
Reserved
Reserved
Available Am486 processor TSS
Reserved
Busy Am486 processor TSS
Am486 processor Call Gate
Reserved
Am486 processor Interrupt Gate
Am486 processor Task Gate
Valid?
no
no
yes
no
no
no
no
no
no
yes
no
yes
no
no
no
no
Note: Conforming segments are not checked for privilege level.
General Guidelines for Programming
A-39
AMD
A.2.3.7.1
Descriptor Validation
The Am486 microprocessor has two instructions, VERR and VERW, which determine
whether a segment selector points to a segment that can be read or written using the CPL.
Neither instruction causes a protection fault if the segment cannot be accessed.
VERR (Verify for Reading) verifies a segment for reading and sets ZF if that segment is
readable using the CPL. The VERR instruction checks the following:
n
The segment selector points to a segment descriptor within the bounds of the GDT or
an LDT.
n
The segment selector indexes to a code or data segment descriptor.
n
The segment is readable and has a compatible privilege level.
n
The privilege check for data segments and non-conforming code segments verifies that
the DPL must be a less privileged level than either the CPL or the selector’s RPL.
VERW (Verify for Writing) provides the same capability as the VERR instruction for verifying
writability. Like the VERR instruction, the VERW instruction sets ZF if the segment can be
written. The instruction verifies the descriptor is within bounds, is a segment descriptor, is
writable, and has a DPL that is a less privileged level than either the CPL or the selector’s
RPL. Code segments are never writable, whether conforming or not.
A.2.3.7.2
Pointer Integrity and RPL
The requester’s privilege level (RPL) can prevent accidental use of pointers that can cause
system lockup when moving to a higher privilege code from a lower privilege level.
A common example is a file system procedure, FREAD (file_id, n_bytes, buffer_ptr). This
hypothetical procedure reads data from a disk file into a buffer, overwriting whatever is
already there. It services requests from programs operating at the application level, but it
must run in a privileged mode (not level 3) in order to read from the system I/O buffer. If
the application program passes a bad buffer pointer to the procedure that points to critical
code or data in a privileged address space, the procedure can lockup the system.
Use of the RPL can avoid this problem. The RPL allows a privilege override to be assigned
to a selector. This privilege override is the privilege level of the code segment that generates
the segment selector. In the above example, the RPL is the CPL of the application program
that called the system level procedure. The Am486 microprocessor automatically checks
any segment selector loaded into a segment register to determine whether its RPL allows
access.
To take advantage of the microprocessor’s checking of the RPL, the called procedure need
only check that all segment selectors passed to it have an RPL for the same or a less
privileged level as the original caller’s CPL. This guarantees that the segment selectors are
not more privileged than their source. If a selector is used to access a segment that the
source would not be able to access directly (i.e., the RPL is less privileged than the segment’s DPL), a general-protection exception is generated when the selector is loaded into
a segment register.
ARPL (Adjust Requested Privilege Level) adjusts the RPL field of a segment selector to
be the larger (less privileged) of its original value and the value of the RPL field for a segment
selector stored in a general register. The RPL fields are the two least-significant bits of the
segment selector and the register. The latter normally is a copy of the caller’s CS register
on the stack. If the adjustment changes the selector’s RPL, ZF is set; otherwise, ZF is
cleared.
A-40
General Guidelines for Programming
AMD
A.2.3.8
Page-Level Protection
Protection applies to both segments and pages. When the flat model for memory segmentation has been used, page-level protection prevents programs from interfering with each
other.
Each memory reference is checked to verify that it satisfies the protection checks. All checks
are made before the memory cycle is started; any violation prevents the cycle from starting
and results in an exception. Because checks are performed in parallel with address translation, there is no performance penalty. There are two page-level protection checks:
n
Restriction of addressable domain
n
Type checking
A protection violation results in an exception. See Section A.2.8 for an explanation of the
exception mechanism. This section describes the protection violations that lead to
exceptions.
A.2.3.8.1
Page-Table Entries Hold Protection Parameters
Figure A-27 highlights the fields of a page table entry that control access to pages. The
protection checks are applied for both first and second-level page tables.
Figure A-27
Protection Holds
Privilege is interpreted differently for pages and segments. With segments, there are four
privilege levels, ranging from 0 (most privileged) to 3 (least privileged). With pages, there
are two levels of privilege:
n
Supervisor level (U/S = 0): for the operating system, other system software (such as
device drivers), and protected system data (such as page tables).
n
User level (U/S = 1): for application code and data. The privilege levels used for segmentation are mapped into the privilege levels used for paging. If the CPL is 0, 1, or 2,
the microprocessor is running at supervisor level. If the CPL is 3, the microprocessor is
running at user level. When the microprocessor is running at supervisor level, all pages
are accessible. When the microprocessor is running at user level, only pages from the
user level are accessible.
Only two types of pages are recognized by the protection mechanism:
n
Read-only access (R/W = 0)
n
Read/write access (R/W = 1)
When the microprocessor is running at supervisor level with the WP bit in the CR0 register
clear (its state following reset initialization), all pages are both readable and writable (writeprotection is ignored). When the microprocessor is running at user level, only pages that
belong to user level and are marked for read/write access are writable. User-level pages
that are read/write or read-only are readable. Pages from the supervisor level are neither
General Guidelines for Programming
A-41
AMD
readable nor writable from user level. A general-protection exception is generated on any
attempt to violate the protection rules.
Unlike the Am386DX microprocessor, the Am486 microprocessor allows user-mode pages
to be write-protected against supervisor mode access. Setting the WP bit in the CR0 register
enables supervisor-mode sensitivity to user-mode, write-protected pages. This feature is
useful for implementing the copy-on-write strategy used by some operating systems, such
as UNIX, for task creation (also called forking or spawning).
When a new task is created, it is possible to copy the entire address space of the parent
task. This gives the child task a complete, duplicate set of the parent’s segments and pages.
The copy-on-write strategy saves memory space and time by mapping the child’s segments
and pages to the same segments and pages used by the parent task. A private copy of a
page gets created only when one of the tasks writes to the page.
A.2.3.8.2
Combining Protection of Both Levels of Page Tables
For any one page, the protection attributes of its page directory entry (first-level page table)
may differ from those of its second-level page table entry. The Am486 microprocessor
checks the protection for a page by examining the protection specified in both the page
directory (first-level page table) and the second-level page table. Table A-5 shows the
protection provided by the possible combinations of protection attributes when the WP bit
is clear.
Table A-5
Combined Page Directory and Page Table Protection
Page Directory Entry
Privilege
A-42
Access Type
Page Table Entry
Privilege
Access Type
Combined Effect
Privilege
Access Type
User
Read-Only
User
Read-Only
User
Read-Only
User
Read-Only
User
Read/Write
User
Read-Only
User
Read/Write
User
Read-Only
User
Read-Only
User
Read/Write
User
Read/Write
User
Read/Write
User
Read-Only
Supervisor
Read-Only
User
Read-Only
User
Read-Only
Supervisor
Read/Write
User
Read-Only
User
Read/Write
Supervisor
Read-Only
User
Read-Only
User
Read/Write
Supervisor
Read/Write
User
Read/Write
Supervisor
Read-Only
User
Read-Only
User
Read-Only
Supervisor
Read-Only
User
Read/Write
User
Read-Only
Supervisor
Read/Write
User
Read-Only
User
Read-Only
Supervisor
Read/Write
User
Read/Write
User
Read/Write
Supervisor
Read-Only
Supervisor
Read-Only
Supervisor
Read-Only
Supervisor
Read-Only
Supervisor
Read/Write
Supervisor
Read/Write
Supervisor
Read/Write
Supervisor
Read-Only
Supervisor
Read/Write
Supervisor
Read/Write
Supervisor
Read/Write
Supervisor
Read/Write
General Guidelines for Programming
AMD
A.2.3.8.3
Overrides to Page Protection
Certain accesses are checked as if they are privilege level 0 accesses, for any value of CPL:
A.2.3.9
n
Access to segment descriptors (LDT, GDT, TSS and IDT)
n
Access to inner stack during a CALL instruction, or exceptions and interrupts, when a
change of privilege level occurs
Combining Page and Segment Protection
When paging is enabled, the Am486 microprocessor first evaluates segment protection,
then evaluates page protection. If the microprocessor detects a protection violation at either
the segment level or the page level, the operation does not go through; an exception occurs
instead. If an exception is generated by segmentation, no paging exception is generated
for the operation.
For example, it is possible to define a large data segment which has some parts that are
read-only and other parts that are read/write. In this case, the page directory (or page table)
entries for the read-only parts would have the U/S and R/W bits specifying no write access
for all the pages described by that directory entry (or for individual pages specified in the
second-level page tables). This technique might be used, for example, to define a large
data segment, part of which is read-only (for shared data or constants in ROM). This
approach defines a single “flat” data space in one large segment that uses “flat” pointers,
but protects shared data that is mapped into the same virtual space using page-defined
supervisor areas.
A.2.4
Data Types
There are two ways in which data is stored and used by the Am486 processor family. When
stored in memory, the microprocessor accesses data using bytes, words, and doublewords
(see Figure A-28). Instructions use the accessed data in multiple ways, both by accessing
multiple sets of bytes, word, or doublewords and by reinterpreting the data stored in them
(such as strings, signed and unsigned integers, BCD values, and real/floating-point
numbers).
Figure A-28
Data Types in Memory
A.2.4.1
Data Types in Memory
Byte—8 bits. The bits are numbered 0 through 7, bit 0 being the least-significant bit (LSB).
Word—two bytes occupying any two consecutive addresses. A word contains 16 bits. The
bits of a word are numbered from 0 through 15, bit 0 again being the least-significant bit.
The byte containing bit 0 of the word is called the Low byte; the byte containing bit 15 is
called the High byte. On the Am486 microprocessor, the Low byte is stored in the byte with
the lower address. The address of the Low byte also is the address of the word. The address
General Guidelines for Programming
A-43
AMD
of the High byte is used only when the upper half of the word is being accessed separately
from the lower half.
Doubleword—four bytes occupying any four consecutive addresses. A doubleword contains 32 bits. The bits of a doubleword are numbered from 0 through 31, bit 0 again being
the least-significant bit. The word containing bit 0 of the doubleword is called the Low word;
the word containing bit 31 is called the High word. The Low word is stored in the two bytes
with the lower addresses. The address of the lowest byte is the address of the doubleword.
The higher addresses are used only when the upper word is being accessed separately
from the lower word, or when individual bytes are being accessed. Figure A-29 illustrates
the arrangement of bytes within words and doublewords.
Figure A-29
Bytes, Words, and Doublewords in Memory
Note: Words do not need to be aligned at even-numbered addresses and doublewords do not need to be
aligned at addresses evenly divisible by four. This allows maximum flexibility in data structures (e.g., records
containing mixed byte, word, and doubleword items) and efficiency in memory utilization. Because the Am486
microprocessor has a 32-bit data bus, communication between microprocessor and memory takes place as
doubleword transfers aligned to addresses evenly divisible by four; the microprocessor converts doubleword
transfers aligned to other addresses into multiple transfers. These unaligned operations reduce speed by
requiring extra bus cycles. For maximum speed, data structures (especially stacks) should be designed so that,
whenever possible, word operands are aligned to even addresses and doubleword operands are aligned to
addresses evenly divisible by four.
A-44
General Guidelines for Programming
AMD
Figure A-30
Data Types
∆
∆
∆
∆
General Guidelines for Programming
A-45
AMD
A.2.4.2
Operand Formats
Although bytes, words, and doublewords represent the way the microprocessor accesses
data in memory, specialized instructions can interpret and manipulate this digital information
in different forms. These operand forms include the following types (shown in Figure A-30):
n
Integer—a signed binary number held in a 32-bit doubleword, 16-bit word, or 8-bit byte.
All operations assume a two's complement representation. The sign bit is located in bit
7 in a byte, bit 15 in a word, and bit 31 in a doubleword. The sign bit is set for negative
integers, clear for positive integers and zero. The value of an 8-bit integer is from –128
to + 127; a 16-bit integer from 32,768 to + 32,767; a 32-bit integer from –231 to + 231–1.
When used by the FPU, they are automatically converted to the 80-bit extended real
format, shown as a signed 79-bit integer in Figure A-30. All binary integers are exactly
representable in the extended real format.
n
Ordinal—an unsigned binary number contained in a 32-bit doubleword, 16-bit word, or
8-bit byte. The value of an 8-bit ordinal is from 0 to 255; a 16-bit ordinal from 0 to 65,535;
a 32-bit ordinal from 0 to 232 –1.
n
Pointer—an offset address, or segment plus offset, used by a JUMP, conditional JUMP,
LOOP, or conditional LOOP instruction.
— Near Pointer: A 32-bit logical address. A near pointer is an offset within a segment.
Near pointers are used for all pointers in a flat memory model, or for references within
a segment in a segmented model.
— Far Pointer: A 48-bit logical address consisting of a 16-bit segment selector and a
32-bit offset. Far pointers are used in a segmented memory model to access other
segments.
n
String—a contiguous sequence of bits, bytes, words, or doublewords. A string may
contain from zero to 232 – 1 bytes (4 Gbytes). The bit sequences can be one of two types:
— Bit field: A contiguous sequence of bits. A bit field may begin at any bit position of
any byte and may contain up to 32 bits.
— Bit string: A contiguous sequence of bits. A bit string may begin at any bit position of
any byte and may contain up to 232 – 1 bits.
n
BCD—a representation of a binary-coded decimal (BCD) digit in the range 0–9. Unpacked decimal numbers are stored as unsigned byte quantities. One digit is stored in
each byte. The magnitude of the number is the binary value of the Low-order half-byte;
values 0–9 are valid and are interpreted as the value of a digit. The High-order half-byte
must be zero during multiplication and division; it may contain any value during addition
and subtraction. Packed BCD formats use a representation of binary-coded decimal
digits, each in the range 0–9. One digit is stored in each half-byte, two digits in each
byte. The digit in bits 4–7 is more significant than the digit in bits 0–3. Values 0–9 are
valid for a digit.
n
Real—the Am486 microprocessor represents real numbers of the form:
(-1)s ⋅ 2E(b0∆b1b2b3...bp-1)
where:
s
= 0 or 1
E = any integer between Emin and Emax, inclusive
bi = 0 or 1
= implicit binary point
∆
p = number of bits of precision
A-46
General Guidelines for Programming
AMD
The Am486 microprocessor stores real numbers in a three-field binary format that resembles scientific, or exponential, notation. The format consists of the following fields:
n
The number’s significant digits are held in the significand field, b0∆b1b2b3...bp-1 (the term
“significand” is analogous to the term “mantissa” used to describe floating-point numbers
on some computers; ∆ indicates the implicit binary point in the bit field).
n
The exponent field, e = E + bias, locates the binary point within the significant digits (and
therefore determines the number's magnitude). The term “exponent” is analogous to
the term “characteristic” used to describe floating-point numbers on some computers.
n
The 1-bit sign field indicates whether the number is positive or negative. Negative numbers differ from positive numbers only in the sign bits of their significands.
Table A-6 shows how the real number 178.125 (decimal) is stored in the single real format.
The table lists a progression of equivalent notations that express the same value to show
how a number can be converted from one form to another. (The ASM386/486 and PL/M386/
486 language translators perform a similar process when they encounter programmerdefined real number constants.)
Table A-6
Real Number Notation
Notation
Value
Ordinary Decimal
178.125
Scientific Decimal
1∆78125E2
Scientific Binary
1∆0110010001E111
Scientific Binary
(Biased Exponent)
1∆0110010001E10000110
Single Format
(Normalized)
Sign
Biased Exponent
Significand
0
10000110
01100100010000000000000
1∆ (implicit)
Note: Not every decimal fraction has an exact binary equivalent. The decimal number
1/10, for example, cannot be expressed exactly in binary (just as the number 1/3 cannot
be expressed exactly in decimal). When a translator encounters such a value, it produces
a rounded binary approximation of the decimal value.
A.2.5
Application Registers
The Am486 microprocessor contains sixteen registers that may be used by an application
programmer. As Figure A-31 shows, these registers may be grouped as:
n
General Registers: These eight 32-bit registers are free for use by the programmer.
n
Segment Registers: These registers hold segment selectors associated with different
forms of memory access. For example, there are separate segment registers for access
to code and stack space. These six registers determine, at any given time, which segments of memory are currently available.
n
Status and Control Registers: These registers report and allow modification of the state
of the Am486 microprocessor.
General Guidelines for Programming
A-47
AMD
Figure A-31
Application Register Set
In Am486DX and DX2 processors, there are also the following registers that are part of the
floating-point unit (FPU) in the microprocessor:
A-48
n
FPU Register Stack—eight 80-bit numeric registers that are organized as a register
stack.
n
Status and Control Registers—16-bit registers that contain the FPU status, control, and
tag words.
General Guidelines for Programming
AMD
n
A.2.5.1
Error Pointers—five registers including two 16-bit registers that hold selectors for the
last 16-bit operation, two 32-bit registers that hold selectors for the last 32-bit operation,
and one 11-bit register that contains the opcode of the last non-control FPU instruction.
General Registers
The general registers are the 32-bit registers EAX, EBX, ECX, EDX, EBP, ESP, ESI, and
EDI. These registers are used to hold operands for logical and arithmetic operations. They
also may be used to hold operands for address calculations (except the ESP register cannot
be used as an index operand). The names of these registers are derived from the names
of the general registers on the 8086 microprocessor, the AX, BX, CX, DX, BP, SP, SI, and
DI registers. As Table A-7 shows, the Low 16 bits of the general registers can be referenced
using these names.
Table A-7
Register Names
8 Bit
16 Bit
32 Bit
AX
EAX
BX
EBX
CX
ECX
DX
EDX
SI
ESI
DI
EDI
BP
EBP
SP
ESP
AL
AH
BL
BH
CL
CH
DL
DH
Note: The 8-bit registers are the upper and lower bytes of the first four 16-bit registers.
The first four 16-bit registers are the lower words in the first four 32-bit registers. The position
in this table is designed to suggest the register interrelationships.
Each byte of the 16-bit registers AX, BX, CX, and DX also have other names. The byte
registers are named AH, BH, CH, and DH (High bytes) and AL, BL, CL, and DL (Low bytes).
All of the general-purpose registers are available for address calculations and for the results
of most arithmetic and logical operations; however, a few instructions assign specific registers to hold operands. For example, string instructions use the contents of the ECX, ESI,
and EDI registers as operands. By assigning specific registers for these functions, the
instruction set can be encoded more compactly. The instructions using specific registers
include: double-precision multiply and divide, I/O, strings, translate, loop, variable shift and
rotate, and stack operations.
A.2.5.2
Segment Registers
Segmentation gives system designers the flexibility to choose among various models of
memory organization. Implementation of memory models is the subject of Section A.2.2.
The segment registers contain 16-bit segment selectors, which index into tables in memory.
The tables hold the base address for each segment, as well as other information regarding
memory access. An unsegmented model is created by mapping each segment to the same
place in physical memory (see Figure A-32).
General Guidelines for Programming
A-49
AMD
Figure A-32
Unsegmented Memory
At any instant, up to six segments of memory are immediately available. The segment
registers CS, DS, SS, ES, FS, and GS hold the segment selectors for these six segments.
Each register is associated with a particular kind of memory access (code, data, or stack).
Each register specifies a segment (from among the six possible segments available to each
program) used for its particular type of access (see Figure A-33). Other segments can be
used by loading their segment selectors into the segment registers.
The segment containing the instructions being executed is called the code segment. Its
segment selector is held in the CS register. The Am486 microprocessor fetches instructions
from the code segment, using the contents of the EIP register as an offset into the segment.
The CS register is loaded by interrupts, exceptions, and instructions that transfer control
between segments (e.g., the CALL, IRET, and JMP instructions).
Figure A-33
A-50
Segmented Memory
General Guidelines for Programming
AMD
All stack operations use the SS register to find the stack segment. Unlike the CS register,
the SS register can be loaded explicitly, which permits application programs to set up stacks.
Before a procedure is called, the SS allocates a stack to hold the return address, parameters
passed by the calling routine, and temporary variables allocated by the procedure.
The DS, ES, FS, and GS registers allow as many as four data segments to be available
simultaneously. Four data segments give efficient and secure access to different types of
data structures. For example, separate data segments can be created for the data structures
of the current module, data exported from a higher-level module, a dynamically created
data structure, and data shared with another program. If a bug causes a program to run
wild, the segmentation mechanism can limit the damage to only those segments allocated
to the program. An operand within a data segment is addressed by specifying its offset
either in an instruction or a general register.
Depending on the structure of data (i.e., the way data is partitioned into segments), a
program may require access to more than four data segments. To access additional segments, the DS, ES, FS, and GS registers can be loaded by an application program during
execution. The only requirement is to load the appropriate segment register before accessing data in its segment.
A base address is kept for each segment. To address data within a segment, a 32-bit offset
is added to the segment’s base address. Once a segment is selected (by loading the
segment selector into a segment register), an instruction only needs to specify the offset.
Simple rules define which segment register is used to form an address when only an offset
is specified.
Stack operations are supported by three registers:
n
Stack Segment (SS) Register: Stacks reside in memory. The number of stacks in a
system is limited only by the maximum number of segments. A stack may be up to
4 Gbytes long, the maximum size of a segment on the Am486 microprocessor. One
stack is available at a time—the stack whose segment selector is held in the SS register.
This is the current stack, often referred to simply as “the stack.” The SS register is used
automatically by the microprocessor for all stack operations.
n
Stack Pointer (ESP) Register: The ESP register holds an offset to the top-of-stack (TOS)
in the current stack segment. It is used by PUSH and POP operations, subroutine calls
and returns, exceptions, and interrupts. When an item is pushed onto the stack (see
Figure A-34), the microprocessor decrements the ESP register, then writes the item at
the new TOS. When an item is popped off the stack, the microprocessor copies it from
the TOS, then increments the ESP register. In other words, the stack grows down in
memory toward lesser addresses.
n
Stack-Frame Base Pointer (EBP) Register: The EBP register typically is used to access
data structures passed on the stack. For example, on entering a subroutine, the stack
contains the return address and some number of data structures passed to the subroutine. The subroutine adds to the stack whenever it needs to create space for temporary
local variables. As a result, the stack pointer moves around as temporary variables are
pushed and popped. If the stack pointer is copied into the base pointer before anything
is pushed on the stack, the base pointer can be used to reference data structures with
fixed offsets. If this is not done, the offset to access a particular data structure would
change whenever a temporary variable is allocated or deallocated.
General Guidelines for Programming
A-51
AMD
Figure A-34
Stacks
When the EBP register is used to address memory, the current stack segment is selected
(i.e., the SS segment). Because the stack segment does not have to be specified, instruction
encoding is more compact. The EBP register also can be used to address other segments.
Instructions, such as the ENTER and LEAVE instructions, are provided. These automatically set up the EBP register for convenient access to variables.
Instructions that use the stack implicitly (for example: POP EAX) also have a stack addresssize attribute of either 16 or 32 bits. Instructions with a stack address-size attribute of 16
use the 16-bit SP stack pointer register; instructions with a stack address-size attribute of
32 bits use the 32-bit ESP register to form the address of the top of the stack. The stack
address-size attribute is controlled by the B bit of the data-segment descriptor in the SS
register. A value of zero in the B bit selects a stack address-size attribute of 16; a value of
one selects a stack address-size attribute of 32.
A.2.5.3
Status and Control Registers
The status and control registers include the 16-bit FLAGS and the 32-bit EFLAGS registers
and the 16-bit IP and the 32-bit EIP registers. The 16-bit registers provide compatibility with
systems using 16-bit memory access.
A-52
General Guidelines for Programming
AMD
A.2.5.3.1
Flags Register
Condition codes (e.g., carry, sign, overflow) and mode bits are kept in a 32-bit register
named EFLAGS. Figure A-35 defines the bits within this register. The flags control certain
operations and indicate the status of the Am486 microprocessor. The flags may be considered in three groups: status flags, control flags, and system flags.
Figure A-35
EFLAGS Register
The status flags of the EFLAGS register report the kind of result produced from the execution
of arithmetic instructions. The MOV instruction does not affect these flags. Conditional
jumps and subroutine calls allow a program to sense the state of the status flags and
respond to them. For example, when the counter controlling a loop is decremented to zero,
the state of ZF changes, and this change can be used to suppress the conditional jump to
the start of the loop. The status flags are shown in Table A-8.
Table A-8
Status Flags
Name
Purpose
Condition Reported
OF
overflow
Result exceeds positive or negative limit of number range
SF
sign
Result is negative (less than zero)
ZF
zero
Result is zero
AF
auxiliary carry
Carry out of bit position 3 (used for BCD)
PF
parity
Low byte of result has even parity (even number of set bits)
CF
carry
Carry out of most-significant bit of result
General Guidelines for Programming
A-53
AMD
The control flag DF of the EFLAGS register causes string instructions to auto-decrement
(i.e., to process strings from High addresses to Low addresses). Clearing DF causes string
instructions to auto-increment (i.e., to process strings from Low addresses to High
addresses).
A.2.5.3.2
Instruction Pointer
The instruction pointer (EIP) register contains the offset in the current code segment for
the next instruction to execute. The instruction pointer is not directly available to the programmer; it is controlled implicitly by control transfer instructions (jumps, returns, etc.),
interrupts, and exceptions. The EIP register advances from one instruction boundary to the
next.
Because of instruction prefetching, the instruction boundary is only an approximate indication of the bus activity that loads instructions into the microprocessor. The Am486 microprocessor does not fetch single instructions. The microprocessor prefetches aligned 128bit blocks of instruction code in advance of instruction execution (an aligned 128-bit block
begins at an address that is clear in its Low four bits). These blocks are fetched without
regard to the boundaries between instructions. By the time an instruction starts to execute,
it already has been loaded into the microprocessor and decoded. This is a performance
feature, because it allows instruction execution to be overlapped with instruction prefetch
and decode.
When a jump or call executes, the microprocessor prefetches the entire aligned block
containing the destination address and discards instructions that are already prefetched or
decoded. This can be a benefit because the microprocessor does not generate an exception
until the causative code actually executes. So, if the original prefetched range sequence
includes some action that could generate an exception, such as code that is beyond the
end of the code segment, a jump or call that replaces that range actually prevents the
exception occurrence.
In Real Address Mode, prefetching may cause the microprocessor to access addresses
not anticipated by programmers. In Protected Mode, exceptions are correctly reported when
these addresses are executed. There may not be hardware mechanisms that account for
Real Address Mode behavior of the microprocessor. For example, if a system does not
return the READY signal (the signal that terminates a bus cycle) for bus cycles to unimplemented addresses, prefetching must not reference these addresses. If a system implements parity checking, prefetching must not access addresses beyond the end of parityprotected memory. (Alternatively, the hardware design can cause READY to be returned
even for unimplemented address bus cycles, and parity errors can be ignored for prefetches
beyond the end of parity-protected memory.)
Prefetching can be kept from referencing a particular address by placing enough distance
between the address and the last executable byte. For example, to keep prefetching away
from addresses in the block from 10000h to 1000Fh, the last executable byte should be no
closer than 0FFEEh. This places one free byte followed by one free, aligned, 128-bit block
between the last byte of the last instruction and the address that must not be referenced.
The prefetching behavior of the Am486 microprocessor is implementation-dependent; future AMD products may have different prefetching behavior.
A-54
General Guidelines for Programming
AMD
A.2.5.4
FPU Registers
The FPU uses the following Registers shown in Figure A-36:
n
Eight individually-addressable 80-bit numeric registers, organized as a register stack
n
Three 16-bit registers:
— FPU Status Word
— FPU Control Word
— Tag Word
n
Error pointers:
— Two 16-bit registers containing selectors for the last instruction and operand
— Two 32-bit registers containing offsets for the last instructions and operand
— One 11-bit register containing the opcode of the last non-control FPU instruction
The FPU instructions use the contents of these registers for their operations.
Figure A-36
Am486 Microprocessor FPU Register Set
A.2.5.4.1
FPU Register Stack
The FPU register stack is shown in Figure A-36. Each of the eight numeric registers in the
stack is 80-bits wide and is divided into fields corresponding to the Am486 microprocessor's
extended real data type.
Numeric instructions address the data registers relative to the register on the top of the
stack. At any point in time, this top-of-stack register is indicated by the TOP (stack TOP)
field in the FPU status word. Load or push operations decrement TOP by one and load a
value into the new top register. A store-and-pop operation stores the value from the current
TOP register and then increments TOP by one. Like stacks in memory, the FPU register
stack grows down toward lower-addressed registers.
General Guidelines for Programming
A-55
AMD
Many numeric instructions have several addressing modes that permit the programmer to
implicitly operate on the top of the stack, or to explicitly operate on specific registers relative
to the TOP. The ASM386/486 assembler supports these register addressing modes, using
the expression ST(0), or simply ST, to represent the current Stack Top and ST(i) to specify
the ith register from TOP in the stack (0 ≤ i ≤ 7). For example, if TOP contains 011B (register
3 is the top of the stack), the following statement would add the contents of two registers
in the stack (registers 3 and 5):
FADDST, ST(2)
The stack organization and top-relative addressing of the numeric registers simplify subroutine programming by allowing routines to pass parameters on the register stack. By
using the stack to pass parameters rather than using “dedicated” registers, calling routines
gain more flexibility in how they use the stack. As long as the stack is not full, each routine
simply loads the parameters onto the stack before calling a particular subroutine to perform
a numeric calculation. The subroutine then addresses its parameters as ST, ST(1), etc.,
even though TOP may, for example, refer to physical register 3 in one invocation and
physical register 5 in another.
A.2.5.4.2
FPU Status and Control Registers
The three 16-bit status and control registers perform control and monitoring functions for
the FPU. They include:
FPU Status Word—this 16-bit status word reflects the overall state of the FPU (see Figure
A-37). This status word may be stored into memory using the FSTSW/FNSTSW, FSTENV/
FNSTENV, and FSAVE/FNSAVE instructions, and can be transferred into the AX register
with the FSTSW AX/FNSTSW AX instructions, allowing the FPU status to be inspected by
the Integer Unit.
Figure A-37
FPU Status Word
Note: The B-bit (bit 15) is included for 8087 compatibility only. It reflects the contents of the ES bit (bit 7 of the
status word).
A-56
General Guidelines for Programming
AMD
The four FPU condition code bits (C3–CO) are similar to the other status flags. The Am486
microprocessor updates these bits to reflect the outcome of arithmetic operations. Table
A-9 summarizes the effect of these instructions on the condition code bits. The condition
code bits are used principally for conditional branching. The FSTSW AX instruction stores
the FPU status word directly into the AX register, allowing easy access for other code
inspection. The SAHF instruction can copy C3–C0 directly to Am486 microprocessor flag
bits to simplify conditional branching. Table A-10 shows the mapping of these bits to the
flag bits.
Table A-9
Condition Code Interpretation
Instruction
C0
FCOM, FCOMP, FCOMPP,
FTST, FUCOM, FUCOMP,
FUCOMPP, FICOM, FICOMP
C3
Result of Comparison
FXAM
C2
C1
Operand is not
comparable
Zero or O/U
Operand class
FPREM, FPREM1
Q2
Q0
FIST, FBSTP, FRNDINT, FST,
FSTP, FADD, FMUL, FDIV,
FDIVR, FSUB, FSUBR,
FSCALE, FSQRT, FPATAN,
F2XM1, FYL2X, FYL2XP1
FPTAN, FSIN, FCOS,
FSINCOS
Sign or O/U
0 = reduction complete
1 = reduction incomplete
Undefined
Undefined
FCHS, FABS, FXCH,
FINCSTP, FLD, FILD,
Constant Loads (FLDxx),
FXTRACT, FBLD,
FSTP (ext. real)
Q1 or O/U
Roundup or O/U
0 = reduction complete
1 = reduction incomplete
Roundup or O/U
(Undefined if C2 = 1)
Undefined
FLDENV, FRSTOR
Zero or O/U
Each bit loaded from memory
FLDCW, FSTENV, FSTCW,
FSTSW, FCLEX
FINIT, FSAVE
Undefined
Zero
Zero
Zero
Zero
Notes:
O/U: When both IE and SF bits of the status word are set, indicating a stack exception, this bit distinguishes
between a stack overflow (C1 = 1) and underflow (C1 = 0).
Reduction: If FPREM or FPREM1 produces a remainder less than the modulus, reduction is complete.
Incomplete reduction leaves a partial remainder value at the top of the stack. This remainder can be used
for further reduction. For FPTAN, FSIN, FCOS, and FSINCOS, the bit is set if the operand at the top of stack
is too large; for this case, the original operand remains at the top of the stack.
Undefined: No specific value is defined for these bits.
Table A-10
Correspondence between FPU Flags and Processor Flag Bits
FPU Flag
Processor Flag
C0
C1
C2
C3
CF
(none)
PF
ZF
General Guidelines for Programming
A-57
AMD
Bits 11–13 of the status word point to the FPU register that is the current Top of Stack
(TOP). The significance of the stack top has been described in the prior section on the
register stack.
Figure A-37 shows the six exception flags in bits 0–5 of the status word. Bit 7 is the exception
summary status (ES) bit. ES is set if any unmasked exception bits are set, and is cleared
otherwise. Bits 0–5 indicate whether the FPU has detected one of six possible exception
conditions since these status bits were last cleared or reset. They are “sticky” bits, and can
only be cleared by the instructions FINIT, FCLEX, FLDENV, FSAVE, and FRSTOR.
Bit 6 is the stack fault (SF) bit. This bit distinguishes invalid operations due to stack overflow
or underflow from other kinds of invalid operations. When SF is set, bit 9 (C1) distinguishes
between stack overflow (C1 = 1) and underflow (C1 = 0).
A.2.5.4.3
Control Word
The FPU provides the programmer with several processing options, which are selected by
loading a word from memory into the control word. Figure A-38 shows the format and
encoding of the fields in the control word.
Figure A-38
FPU Control Word Format
∞
∞
The Low-order byte of this control word configures the numerical exception masking. Bits
0–5 of the control word contain individual masks for each of the six floating-point exception
conditions recognized by the Am486 microprocessor. The High-order byte of the control
word configures the FPU processing options, including:
n
Precision control
n
Rounding control
The precision-control bits (bits 8–9) can be used to set the FPU internal operating precision
at less than the default precision (64-bit significand). These control bits can be used to
provide compatibility with the earlier-generation arithmetic processors having less precision
than the 486 microprocessor or 387 math coprocessor. The precision control bits affect the
A-58
General Guidelines for Programming
AMD
results of only the following five arithmetic instructions: ADD, SUB(R), MUL, DIV(R), and
SQRT. No other operations are affected by precision control.
The rounding-control bits (bits 10–11) provide for the common round-to-nearest mode, as
well as directed rounding and true chop. Rounding control affects the arithmetic instructions
(refer to Chapter 2 for lists of arithmetic and non-arithmetic instructions) and certain nonarithmetic instructions, namely FLD constant and FST(P)mem instructions.
A.2.5.4.4
FPU Tag Word
The tag word indicates the contents of each register in the stack (see Figure A-39). The
tag word is used by the FPU itself to distinguish between empty and nonempty register
locations. Programmers of exception handlers may use this tag information to check the
contents of a numeric register without performing complex decoding of the actual data in
the register. The tag values from the tag word correspond to physical registers 0–7. Programmers must use the current top-of-stack (TOP) pointer stored in the FPU status word
to associate these tag values with the relative stack registers ST(0)–ST(7).
Figure A-39
Tag Word Format
The exact values of the tags are generated during execution of the FSTENV and FSAVE
instructions according to the actual contents of the non-empty stack locations. During execution of other instructions, the Am486 microprocessor updates the tag values only to
indicate whether a stack location is empty or non-empty.
A.2.5.4.5
Numeric Instruction and Data Pointers
The instruction and data pointers provide support for programmed exception-handlers.
These registers are accessed by the ESC instructions FLDENV, FSTENV, FSAVE, and
FRSTOR. Whenever the Am486 microprocessor decodes an ESC instruction, it saves the
instruction address, the operand address (if present), and the instruction opcode.
When stored in memory, the instruction and data pointers appear in one of four formats,
depending on the operating mode of the microprocessor (Protected Mode or Real Address
Mode) and depending on the operand-size attribute in effect (32-bit operand or 16-bit
operand). In Virtual 8086 Mode, the Real Address Mode formats are used. Figures A-40
through A-43 show these pointers as they are stored following an FSTENV instruction.
General Guidelines for Programming
A-59
AMD
Figure A-40
Protected Mode Numeric Instruction and Data Pointer Image in Memory, 32-Bit
Format
Figure A-41
Real Mode Numeric Instruction and Data Pointer Image in Memory, 32-Bit Format
A-60
General Guidelines for Programming
AMD
Figure A-42
Protected Mode Numeric Instruction and Data Pointer Image in Memory, 16-Bit
Format
Figure A-43
Real Mode Numeric Instruction and Data Pointer Image in Memory, 16-Bit Format
The FSTENV and FSAVE instructions store this data into memory, allowing exception
handlers to determine the precise nature of any numeric exceptions that may be
encountered.
The saved instruction address points to any prefixes that preceded the instruction, as in
the 387 and 287 math coprocessors. This is different from the 8087 coprocessor, for which
the instruction address points only to the ESC instruction opcode.
Note: The microprocessor control instructions FINIT, FLDCW, FSTCW FSTSW, FCLEX,
FSTENV, FLDENV, FSAE, and FRSTOR do not affect the data pointer. Also, except for the
instructions just mentioned, the value of the data pointer is undefined if the prior ESC
instruction did not have a memory operand.
General Guidelines for Programming
A-61
AMD
A.2.5.4.6
Opcode Field of Last Instruction
The opcode field in Figure A-44 describes the 11-bit format of the last non-control FPU
instruction executed. The first and second instruction bytes (after all prefixes) are combined
to form the opcode field. Since all floating-point instructions share the same 5 upper bits
in the first instruction byte (following prefixes), they are not stored in the opcode field. Note
that the second instruction byte is actually located in the Low-order byte of the stored opcode
field.
Figure A-44
Opcode Field
A.2.6
Instruction Format
The instruction format uses a combination of explicit and implicit conditions to set the
environment for executing a specific command. For example, when executing an instruction, the Am486 microprocessor can address memory using either 16- or 32-bit addresses.
Accordingly, each instruction that uses memory addresses has an associated address-size
attribute of either 16 or 32 bits. Using a 16-bit address implies both the use of 16-bit
displacements in instructions and the generation of 16-bit address offsets (segment relative
addresses) as the result of the effective address calculations. Using 32-bit addresses implies the use of 32-bit displacements and the generation of 32-bit address offsets. Similarly,
an instruction that accesses words (16 bits) or doublewords (32 bits) has an operand-size
attribute of either 16 or 32 bits.
The attributes are determined by a combination of defaults, instruction prefixes, and (for
programs executing in Protected Mode) size-specification bits in segment descriptors.The
information encoded in an instruction includes a specification of the operation to be performed, the type of the operands to be manipulated, and the location of these operands. If
an operand is located in memory, the instruction also must select, explicitly or implicitly, the
segment that contains the operand.
Chapter 2 provides a complete listing and description of Am486 microprocessor instructions. All non-floating-point instruction encodings are subsets of the general instruction
format shown in Figure A-45.
A-62
General Guidelines for Programming
AMD
Figure A-45
General Instruction Format
Instructions consist of:
n
Instruction prefixes (optional)
n
Primary opcode bytes (one or two)
n
Address specifier with mod r/m byte and Scale Index Base (s-i-b) byte, if required
n
Displacement, if required
n
Immediate data field, if required
Floating-point instructions all begin with the letter “F” and have a basic 2-byte format that
may have a 1- or 2-byte optional address specifier field. The basic FPU instruction layout
is included in Figure A-46.
General Guidelines for Programming
A-63
AMD
Figure A-46
Floating-Point Instruction Formats
Table A-11
Address Mode Field (mod/rm) Definitions (no s-i-b present)
Effective Address
A-64
Value
(mod r/m =)
16-Bit Address Mode
32-Bit Address Mode
00 000
DS:[BX + SI]
DS:[EAX]
00 001
DS:[BX + DI]
DS:[ECX]
00 010
SS:[BP + SI]
DS:[EDX]
00 011
SS:[BP + DI]
DS:[EBX]
00 100
DS:[SI]
s-i-b present (see Tables A-12 through A-14)
00 101
DS:[DI]
DS:immediate doubleword
00 110
DS:immediate word
DS:[ESI]
00 111
DS:[BX]
DS:[EDI]
01 000
DS:[BX + SI + immediate byte]
DS:[EAX + immediate byte]
01 001
DS:[BX + DI + immediate byte]
DS:[ECX + immediate byte]
01 010
SS:[BP + SI + immediate byte]
DS:[EDX + immediate byte]
01 011
SS:[BP + DI + immediate byte]
DS:[EBX + immediate byte]
01 100
DS:[SI + immediate byte]
s-i-b present (see Tables A-12 through A-14)
01 101
DS:[DI + immediate byte]
SS:[EBP + immediate byte]
01 110
SS:[BP + immediate byte]
DS:[ESI + immediate byte]
01 111
DS:[BX + immediate byte]
DS:[EDI + immediate byte]
10 000
DS:[BX + SI + immediate word]
DS:[EAX + immediate doubleword]
10 001
DS:[BX + DI + immediate word]
DS:[ECX + immediate doubleword]
General Guidelines for Programming
AMD
Table A-11
Address Mode Field (mod/rm) Definitions (no s-i-b present) (continued)
Effective Address
Table A-12
Table A-13
Value
(mod r/m =)
16-Bit Address Mode
32-Bit Address Mode
10 010
SS:[BP + SI + immediate word]
DS:[EDX + immediate doubleword]
10 011
SS:[BP + DI + immediate word]
DS:[EBX + immediate doubleword]
10 100
DS:[SI + immediate word]
s-i-b present (see Tables A-12 through A-14)
10 101
DS:[DI + immediate word]
SS:[EBP + immediate doubleword]
10 110
SS:[BP + immediate word]
DS:[ESI + immediate doubleword]
10 111
DS:[BX + immediate word]
DS:[EDI + immediate doubleword]
The following values
specify General
Registers
16-Bit Data Operations
32-Bit Data Operations
w=0
w =1
w =0
w =1
11 000
AL
AX
AL
EAX
11 001
CL
CX
CL
ECX
11 010
DL
DX
DL
EDX
11 011
BL
BX
BL
EBX
11 100
AH
SP
AH
ESP
11 101
CH
BP
CH
EBP
11 110
DH
SI
DH
ESI
11 111
BH
DI
BH
EDI
Scale Field (ss) Definitions
Value (ss=)
Scale Factor
00
x1
01
x2
10
x4
11
x8
Index Field (index) Definitions
Value (index=)
Indexed Register
000
EAX
001
ECX
010
EDX
011
EBX
100
no index register
101
EBP
110
ESI
111
EDI
Note: When index = 100, the ss field must equal 00. If not, the effective address is undefined.
General Guidelines for Programming
A-65
AMD
Table A-14
A.2.6.1
Base Field (base) Definitions
mod r/m =
Value (base=)
Effective Address
00 100
000
DS:[EAX + (scaled index)]
00 100
001
DS:[ECX + (scaled index)]
00 100
010
DS:[EDX + (scaled index)]
00 100
011
DS:[EBX + (scaled index)]
00 100
100
SS:[ESP + (scaled index)]
00 100
101
DS:[immediate doubleword + (scaled index)]
00 100
110
DS:[ESI + (scaled index)]
00 100
111
DS:[EDI + (scaled index)]
01 100
000
DS:[EAX + (scaled index) + immediate byte]
01 100
001
DS:[ECX + (scaled index) + immediate byte]
01 100
010
DS:[EDX + (scaled index) + immediate byte]
01 100
011
DS:[EBX + (scaled index) + immediate byte]
01 100
100
SS:[ESP + (scaled index) + immediate byte]
01 100
101
SS:[EBP + (scaled index) + immediate byte]
01 100
110
DS:[ESI + (scaled index) + immediate byte]
01 100
111
DS:[EDI + (scaled index) + immediate byte]
10 100
000
DS:[EAX + (scaled index) + immediate doubleword]
10 100
001
DS:[ECX + (scaled index) + immediate doubleword]
10 100
010
DS:[EDX + (scaled index) + immediate doubleword]
10 100
011
DS:[EBX + (scaled index) + immediate doubleword]
10 100
100
SS:[ESP + (scaled index) + immediate doubleword]
10 100
101
SS:[EBP + (scaled index) + immediate doubleword]
10 100
110
DS:[ESI + (scaled index) + immediate doubleword]
10 100
111
DS:[EDI + (scaled index) + immediate doubleword]
Instruction Prefixes
Allowable instruction prefix codes include:
n
REP/REPE/REPNE/REPNZ/REPZ: Repeat instruction codes used with string
instructions
n
LOCK: Forces the system to invoke the LOCK signal
n
Segment Override: Requires the instruction to use the specified segment register (CS,
DS, ES, FS, GS, or SS)
n
Operand size override: Requires the instruction to use the specified operand size instead
of the default value
n
Address size override: Requires the instruction to use the specified address size instead
of the default value
Note: For programs running in Protected Mode, the D bit in executable-segment descriptors
specifies the default attribute for both address size and operand size. These default
attributes apply to the execution of all instructions in the segment. A clear D bit sets the
default address size and operand size to 16 bits; a set D bit, to 32 bits. Programs that
execute in Real Mode or Virtual 8086 Mode have 16-bit addresses and operands by default.
A-66
General Guidelines for Programming
AMD
A.2.6.2
Opcode Fields
The opcode fields define the operation, but all can have smaller encoding fields within them
that define the operation direction, displacement sizes, the register encoding, or sign extension; encoding fields vary depending on the class of operation.
A.2.6.3
Address Specifier
Most instructions that can refer to an operand in memory have an addressing form byte
after the primary opcode byte(s). This byte, called the mod r/m byte, specifies the address
form to be used. Certain encodings of the mod r/m byte indicate a second addressing byte,
the s-i-b byte, which follows the mod r/m byte and is required to fully specify the addressing
form. Addressing forms can include a displacement immediately following either the mod
r/m or s-i-b byte. If a displacement is present, it can be 8, 16, or 32 bits. The 8-bit form is
used in the common case when the displacement is sufficiently small. The microprocessor
extends an 8-bit displacement to 16 or 32 bits, taking into account the sign.
The mod r/m and s-i-b bytes contain the following information:
n
The indexing type or register number to be used in the instruction
n
The register to be used, or more information to select the instruction
n
The base, index, and scale information
The mod r/m byte contains three fields of information:
n
The mod field, which occupies the two most-significant bits of the byte, combines with
the r/m field to form 32 possible values: eight registers and 24 indexing modes.
n
The reg field, which occupies the next three bits following the mod field, specifies either
a register number or three more bits of opcode information. The meaning of the reg field
is determined by the first (opcode) byte of the instruction.
n
The r/m field, which occupies the three least-significant bits of the byte, can specify a
register as the location of an operand, or can form part of the addressing-mode encoding
in combination with the mod field as described above.
The based indexed and scaled indexed forms of 32-bit addressing require the s-i-b byte.
The presence of the s-i-b byte is indicated by certain encodings of the mod r/m byte. The
s-i-b byte then includes the following fields:
n
The ss field (the two most-significant bits of the byte) specifies the scale factor
n
The index field (the next three bits after the ss field) specifies the index register number
n
The base field (the three least-significant bits of the byte) specifies the base register
number
Figure A-47 shows the formats of the mod r/m and s-i-b bytes. (See also Tables A-11–A-14.)
Figure A-47
mod R/M and s-i-b Byte Formats
General Guidelines for Programming
A-67
AMD
A.2.6.4
Immediate Operand
If the instruction specifies an immediate operand, the immediate operand always follows
any displacement bytes. The immediate operand, if specified, is always the last field of the
instruction. Immediate operands may be bytes, words, or doublewords. In cases where an
8-bit immediate operand is used with a 16- or 32-bit operand, the microprocessor extends
the 8-bit operand to an integer of the same sign and magnitude in the larger size. In the
same way, a 16-bit operand is extended to 32-bits.
A.2.7
Operand Selection
An instruction acts on zero or more operands. An example of a zero-operand instruction is
the NOP instruction (no operation). An operand can be held in any of these places:
n
In the instruction itself (an immediate operand)
n
In a register (in the case of 32-bit operands, EAX, EBX, ECX, EDX, ESI, EDI, ESP, or
EBP; in the case of 16-bit operands AX, BX, CX, DX, SI, DI, SP, or BP; in the case of
8-bit operands AH, AL, BH, BL, CH, CL, DH, or DL; the segment registers; or the
EFLAGS register for flag operations). Use of 16-bit register operands requires use of
the 16-bit operand size prefix (a byte with the value 67h preceding the instruction).
n
In memory
n
At an I/O port
Access to operands is very fast. Register and immediate operands are available on-chip
(the latter because they are prefetched as part of interpreting the instruction). Memory
operands residing in the on-chip cache can be accessed just as fast.
Of the instructions that have operands, some specify operands implicitly; others specify
operands explicitly; still others use a combination of both. For example:
n
Implicit operand: AAM
— By definition, AAM (ASCII adjust for multiplication) operates on the contents of the
AX register
n
Explicit operand: XCHG EAX, EBX
— The operands to be exchanged are encoded in the instruction with the opcode
n
Implicit and explicit operands: PUSH COUNTER
— The memory variable COUNTER (the explicit operand) is copied to the top of the
stack (the implicit operand)
Note: Most instructions have implicit operands. All arithmetic instructions, for example,
update the EFLAGS register.
An instruction can explicitly reference one or two operands. Two-operand instructions, such
as MOV, ADD, and XOR, generally overwrite one of the two participating operands with
the result. This is the difference between the source operand (the one unaffected by the
operation) and the destination operand (the one overwritten by the result).
For most instructions, one of the two explicitly specified operands—either the source or the
destination—can be either in a register or in memory. The other operand must be in a
register or it must be an immediate source operand. This puts the explicit two-operand
instructions into the following groups:
A-68
General Guidelines for Programming
AMD
n
Register to register
n
Register to memory
n
Memory to register
n
Immediate to register
n
Immediate to memory
Certain string instructions and stack manipulation instructions, however, transfer data from
memory to memory. Both operands of some string instructions are in memory and are
specified implicitly. Push and pop stack operations allow transfer between memory operands and the memory-based stack. Several three-operand instructions are provided, such
as the IMUL, SHRD, and SHLD instructions. Two of the three operands are specified
explicitly, as for the two-operand instructions, while a third is taken from the ECX register
or supplied as an immediate value. Other three-operand instructions, such as the string
instructions when used with a repeat prefix, take all their operands from registers.
For programs running in Protected Mode, the D bit in executable-segment descriptors
specifies the default attribute for both address size and operand size. These default attributes apply to the execution of all instructions in the segment. A clear D bit sets the default
address size and operand size to 16 bits; a set D bit, to 32 bits. Programs that execute in
Real Mode or Virtual 8086 Mode have 16-bit addresses and operands by default.
A.2.7.1
Immediate Operands
Certain instructions use data from the instruction itself as one (and sometimes two) of the
operands. Such an operand is called an immediate operand. It may be a byte, word, or
doubleword. For example:
SHR PATTERN, 2
One byte of the instruction holds the value 2, the number of bits by which to shift the variable
PATTERN.
TEST PATTERN, 0FFFF00FFh
A doubleword of the instruction holds the mask that is used to test the variable PATTERN.
IMUL CX, MEMWORD, 3
A word in memory is multiplied by an immediate 3 and stored into the CX register.
All arithmetic instructions (except divide) allow the source operand to be an immediate
value. When the destination is the EAX or AL register, the instruction encoding is one byte
shorter than with the other general registers.
A.2.7.2
Register Operands
Operands may be located in one of the 32-bit general registers (EAX, EBX, ECX, EDX,
ESI, EDI, ESP, or EBP), in one of the 16-bit general registers (AX, BX, CX, DX, SI, DI, SP,
or BP), or in one of the 8-bit general registers (AH, BH, CH, DH, AL, BL, CL, or DL). The
Am486 microprocessor has instructions for referencing the segment registers (CS, DS, ES,
SS, FS, and GS). These instructions are used by application programs only if system
designers have chosen a segmented memory model. The Am486 microprocessor also has
instructions for changing the state of individual flags in the EFLAGS register. Instructions
have been provided for setting and clearing flags that often need to be accessed. The other
flags, which are not accessed so often, can be changed by pushing the contents of the
EFLAGS register on the stack, making changes to it while it’s on the stack, and popping it
back into the register.
General Guidelines for Programming
A-69
AMD
A.2.7.3
Memory Operands
Instructions with explicit operands in memory must reference the segment containing the
operand and the offset from the beginning of the segment to the operand. Segments are
specified using a segment-override prefix, which is a byte placed at the beginning of an
instruction. If no segment is specified, simple rules assign the segment by default. The
offset is specified in one of the following ways:
n
Most instructions that access memory contain a byte for specifying the addressing method of the operand. The byte, called the mod r/m byte, comes after the opcode and
specifies whether the operand is in a register or in memory. If the operand is in memory,
the address is calculated from a segment register and any of the following values: a
base register, an index register, a scaling factor, and a displacement. When an index
register is used, the mod r/m byte also is followed by another byte to specify the index
register and scaling factor. This form of addressing is the most flexible.
n
A few instructions use implied address modes: A MOV instruction with the AL or EAX
register as either source or destination can address memory with a doubleword encoded
in the instruction. This special form of the MOV instruction allows no base register, index
register, or scaling factor to be used. This form is one byte shorter than the generalpurpose form.
String operations address memory in the DS segment using the ESI register, (the MOVS,
CMPS, OUTS, and LODS instructions) or using the ES segment and EDI register (the
MOVS, CMPS, INS, SCAS, and STOS instructions).
Stack operations address memory in the SS segment using the ESP register (the PUSH,
POP, PUSHA, PUSHAD, POPA, POPAD, PUSHF, PUSHFD, POPF, POPFD, CALL,
LEAVE, RET, IRET, and IRETD instructions, exceptions, and interrupts).
A.2.7.3.1
Segment Selection
Explicit specification of a segment is optional. If a segment is not specified by a segmentoverride prefix, the microprocessor automatically chooses a segment according to the rules
of Table A-15. (If a flat model of memory organization is used, the rules for selecting
segments are not apparent to application programs.) Different kinds of memory access
have different default segments. Data operands usually use the main data segment (the
DS segment). However, the ESP and EBP registers are used for addressing the stack, so
when either register is used, the stack segment (the SS segment) is selected.
Table A-15
Default Segment Selection Rules
Type of Reference
A-70
Segment Used
Register Used
Default Selection Rule
Instructions
Code Segment
CS Register
Automatic with instruction fetch
Stack
Stack Segment
SS Register
All stack PUSHes and POPs. Any memory reference
that uses ESP or EBP as a base register.
Local Data
Data Segment
DS Register
All data references except when relative to stack or
string destination
Destination Strings
E-Space Segment
ES Register
Destination of string instructions
General Guidelines for Programming
AMD
Segment-override prefixes are provided for each of the segment registers. Only the following special cases have a default segment selection that is not affected by a segmentoverride prefix:
A.2.7.3.2
n
Destination strings in string instructions use the ES segment
n
Destination of a push or source of a pop uses the SS segment
n
Instruction fetches use the CS segment
Effective-Address Computation
The mod r/m byte provides the most flexible form of addressing. Instructions that have a
mod r/m byte after the opcode are the most common in the instruction set. For memory
operands specified by a mod r/m byte, the offset within the selected segment is the sum
of three components:
n
Displacement
n
Base register
n
Index register (the index register may be multiplied by a factor of 2, 4, or 8)
The offset that results from adding these components is called an effective address. Each
of these components may have either a positive or negative value. Figure A-48 illustrates
the full set of possibilities for mod r/m addressing.
Figure A-48
Effective Address Computation
The displacement component, because it is encoded in the instruction, is useful for relative
addressing by fixed amounts, such as:
n
Location of simple scalar operands
n
Beginning of a statically allocated array
n
Offset to a field within a record
The base and index components have similar functions. Both use the same set of general
registers. Both can be used for addressing that changes during program execution, such as:
n
Location of procedure parameters and local variables on the stack.
n
The beginning of one record among several occurrences of the same record type or in
an array of records
n
The beginning of one dimension of multiple dimension array
n
The beginning of a dynamically allocated array
General Guidelines for Programming
A-71
AMD
The uses of general registers as base or index components differ in the following respects:
n
The ESP register cannot be used as an index register.
n
When the ESP or EBP register is used as the base, the SS segment is the default
selection. In all other cases, the DS segment is the default selection.
n
The scaling factor permits efficient indexing into an array when the array elements are
2, 4, or 8 bytes. The scaling of the index register is done in hardware at the time the
address is evaluated. This eliminates an extra shift or multiply instruction.
The base, index, and displacement components may be used in any combination; any of
these components may be null. A scale factor can be used only when an index also is used.
Each possible combination is useful for data structures commonly used by programmers
in high-level languages and assembly language. Suggested uses for some combinations
of address components are described below:
n
Displacement—indicates the offset of the operand. This form of addressing is used to
access a statically allocated scalar operand. A byte, word, or doubleword displacement
can be used.
n
Base—the offset to the operand is specified indirectly in one of the general registers, as
for “based” variables.
n
Base + Displacement—a register and a displacement can be used together for two
distinct purposes:
— Index into static array when the element size is not 2, 4, or 8 bytes. The displacement
component encodes the offset of the beginning of the array. The register holds the
results of a calculation to determine the offset to a specific element within the array.
— Access a field of a record. The base register holds the address of the beginning of
the record, while the displacement is an offset to the field.
Note: An important special case of this combination is access to parameters in a procedure
activation record. A procedure activation record is the stack frame created when a
subroutine is entered. In this case, the EBP register is the best choice for the base register,
because it automatically selects the stack segment. This is a compact encoding for this
common function.
n
n
n
A.2.8
(Index ⋅ Scale) + Displacement—this combination is an efficient way to index into a static
array when the element size is 2, 4, or 8 bytes. The displacement addresses the beginning of the array, the index register holds the subscript of the desired array element,
and the microprocessor automatically converts the subscript into an index by applying
the scaling factor.
Base + Index + Displacement—two registers used together that support either a twodimensional array (the displacement holds the address of the beginning of the array) or
one of several instances of an array of records (the displacement is an offset to a field
within the record).
Base + (Index ⋅ Scale) + Displacement—provides efficient indexing of a two-dimensional
array when the elements of the array are 2, 4, or 8 bytes in size.
Interrupts and Exceptions
Interrupts and exceptions are forced transfers of execution to a task or a procedure. The
task or procedure is called a handler. Interrupts occur at random times during the execution
of a program in response to signals from hardware. Exceptions occur when instructions
that provoke exceptions are executed. Usually, the servicing of interrupts and exceptions
is performed in a manner transparent to application programs. Interrupts are used to handle
A-72
General Guidelines for Programming
AMD
events external to the microprocessor, such as requests to service peripheral devices.
Exceptions handle conditions detected by the microprocessor in the course of executing
instructions, such as division by 0.
There are two sources for interrupts and two sources for exceptions:
n
Interrupts
— Maskable interrupts: invoked by a signal to the INTR input if not masked by IF
— Non-maskable interrupts: invoked by a signal to the NMI input
n
Exceptions
— Microprocessor-detected exceptions: faults, traps, and aborts
— Programmed exceptions: triggered by INTO, INT 3h, INT nh, and BOUND instructions
Application programmers normally are not concerned with handling exceptions or interrupts. The operating system, monitor, or device driver handles them. Certain kinds of exceptions, however, are relevant to application programming, and many operating systems
give application programs the opportunity to service these exceptions. However, the operating system defines the interface between the application program and the exception
mechanism of the Am486 microprocessor.
Table A-16 lists the exceptions and interrupts.
Table A-16
Exceptions and Interrupts
Vector Number
Description
0
Divide Error
1
Debugger Call
2
NMI
3
Breakpoint
4
INTO-detected Overflow
5
BOUND Range Exceeded
6
Invalid Opcode
7
Device Not Available
8
Double Fault
9
Reserved
10
Invalid Task State Segment
11
Segment Not Present
12
Stack Exception
13
General Protection
14
Page Fault
15
Reserved
16
Floating-Point Error
17
Alignment Check
18–31
Reserved
32–255
Maskable Interrupts
General Guidelines for Programming
A-73
AMD
n
A divide-error exception results when the DIV or IDIV instruction is executed with a zero
denominator or when the quotient is too large for the destination operand.
n
A debug exception may be sent back to an application program if it results from the Trap
Flag (TF).
n
A breakpoint exception results when an INT3 instruction is executed. This instruction is
used by some debuggers to stop program execution at specific points.
n
An overflow exception results when the INTO instruction is executed and the Overflow
Flag (OF) is set.
n
A bounds-check exception results when the BOUND instruction is executed with an
array index that falls outside the bounds of the array.
n
The device-not-available exception occurs whenever the microprocessor encounters an
escape instruction and either the TS (task switched) or the EM (emulate coprocessor)
bit of the CR0 control register is set.
n
An alignment-check exception is generated for unaligned memory operations in user
mode (privilege level 3), provided both AM and AC are set. Memory operations at supervisor mode (privilege levels 0, 1, and 2), or memory operations that default to supervisor mode, do not generate this exception.
The INT instruction generates an interrupt whenever it is executed; the microprocessor
treats this interrupt as an exception. Its effects (and the effects of all other exceptions) are
determined by exception handler routines in the application program or the operating
system.
Exceptions caused by segmentation and paging are handled differently than interrupts.
Normally, the contents of the program counter (EIP register) are saved on the stack when
an exception or interrupt is generated. But exceptions resulting from segmentation and
paging restore the contents of some microprocessor registers to the state they held prior
to instruction interpretation. The saved contents of the program counter address the instruction that caused the exception, rather than the instruction after it. This lets the operating
system fix the exception-generating condition and restart the program at the instruction
that generated the exception. This mechanism is completely transparent to the program.
A.2.9
Input/Output
This chapter explains the input/output architecture of the Am486 microprocessor. Input/
output is accomplished through I/O ports, which are registers connected to peripheral
devices. An I/O port can be an input port, an output port, or a bidirectional port. Some I/O
ports are used for carrying data, such as the transmit and receive registers of a seriaI
interface. Other I/O ports are used to control peripheral devices, such as the control registers
of a disk controller.
The Am486 microprocessor always synchronizes I/O instruction execution with external
bus activity. All previous instructions are completed before an I/O operation begins. In
particular, all writes held pending in the Am486 CPU write buffers are completed before an
I/O read or write is performed.
The input/output architecture is the programmer’s model of how these ports are accessed.
The discussion of this model includes:
A-74
n
Methods of addressing I/O ports
n
Instructions that perform I/O operations
n
The I/O protection mechanism
General Guidelines for Programming
AMD
A.2.9.1
I/O Addressing
The Am486 microprocessor allows I/O ports to be addressed in either of two ways:
n
Through a separate I/O address space accessed using I/O instructions
n
Through memory-mapped I/O, where I/O ports appear in the address space of physical
memory
The use of a separate I/O address space is supported by special instructions and a hardware
protection mechanism. When memory-mapped I/O is used, the general purpose instruction
set can be used to access I/O ports, and protection is provided using segmentation or
paging. Some system designers may prefer to use the I/O facilities built into the microprocessor, while others may prefer the simplicity of a single physical address space.
If segmentation or paging is used for protection of the I/O address space, the AVL fields in
segment descriptors or page-table entries may be used to mark pages containing I/O as
unrelocatable and unswappable. The AVL fields are provided for this kind of use, where a
system programmer needs to make an extension to the address translation and protection
mechanisms.
Hardware designers use these ways of mapping I/O ports into the address space when
they design the address decoding circuits of a system. I/O ports can be mapped so that
they appear in the I/O address space or the address space of physical memory (or both).
System programmers may need to discuss with hardware designers the kind of I/O addressing they would like to have.
A.2.9.1.1
I/O Address Space
The Am486 microprocessor provides a separate I/O address space, distinct from the address space for physical memory, where I/O ports can be placed. The I/O address space
consists of 216 (64K) individually addressable 8-bit ports; any two consecutive 8-bit ports
can be treated as a 16-bit port, and any four consecutive ports can be a 32-bit port. Extra
bus cycles are required if a port crosses the boundary between two doublewords in physical
memory.
The M/IO pin on the Am486 microprocessor indicates when a bus cycle to the I/O address
space occurs. When a separate I/O address space is used, it is the responsibility of the
hardware designer to make use of this signal to select I/O ports rather than memory. In
fact, the use of the separate I/O address space simplifies the hardware design because
these ports can be selected by a single signal; unlike other microprocessors, it is not
necessary to decode a number of upper address lines in order to set up a separate I/O
address space.
A program can specify the address of a port in two ways. With an immediate byte constant,
the program can specify:
n
256 8-bit ports numbered 0–255
n
128 16-bit ports numbered 0, 2, 4, . . . , 252, 254
n
64 32-bit ports numbered 0, 4, 8, . . . , 248, 252
Using a value in the DX register, the program can specify:
n
8-bit ports numbered 0–65535
n
16-bit ports numbered 0, 2, 4, . . . , 65532, 65534
n
32-bit ports numbered 0, 4, 8, . . . , 65528, 65532
General Guidelines for Programming
A-75
AMD
The Am486 microprocessor can transfer 8, 16, or 32 bits to a device in the I/O space. Like
words in memory, 16-bit ports should be aligned to even addresses so that all 16 bits can
be transferred in a single bus cycle. Like doublewords in memory, 32-bit ports should be
aligned to addresses that are multiples of 4. The microprocessor supports data transfers
to unaligned ports, but there is a performance penalty because an extra bus cycle must be
used.
n
The IN and OUT instructions move data between a register and a port in the I/O address
space. The instructions INS and OUTS move strings of data between the memory
address space and ports in the I/O address space.
n
I/O port addresses 0F8h through 0FFh are reserved for use by AMD. Do not assign
I/O ports to these addresses.
n
The exact order of bus cycles used to access ports that require more than one bus cycle
is undefined. For example, an OUT instruction that loads an unaligned doubleword port
at location 2h accesses the word at 4h before accessing the word at 2h. This behavior
is neither defined, nor guaranteed to remain the same in future AMD products.
n
If software needs to produce a particular order of bus cycles, this order must be specified
explicitly. For example, to load a word-length port at 4h followed by loading a word port
at 2h, two word-length instructions must be used, rather than a single doubleword
instruction.
Note: Although the Am486 microprocessor automatically masks parity errors for certain
types of bus cycles, such as interrupt acknowledge cycles, it does not mask parity for bus
cycles to the I/O address space. Programmers may need to be aware of this behavior as
a possible source of spurious parity efforts.
A.2.9.1.2
Memory-Mapped I/O
I/O devices may be placed in the address space for physical memory. This is called memorymapped I/O. As long as the devices respond like memory components, they can be used
with memory-mapped I/O.
Memory-mapped I/O provides additional programming flexibility. Any instruction that references memory may be used to access an I/O port located in the memory space. For
example, the MOV instruction can transfer data between any register and a port. The AND,
OR, and TEST instructions may be used to manipulate bits in the control and status registers
of peripheral devices (see Figure A-49). Memory-mapped I/O can use the full instruction
set and the full complement of addressing modes to address I/O ports.
Figure A-49
A-76
Memory Mapped I/O
General Guidelines for Programming
AMD
To optimize performance, the Am486 CPU allows reads to be re-ordered ahead of buffered
writes in certain precisely-defined circumstances. Using memory-mapped I/O on the Am486
CPU therefore creates the possibility that an I/O read will be performed before the memory
write of a previous instruction. To eliminate this possibility, use an I/O instruction for the
read. Using an I/O instruction for an I/O write can also be advantageous because it guarantees that the write will be completed before the next instruction begins execution. If I/O
writes are used to control system hardware, then this sequence of events is desirable, since
it guarantees that the next instruction will be executed in the new state.
A.2.9.2
n
If caching is enabled, either external hardware or the paging mechanism (the PCD bit
in the page table entry) must be used to prevent caching of I/O data.
n
Memory-mapped I/O, like any other memory reference, is subject to access protection
and control. See Section A.2.3 for a discussion of memory protection.
I/O Instructions
The I/O instructions of the Am486 microprocessor provide access to the microprocessor’s
I/O ports for the transfer of data. These instructions have the address of a port in the I/O
address space as an operand. There are two kinds of I/O instructions:
n
Those that transfer a single item (byte, word, or doubleword) to or from a register.
n
Those that transfer strings of items (strings of bytes, words, or doublewords) located in
memory. These are known as “string I/O instruction” or “block I/O instructions.” These
instructions cause the M/IO signal to be driven Low (logic 0) during a bus cycle, which
indicates to external hardware that access to the I/O address space is taking place.
If memory-mapped I/O is used, there is no reason to use I/O instructions.
A.2.9.3
Register I/O Instructions
The I/O instructions IN and OUT move data between I/O ports and the EAX register (32bit I/O), the AX register (16-bit I/O), or the AL (8-bit I/O) register. The IN and OUT instructions
address I/O ports either directly, with the address of one of 256 port addresses coded in
the instruction, or indirectly using an address in the DX register to select one of 64K port
addresses. These instructions synchronize program execution to external hardware. The
Am486 microprocessor write buffers are cleared and program execution delayed until the
last ready of the last bus cycle has been returned.
A.2.9.4
n
IN (Input from Port)—transfers a byte, word, or doubleword from an input port to the AL,
AX, or EAX registers. A byte IN instruction transfers 8 bits from the selected port to the
AL register. A word IN instruction transfers 16 bits from the port to the AX register. A
doubleword IN instruction transfers 32 bits from the port to the EAX register.
n
OUT (Output from Port)—transfers a byte, word, or doubleword from the AL, AX, or EAX
registers to an output port. A byte OUT instruction transfers 8 bits from the AL register
to the selected port. A word OUT instruction transfers 16 bits from the AX register to the
port. A doubleword OUT instruction transfers 32 bits from the EAX register to the port.
Block I/O Instructions
The INS and OUTS instructions move blocks of data between I/O ports and memory. Block
I/O instructions use an address in the DX register to address a port in the I/O address
space. These instructions use the DX register to specify:
n
8-bit ports numbered 0–65535
n
16-bit ports numbered 0, 2, 4, . . . , 65532, 65534
n
32-bit ports numbered 0, 4, 8, . . . , 65528, 65532
General Guidelines for Programming
A-77
AMD
Block I/O instructions use either the SI or DI register to address memory. For each transfer,
the SI or DI register is incremented or decremented, as specified by DF.
The INS and OUTS instructions, when used with repeat prefixes, perform block input or
output operations. The repeat prefix REP modifies the INS and OUTS instructions to transfer
blocks of data between an I/O port and memory. These block I/O instructions are string
instructions. They simplify programming and increase the speed of data transfer by eliminating the need to use a separate LOOP instruction or an intermediate register to hold the
data. The string I/O instructions operate on byte strings, word strings, or doubleword strings.
After each transfer, the memory address in the ESI or EDI registers is incremented or
decremented by 1 for byte operands, by 2 for word operands, or by 4 for doubleword
operands. DF controls whether the register is incremented (DF is clear) or decremented
(DF is set).
A.2.9.5
n
INS (Input String from Port)—transfers a byte, word, or doubleword string element from
an input port to memory. The INSB instruction transfers a byte from the selected port to
the memory location addressed by the ES and EDI registers. The INSW instruction
transfers a word. The INSD instruction transfers a doubleword. A segment override
prefix cannot be used to specify an alternate destination segment. Combined with a
REP prefix, an INS instruction makes repeated read cycles to the port, and puts the
data into consecutive locations in memory.
n
OUTS (Output String from Port)—transfers a byte, word, or doubleword string element
from memory to an output port. The OUTSB instruction transfers a byte from the memory
location addressed by the DS and ESI registers to the selected port. The OUTSW
instruction transfers a word. The OUTSD instruction transfers a doubleword. A segment
override prefix cannot be used to specify an alternate source segment. Combined with
a REP prefix, an OUTS instruction reads consecutive locations in memory and writes
the data to an output port.
Protection and I/O
The I/O architecture has two protection mechanisms:
n
The IOPL field in the EFLAGS register controls access to the I/O instructions.
n
The I/O permission bit map of a TSS segment controls access to individual ports in the
I/O address space.
These protection mechanisms are available only when a separate I/O address space is
used. When memory-mapped I/O is used, protection is provided using segmentation or
paging.
A.2.9.5.1
I/O Privilege Level
In systems that use I/O protection, the IOPL field in the EFLAGS register controls access
to I/O instructions. This permits the operating system to adjust the privilege level needed
to perform I/O operations. In a typical protection ring model, privilege levels 0 and 1 have
access to the I/O instructions. This lets the operating system and the device drivers perform
I/O, but keeps applications and less privileged device drivers from accessing the I/O address
space. Applications access I/O through the operating system. The following instructions
can be executed only if CPL ≤ IOPL:
IN
INS
OUT
OUTS
CLI
STI
A-78
–Input
–Input String
–Output
–Output String
–Clear Interrupt-Enable Flag
–Set Interrupt-Enable Flag
General Guidelines for Programming
AMD
These instructions are called “sensitive” instructions, because they are sensitive to the
IOPL field. In Virtual-8086 Mode, IOPL is not used; only the I/O permission bit map limits
access to I/O ports.
To use sensitive instructions, a procedure must run at a privilege level at least as privileged
as that specified by the IOPL field. Any attempt by a less privileged procedure to use a
sensitive instruction results in a general-protection exception. Because each task has its
own copy of the EFLAGS register, each task can have a different IOPL.
A task can change IOPL only with the POPF instruction; however, such changes are privileged. No procedure may change its IOPL unless it is running at privilege level 0. An attempt
by a less privileged procedure to change the IOPL does not result in an exception; the IOPL
simply remains unchanged.
The POPF instruction also may be used to change the state of IF (as can the CLI and STI
instructions); however, changes to IF using the POPF instruction are IOPL-sensitive. A
procedure may change the setting of IF with a POPF instruction only if it runs with a CPL
at least as privileged as the IOPL. An attempt by a less privileged procedure to change IF
does not result in an exception; IF simply remains unchanged.
A.2.9.5.2
I/O Permission Bit Map
The Am486 microprocessor can generate exceptions for references to specific I/O addresses. These addresses are specified in the I/O permission bit map in the TSS (see Figure A50). The size of the map and its location in the TSS are variable. The microprocessor finds
the I/O permission bit map with the I/O map base address in the TSS. The base address
is a 16-bit offset into the TSS. This is an offset to the beginning of the bit map. The limit of
the TSS is the limit on the size of the I/O permission bit map.
Figure A-50
I/O Permission Bit Map
General Guidelines for Programming
A-79
AMD
Because each task has its own TSS, each task has its own I/O permission bit map. Access
to individual I/O ports can be granted to individual tasks.
If CPL is less than or equal to IOPL in Protected Mode, then the microprocessor allows
I/O operations to proceed. If CPL is greater than IOPL, or if the microprocessor is operating
in Virtual 8086 Mode, then the microprocessor checks the I/O permission map. Each bit in
the map corresponds to an I/O port byte address; for example, the control bit for address
41 (decimal) in the I/O address space is found at bit position 1 of the sixth byte in the bit
map. The microprocessor tests all the bits corresponding to the I/O port being addressed;
for example, a doubleword operation tests four bits corresponding to four adjacent byte
addresses. If any tested bit is set, a general-protection exception is generated. If all tested
bits are clear, the I/O operation proceeds.
Because I/O ports that are not aligned to word and doubleword boundaries are permitted,
it is possible that the microprocessor may need to access two bytes in the bit map when
I/O permission is checked. For maximum speed, the microprocessor has been designed
to read two bytes for every access to an I/O port. To prevent exceptions from being generated when the ports with the highest addresses are accessed, an extra byte needs to come
after the table. This byte must have all of its bits set, and it must be within the segment limit.
It is not necessary for the I/O permission bit map to represent all the I/O addresses. I/O
addresses not spanned by the map are treated as if they had set bits in the map. For
example, if the TSS segment limit is 10 bytes past the bit map base address, the map has
11 bytes and the first 80 I/O ports are mapped. Higher addresses in the I/O address space
generate exceptions.
If the I/O bit map base address is greater than or equal to the TSS segment limit, there is
no I/O permission map, and all I/O instructions generate exceptions. The base address
must be less than or equal to 0DFFFh.
A.3
DEBUGGING
The Am486 microprocessor has advanced debugging facilities that are particularly important for sophisticated software systems, such as multitasking operating systems. The failure
conditions for these software systems can be very complex and time-dependent. The debugging features of the Am486 microprocessor give the system programmer valuable tools
for looking at the dynamic state of the microprocessor.
The debugging support is accessed through the debug registers. They hold the addresses
of memory locations, called breakpoints, that invoke debugging software. An exception is
generated when a memory operation is made to one of these addresses. A breakpoint is
specified for a particular form of memory access, such as an instruction fetch or a doubleword write operation. The debug registers support both instruction breakpoints and data
breakpoints.
With other microprocessors, instruction breakpoints are set by replacing normal instructions
with breakpoint instructions. When the breakpoint instruction is executed, the debugger is
called. But with the debug registers of the Am486 microprocessor, this is not necessary.
By eliminating the need to write into the code space, the debugging process is simplified
(there is no need to set up a data segment mapped to the same memory as the code
segment) and breakpoints can be set in ROM-based software. In addition, breakpoints can
be set on reads and writes to data that allows real-time monitoring of variables.
A-80
General Guidelines for Programming
AMD
A.3.1
Debugging Support
The features of the architecture that support debugging are:
n
Reserved Debug Interrupt Vector—specifies a procedure or task to call when an event
for the debugger occurs
n
Debug Address Registers—specifies the addresses of up to four breakpoints
n
Debug Control Register—specifies the forms of memory access for the breakpoints
n
Debug Status Register—reports conditions in effect at the time of the exception
n
Trap Bit of TSS (T-bit)—generates a debug exception when an attempt is made to
perform a task switch to a task with this bit set in its TSS
n
Resume Flag (RF)—suppresses multiple exceptions to the same instruction
n
Trap Flag (TF)—generates a debug exception after every execution of an instruction
n
Breakpoint Instruction—calls the debugger (generates a debug exception). This instruction is an alternative way to set code breakpoints. It is especially useful when more than
four breakpoints are desired, or when breakpoints are placed in the source code.
n
Reserved Interrupt Vector for Breakpoint Exception—calls a procedure or task when a
breakpoint instruction is executed
These features allow a debugger to be called either as a separate task or as a procedure
in the context of the current task. The following conditions are used to call the debugger:
A.3.2
n
Task switch to a specific task
n
Execution of the breakpoint instruction
n
Execution of any instruction
n
Execution of an instruction at a specified address
n
Read or write of a byte, word, or doubleword at a specified address
n
Write to a byte, word, or doubleword at a specified address
n
Attempt to change the contents of a debug register
Debug Registers
Six registers control debugging. The registers are accessed by a MOV instruction. A debug
register can be the source or destination operand for the instruction. Debug registers are
privileged resources; MOV instructions that access them can execute only at privilege level
0. An attempt to read or write the debug registers from any other privilege level generates
a general-protection exception. Figure A-51 shows the debug register format.
A.3.2.1
Debug Address Registers (DR3–DR0)
Each of these registers holds the linear address for one of the four breakpoints. That is,
breakpoint comparisons are made before physical address translation occurs. Each breakpoint condition is specified further by the contents of the DR7 register.
A.3.2.2
Debug Control Register (DR7)
The debug control register shown in Figure A-51 specifies the sort of memory access
associated with each breakpoint. Each address in registers DR3–DR0 corresponds to a
field R/W3–R/W0 in the DR7 register. The microprocessor interprets these bits as follows:
n
n
n
n
00—Break on instruction execution only
01—Break on data writes only
10—Undefined
11—Break on data reads or writes but not instruction fetches
General Guidelines for Programming
A-81
AMD
Figure A-51
Debug Registers
The LEN3–LEN0 fields in the DR7 register specify the size of the breakpoint. The length
fields are interpreted as follows:
n
00—One-byte length
n
01—Two-byte length
n
10—Undefined
n
11—Four-byte length
Note: If RWn is 00 (instruction execution), then LENn should also be 00. The effect of using
any other length is undefined.
The GD bit enables the debug register protection condition that is flagged by BD of DR6.
Note that GD is cleared at entry to the debug exception handler by the microprocessor.
This allows the handler free access to the debug registers.
The Low 8 bits of the DR7 register (fields L3–L0 and G3–G0) individually enable the four
address breakpoint conditions. There are two levels of enabling: the local (L3–L0) and
global (G3–G0) levels. The local enable bits are automatically cleared by the microprocessor on every task switch to avoid unwanted breakpoint conditions in the new task. They
are used to breakpoint conditions in a single task. The global enable bits are not cleared
by a task switch. They are used to enable breakpoint conditions that apply to all tasks.
A-82
General Guidelines for Programming
AMD
The Am486 microprocessor always uses exact data breakpoint matching in debugging.
That is, if any of the Ln/Gn bits are set, the microprocessor slows execution so that data
breakpoints are reported for the instruction that triggers the breakpoint, rather than the next
instruction. In such a case, one-clock instructions that access memory take two clocks to
execute.
In the Am386 microprocessor, exact data breakpoint matching does not occur unless it is
enabled by setting either the LE or the GE bit. The Am486 microprocessor ignores these
bits.
A.3.2.3
Debug Status Register (DR6)
The debug status register shown in Figure A-51 reports conditions sampled at the time the
debug exception was generated. Among other information, it reports which breakpoint
triggered the exception. Update only occurs if the exception is taken, then all bits will be
updated.
When an enabled breakpoint generates a debug exception, it loads the Low four bits of
this register (B0–B3) before entering the debug exception handler. The B bit is set if the
condition described by the DR, LEN, and R/W bits is true, even if the breakpoint is not
enabled by the L and G bits. The microprocessor sets the B bits for all breakpoints that
match the conditions present at the time the debug exception is generated, whether or not
they are enabled.
The BT bit is associated with the T bit (debug trap bit) of the TSS. The microprocessor sets
the BT bit before entering the debug handler if a task switch has occurred to a task with a
set T bit in its TSS. There is no bit in the DR7 register to enable or disable this exception;
the T bit of the TSS is the only enabling bit.
The BS bit is associated with TF. The BS bit is set if the debug exception is triggered by
the single-step execution mode (TF set). The single-step mode is the highest-priority debug
exception; when the BS bit is set, any of the other debug status bits may also be set.
The BD bit is set if the next instruction reads or writes one of the eight debug registers while
it is being used by in-circuit emulation.
Note: The contents of the DR6 register are never cleared by the microprocessor. To avoid
any confusion in identifying debug exceptions, the debug handler should clear the register
before returning.
A.3.2.4
Breakpoint Field Recognition
The address and LEN bits for each of the four breakpoint conditions define a range of
sequential byte addresses for a data breakpoint. The LEN bits permit specification of a
1-, 2-, or 4-byte range. Align 2-byte ranges on word boundaries (addresses that are multiples
of 2) and 4-byte ranges on doubleword boundaries (addresses that are multiples of 4).
These requirements are enforced by the microprocessor; it uses the LEN bits to mask the
lower address bits in the debug registers. Unaligned code or data breakpoint addresses
do not yield the expected results.
A data breakpoint for reading or writing is triggered if any of the bytes participating in a
memory access is within the range defined by a breakpoint address register and its LEN
bits. A data breakpoint for an unaligned operand can be made from two entry sets in the
breakpoint registers where each entry is byte-aligned, and the two entries cover the operand. This breakpoint generates exceptions for the operand, not for any neighboring bytes.
Instruction breakpoint addresses must have a 1-byte length specification (LEN = 00); the
behavior of code breakpoints for other operand sizes is undefined.
General Guidelines for Programming
A-83
AMD
Table A-17
Breakpoint Examples
Comment
Address
Length in bytes
DR0 Contents
DR1 Contents
DR2 Contents
DR3 Contents
A0001h
A0002h
B0002h
C0000h
1 (LEN0 = 00)
1 (LEN0 = 00
2 (LEN0 = 01)
4 (LEN0 =11)
Memory Operations That Trap
A0001h
A0002h
A0001h
A0002h
B0002h
B0001h
C0000h
C0001h
C0003h
1
1
2
2
2
4
4
2
1
Memory Operations That Do Not
Trap
A0000h
A0003h
B0000h
C0004h
1
4
2
4
Table A-17 gives some examples of combinations of addresses and fields with memory
references that do and do not cause traps. The processor recognizes an instruction breakpoint address only when it points to the first byte of an instruction. If the instruction has any
prefixes, the breakpoint address must point to the first prefix.
A.3.3
Debug Exceptions
Two of the interrupt vectors of the Am486 microprocessor are reserved for debug exceptions. The debug exception is the usual way to invoke debuggers designed for the Am486
microprocessor. The breakpoint exception is intended to put breakpoints in debuggers.
A.3.3.1
Interrupt 1—Debug Exceptions
The handler for this exception usually is a debugger or part of a debugging system. The
microprocessor generates a debug exception for any of several conditions. The debugger
can check flags in the DR6 and DR7 registers to determine which condition caused the
exception and which other conditions also might apply. Table A-18 shows the states of
these bits for each kind of breakpoint condition.
Table A-18
Debug Exception Conditions
Flags Tested
A-84
Description
BS = 1
Single-step trap
B0 = 1 and (GE0 = 1 or LE0 = 1)
Breakpoint defined by DR0, LEN0, and R/W0
B1 = 1 and (GE1 = 1 or LE1 = 1)
Breakpoint defined by DR1, LEN1, and R/W1
B2 = 1 and (GE2 = 1 or LE2 = 1)
Breakpoint defined by DR2, LEN2, and R/W2
B3 = 1 and (GE3 = 1 or LE3 = 1)
Breakpoint defined by DR3, LEN3, and R/W3
BD = 1
Debug registers in use for in-circuit emulation
BT = 1
Task switch
General Guidelines for Programming
AMD
Instruction breakpoints are faults; other debug exceptions are traps. The debug exception
may report either or both at one time. The following sections present details for each class
of debug exception.
A.3.3.1.1
Instruction-Breakpoint Fault
The microprocessor reports an instruction breakpoint before it executes the breakpointed
instruction (i.e., a debug exception caused by an instruction breakpoint is a fault).
The Resume Flag (RF) permits the debug exception handler to restart instructions that
cause faults other than debug faults. When a debug fault occurs, the system software writer
must set the RF bit in the copy of the EFLAGS register that is pushed on the stack in the
debug exception handler routine. This bit is set in preparation of resuming the program’s
execution at the breakpoint address without generating another breakpoint fault on the
same instruction.
Note: RF does not cause breakpoint traps nor other kinds of faults to be ignored.
The microprocessor clears RF at the successful completion of every instruction except after
the IRET instruction, the POPF instruction, POPFD instruction, and JMP, CALL or INT
instructions that cause a task switch. These instructions set RF to the value specified by
the saved copy of the EFLAGS register.
The microprocessor sets RF in the copy of the EFLAGS register pushed on the stack before
entry into any fault handler. When the fault handler is entered for instruction breakpoints,
for example, RF is set in the copy of the EFLAGS register pushed on the stack; therefore,
the IRET instruction that returns control from the exception handler sets RF in the EFLAGS
register and execution resumes at the breakpointed instruction without generating another
breakpoint for the same instruction.
If, after a debugger RF is set and the debug handler retries the faulting instruction, it is
possible that retrying the instruction will generate other faults. The restart of the instruction
after these faults also occurs with RF set, so repeated debug faults continue to be suppressed. The microprocessor clears RF only after successful completion of the instruction.
A.3.3.1.2
Data-Breakpoint Trap
A data-breakpoint exception is a trap (i.e., the processor generates an exception for a data
breakpoint after executing the instruction that accesses the breakpointed memory location).
The Am486 microprocessor always does exact data breakpoint matching, regardless of
GE/LE bit settings. Exact reporting is provided by forcing the Am486 microprocessor execution unit to wait for completion of data operand transfers before beginning execution of
the next instruction.
If a debugger needs to save the contents of a write-breakpoint location, it should save the
original contents before saving the breakpoint. Because data breakpoints are traps, the
original data is overwritten before the trap exception is generated. The handler can report
the saved value after the breakpoint is triggered. The data in the debug registers can be
used to address the new value stored by the instruction that triggered the breakpoint.
A.3.3.1.3
General-Detect Fault
The general-detect fault occurs when an attempt is made to use the debug registers at the
same time they are being used by in-circuit emulation. This additional protection feature is
provided to guarantee emulators can have full control over the debug registers when required. The exception handler can detect this condition by checking the state of the BD bit
of the DR6 register.
General Guidelines for Programming
A-85
AMD
A.3.3.1.4
Single-Step Trap
This trap occurs after an instruction is executed if TF was set before the instruction was
executed. Note the exception does not occur after an instruction that sets TF. For example,
if the POPF instruction is used to set TF, a single-step trap does not occur until after the
instruction following the POPF instruction.
The microprocessor clears TF before calling the exception handler. If TF was set in a TSS
at the time of a task switch, the exception occurs after the first instruction is executed in
the new task.
The single-step flag normally is not cleared by changing privilege levels inside a task. INT
instructions do, however, clear TF. Therefore, software debuggers that single-step code
must recognize and emulate INTn or INTO instructions rather than executing them directly.
To maintain protection, the operating system should check the current execution privilege
level after any single-step trap to see if single stepping should continue at the current
privilege level.
The interrupt priorities guarantee that if an external interrupt occurs, single stepping stops.
When both an external interrupt and a single-step interrupt occur together, the single-step
interrupt is processed first. This clears TF. After saving the return address or switching
tasks, the external interrupt input is examined before the first instruction of the single-step
handler executes. If the external interrupt is still pending, then it is serviced. The external
interrupt handler does not run in single-step mode. To single step an interrupt handler,
single step an INTn instruction that calls the interrupt handler.
A.3.3.1.5
Task-Switch Trap
The debug exception also occurs after a task switch if the T bit of the new task’s TSS is
set. The exception occurs after control has passed to the new task, but before the first
instruction of that task is executed. The exception handler can detect this condition by
examining the BT bit of the DR6 register.
Note: If the debug exception handler is a task, the T bit of its TSS should not be set. Failure
to observe this rule will put the microprocessor in a loop.
A.3.3.2
Interrupt 3—Breakpoint Instruction
The breakpoint trap is caused by execution of the INT 3h instruction. Typically, a debugger
prepares a breakpoint by replacing the first opcode byte of an instruction with the opcode
for the breakpoint instruction. When execution of the INT 3h instruction calls the exception
handler, the return address points to the first byte of the instruction following the INT 3h
instruction.
With older microprocessors, this feature is used extensively for setting instruction breakpoints. With the Am486 microprocessor, this use is more easily handled using the debug
registers. However, the breakpoint exception still is useful for breakpointing debuggers,
because the breakpoint exception can call an exception handler other than itself. The
breakpoint exception also can be useful when it is necessary to set a greater number of
breakpoints than permitted by the debug registers, or when breakpoints are being placed
in the source code of a program under development.
A.4
CACHING
The Am486 microprocessor has an on-chip internal cache for storing 8 Kbytes of instructions and data. The cache raises system performance by satisfying an internal read request
more quickly than a bus cycle to memory. This also reduces the microprocessor’s use of
the external bus. The internal cache is transparent to program operation.
A-86
General Guidelines for Programming
AMD
The Am486 microprocessor can use an external second-level cache outside of the processor chip. An external cache normally improves performance and reduces bus bandwidth
required by the Am486 microprocessor.
Caches require special consideration in multiprocessor systems. When one microprocessor
accesses data cached in another microprocessor, it must not receive incorrect data. If it
modifies data, all other microprocessors that access that data must receive the modified
data. This property is called cache consistency. The Am486 microprocessor provides mechanisms that maintain cache consistency in the presence of multiple microprocessors and
external caches.
The operation of internal and external caches is transparent to application software, but
knowledge of the behavior of these caches may be useful in optimizing software performance. In multiprocessor systems, maintenance of cache consistency may require intervention by system software.
The cache is available in all execution modes: Real Mode, Protected Mode, and Virtual
8086 Mode. For properly designed single-processor systems, the cache can be initially
enabled and not require further control.
A.4.1
Introduction to Caching
Caches are often implemented as associative memories. An associative memory has extra
storage for each unit of memory, called a tag. When an address is applied to an associative
memory, each tag simultaneously compares itself against the address. If a tag matches
the address, access is provided to the unit of memory associated with the tag.
This is called a cache hit. If no match occurs, the cache signals a cache miss. A cache
miss requires a bus cycle to access main memory. To gain efficiency in the implementation
of the internal cache, storage is allocated in chunks of 128 bits, called cache lines. External
caches are not likely to use cache lines smaller than those of the internal cache.
The cache of the Am486 microprocessor does not support partially-filled cache lines, so
caching a single doubleword requires caching four doublewords. This would be an inefficient
use of the cache if it were not for the fact that the microprocessor rarely accesses random
locations in memory. Over any small span of time, the microprocessor usually accesses a
small number of areas in memory, such as the code segment or the stack, and it usually
accesses many neighboring addresses in these areas.
To simplify the hardware implementation, cache lines can only be mapped to aligned 128bit blocks of main memory. (An aligned 128-bit block begins at an address that is clear in
its Low four bits.) When a new cache line is allocated, the microprocessor loads a block
from main memory into the cache line. This operation is called a cache line fill. Allocated
cache lines are said to be valid. Unallocated cache lines are invalid.
Caching can be write-through or write-back. On reads, both forms of caching operate as
described above. On writes, write-through caching updates both cache memory and main
memory; write-back caching updates only the cache memory. Write-back caching updates
main memory when a write-back operation is performed. Write-back operations are triggered when cache lines need to be deallocated, such as when new cache lines are being
allocated in a cache that is already full. Write-back operations also are triggered by the
mechanisms used to maintain cache consistency.
The internal cache of the Am486 microprocessor is a write-through cache. It can be used
with external caches that are write-through, write-back, or a mixture of both.
General Guidelines for Programming
A-87
AMD
A.4.2
Operation of the Internal Cache
Software controls the operating mode of the cache. Caching can be enabled (its state
following reset initialization), caching can be disabled while valid cache lines exist (a mode
in which the cache acts like a fast, internal RAM), or caching can be fully disabled.
Precautions must be followed when disabling the cache. Whenever CD is set to 1, the
Am486 microprocessor does not read external memory if a copy is still in the cache. Whenever NW is set to 1, the Am486 microprocessor does not write to external memory if the
data is in the cache. This means stale data can develop in the Am486 CPU cache. This
stale data is not written to external memory if NW is later set to 0 or that cache line is later
overwritten as a result of a cache miss. In general, the cache should be flushed when
disabled. It is possible to freeze data in the cache by loading it using test registers while
CD and NW are set. This is useful to provide guaranteed cache hits for time critical interrupt
code and data.
Note: All segments should start on 16-byte boundaries to allow programs to align code/
data in cache lines.
A.4.2.1
Cache Disabling Bits
Table A-19 summarizes the modes enabled by the CD and NW bits.
Table A-19
A.4.2.2
Cache Operating Modes
CD
NW
Description
1
1
Caching is disabled, but valid cache lines continue to respond. To disable
the cache completely, enter this mode and perform a cache flush. To use
the cache as fast internal RAM, preload the cache with valid cache lines
by carefully choosing memory operations or by using the test registers.
In this mode, writes to valid cache lines update the cache, but do not
update main memory.
1
0
No new cache lines are allocated, but valid cache lines continue to
respond.
0
1
Invalid setting. A general-protection exception with an error code 0
occurs.
0
0
Caching is enabled.
Cache Management Instructions
The INVD and WBINVD instructions are used to invalidate the contents of the internal and
external caches. The INVD instruction flushes the internal cache and generates a special
bus cycle that indicates that external caches also should be flushed. (The response of
hardware to receiving a cache flush bus cycle is implementation dependent; hardware
might use some other mechanism for maintaining cache consistency.)
There is only one difference between the WBINVD and INVD instructions. The WBINVD
instruction generates a special bus cycle that indicates external, write-back caches should
write-back modified data to main memory. This cycle is produced immediately before the
cycle to flush the cache.
A.4.2.3
Self-Modifying Code
A write to an instruction in the cache modifies it in both cache and memory, but if the
instruction is prefetched before the write, the old version of the instruction can be the one
executed. To prevent this, flush the instruction prefetch unit by coding a jump instruction
immediately after any write that modifies an instruction.
A-88
General Guidelines for Programming
AMD
A.4.3
Page-Level Cache Management
The Am486 microprocessor defines two bits in entries in the page directory and secondlevel page tables that are reserved on Am386 microprocessors. These bits are used to
drive microprocessor output pins. These bits are used to manage the caching of pages.
The PCD and PWT bits control caching on a page-by-page basis. The PCD bit (page-level
cache disable) affects the operation of the internal cache. Both the PCD bit and the PWT
bit (page-level write-through) drive microprocessor output pins for controlling external caches. The treatment of these signals by external hardware is implementation dependent; for
example, some hardware systems may control the caching of pages by decoding some of
the High address bits.
There are three potential sources of the bits used to drive the PCD and PWT outputs of
the microprocessor: the CR3 register, the page directory, and the second-level page tables.
The microprocessor outputs are driven by the CR3 register for bus cycles where paging is
not used to generate the address, such as the loading of an entry in the page directory.
The outputs are driven by a page directory entry when an entry from a second-level page
table is accessed. The outputs are driven by a second-level page table entry when instructions or data in memory are accessed. When paging is disabled, these bits are ignored
(CPU assumes PCD = 0 and PWT = 0).
A.4.3.1
PCD Bit
When a page table entry has a set PCD bit (bit position 4), caching of the page is disabled,
even if hardware is requesting caching by asserting the KEN input. When the PCD bit is
clear, caching may be requested by hardware on a cycle-by-cycle basis. Disabling caching
is necessary for pages that contain memory-mapped I/O ports. It also is useful for pages
that do not provide a performance benefit when cached, such as initialization software.
Regardless of the page-table entries, the Am486 microprocessor ignores the PCD output
(assume PCD =O) whenever the CD (Cache Disable) bit in CR0 is set.
A.4.3.2
PWT Bit
When a page table entry has a set PWT bit (bit position 3), a write-through caching policy
is specified for data in the corresponding page. Clearing the PWT bit allows the possibility
of using a write-back policy for the page. Since the internal cache of the Am486 microprocessor is a write-through cache, it is not affected by the state of the PWT bit. External
caches however may use write-back caching, and so they can use the output signal driven
by the PWT bit to control caching policy on a page-by-page basis.
In multiprocessor systems, enabling write-through may be advantageous for shared memory, particularly for memory locations written infrequently by one microprocessor, but read
often by many microprocessors.
General Guidelines for Programming
A-89
AMD
A-90
General Guidelines for Programming
APPENDIX
B
B.1
OPCODE MAP
GENERAL
The opcode tables aid in interpreting the 486 processor object code. Use the high-order
four bits of the opcode as an index to a row of the opcode table; use the low order four bits
as an index to a column of the table. If the opcode is 0Fh, refer to the two-byte opcode
table and use the second byte of the opcode to index the rows and columns of that table.
B.2
KEY TO ABBREVIATIONS
Operands are identified by a two-character code of the form Zz. The uppercase letter
specifies the addressing method; the lowercase letter specifies the type of operand.
B.3
CODES FOR ADDRESSING METHOD
A
C
D
E
F
G
I
J
M
O
R
S
T
X
Y
B.4
CODES FOR OPERAND TYPE
a
b
c
d
p
s
v
w
B.5
Direct address; the instruction has no mod R/M byte; the operand address is encoded in
the instruction; no base register, index register, or scaling factor can be applied; for example,
JMP (EA)
The reg field of the mod R/M byte selects a control register; for example, MOV (0F20, 0F22)
The reg field of the mod R/M byte selects a debug register; for example, MOV (0F21,0F23)
A mod R/M byte follows the opcode and specifies the operand, either a general register or
a memory address. If a memory address, it is computed from a segment register and any
of the following values: a base register, an index register, a scaling factor, a displacement.
Flags Register
The reg field of the mod R/M byte selects a general register; for example, ADD (00)
Immediate data. The value of the operand is encoded in subsequent bytes of the instruction.
The instruction contains a relative offset to be added to the instruction pointer register; for
example, JMP short, LOOP.
The mod R/M byte only refers to memory; for example, BOUND, LES, LDS, LSS, LFS, LGS.
The instruction has no mod R/M byte; the offset of the operand is coded as a word or
doubleword (depending on address size attribute) in the instruction. No base register, index
register, or scaling factor can be applied; for example, MOV (A3–A0).
The mod field of the mod R/M byte may refer only to a general register; for example, MOV
(0F20–0F24, 0F26).
The reg field of the mod R/M byte selects a segment register; for example, MOV (8C,8E).
The reg field of the mod R/M byte selects a test register; for example, MOV (0F24,F26)
Memory addressed by the DS:SI register pair; for example, MOVS, COMPS, OUTS, LODS,
SCAS.
Memory addressed by the ES:DI register pair; for example, MOVS, CMPS, INS, STOS.
Two one-word operands in memory or two doubleword operands in memory, depending on
operand size attribute (used only by BOUND)
Byte (regardless of operand size attribute)
Byte or word, depending on operand size attribute
Doubleword (regardless of operand size attribute)
32-bit or 48-bit pointer, depending on operand size attribute
6-byte pseudo-descriptor
Word or doubleword, depending on operand size attribute
Word (regardless of operand size attribute)
REGISTER CODES
The register name in the opcode indicates whether the register is 32-, 16-, or 8-bits wide.
A register identifier in the form eXX indicates the width of the register depends on the
operand size; for example eAX indicates the AX register (16 bit) or the EAX register (32 bit).
Opcode Map
B-1
AMD
One-Byte Opcode Map
0
1
2
3
4
5
6
7
Gv,Ev
AL,Ib
eAX,lv
PUSH ES
POP ES
Gv,Ev
AL,Ib
eAX,lv
PUSH SS
POP SS
Gv,Ev
AL,lb
eAX,lv
SEG=ES
DAA
Gv,Ev
AL,lb
eAX,lv
SEG=SS
AAA
eBP
eSI
eSI
ADD
0
Eb,Gb
Ev,Gv
Gb,Eb
ADC
1
Eb,Gb
Ev,Gv
Gb,Eb
AND
2
Eb,Gb
Ev,Gv
Gb,Eb
XOR
3
Eb,Gb
Ev,Gv
Gb,Eb
INC general register
4
eAX
eCX
eDX
eBX
eSP
PUSH general register
5
6
eAX
eCX
eDX
eBX
eSP
eBP
eSI
eSI
PUSHA
POPA
BOUND
Gv,Ma
ARPL
Ew,Rw
SEG=FS
SEG=GS
Operand
Size
Address
Size
JBE
JNBE
Short-displacement jump on condition (Jxx)
7
JO
JNO
Immediate Grpl
JB
JNB
MOVB
Grpl
Ev,Ib
8
Eb,Ib
Ev,Iv
AL,imm8
JZ
JNZ
TEST
Eb,Gb
XCHG
Ev,Gv
Eb,Gb
Ev,Gv
XCHG word or doubleword register with eAX
9
NOP
eCX
eDX
eBX
eSP
eBP
eSI
eDI
Ob,AL
Ov,eAX
MOVSB
Xb,Yb
MOVSW/D
Xv,Yv
CMPSB
Xb,Yb
CMPSW/D
Xv,Yv
DH
BH
MOV
A
AL,Ob
eAX,Ov
MOV immediate byte into byte register
B
AL
CL
DL
Shift Grp2
BL
RET near
C
Eb,Ib
Ev,Ib
Iw
AH
CH
LES
Gv,Mp
LDS
GV,Mp
AAM
AAD
MOV
Eb,Ib
Ev,Iv
Shift Grp2
D
E
F
B-2
Eb,1
Ev,1
Eb,CL
Ev,CL
LOOPNE
Jb
LOOPE
Jb
LOOP
Jb
JCXZ
Jb
REPNE
REP
REPE
LOCK
Opcode Map
XLAT
IN
OUT
AL,lb
eAX
HLT
CMC
Ib,AL
Ib,eAX
Unary Grp3
Eb
Ev
AMD
One-Byte Opcode Map
8
9
A
B
C
D
E
F
Gv,Ev
AL,Ib
eAX,lv
PUSH CS
POP CS
Gv,Ev
AL,Ib
eAX,lv
PUSH DS
POP DS
Gv,Ev
AL,lb
eAX,lv
SEG=CS
DAS
Gv,Ev
AL,lb
eAX,lv
SEG=DS
AAS
eBP
eSI
eSI
OR
0
Eb,Gb
Ev,Gv
Gb,Eb
SBB
1
Eb,Gb
Ev,Gv
Gb,Eb
SUB
2
Eb,Gb
Ev,Gv
Gb,Eb
CMP
3
Eb,Gb
Ev,Gv
Gb,Eb
DEC general register
4
eAX
eCX
eDX
eBX
eSP
POP into general register
5
6
eAX
eCX
eDX
eBX
eSP
eBP
eSI
eSI
PUSH
Iv
IMUL
GvEvIv
PUSH
Ib
IMUL
GvEvIb
INSB
Yb,DX
INSW/D
Yv,DX
OUTSB
DX,Xb
OUTSW/D
DX,Xv
Short-displacement jump on condition (Jxx)
7
JS
JNS
JP
JNP
JL
JNL
JLE
JNLE
MOV
Eb,Gb
Ev,Gv
Gb,Eb
Gv,Ev
MOV
Ew,Sw
LEA
Gv,M
MOV
Sw,Ew
POP
Ev
CBW
CWD
CALL Ap
WAIT
PUSHF Fv
POPF Fv
SAHF
LAHF
eAX,Iv
STOSB
Yb,AL
STOSW/D
Yv,eAX
LODSB
AL,Xb
LODSW/D
eAX,Xv
SCASB
AL,Xb
SCASW/D
eAX,Xv
8
9
TEST
A
AL,Ib
MOV immediate word or doubleword into word or doubleword register
B
C
eAX
eCX
ENTER
Iw,iB
LEAVE
F
eBX
eSP
eBP
eSI
eDI
INT 3
INT Ib
INTO
IRET
RET far
Iw
D
E
eDX
ESC (Escape to coprocessor instruction set)
JMP
IN
OUT
CALL
Jv
JV
AP
Jb
AL,DX
eAX,DX
DX,AL
DX,eAX
CLC
STC
CLI
STI
CLD
STD
INC/DEC
Grp4
INC/DEC
Grp5
Opcode Map
B-3
AMD
Two-Byte Opcode Map (first byte is 0Fh)
0
0
1
2
3
Grp6
Grp7
LAR
Gv,Ew
LSL
Gv,Ew
MOV
Cd,Rd
MOV
Dd,Rd
MOV
Rd,Cd
MOV
Td,Rd
4
5
6
7
CLTS
1
2
MOV
Rd,Td
3
4
5
6
7
Long-displacement jump on condition (Jxx)
8
JO
JNO
JB
JNB
JZ
JNZ
JBE
JNBE
Byte Set on condition (Eb)
9
A
SETO
SETNO
PUSH
FS
POP
FS
LSS
Mp
B
C
SETB
XADD
Eb,Gb
SETNB
SETZ
SETNZ
SETBE
SETNBE
BT
Ev,Gv
SHLD
EvGvIb
SHLD
EvGvCL
CMPXCHG
Eb,Gb
CMPXCHG
Ev,Gv
BTR
Ev,Gv
LFS
Mp
LGS
Mp
XADD
Ev,Gv
D
E
F
B-4
Opcode Map
MOVZX
Gv,Eb
Gv,Ew
AMD
Two-Byte Opcode Map (first byte is 0Fh)
0
8
9
INVD
WBINVD
A
B
C
D
E
F
JNL
JLE
JNLE
SETLE
SETNLE
1
2
3
4
5
6
7
Long-displacement jump on condition (Jxx)
8
JS
JNS
JP
JNP
JL
Byte Set on condition (Eb)
9
A
SETS
SETNS
PUSH
GS
POP
GS
B
C
BSWAP
EAX
BSWAP
ECX
SETP
SETNP
SETL
SETNL
BTS
Ev,Gv
SHRD
EvGvIb
SHRD
EvGvCL
Grp8
Ev,Ib
BTC
Ev,Gv
BSF
Gv,Ev
BSR
Gv,Ev
Gv,Eb
Gv,Ew
BSWAP
EDX
BSWAP
EBX
BSWAP
ESP
BSWAP
EBP
BSWAP
ESI
BSWAP
EDI
IMUL
Gv,Ev
MOVSX
D
E
F
Opcode Map
B-5
AMD
Opcodes determined by bits 5, 4, 3 or mod R/M byte:
mod
nnn
000
001
010
011
100
101
110
111
1
ADD
OR
ADC
SBB
AND
SUB
XOR
CMP
2
ROL
ROR
RCL
RCR
SHL
SHR
SHL
SAR
3
TEST
Ib/Iv
TEST
Ib/Iv
NOT
NEG
MUL
AL/eAX
IMUL
AL/eAX
DIV
AL/eAX
IDIV
AL/eAX
4
INC
Eb
DEC
Eb
5
INC
Ev
IDEC
Ev
CALL
Ev
CALL
eP
JMP
Ev
JMP
Ep
PUSH
Ev
6
SLDT
Ew
STR
Ew
LLDT
Ew
LTR
Ew
VERR
Ew
VERW
Ew
7
SGDT
Ms
SIDT
Ms
LGDT
Ms
LIDT
Ms
SMSW
Ew
8
B-6
R/M
BT
Opcode Map
LMSW
Ew
BTS
BTR
BTC
APPENDIX
C
C.1
FLAG CROSS-REFERENCE
KEY TO CODES
T
M
0
1
—
R
blank
=
=
=
=
=
=
=
Instruction test flags
Instruction modifies flag (either sets or resets depending on operands)
Instruction resets flag
Instruction sets flag
Instruction’s effect on flag is undefined
Instruction restores prior value of flag
Instruction does not affect flag
Instruction
AAA
AAD
AAM
AAS
ADC
ADD
AND
ARPL
BOUND
BSF/BSR
BSWAP
BT/BTS/BTR/BTC
CALL
CBW
CLC
CLD
CLI
CLTS
CMC
CMP
CMPS
CMPSCHG
CWD
DAA
DAS
DEC
DIV
ENTER
ESC
HLT
IDIV
IMUL
IN
INC
INS
INT
INTO
INVD
INVLPG
IRET
Jcond
JCXZ
JMP
OF
SF
ZF
AF
PF
CF
—
—
—
—
M
M
0
—
M
M
—
M
M
M
—
M
M
—
M
M
M
M
TM
—
—
TM
M
M
—
—
M
M
—
M
M
M
M
—
—
M
TM
M
0
—
—
M
—
—
—
—
—
—
—
—
M
TF
IF
DF
NT
RF
0
0
0
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
M
—
—
M
—
M
M
M
—
M
M
M
—
TM
TM
M
—
M
M
M
—
TM
TM
—
M
—
—
—
—
—
—
—
—
—
M
M
M
M
M
M
T
—
T
T
R
T
0
0
0
0
R
T
R
T
Flag Cross-Reference
R
R
T
R
T
R
R
R
T
C-1
AMD
Instruction
LAHF
LAR
LDS/LES/LSS/LFS/LGS
LEA
LEAVE
LGDT/LIDT/LLDT/LMSW
LOCK
LODS
LOOP
LOOPE/LOOPNE
LSL
LTR
MOV
MOV control, debug
MOVS
MOVSX/MOVZX
MUL
NEG
NOP
NOT
OR
OUT
OUTS
POP/POPA
POPF
PUSH/PUSHA/PUSHF
RCL/RCR 1
RCL/RCR count
REP/REPE/REPNE
RET
ROL/ROR 1
ROL/ROR count
SAHF
SAL/SAR/SHL/SHR 1
SAL/SAR/SHL/SHR count
SBB
SCAS
SET cond
SGDT/SIDT/SLDT/SMSW
SHLD/SHRD
STC
STD
STI
STOS
STR
SUB
TEST
VERR/VERW
WAIT
WBINVD
XADD
XCHG
XLAT
XOR
OF
SF
ZF
AF
PF
CF
TF
IF
DF
NT
M
T
T
M
—
—
—
—
—
—
T
M
M
—
M
—
M
—
M
—
M
M
M
0
M
M
—
M
0
R
R
R
R
R
R
M
—
TM
TM
M
—
M
M
R
M
M
TM
M
T
M
—
M
M
T
R
M
M
M
M
T
R
M
M
M
M
T
R
—
—
M
M
R
M
M
M
M
T
—
M
M
—
M
R
R
T
R
R
T
M
1
1
1
T
M
0
M
M
M
M
M
M
—
M
M
M
0
M
M
M
M
M
M
0
M
M
—
M
0
Flag Definitions:
OF = Overflow Flag: When set, the number of digits in the result exceeds the destination operand size.
SF = Sign Flag: When set, the result is negative.
ZF = Zero Flag: When set, the result is zero.
AF = Adjust Flag: When set, there is a carry from or borrow to the low order 4 bits of AL in decimal.
PF = Parity Flag: When set, the low order byte of the result has an even number of 1 bits.
CF = Carry Flag: When set, there is a high order bit carry to or borrow.
TF = Trap Flag: When set, the processor goes into single-step mode for debugging.
IF = Interrupt Enable Flag: When set, the processor can respond to maskable interrupt requests.
DF = Directory Flag: When set, the processor decrements the index registers ESI and EDI.
NT = Nested Flag: Used to control chaining of interrupted and called tasks.
RF = Resume Flag: When set, temporarily disables debug exceptions to allow normal running.
C-2
Flag Cross-Reference
RF
APPENDIX
D
D.1
CONDITION CODES
CONDITION CODES FOR CONDITIONAL JUMP AND SET
INSTRUCTIONS
Mnemonic
Meaning
Instruction
Subcode
Condition Tested
A
Above
0111
(CF or ZF) = 0
AE
Above or equal
0011
CF = 0
B
Below
0010
CF=1
BE
Below or equal
0110
(CF or ZF) = 1
E
Equal
0100
ZF = 1
GE
Great or equal
1101
(SF xor OF) = 0
L
Less
1100
(SF xor OF) = 1
LE
Greater
1111
((SF xor OF) or ZF) = 0
LE
Less or equal
1110
((SF xor OF) or ZF) = 1
LE
Neither less nor equal
1111
((SF xor OF) or ZF) = 0
NA
Not above
0110
(CF or ZF) = 1
NAE
Neither above nor equal
0010
CF = 1
NB
Not below
0011
CF = 0
NBE
Not below or equal
0111
(CF or ZF) = 0
NE
Not equal
0101
ZF = 0
NG
Not greater
1110
((SF xor OF) or ZF) = 1
NGE
Not greater nor equal
1100
(SF xor OF) = 1
NL
Not less
1101
(SF xor OF) = 0
NO
No overflow
0001
OF = 0
NP
No parity
1011
PF = 0
NS
No sign
1001
SF = 0
NZ
Not zero
0101
ZF = 0
O
Overflow
0000
OF = 1
P
Parity
1010
PF = 1
PE
Parity even
1010
PF = 1
PO
Parity odd
1011
PF = 0
S
Sign
1000
SF = 1
Z
Zero
0100
ZF = 1
Note: The terms “above” and “below” refer to the relation between two unsigned
values (neither the SF flag nor the OF flag is tested). The terms “greater” and “less”
refer to the relation between two signed values (the SF and OF flags are tested).
Condition Codes
D-1
AMD
D-2
Condition Codes
APPENDIX
E
E.1
INSTRUCTION FORMAT AND TIMING
INSTRUCTION ENCODING AND CLOCK COUNT SUMMARY
To calculate elapsed time for an instruction, multiply the instruction clock count, as listed
in Table E-1, by the processor clock period. For more detailed information on the encodings
of instructions, refer to Section E.3, Instruction Encodings. Section E.3 explains the general
structure of instruction encodings and defines the exact encodings of all fields contained
within the instruction.
The Am486 microprocessor instruction clock count tables give clock counts, assuming data
and instruction accesses hit in the cache. A separate penalty column defines clocks to add
if a data access misses in the cache. The combined instruction and data cache hit rate is
over 90%.
A cache miss forces the Am486 microprocessor to run an external bus cycle. The Am486
microprocessor 32-bit burst bus is defined as r-b-w, where:
n
r = The number of clocks in the first cycle of a burst read or the number of clocks per
data cycle in a non-burst read.
n
b = The number of clocks for the second and subsequent cycles in a burst read.
n
w = The number of clocks for a write.
The fastest bus the Am486 microprocessor can support is 2-1-2, assuming 0 wait states.
The clock counts in the cache miss penalty column assume a 2-1-2 bus. For slower buses,
add r-2 clocks to the cache miss penalty for the first dword accessed.
E.2
FACTORS THAT AFFECT INSTRUCTION CLOCK COUNTS
1.
The external bus is available for reads or writes at all times. Else, add clocks to reads
until the bus is available.
2.
Accesses are aligned. Add three clocks to each misaligned access.
3.
Cache fills complete before subsequent accesses to the same line. If a read misses
the cache during a cache fill due to a previous read or prefetch, the read must wait
for the cache fill to complete. If a read or write accesses a cache line still being filled,
it must wait for the fill to complete.
4.
If an effective address is calculated, the base register is not the destination register
of the preceding instruction. If the base register is the destination register of the
preceding instruction, add 1 to the clock counts shown. Back-to-back PUSH and POP
instructions are not affected by this rule.
5.
An effective address calculation uses one base register and does not use an index
register. However, if the effective address calculation uses an index register, one
clock may be added to the clock count shown.
6.
The target of a jump is in the cache. If not, add r clocks for accessing the destination
instruction of a jump. If the destination instruction is not completely contained in the
first dword read, add a maximum of 3b clocks. If the destination instruction is not
completely contained in the first 16-byte burst, add a maximum of another r+3b clocks.
Instruction Format and Timing
E-1
AMD
Table E-1
7.
If no write buffer delay, w clocks are added only in the case in which all write buffers
are full. This case rarely occurs.
8.
Displacement and immediate are not used together. If displacement and immediate
are used together, one clock can be added to the clock count shown.
9.
No invalidate cycles. Add a delay of one clock for each invalidate cycle if the invalidate
cycle contends for the internal cache/external bus when the Am486 CPU needs to
use it.
10.
Page translation hits in TLB. A TLB miss adds 13, 21, or 28 clocks to the instruction,
depending on whether the accessed and/or dirty bit in neither, one, or both of the
page entries needs to be set in memory. This assumes that neither page entry is in
the data cache and a page fault does not occur on the address translation.
11.
No exceptions are detected during instruction execution.
12.
Instructions that read multiple consecutive data items (i.e., task switch, POPA, etc.)
and miss the cache are assumed to start the first access on a 16-byte boundary. If
not, an extra cache line fill might be necessary and might add up to (r+3b) clocks to
the cache miss penalty.
Instruction Clock Count Summary
Clocks if
Cache Hit
INSTRUCTION
FORMAT
AAA = ASCII adjust AL after add
00110111
AAD = ASCII adjust AX before
divide
11010101
00001010
14
AAM = ASCII adjust AX after
multiply
11010100
00001010
15
AAS = ASCII Adjust AL after
subtract
00111111
Notes
3
3
ADC = Add with carry
reg1 to reg2
reg2 to reg1
memory to register
register to memory
immediate to register
immediate to accumulator
immediate to memory
0001000w
0001001w
0001001w
0001000w
100000sw
0001010w
100000sw
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 010 reg
immediate data
mod 010 r/m
ADD = Add
reg1 to reg2
reg2 to reg1
memory to register
register to memory
immediate to register
immediate to accumulator
immediate to memory
0000000w
0000001w
0000001w
0000000w
100000sw
0000010w
100000sw
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 000 reg
immediate data
mod 000 r/m
Address Size
01100111
E-2
Penalty if
Cache Miss
immediate register
immediate data
immediate register
immediate data
1
1
2
3
1
1
3
1
1
2
3
1
1
3
1
Instruction Format and Timing
2
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
2
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
Prefix
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
AND = Logical AND
reg1 to reg2
reg2 to reg1
memory to register
register to memory
immediate to register
immediate to accumulator
immediate to memory
0010000w
0010001w
0010001w
0010000w
100000sw
0010010w
100000sw
Clocks if
Cache Hit
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 100 reg
immediate data
mod 100 r/m
immediate register
immediate data
1
1
2
3
1
1
3
Penalty if
Cache Miss
Notes
2
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
ARPL = Adjust RPL field of selector
From Register
From Memory
01100011
01100011
11 reg1 reg2
mod reg r/m
9
9
BOUND = Check array index bounds (generates INT 5 if out of bounds)
If in range
If out of range
Real Mode
Protected Mode:
Int/Trap Gate, same level
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Virtual Mode:
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
01100010
01100010
mod reg r/m
mod reg r/m
7
7
50
7
68
95
7
7
223
204
201
7
7
7
106
7
223
204
201
7
7
7
Variables:
BSF = Bit can Forward
reg1, reg2
00001111
10111100
11 reg2 reg1
6 to 42
memory, reg
00001111
10111100
mod reg r/m
7 to 43
BSR = Bit Scan Reverse
b = number of bytes not 0 (0–3)
i = number of nibbles/byte not 0 (0–1)
n = number of bits/nibble not 0 (0–3)
If operand2 is 0,
clocks = 6.
Else, clocks =
8 + 4(b+1) +
3(i+1) + 3(n+1)
2
If operand2 is 0,
clocks = 7.
Else, clocks =
9 + 4(b+1) +
3(i+1) + 3(n+1)
Variable: n = bit position number (0–31)
reg1, reg2
00001111
10111101
11 reg2 reg1
6 to 103
memory, reg
00001111
10111101
mod reg r/m
7 to 104
00001111
11001 reg
BSWAP = Byte Swap
 Add 11
 clocks for each
 unaccessed
 descriptor
 load.


Instruction Format and Timing
If operand2 is 0,
clocks = 6.
Else, clocks =
7+ 3(32 – n)
1
If operand2 is 0,
clocks = 7.
Else, clocks =
8 + 3(32 – n)
1
E-3
AMD
Table E-1
Instruction Clock Count Summary (continued)
Clocks if
Cache Hit
Penalty if
Cache Miss
INSTRUCTION
FORMAT
BT = Bit Test
register, immediate
memory, immediate
reg1, reg2
memory, reg
00001111
00001111
00001111
00001111
10111010
10111010
10100011
10100011
11 100 reg imm. byte
mod 100 r/m imm. byte
11 reg2 reg1
mod reg r/m
3
3
3
8
BTC = Bit Test and Complement
register, immediate
memory, immediate
reg1, reg2
memory, reg
00001111
00001111
00001111
00001111
10111010
10111010
10111011
10111011
11 111 reg imm. byte
mod 111 r/m imm. byte
11 reg2 reg1
mod reg r/m
6
8
6
13
BTR = Bit Test and Reset
register, immediate
memory, immediate
reg1, reg2
memory, reg
00001111
00001111
00001111
00001111
10111010
10111010
10110011
10110011
11 110 reg imm. byte
mod 110 r/m imm. byte
11 reg2 reg1
mod reg r/m
6
8
6
13
BTS = Bit Test and Set
register, immediate
memory, immediate
reg1, reg2
memory, reg
00001111
00001111
00001111
00001111
10111010
10111010
10101011
10101011
11 101 reg imm. byte
mod 101 r/m imm. byte
11 reg2 reg1
mod reg r/m
6
8
6
13
CALL = Call Procedure
Within Segment
Direct
Register indirect
Memory indirect
11101000
11111111
11111111
full displacement
11 010 reg
mod 010 r/m
3
5
5
Direct Intersegment
10011010
unsigned full offset, selector
18
2
20
35
69
3
6
17
77 + 4(x)
17 + 4(x)
199
180
177
3
3
3
200
181
178
3
3
3
same level
thru Gate — same level
inner level, no parameters
inner level, x parameters (d)
words
to TSS:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
thru Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
E-4
Instruction Format and Timing
Notes
1
2
{
2/0
No LOCK/LOCK
3/1
No LOCK/LOCK
2/0
No LOCK/LOCK
3/1
No LOCK/LOCK
2/0
No LOCK/LOCK
3/1
No LOCK/LOCK
See factor 6,
p. E-1
5
Real Mode;
assumes
memory read,
stack push/pop,
and branch in different cache
sets; clocks include 1 for
displacement +
immediate


 Add 11
 clocks for each
 unaccessed
 descriptor
 load.





AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
CALL (continued)
Indirect Intersegment
11111111
mod 011 r/m
same level
thru Gate — same level
inner level, no parameters
inner level, x = number of
parameter words
to TSS:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
thru Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Clocks if
Cache Hit
Penalty if
Cache Miss
17
8
Real Mode;
assumes mem.
read, stack push/
pop, and branch
in different cache
sets; clocks
include 1 for
displacement +
immediate
20
35
69
10
13
24
77 + 4(x)
24 + 4(x)
199
180
177
10
10
10
200
181
178
10
10
10

 Add 11
 clocks for each
 unaccessed
 descriptor
 load.







CBW = Convert Byte to Word
10011000
3
CDQ = Convert Dword to Qword
10011001
3
CLC = Clear Carry Flag
11111000
2
CLD = Clear Direction Flag
11111100
2
CLI = Clear Interrupt-Enable Flag
11111010
2
CLTS = Clear Task Switched Flag
00001111
CMC = Complement Carry Flag
11110101
CMP = Compare
reg1 with reg2
reg2 with reg1
memory with register
register with memory
immediate with register
immediate with accumulator
immediate with memory
0011100w
0011101w
0011100w
0011101w
100000sw
0011110w
100000sw
CMPS = Compare 2 Strings
CMPSB = Compare 2 Bytes
CMPSD = Compare 2 Dwords
CMPSW = Compare 2 Words
00000110
7
2
2
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 111 reg
immediate data
mod 111 r/m
immediate data
immediate data
1010011w
CMPXCHG = Compare/Exchange
reg1, reg2
00001111
memory, reg:
00001111
equal
not equal
Notes
1
1
2
2
1
1
2
8
1011000w
1011000w
11 reg2 reg1
mod reg r/m
2
6
16
6
7
10
CWD = Convert Word to Dword
10011001
3
CWDE = Convert Word to Dword
10011000
3
DAA = Decimal Adjust after Add
00100111
2
DAS = Decimal Adjust after
Subtract
00101111
2
Instruction Format and Timing
2
2
2
2
E-5
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
DEC = Decrement
reg
or
memory
1111111w
01001 reg
1111111w
11 001 reg
1111011w
11 110 reg
DIV = Divide (unsigned)
accumulator by reg.
divisor-byte
divisor-word
divisor-dword
accumulator by mem.
divisor-byte
divisor-word
divisor-dword
mod 001 r/m
Clocks if
Cache Hit
Penalty if
Cache Miss
Notes
1
1
3
6/2
No LOCK/LOCK
6(n)
n = number of
words copied to
new stack frame
16
24
40
1111011w
mod 110 r/m
16
24
40
ENTER = Enter Procedure
Level = 0
Level = 1
Level (L) > 1
11001000
16-bit displacement, 8-bit level
F2XM1 = Compute 2ST(0) –1
11011001
11110000
Avg. (range)
242 (140–279)
FABS = Absolute Value of ST(0)
11011001
11100001
3
14
17
17 + 3(L)
Concurr. Exec.
2
Continuous INT
polling to ensure
short interrupt
latency.
FADD = Add Real to ST(0)
ST(0) ← ST(0) + 32-bit memory
ST(0) ← ST(0) + 64-bit memory
ST(d) ← ST(0) + ST(i)
s-i-b/displacement
s-i-b/displacement
Avg. (range)
10 (8–20)
10 (8–20)
10 (8–20)
11011 000
11011 100
11011 d00
mod 000 r/m
mod 000 r/m
11100 ST(i)
FADDP = Add Floating-Point and
Pop Stack
11011 110
11000 ST(i)
FBLD = Load BCD to ST(0)
11011 111
mod 100 r/m
s-i-b/displacement
Avg. (range)
75 (70–103)
FBSTP = Store BCD & Pop Stack
11011 111
mod 110 r/m
s-i-b/displacement
Avg. (range)
175 (172–176)
FCHS = Change Sign
11011 001
1110 0000
FCLEX = Clear Exceptions after
Checking for FPU Error
No error pending
Error pending
1001 1011
11011 011
2
3
Avg. (range)
10 (8–10)
Concurr. Exec.
Avg. (range)
7 (5–17)
7 (5–17)
7 (5–17)
Concurr. Exec.
Avg. (range)
7 (5–17)
4
Concurr. Exec.
Avg. (range)
7.7 (2–8)
6
1110 0010
7
24
FCOM = Compare ST and Real
32-bit memory
64-bit memory
ST(i)
11011 000
11011 100
11011 000
mod 010 r/m
mod 010 r/m
11010 ST(i)
s-i-b/displacement
s-i-b/displacement
4
4
4
2
3
Concurr. Exec.
1
1
1
FCOMP = Compare Real and Pop
32-bit memory
64-bit memory
ST(i)
11011 000
11011 100
11011 000
mod 011 r/m
mod 011 r/m
11011 ST(i)
s-i-b/displacement
s-i-b/displacement
4
4
4
2
3
Concurr. Exec.
1
1
1
E-6
Instruction Format and Timing
AMD
Table E-1
Instruction Clock Count Summary (continued)
Clocks if
Cache Hit
Penalty if
Cache Miss
Notes
INSTRUCTION
FORMAT
FCOMPP = Compare Real and
Pop Stack Twice
11011 110
1101 1001
5
Concurr. Exec.
1
FCOS = Cosine ST(0)
11011 001
1111 1111
Avg. (range)
241 (193–279)
If |ST(0)| > π/4,
add n, where
n = [ST(0)/(π/4)]
Concurr. Exec.
2
Continuous INT
polling to ensure
short interrupt
latency.
FDECSTP = Decrement Stack
Pointer
11011 001
1111 0110
3
11011 000
mod 110 r/m
s-i-b/displacement
11011 100
mod 110 r/m
s-i-b/displacement
11011 d00
11111 ST(i)
11011 110
111111 ST(i)
11011 000
mod 111 r/m
s-i-b/displacement
11011 100
mod 111 r/m
s-i-b/displacement
11011 d00
11110 ST(i)
11011 110
111110 ST(i)
73
35
62
11011 101
11000 ST(i)
3
11011 110
11011 010
mod 000 r/m
mod 000 r/m
s-i-b/displacement
s-i-b/displacement
Avg. (range)
24 (20–35)
22.5 (19–32)
2
2
Concurr. Exec.
Avg. (range)
7 (5–17)
7 (5–17)
FICOM = Compare Integer
16-bit memory
32-bit memory
11011 110
11011 010
mod 010 r/m
mod 010 r/m
s-i-b/displacement
s-i-b/displacement
Avg. (Range)
18 (16–20)
16.5 (15–17)
2
2
Concurr. Exec.
1
1
FICOMP = Compare Integer and
Pop Stack
32-bit memory
64-bit memory
11011 110
11011 010
mod 011 r/m
mod 011 r/m
s-i-b/displacement
s-i-b/displacement
Avg. (Range)
18 (16–20)
16.5 (15–17)
2
2
Concurr. Exec.
1
1
FDIV = Divide Real
ST(0) ← ST(0) / 32-bit mem
24-bit precision
53-bit precision
ST(0) ← ST(0) / 64-bit mem
24-bit precision
53-bit precision
ST(d) ← ST(0) / ST(i)
24-bit precision
53-bit precision
FDIVP = Divide Real and Pop
24-bit precision
53-bit precision
FDIVR = Reverse Divide Real
ST(0) ← 32-bit mem / ST(0)
24-bit precision
53-bit precision
ST(0) ← 64-bit mem / ST(0)
24-bit precision
53-bit precision
ST(d) ← ST(i) / ST(0)
24-bit precision
53-bit precision
FDIVRP = Reverse Divide Real
and Pop Stack
24-bit precision
53-bit precision
FFREE = Free Floating-Point
Register
73
35
62
73
35
62
73
35
62
2
3
Concurr. Exec.
70
32
59
73
35
62
73
35
62
73
35
62
73
35
62
2
3
Instruction Format and Timing
Concurr. Exec.
70
32
59
70
32
59
70
32
59
Concurr. Exec.
70
32
59
FIADD = Add Integer
ST(0) ← ST(0) + 16-bit memory
ST(0) ← ST(0) + 32-bit memory
Concurr. Exec.
70
32
59
70
32
59
70
32
59
E-7
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FIDIV = Divide Integer
ST(0) ← ST(0) / 16-bit memory
24-bit precision
53-bit precision
ST(0) ← ST(0) / 32-bit memory
24-bit precision
53-bit precision
FIDIVR = Reverse Divide Integer
ST(0) ← 16-bit memory / ST(0)
24-bit precision
53-bit precision
ST(0) ← 32-bit memory / ST(0)
24-bit precision
53-bit precision
FORMAT
11011 110
mod 110 r/m
s-i-b/displacement
11011 010
mod 110 r/m
s-i-b/displacement
11011 110
mod 111 r/m
s-i-b/displacement
11011 010
mod 111 r/m
s-i-b/displacement
Clocks if
Cache Hit
Penalty if
Cache Miss
Avg. (range)
87 (85–89)
49 (47–51)
76 (74–78)
85.5 (84–86)
47.5 (46–48)
74.5 (73–75)
2
2
2
2
2
2
Concurr. Exec.
70
32
59
70
32
59
Avg. (range)
87 (85–89)
49 (47–51)
76 (74–78)
85.5 (84–86)
47.5 (46–48)
74.5 (73–75)
2
2
2
2
2
2
Concurr. Exec.
70
32
59
70
32
59
2
2
3
Concurr. Exec.
Avg. (range)
4
4 (2–4)
7.8 (2–8)
2
2
Concurr. Exec.
8
8
2
2
Concurr. Exec.
Avg. (range)
7 (5–17)
7 (5–17)
2
2
Concurr. Exec.
Avg. (range)
7 (5–17)
7 (5–17)
FILD = Load Integer ST(0)
11011111
11011011
11011111
mod 000 r/m
mod 000 r/m
mod 101 r/m
s-i-b/displacement
s-i-b/displacement
s-i-b/displacement
Avg. (range)
14.5 (13–16)
11.5 (9–12)
16.8 (10–18)
11011 110
11011 010
mod 001 r/m
mod 001 r/m
s-i-b/displacement
s-i-b/displacement
Avg. (range)
25 (23–27)
23.5 (22–24)
FINCSTP = Increment Stack
Pointer
11011 001
1111 0111
FINIT = Initialize FPU after
Checking for Unmasked Error
No error pending
Error pending
1001 1011
11011 011
32-bit memory
64-bit memory
80-bit memory
FIMUL = Multiply Integer
ST(0) ← ST(0) ⋅ 16-bit mem
ST(0) ← ST(0) ⋅ 32-bit mem
3
1110 0011
17
34
FIST = Store Integer from ST(0)
16-bit memory
32-bit memory
11011 111
11011 011
mod 010 r/m
mod 010 r/m
s-i-b/displacement
s-i-b/displacement
Avg. (range)
33.4 (29–34)
32.4 (28–34)
FISTP = Store Integer and Pop
Stack
16-bit memory
32-bit memory
64-bit memory
11011 111
11011 011
11011 111
mod 011 r/m
mod 011 r/m
mod 111 r/m
s-i-b/displacement
s-i-b/displacement
s-i-b/displacement
Avg. (range)
33.4 (29–34)
33.4 (29–34)
33.4 (29–34)
s-i-b/displacement
s-i-b/displacement
Avg. (range)
24 (20–35)
22.5 (19–32)
Avg. (range)
24 (20–35)
22.5 (19–32)
FISUB = Subtract Integer
ST(0) ← ST(0) – 16-bit memory
ST(0) ← ST(0) – 32-bit memory
11011 110
11011 010
mod 100 r/m
mod 100 r/m
FISUBR = Reverse Subtr. Integer
ST(0) ← 16-bit memory – ST(0)
ST(0) ← 32-bit memory – ST(0)
11011 110
11011 010
mod 101 r/m
mod 101 r/m
s-i-b/displacement
s-i-b/displacement
FLD = Load Real to ST(0)
32-bit memory
64-bit memory
80-bit memory
ST(i)
11011001
11011101
11011011
11011001
mod 000 r/m
mod 000 r/m
mod 101 r/m
11000 ST(i)
s-i-b/displacement
s-i-b/displacement
s-i-b/displacement
FLD1 = Load Constant +1.0
11011 001
1110 1000
FLDCW = Load Control Word
11011 001
mod 101 r/m
E-8
Notes
Avg. (lo–hi)
3
3
6
4
2
3
4
4
s-i-b/displacement
Instruction Format and Timing
4
2
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
Clocks if
Cache Hit
Penalty if
Cache Miss
44
44
34
34
2
2
2
2
Notes
FLDENV = Load FPU Environment 11011 001
Real/Virtual Mode 16-bit addr.
Real/Virtual Mode 32-bit addr.
Protected Mode 16-bit address
Protected Mode 32-bit address
mod 100 r/m
FLDL2E = Load Constant log2e
11011 001
1110 1010
8
Concurr. Exec. 2
FLDL2T = Load Constant log210
11011 001
1110 1001
8
Concurr. Exec. 2
FLDLG2 = Load Constant log102
11011 001
1110 1100
8
Concurr. Exec. 2
FLDLN2 = Load Constant loge2
11011 001
1110 1101
8
Concurr. Exec. 2
11011 001
1110 1011
8
Concurr. Exec. 2
FLDZ = Load Constant +0.0
11011 001
1110 1110
4
FMUL = Multiply Real
ST(0) ← ST(0) ⋅ 32-bit mem
ST(0) ← ST(0) ⋅ 64-bit mem
ST(d) ← ST(0) ⋅ ST(i)
11011 000
11011 100
11011 d00
mod 001 r/m
mod 001 r/m
11001 ST(i)
FMULP = Multiply Real and Pop
Stack
11011 110
11001 ST(i)
16
FNCLEX = Clear Exceptions
without Checking for Error
11011 011
1110 0010
7
FNINIT = Initialize FPU without
Checking for Error
11011 011
1110 0011
17
FNOP = No operation
11011 001
1101 0000
3
FNSAVE = Store FPU State without Checking for Error
Real/Virtual Mode 16-bit addr.
Real/Virtual Mode 32-bit addr.
Protected Mode 16-bit address
Protected Mode 32-bit address
11011 101
mod 110 r/m
FNSTCW = Store Control Word
without Checking for Error
11011 001
mod 111 r/m
s-i-b/displacement
FNSTENV = Store FPU Environment without Checking for Error
Real/Virtual Mode 16-bit addr.
Real/Virtual Mode 32-bit addr.
Protected Mode 16-bit address
Protected Mode 32-bit address
11011 001
mod 110 r/m
s-i-b/displacement
FLDPI = Load Constant
π
s-i-b/displacement
s-i-b/displacement
s-i-b/displacement
11
14
16
2
3
Concurr. Exec.
8
11
13
Concurr. Exec.
13
s-i-b/displacement
154
154
143
143
3
67
67
56
56
FNSTSW = Store Status Word
without Checking for Error
Into AX
Into memory
11011 111
11011 101
1110 0000
mod 111 r/m
FPATAN = Partial Arctangent
11011 001
1111 0011
Avg. (range)
289 (218–303)
Concurr. Exec.
Avg. (range)
5 (2–17)
Continuous INT
polling to ensure
short interrupt latency.
FPREM = Partial Remainder
11011 001
1111 1000
Avg. (range)
84 (70–138)
Concurr. Exec.
Avg. (range)
2 (2–8)
s-i-b/displacement
Instruction Format and Timing
3
3
E-9
AMD
Table E-1
Instruction Clock Count Summary (continued)
Clocks if
Cache Hit
Penalty if
Cache Miss
Notes
INSTRUCTION
FORMAT
FPREM1 = Partial Remainder
(IEEE 754 compliant)
11011 001
1111 0101
Avg. (range)
94.5 (72–167)
Concurr. Exec.
Avg. (range)
5.5 (2–18)
FPTAN = Partial Tangent
11011 001
1111 0010
Avg. (range)
244 (200–273)
If |ST(0)| > π/4,
add n, where
n = [ST(0)/(π/4)]
Concurr. Exec.
70
Continuous INT
polling to ensure
short interrupt latency.
FRNDINT = Round to Integer
11011 001
1111 1100
Avg. (range)
29.1 (21–30)
Concurr. Exec.
Avg. (range)
5.5 (2–18)
FRSTOR = Restore FPU State
Real/Virtual Mode 16-bit addr.
Real/Virtual Mode 32-bit addr.
Protected Mode 16-bit address
Protected Mode 32-bit address
11011 101
mod 100 r/m
FSAVE = Store FPU State after
checking for error
Real/Virtual Mode 16-bit addr.
No error pending
Error pending
Real/Virtual Mode 32-bit addr.
No error pending
Error pending
Protected Mode 16-bit address
No error pending
Error pending
Protected Mode 32-bit address
No error pending
Error pending
1001 1011
FSCALE = Scale
11011 001
1111 1101
Avg. (range)
31 (30–32)
Concurr. Exec.
2
FSIN = Sine
11011 001
1111 1110
Avg. (range)
241 (193–279)
If |ST(0)| > π/4,
add n, where
n = [ST(0)/(π/4)]
Concurr. Exec.
2
Continuous INT
polling to ensure
short interrupt latency.
FSINCOS = Sine and Cosine
11011 001
1111 1011
Avg. (range)
291(243–329)
If |ST(0)| > π/4,
add n, where
n = [ST(0)/(π/4)]
Concurr. Exec.
2
Continuous INT
polling to ensure
short interrupt latency.
FSQRT = Square Root
11011 001
1111 1010
Avg. (range)
85.5 (83–87)
Concurr. Exec.
70
FST = Store Real
32-bit memory
64-bit memory
ST(i)
11011 001
11011 101
11011 101
mod 010 r/m
mod 010 r/m
11010 ST(i)
s-i-b/displacement
s-i-b/displacement
7
8
3
If op.=0, clks=27
If op.=0, clks=28
1001 1011
11011 001
mod 111 r/m s-i-b/displacement
FSTCW = Store Control Word
after checking for error
No error pending
Error pending
E-10
s-i-b/displacement
131
131
120
120
11011 101
23
27
23
27
mod 110 r/m s-i-b/displacement
154
171
154
171
143
160
143
160
3
21
Instruction Format and Timing
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
FSTENV = Store Environment
after checking for error
Real/VirtualMode/16-bit Addr.
No error pending
Error pending
Real/Virtual Mode/32-bit Addr.
No error pending
Error pending
Protected Mode/16-bit Addr.
No error pending
Error pending
Protected Mode/32-bit Addr.
No error pending
Error pending
1001 1011
FSTP = Store Real and Pop Stack
32-bit memory
64-bit memory
80-bit memory
ST(i)
FSTSW = Store Status Word after
checking for error
Into AX
No error pending
Error pending
In memory
No error pending
Error pending
Clocks if
Cache Hit
11011 001
Penalty if
Cache Miss
mod 110 r/m s-i-b/displacement
67
84
67
84
56
70
56
70
11011 001
11011 101
11011 011
11011 101
mod 011 r/m
mod 011 r/m
mod 111 r/m
11001 ST(i)
s-i-b/displacement
s-i-b/displacement
s-i-b/displacement
1001 1011
11011 111
1110 0000
7
8
6
3
If op.=0, clks=27
If op.=0, clks=28
3
21
1001 1011
11011 101
mod 111 r/m s-i-b/displacement
3
21
FSUB = Subtract Real
ST(0) ← ST(0) – 32-bit memory
ST(0) ← ST(0) – 64-bit memory
ST(d) ← ST(0) – ST(i)
FSUBP = Subtract Real and Pop
Stack
ST(i) ← ST(0) – ST(i)
11011 000
11011 100
11011 d00
mod 100 r/m
mod 100 r/m
11101 ST(i)
11011 110
11001 ST(i)
s-i-b/displacement
s-i-b/displacement
Avg. (range)
10 (8–20)
10 (8–20)
10 (8–20)
2
3
Avg. (range)
10 (8–10)
s-i-b/displacement
s-i-b/displacement
Avg. (range)
10 (8–20)
10 (8–20)
10 (8–20)
Concurr. Exec.
Avg. (range)
7 (5–17)
7 (5–17)
7 (5–17)
Concurr. Exec.
Avg. (range)
7 (5–17)
FSUBR = Reverse Subtract Real
ST(0) ← 32-bit memory – ST(0)
ST(0) ← 64-bit memory – ST(0)
ST(d) ← ST(i) – ST(0)
Notes
2
3
Concurr. Exec.
Avg. (range)
7 (5–17)
7 (5–17)
7 (5–17)
11011 000
11011 100
11011 d00
mod 101 r/m
mod 101 r/m
11100 ST(i)
FSUBRP = Reverse Subtract
Real and Pop Stack
ST(i) ← ST(i) – ST(0)
11011 110
11100 ST(i)
Avg. (range)
10 (8–10)
Concurr. Exec.
Avg. (range)
7 (5–17)
FTST = Compare ST(0) to 0.0
11011 001
1110 0100
4
Concurr. Exec. 1
FUCOM = Unordered Compare
Real – ST(0) to ST(i)
11011 101
11100 ST(i)
4
Concurr. Exec. 1
FUCOMP = Unordered Compare
Real and Pop Stack
11011 101
11101 ST(i)
4
Concurr. Exec. 1
FUCOMPP = Unordered
Compare Real and Pop Stack
Twice
11011 101
1110 1001
5
Concurr. Exec. 1
FWAIT = Wait
10011011
FXAM = Examine
11011 001
1 to 3
1110 0101
Instruction Format and Timing
8
E-11
AMD
Table E-1
Instruction Clock Count Summary (continued)
Clocks if
Cache Hit
Penalty if
Cache Miss
Notes
INSTRUCTION
FORMAT
FXCH = Exchng. ST(0) and ST(i)
11011 001
11001 ST(i)
4
FXTRACT = Extract Exponent
and Significand
11011 001
1111 0100
Avg. (range)
19 (16–20)
FYL2X = Compute
ST(1) ⋅ log2ST(0)
11011 001
1111 0001
Avg. (range)
311 (196–329)
Concurr. Exec.
13
Continuous INT
polling to ensure
short interrupt latency.
FYL2XP1 = Compute
ST(1) ⋅ log2[ST(0) +1]
11011 001
1111 1001
Avg. (range)
313(171–326)
Concurr. Exec.
13
Continuous INT
polling to ensure
short interrupt latency.
HLT = Halt
11110100
IDIV = Integer Divide (signed)
accumulator by register
Divisor:
Byte
Word
Dword
accumulator by memory
Divisor
Byte
Word
Dword
E-12
1111011w
4
11 111 reg
19
27
43
1111011w
mod 111 r/m
20
28
44
Instruction Format and Timing
Concurr. Exec.
Avg. (range)
4 (2–4)
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
IMUL = Integer Multiply (signed)
accumulator with register
Multiplier:
Byte
Word
Dword
accumulator with memory
Multiplier:
Byte
Word
Dword
reg1 with reg2
Multiplier:
Byte
Word
Dword
register with memory
Multiplier:
Byte
Word
Dword
reg1 with imm. to reg2
Multiplier:
Byte
Word
Dword
mem. with imm. to reg.
Multiplier:
Byte
Word
Dword
IN = Input from Port
Fixed Port
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
Variable Port
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
INC = Increment
reg
or
memory
INS = Input String from Port
INSB = Input Byte from Port
INSD = Input Dword from Port
INSW = Input Word from Port
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
Clocks if
Cache Hit
FORMAT
1111011w
Penalty if
Cache Miss
11 101 reg
13 to 18
13 to 26
13 to 42
1111011w
For all cases,
clocks = 10 +
max(log2(|m|),n)
where
m = multiplier
n = 3/5 for ± m
if m = 0,
clocks = 13
mod 101 r/m
13 to 18
13 to 26
13 to 42
00001111
Notes
10101111
11 reg1 reg2
13 to 18
13 to 26
13 to 42
00001111
10101111
mod reg r/m
13 to 18
13 to 26
13 to 42
011010s1
11 reg1 reg2
1
1
1
immediate data
13 to 18
13 to 26
13 to 42
011010s1
mod reg r/m
immediate data
13 to 18
13 to 26
13 to 42
1110010w
2
2
2
port number
14
9
29
27
1110110w
14
8
28
27
1111111w
01000 reg
1111111w
11 000 reg
mod 000 r/m
1
1
3
6/2
No LOCK/LOCK
0110110w
17
10
32
30
Instruction Format and Timing
E-13
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
INT = Call to Interrupt Procedure
INT n = Interrupt Type n
Real Mode
Protected Mode:
Int/Trap Gate, same level
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Virtual Mode:
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
INT 3 = Interrupt Type 3
Real Mode
Protected Mode:
Int/Trap Gate, same level
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Virtual Mode:
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Clocks if
Cache Hit
FORMAT
11001101
Notes
type
26
44
71
199
180
177







Add 11
clocks for each
unaccessed
descriptor
load.
82
199
180
177
11001100
26
44
71
199
180
177







Add 11
clocks for each
unaccessed
descriptor
load.
82
199
180
177
Hardware Interrupts:
External Interrupt
Real Mode
Protected Mode:
Int/Trap Gate, same level
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Virtual Mode:
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
E-14
Penalty if
Cache Miss
37
55
82
210
191
188
93
210
191
188
Instruction Format and Timing







Add 11
clocks for each
unaccessed
descriptor
load.
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
Clocks if
Cache Hit
FORMAT
Hardware Interrupts (continued):
NMI
Real Mode
Protected Mode:
Int/Trap Gate, same level
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Virtual Mode:
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Penalty if
Cache Miss
Notes
29
47
74
202
183
180
 Add 11
 clocks for each
 unaccessed
 descriptor
 load.


85
202
183
180
Page Fault
Real Mode
Protected Mode:
Int/Trap Gate, same level
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Virtual Mode:
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
50
68
95
223
204
201







Add 11
clocks for each
unaccessed
descriptor
load.
106
223
204
201
INTO = Interrupt 4 if OF=1
Taken:
Real Mode
Protected Mode:
Int/Trap Gate, same level
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Virtual Mode:
Int/Trap Gate, diff. level
Task Gate:
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
Not taken
11001110
INVD = Invalidate Cache
00001111
00001000
INVLPG = Invalidate TLB Entry
Hit
No hit
00001111
00000001
28
46
73
201
182
179







Add 11
clocks for each
unaccessed
descriptor
load.
84
201
182
179
3
4
mod 111 r/m
12
11
Instruction Format and Timing
E-15
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
IRET/IRETD = Interrupt Return
Real Mode/Virtual Mode
Protected Mode:
To same level
To outer level
To nested task (NT=1):
VM/486/286 TSS to 486 TSS
VM/486/286 TSS to 286 TSS
VM/486/286 TSS to VM TSS
11001111
JA = Jump if Above
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JAE = Jump if Above/Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JB = Jump if Below
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JBE = Jump if Below/Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JC = Jump if Carry
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JCXZ = Jump Short if CX=0
8-bit displacement
Jump taken
Jump not taken
JE = Jump Short if Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
E-16
01110111
00001111
Clocks if
Cache Hit
Penalty if
Cache Miss
15
8
20
36
11
19
194
175
172
59
35
41
3
1
See factor 6,
p. E-1
Notes







Add 11
clocks for each
unaccessed
descriptor
load.
8-bit displacement
10000111
full displacement
3
1
01110011
8-bit displacement
3
1
00001111
10000011
See factor 6,
p. E-1
full displacement
3
1
01110010
8-bit displacement
3
1
00001111
10000010
See factor 6,
p. E-1
full displacement
3
1
01110110
8-bit displacement
3
1
00001111
10000110
See factor 6,
p. E-1
full displacement
3
1
01110010
8-bit displacement
3
1
00001111
10000010
See factor 6,
p. E-1
full displacement
3
1
11100011
01110100
00001111
8-bit displacement
8
5
See factor 6,
p. E-1
3
1
See factor 6,
p. E-1
8-bit displacement
10000100
full displacement
3
1
Instruction Format and Timing
Add 11
clocks for each
unaccessed
descriptor
load.
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
JECXZ = Jump Short if ECX=0
8-bit displacement
Jump taken
Jump not taken
JG = Jump if Greater
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JGE = Jump if Greater/Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JL = Jump if Less
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JLE = Jump if Less/Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
FORMAT
11100011
01111111
00001111
Clocks if
Cache Hit
Penalty if
Cache Miss
8
5
See factor 6,
p. E-1
3
1
See factor 6,
p. E-1
Notes
8-bit displacement
8-bit displacement
10001111
full displacement
3
1
01111101
8-bit displacement
3
1
00001111
10001101
See factor 6,
p. E-1
full displacement
3
1
01111100
8-bit displacement
3
1
00001111
10001100
See factor 6,
p. E-1
full displacement
3
1
01111110
8-bit displacement
3
1
00001111
10001110
See factor 6,
p. E-1
full displacement
3
1
Instruction Format and Timing
E-17
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
JMP = Jump
within segment
Short
Direct
Register indirect
Memory indirect
11101011
11101001
11111111
11111111
11101010
direct intersegment
Clocks if
Cache Hit
Penalty if
Cache Miss
8-bit displacement
full displacement
11 100 reg
mod 100 r/m
3
3
5
5
See factor 6,
p. E-1
unsigned full offset, selector
17
2
19
32
3
6
204
185
182
3
3
3
205
186
183
3
3
3
13
9
18
31
10
13
203
184
181
10
10
10
204
185
182
10
10
10
3
1
See factor 6,
p. E-1
to same level
thru Call Gate to same level
thru TSS
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
thru Task Gate
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
indirect intersegment
11111111
mod 101 r/m
to same level
thru Call Gate to same level
thru TSS
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
thru Task Gate
VM/486/286 to 486 TSS
VM/486/286 to 286 TSS
VM/486/286 to VM TSS
JNA = Jump if Not Above
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
E-18
01110110
00001111
5
8-bit displacement
10000110
full displacement
3
1
Instruction Format and Timing
Notes
Assumes mem.
rd, stack push/
pop, and branch
in diff. cache
sets.
Real Mode;
assumes mem.
rd, stack push/
pop, and branch
in diff. cache
sets; clocks
include 1 for
displacement+
immediate










Add 11
clocks for each
unaccessed
descriptor
load.
Real Mode;
assumes mem.
rd, stack push/
pop, and branch
in diff. cache
sets; add 11
clocks for each
unaccessed
descriptor load.










Add 11
clocks for each
unaccessed
descriptor
load.
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
JNAE = Jump if Not Above/ Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNB = Jump if Not Below
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNBE = Jump if Not Below/ Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNC = Jump if Not Carry
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNE = Jump if Not Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNG = Jump if Not Greater
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNGE = Jump if Not Greater/
Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
FORMAT
01110010
00001111
Clocks if
Cache Hit
Penalty if
Cache Miss
3
1
See factor 6,
p. E-1
Notes
8-bit displacement
10000010
full displacement
3
1
01110011
8-bit displacement
3
1
00001111
10000011
See factor 6,
p. E-1
full displacement
3
1
01110111
8-bit displacement
3
1
00001111
10000111
See factor 6,
p. E-1
full displacement
3
1
01110011
8-bit displacement
3
1
00001111
10000011
See factor 6,
p. E-1
full displacement
3
1
01110101
8-bit displacement
3
1
00001111
10000101
See factor 6,
p. E-1
full displacement
3
1
01111110
8-bit displacement
3
1
00001111
10001110
See factor 6,
p. E-1
full displacement
3
1
01111100
8-bit displacement
3
1
00001111
10001100
See factor 6,
p. E-1
full displacement
3
1
Instruction Format and Timing
E-19
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
JNL = Jump if Not Less
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNLE = Jump if Not Less/Equal
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNO = Jump if Not Overflow
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNP = Jump if Not Parity
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNS = Jump if Not Sign
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JNZ = Jump if Not Zero
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JO = Jump if Overflow
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
E-20
FORMAT
01111101
00001111
Clocks if
Cache Hit
Penalty if
Cache Miss
3
1
See factor 6,
p. E-1
8-bit displacement
10001101
full displacement
3
1
01111111
8-bit displacement
3
1
00001111
10001111
See factor 6,
p. E-1
full displacement
3
1
01110001
8-bit displacement
3
1
00001111
10000001
See factor 6,
p. E-1
full displacement
3
1
01111011
8-bit displacement
3
1
00001111
10001011
See factor 6,
p. E-1
full displacement
3
1
01111001
8-bit displacement
3
1
00001111
10001001
See factor 6,
p. E-1
full displacement
3
1
01110111
8-bit displacement
3
1
00001111
10000111
See factor 6,
p. E-1
full displacement
3
1
01110101
8-bit displacement
3
1
00001111
10000101
full displacement
3
1
Instruction Format and Timing
See factor 6,
p. E-1
Notes
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
JP = Jump if Parity
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JPE = Jump if Parity Even
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JPO = Jump if Parity Odd
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JS = Jump if Sign
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
JZ = Jump if Zero
8-bit displacement
Jump taken
Jump not taken
Full displacement
Jump taken
Jump not taken
FORMAT
01111010
00001111
Clocks if
Cache Hit
Penalty if
Cache Miss
3
1
See factor 6,
p. E-1
Notes
8-bit displacement
10001010
full displacement
3
1
01111010
8-bit displacement
3
1
00001111
10001010
See factor 6,
p. E-1
full displacement
3
1
01111011
8-bit displacement
3
1
00001111
10001011
See factor 6,
p. E-1
full displacement
3
1
01111000
8-bit displacement
3
1
00001111
10001000
See factor 6,
p. E-1
full displacement
3
1
01110100
8-bit displacement
3
1
00001111
10000100
See factor 6,
p. E-1
full displacement
3
1
LAHF = Load Flags into AH
1001 1111
3
LAR = Load Access Rights Byte
From Register
From Memory
00001111
00001111
00000010
00000010
LDS = Load Pointer Using DS
Real and Virtual Mode
Protected Mode
11000101
mod reg r/m
LEA = Load EA to Register
no index register
with index register
10001101
LEAVE = Leave Procedure
11001001
LES = Load Pointer Using ES
11000100
11 reg1 reg2
mod reg r/m
11
11
3
5
6
12
7
10
Add 11 clocks
for each
unaccessed
descriptor load.
mod reg r/m
1
2
5
1
6
12
7
10
mod reg r/m
Instruction Format and Timing
Add 11 clocks
for each
unaccessed
descriptor load.
E-21
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
LFS = Load Pointer Using FS
00001111
10110100
Clocks if
Cache Hit
Penalty if
Cache Miss
Notes
6
12
7
10
Add 11 clocks
for each
unaccessed
descriptor load.
12
5
6
12
7
10
mod reg r/m
LGDT = Load Global Descriptor
Table Register
00001111
00000001
mod 010 r/m
LGS = Load Pointer Using GS
00001111
10110101
mod reg r/m
LIDT = Load Interrupt Descriptor
Table Register
00001111
00000001
mod 011 r/m
12
5
LLDT = Load Local Descriptor
Table Register from Register
Table Register from Memory
00001111
00001111
00000000
00000000
11 010 reg
mod 010 r/m
11
11
3
6
00000001
00000001
11 110 reg
mod 110 r/m
13
13
1
Add 11 clocks
for each
unaccessed
descriptor load.
LMSW = Load Machine Status Word
From Register
From Memory
00001111
00001111
LOCK = Assert LOCK Signal
11110000
1
LODS = Load String
LODSB = Load String Byte
LODSD = Load String Dword
LODSW = Load String Word
1010110w
5
2
LOOP = Loop CX times
Loop
No loop
11100010
7
6
See factor 6,
p. E-1.
LOOPE = Loop if Equal
Loop
No loop
11100001
9
6
See factor 6,
p. E-1.
LOOPNE = Loop if Not Equal
Loop
No loop
11100000
9
6
See factor 6,
p. E-1.
LOOPNZ = Loop if Not Zero
Loop
No loop
11100000
9
6
See factor 6,
p. E-1.
LOOPZ = Loop if Zero
Loop
No loop
11100001
9
6
See factor 6,
p. E-1.
10
10
3
6
6
12
7
10
8-bit displacement
8-bit displacement
8-bit displacement
8-bit displacement
8-bit displacement
LSL = Load Segment Limit
From Register
From Memory
00001111
00001111
00000011
00000011
11 reg1 reg2
mod reg r/m
LSS = Load Pointer using SS
00001111
10110010
mod reg r/m
LTR = Load Task Register
From Register
From Memory
E-22
00001111
00001111
Prefix
00000000
00000000
11 001 reg
mod 001 r/m
Instruction Format and Timing
20
20
Add 11 clocks
for each
unaccessed
descriptor load.
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
MOV = Move
reg1 to reg2
reg2 to reg1
memory to reg
reg to memory
immediate to reg
or
Immediate to Memory
Memory to Accumulator
Accumulator to Memory
reg to segment reg
Real and Virtual Mode
Protected Mode
memory to segment reg
Real and Virtual Mode
Protected Mode
segment reg to reg
segment reg to memory
CR0 from Register
CR2 from Register
CR3 from Register
Register from CR0
Register from CR2
Register from CR3
DR0 from Register
DR1 from Register
DR2 from Register
DR3 from Register
DR6 from Register
DR7 from Register
Register from DR0
Register from DR1
Register from DR2
Register from DR3
Register from DR6
Register from DR7
TR3 from Register
TR4 from Register
TR5 from Register
TR6 from Register
TR7 from Register
Register from TR3
Register from TR4
Register from TR5
Register from TR6
Register from TR7
MOVS = Move String to String
MOVSB = Move Byte to Byte
MOVSD = Move Dword to Dword
MOVSW = Move Word to Word
Clocks if
Cache Hit
FORMAT
1000100W
1000101W
1000101w
1000100w
1100011w
1011w reg
1100011w
1010000w
1010001w
10001110
10001110
10001100
10001100
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
00001111
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11000 reg
immediate data
mod 000 r/m
full displacement
full displacement
11 sreg3 reg
immediate data
displacement immediate
1
1
1
1
1
1
1
1
1
Penalty if
Cache Miss
2
2
3
9
0
3
3
9
3
3
17
4
4
4
4
4
10
10
10
10
10
10
9
9
9
9
9
9
4
4
4
4
4
3
4
4
4
4
2
5
7
2
mod sreg3 r/m
11 sreg3 reg
mod sreg3 r/m
00100010
00100010
00100010
00100000
00100000
00100000
00100011
00100011
00100011
00100011
00100011
00100011
00100001
00100001
00100001
00100001
00100001
00100001
00100110
00100110
00100110
00100110
00100110
00100100
00100100
00100100
00100100
00100100
11 000 reg
11 010 reg
11 011 reg
11 000 reg
11 010 reg
11 011 reg
11 000 reg
11 001 reg
11 010 reg
11 011 reg
11 110 reg
11 111 reg
11 000 reg
11 001 reg
11 010 reg
11 011 reg
11 110 reg
11 111 reg
11 011 reg
11 100 reg
11 101 reg
11 110 reg
11 111 reg
11 011 reg
11 100 reg
11 101 reg
11 110 reg
11 111 reg
1010010w
Notes
2
 Add 11
 clocks for each
 unaccessed
 descriptor
 load.
For sreg3:
CS = 001
DS = 011
ES = 000
FS = 100
GS = 101
SS = 010
Assumes the two
string addresses
fall in different
cache sets.
MOVSX = Move with Sign Extension
reg2 to reg1
memory to reg
00001111
00001111
1011111w
1011111w
11 reg1 reg2
mod reg r/m
3
3
2
1011011w
1011011w
11 reg1 reg2
mod reg r/m
3
3
2
MOVZX = Move with Zero Extension
reg2 to reg1
memory to reg
00001111
00001111
Instruction Format and Timing
E-23
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
MUL = Multiply (unsigned)
accumulator with register
Multiplier:
Byte
Word
Dword
accumulator with memory
Multiplier:
Byte
Word
Dword
Clocks if
Cache Hit
FORMAT
1111011w
13 to 18
13 to 26
13 to 42
11 100 reg
NOP = No Operation
10010000
NOT = Logical Complement
reg
memory
1111011w
1111011w
Operand Size
01100110
OR = Logical OR
reg1 to reg2
reg2 to reg1
memory to register
register to memory
immediate to register
immediate to accumulator
immediate to memory
0000100w
0000101w
0000101w
0000100w
100000sw
0000110w
100000sw
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 001 reg
immediate data
mod 001 r/m
1110011w
port number
E-24
13 to 18
13 to 26
13 to 42
1
1
1
For all cases,
clocks = 10 +
max(log2(|m|),n)
where
m = multiplier
n = 3/5 for ± m
if m = 0,
clocks = 13
1
3
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
mod 100 r/m
1111011w
1111011w
OUTS = Output String to Port
OUTSB = Output Byte to Port
OUTSD = Output Dword to Port
OUTSW = Output Word to Port
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
Notes
11 100 reg
NEG = Negate
reg
memory
OUT = Output to Port
Fixed Port
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
Variable Port
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
Penalty if
Cache Miss
11 011 reg
mod 011 r/m
1
11 010 reg
mod 010 r/m
1
3
1
immediate register
immediate data
1
1
2
3
1
1
3
Prefix
2
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
16
11
31
29
1110111w
16
10
30
29
0110111w
Instruction Format and Timing
17
2
10
32
30
2
2
2
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
POP = Pop
reg
or
memory
segment registers:
CS
Real or Virtual Mode
Protected Mode
DS
Real or Virtual Mode
Protected Mode
ES
Real or Virtual Mode
Protected Mode
FS
Real or Virtual Mode
Protected Mode
GS
Real or Virtual Mode
Protected Mode
SS
Real or Virtual Mode
Protected Mode
POPA = Pop All (16-bit)
FORMAT
10001111
01011 reg
10001111
11 000 reg
mod 000 r/m
Clocks if
Cache Hit
Penalty if
Cache Miss
4
1
5
1
2
2
3
9
2
5
3
9
2
5
3
9
2
5
3
9
2
5
3
9
2
5
3
9
2
5
9
7/15
16/32
1
Assumes operand and stack
addresses are in
different cache
sets.
000 01 111
000 11 111
000 00 111
00001111
00001111
10 100 001
10 101 001
000 10 111
01100001
Notes





 Add 11
 clocks for each
 unaccessed
 descriptor
 load.








POPAD = Pop All (32-bit)
POPF = Pop into FLAGS
Virtual and Real Mode
Protected Mode
10011101
9
6
POPFD = Pop into EFLAGS
PUSH = Push
reg
or
memory
11111111
01010 reg
11111111
11 110 reg
mod 110 r/m
4
1
4
immediate
segment registers:
CS
DS
ES
FS
GS
SS
011010s0
immediate data
1
PUSHA = Push All (16-bit)
000 01 110
000 11 110
000 00 110
00001111
00001111
000 10 110
10 100 000
10 101 000
01100000
3
3
3
3
3
3
11
PUSHAD = Push All (32-bit)
PUSHF = Push FLAGS
Real and Virtual Mode
Protected Mode
10011100
4
3
PUSHFD = Push EFLAGS
Instruction Format and Timing
E-25
AMD
Table E-1
Instruction Clock Count Summary (continued)
Clocks if
Cache Hit
Penalty if
Cache Miss
11 010 reg
mod 010 r/m
3
4
6
1101001w
1101001w
1100000w
1100000w
11 010 reg
mod 010 r/m
11 010 reg
mod 010 r/m
8 to 30
9 to 31
8 to 30
9 to 31
1101000w
1101000w
11 011 reg
mod 011 r/m
3
4
1101001w
1101001w
1100000w
1100000w
11 011 reg
mod 011 r/m
11 011 reg
mod 011 r/m
8 to 30
9 to 31
8 to 30
9 to 31
11110010
1010110w
INSTRUCTION
FORMAT
RCL = Rotate thru Carry Left
reg by 1
memory by 1
1101000w
1101000w
reg by CL
memory by CL
reg by immediate count
mem by immediate count
RCR = Rotate thru Carry Right
reg by 1
memory by 1
reg by CL
memory by CL
reg by immediate count
mem by immediate count
REP = Repeat String Instruction
REP LODS = Load String
c=0
c>0
immediate 8-bit data
immediate 8-bit data
immediate 8-bit data
immediate 8-bit data
if CL≤op. length,
clocks = 8(mem)
or 9 (reg)
else clocks = 9 +
(CL/op. lngth)*7
if CL≤op. length,
clocks = 8(mem)
or 9 (reg)
else clocks = 9 +
(CL/op. lngth)*7
c = (E)CX count
5
7 + 4c
REP INS = Input String
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
11110010
REP MOVS = Load String
c=0
c=1
11110010
6 per 16 bytes on
first load
Assumes string
addresses in diff.
cache sets.
1
Assumes string
addresses in diff.
cache sets.
Assumes string
addresses in diff.
cache sets.
0110110w
16 + 8c
10 + 8c
30 + 8c
29 + 8 c
1010010w
5
13
c>1
REP OUTS = Output String
Real Mode
Protected Mode:
CPL ≤ IOPL
CPL > IOPL
Virtual Mode
11110010
REP STOS = Load String
c=0
c>0
11110010
E-26
6
Notes
12 + 3c
4 per 16 bytes; 1
on first move and
3 on second
17 + 5c
2 per 16 bytes
11 + 5c
31 + 5c
30 + 5c
2 per 16 bytes
2 per 16 bytes
2 per 16 bytes
0110111w
1010101w
5
7 + 4c
Instruction Format and Timing
For all
REP OUTS, the
entire penalty is
on the second
operation.
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
REPE = Repeat if Equal
REPE CMPS = Compare String
c=0
c>0
REPE SCAS = Scan String
c=0
c>0
REPNE = Repeat if Not Equal
REPNE CMPS = Comp. String
c=0
c>0
REPNE SCAS = Scan String
c=0
c>0
REPNZ = Repeat if Not Zero
REPNZ CMPS = Comp. String
c=0
c>0
REPNZ SCAS = Scan String
c=0
c>0
REPZ = Repeat if Zero
REPZ CMPS = Compare String
c=0
c>0
REPZ SCAS = Scan String
c=0
c>0
Clocks if
Cache Hit
FORMAT
Penalty if
Cache Miss
Notes
c = (E)CX count
11110011
1010011w
5
7 + 7c
11110011
6 per 16 bytes; all
on first compare
1010111w
5
7 + 5c
Assumes string
addresses fall in
different cache
sets
4 per 16 bytes; 2
on first and 2 on
second compare
c = (E)CX count
11110010
1010011w
5
7 + 7c
11110010
6 per 16 bytes; all
on first compare
1010111w
5
7 + 5c
Assumes string
addresses fall in
different cache
sets
4 per 16 bytes; 2
on first and 2 on
second compare
c = (E)CX count
11110010
1010011w
5
7 + 7c
11110010
6 per 16 bytes; all
on first compare
1010111w
5
7 + 5c
Assumes string
addresses fall in
different cache
sets
4 per 16 bytes; 2
on first and 2 on
second compare
c = (E)CX count
11110011
1010011w
5
7 + 7c
11110011
6 per 16 bytes; all
on first compare
1010111w
5
7 + 5c
Instruction Format and Timing
Assumes string
addresses fall in
different cache
sets
4 per 16 bytes; 2
on first and 2 on
second compare
E-27
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
RET = Return
within segment
Adding imm. to SP
intersegment
11000011
11000010
11001011
16-bit displacement
to same level
to outer level
intersegment add imm. to SP
11001010
16-bit displacement
to same level
to outer level
ROL = Rotate Left
reg by 1
memory by 1
reg by CL
memory by CL
reg by immediate count
mem by immediate count
1101000w
1101000w
1101001w
1101001w
1100000w
1100000w
11 000 reg
mod 000 r/m
11 000 reg
mod 000 r/m
11 000 reg
mod 000 r/m
ROR = Rotate Right
reg by 1
memory by 1
reg by CL
memory by CL
reg by immediate count
mem by immediate count
1101000w
1101000w
1101001w
1101001w
1100000w
1100000w
11 001 reg
mod 001 r/m
11 001 reg
mod 001 r/m
11 001 reg
mod 001 r/m
SAHF = Store AH into Flags
10011110
SAL = Shift Arithmetic Left
reg by 1
memory by 1
reg by CL
memory by CL
reg by immediate count
mem by immediate count
1101000w
1101000w
1101001w
1101001w
1100000w
1100000w
11 100 reg
mod 100 r/m
11 100 reg
mod 100 r/m
11 100 reg
mod 100 r/m
SAR = Shift Arithmetic Right
reg by 1
memory by 1
reg by CL
memory by CL
reg by immediate count
mem by immediate count
1101000w
1101000w
1101001w
1101001w
1100000w
1100000w
11 111 reg
mod 111 r/m
11 111 reg
mod 111 r/m
11 111 reg
mod 111 r/m
E-28
Clocks if
Cache Hit
Penalty if
Cache Miss
5
5
13
5
5
8
17
35
9
12
14
8
18
36
9
12
immediate 8-bit data
immediate 8-bit data
3
4
3
4
2
4
immediate 8-bit data
immediate 8-bit data
3
4
3
4
2
4
6
6
6
6
6
6
2
immediate 8-bit data
immediate 8-bit data
3
4
3
4
2
4
immediate 8-bit data
immediate 8-bit data
3
4
3
4
2
4
Instruction Format and Timing
6
6
6
6
6
6
Notes
Real Mode;
assumes mem.
rd, stack push/
pop, and branch
in diff. cache
sets.
Protected Mode;
add 11 clocks per
unaccessed
descripter load.
Real Mode;
assumes mem.
rd, stack push/
pop, and branch
in diff. cache
sets.
Protected Mode;
add 11 clocks per
unaccessed
descripter load.
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
SBB = Subtract with Borrow
reg1 to reg2
reg2 to reg1
memory to register
register to memory
immediate to register
immediate to accumulator
immediate to memory
0001100w
0001101w
0001101w
0001100w
100000sw
0001110w
100000sw
SCAS = Scan String
SCASB = Scan Byte
SCASD = Scan Dword
SCASW = Scan Word
Segment Override
CS
DS
ES
FS
GS
SS
SETA = Set Byte if Above
Register
True
False
Memory
True
False
SETAE = Set Byte if Above or
Equal
Register
True
False
Memory
True
False
SETB = Set Byte if Below
Register
True
False
Memory
True
False
SETBE = Set Byte if Below or
Equal
Register
True
False
Memory
True
False
Clocks if
Cache Hit
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 011 reg
immediate data
mod 011 r/m
immediate register
immediate data
1
1
2
3
1
1
3
1010111w
6
00101110
00111110
00100110
01100100
01100101
00110110
1
1
1
1
1
1
Penalty if
Cache Miss
Notes
2
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
2
Prefix
00001111
10010111
11 000 reg
4
3
00001111
10010111
mod 000 r/m
3
4
00001111
10010011
11 000 reg
4
3
00001111
10010011
mod 000 r/m
3
4
00001111
10010010
11 000 reg
4
3
00001111
10010010
mod 000 r/m
3
4
00001111
10010110
11 000 reg
4
3
00001111
10010110
mod 000 r/m
3
4
Instruction Format and Timing
E-29
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
SETC = Set Byte if Carry
Register
True
False
Memory
True
False
SETE = Set Byte Short if Equal
Register
True
False
Memory
True
False
SETG = Set Byte if Greater
Register
True
False
Memory
True
False
SETGE = Set Byte if Greater or
Equal
Register
True
False
Memory
True
False
SETL = Set Byte if Less
Register
True
False
Memory
True
False
SETLE = Set Byte if Less or Equal
Register
True
False
Memory
True
False
SETNA = Set Byte if Not Above
Register
True
False
Memory
True
False
E-30
Clocks if
Cache Hit
FORMAT
00001111
10010010
11 000 reg
4
3
00001111
10010010
mod 000 r/m
3
4
00001111
10010100
11 000 reg
4
3
00001111
10010100
mod 000 r/m
3
4
00001111
10011111
11 000 reg
4
3
00001111
10011111
mod 000 r/m
3
4
00001111
10011101
11 000 reg
4
3
00001111
10011101
mod 000 r/m
3
4
00001111
10011100
11 000 reg
4
3
00001111
10011100
mod 000 r/m
3
4
00001111
10011110
11 000 reg
4
3
00001111
10011110
mod 000 r/m
3
4
00001111
10010110
11 000 reg
4
3
00001111
10010110
mod 000 r/m
3
4
Instruction Format and Timing
Penalty if
Cache Miss
Notes
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
SETNAE = Set Byte if Not Above
or Equal
Register
True
False
Memory
True
False
SETNB = Set Byte if Not Below
Register
True
False
Memory
True
False
SETNBE = Set Byte if Not Below
or Equal
Register
True
False
Memory
True
False
SETNC = Set Byte if Not Carry
Register
True
False
Memory
True
False
SETNE = Set Byte if Not Equal
Register
True
False
Memory
True
False
SETNG = Set Byte if Not Greater
Register
True
False
Memory
True
False
SETNGE = Set Byte if Not Greater
or Equal
Register
True
False
Memory
True
False
Clocks if
Cache Hit
FORMAT
00001111
10010010
Penalty if
Cache Miss
Notes
11 000 reg
4
3
00001111
10010010
mod 000 r/m
3
4
00001111
10010011
11 000 reg
4
3
00001111
10010011
mod 000 r/m
3
4
00001111
10010111
11 000 reg
4
3
00001111
10010111
mod 000 r/m
3
4
00001111
10010011
11 000 reg
4
3
00001111
10010011
mod 000 r/m
3
4
00001111
10010101
11 000 reg
4
3
00001111
10010101
mod 000 r/m
3
4
00001111
10011110
11 000 reg
4
3
00001111
10011110
mod 000 r/m
3
4
00001111
10011100
11 000 reg
4
3
00001111
10011100
mod 000 r/m
3
4
Instruction Format and Timing
E-31
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
SETNL = Set Byte if Not Less
Register
True
False
Memory
True
False
SETNLE = Set Byte if Not Less/
Equal
Register
True
False
Memory
True
False
SETNO = Set Byte if Not Overflow
Register
True
False
Memory
True
False
SETNP = Set Byte if Not Parity
Register
True
False
Memory
True
False
SETNS = Set Byte if Not Sign
Register
True
False
Memory
True
False
SETNZ = Set Byte if Not Zero
Register
True
False
Memory
True
False
SETO = Set Byte if Overflow
Register
True
False
Memory
True
False
E-32
Clocks if
Cache Hit
FORMAT
00001111
10011101
11 000 reg
4
3
00001111
10011101
mod 000 r/m
3
4
00001111
10011111
11 000 reg
4
3
00001111
10011111
mod 000 r/m
3
4
00001111
10010001
11 000 reg
4
3
00001111
10010001
mod 000 r/m
3
4
00001111
10011011
11 000 reg
4
3
00001111
10011011
mod 000 r/m
3
4
00001111
10011001
11 000 reg
4
3
00001111
10011001
mod 000 r/m
3
4
00001111
10010101
11 000 reg
4
3
00001111
10010101
mod 000 r/m
3
4
00001111
10010000
11 000 reg
4
3
00001111
10010000
mod 000 r/m
3
4
Instruction Format and Timing
Penalty if
Cache Miss
Notes
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
SETP = Set Byte if Parity
Register
True
False
Memory
True
False
SETPE = Set Byte if Parity Even
Register
True
False
Memory
True
False
SETPO = Set Byte if Parity Odd
Register
True
False
Memory
True
False
SETS = Set Byte if Sign
Register
True
False
Memory
True
False
SETZ = Set Byte if Zero
Register
True
False
Memory
True
False
Clocks if
Cache Hit
FORMAT
00001111
10011010
Penalty if
Cache Miss
Notes
11 000 reg
4
3
00001111
10011010
mod 000 r/m
3
4
00001111
10011010
11 000 reg
4
3
00001111
10011010
mod 000 r/m
3
4
00001111
10011011
11 000 reg
4
3
00001111
10011011
mod 000 r/m
3
4
00001111
10011000
11 000 reg
4
3
00001111
10011000
mod 000 r/m
3
4
00001111
10010100
11 000 reg
4
3
00001111
10010100
mod 000 r/m
3
4
SGDT = Store Global Descriptor
Table Register
00001111
00000001
SHL = Shift Logical Left
reg by 1
memory by 1
reg by CL
memory by CL
reg by immediate count
mem by immediate count
1101000w
1101000w
1101001w
1101001w
1100000w
1100000w
SHLD = Shift Left Double
Precision
reg by immediate count
mem by immediate count
reg by CL
memory by CL
00001111
00001111
00001111
00001111
mod 000 r/m
10
11 100 reg
mod 100 r/m
11 100 reg
mod 100 r/m
11 100 reg
mod 100 r/m
immediate 8-bit data
immediate 8-bit data
3
4
3
4
2
4
10100100
10100100
10100101
10100101
11 reg2 reg1 imm. 8-bit
mod reg r/m imm. 8-bit
11 reg2 reg1
mod reg r/m
2
3
3
4
Instruction Format and Timing
6
6
6
6
5
E-33
AMD
Table E-1
Instruction Clock Count Summary (continued)
INSTRUCTION
FORMAT
SHR = Shift Logical Right
reg by 1
memory by 1
reg by CL
memory by CL
reg by immediate count
mem by immediate count
1101000w
1101000w
1101001w
1101001w
1100000w
1100000w
Clocks if
Cache Hit
11 101 reg
mod 101 r/m
11 101 reg
mod 101 r/m
11 101 reg
mod 101 r/m
immediate 8-bit data
immediate 8-bit data
3
4
3
4
2
4
00001111
00001111
00001111
00001111
10101100
10101100
10101101
10101101
11 reg2 reg1 imm. 8-bit
mod reg r/m imm. 8-bit
11 reg2 reg1
mod reg r/m
2
3
3
4
SIDT = Store Interrupt Descriptor
Table Register
00001111
00000001
mod 001 r/m
10
SLDT = Store Local Descriptor
Table Register to register
Table Register to memory
00001111
00001111
00000000
00000000
11 000 reg
mod 000 r/m
2
3
00000001
00000001
11 100 reg
mod 000 r/m
2
3
Penalty if
Cache Miss
Notes
6
6
6
SHRD = Shift Right Double Precision
reg by immediate count
mem by immediate count
reg by CL
memory by CL
6
5
SMSW = Store Machine Status Word
To register
To memory
00001111
00001111
STC = Set Carry Flag
11111001
2
STD = Set Direction Flag
11111101
2
STI = Set Interrupt-Enable Flag
11111011
2
STOS = Store String
STOSB = Store String Byte
STOSD = Store String Dword
STOSW = Store String Word
1010101w
5
STR = Store Task Register
To register
To memory
00001111
00001111
00000000
00000000
SUB = Subtract
reg1 to reg2
reg2 to reg1
memory to register
register to memory
immediate to register
immediate to accumulator
immediate to memory
0010100w
0010101w
0010101w
0010100w
100000sw
0010110w
100000sw
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 101 reg
immediate data
mod 101 r/m
TEST = Logical Compare
reg1 and reg2
memory and register
immediate and register
immediate and accumulator
immediate and memory
1000010w
1000010w
1111011w
1010100w
1111011w
11 reg1 reg2
mod reg r/m
11 000 reg
immediate data
mod 000 r/m
immediate data
immediate data
1
2
1
1
2
VERR = Verify Read
Register
Memory
00001111
00001111
00000000
00000000
11 100 r/m
mod 100 r/m
11
11
3
7
VERW = Verify Write
Register
Memory
00001111
00001111
00000000
00000000
11 101 reg
mod 101 r/m
11
11
3
7
WAIT = Wait
10011011
E-34
11 001 reg
mod 001 r/m
immediate register
immediate data
2
3
1
1
2
3
1
1
3
1 to 3
Instruction Format and Timing
2
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
2
2
AMD
Table E-1
Instruction Clock Count Summary (continued)
Clocks if
Cache Hit
INSTRUCTION
FORMAT
WBINVD = Writeback and
Invalidate Cache
00001111
00001001
00001111
00001111
1100000w
1100000w
1000011w
10010 reg
1000011w
11 reg1 reg2
XADD = Exchange and Add
reg1, reg2
memory, reg
Penalty if
Cache Miss
Notes
6/2
No LOCK/LOCK
5
11 reg2 reg1
mod reg r/m
3
4
XCHG = Exchange
reg1 with reg1
Accumulator with reg
Memory with reg
XLAT/XLATB = Table Look-Up
Translation
XOR = Logical Exclusive OR
reg1 to reg2
reg2 to reg1
memory to register
register to memory
immediate to register
immediate to accumulator
immediate to memory
3
3
5
mod reg r/m
11010111
0011000w
0011001w
0011001w
0011000w
100000sw
0011010w
100000sw
4
11 reg1 reg2
11 reg1 reg2
mod reg r/m
mod reg r/m
11 110 reg
immediate data
mod 110 r/m
immediate register
immediate data
Instruction Format and Timing
1
1
2
3
1
1
3
2
2
6/2
No LOCK/LOCK
6/2
No LOCK/LOCK
E-35
AMD
Figure E-1
General Instruction Format
E.3
General Instruction Encoding
Figure E-1 shows the general instruction format. All instruction encodings are subsets of
this format. Instructions include one or two primary opcode bytes, possibly an address
specifier consisting of the “mod r/m” byte and “scale-index-base” byte, a displacement if
required, and an immediate data field if required. Within the primary opcode or opcodes,
smaller encoding fields can be defined. These fields vary according to the class of operation.
The fields define such information as direction of the operation, size of the displacements,
register encoding, or sign extension.
Almost all instructions referring to an operand in memory have an addressing mode byte
following the primary opcode byte(s). This byte, the mod r/m byte, specifies the address
mode to be used. Certain encodings of the mod r/m byte indicate a second addressing
byte, the scale-index-base byte, follows the mod r/m byte to fully specify the addressing
mode. Addressing modes can include a displacement immediately following the mod r/m
byte, or scaled index byte. If a displacement is present, the possible sizes are 8, 16, or 32
bits. If the instruction specifies an immediate operand, the immediate operand follows any
displacement bytes. The immediate operand, if specified, is always the last field of the
instruction.
Several smaller fields also appear in certain instructions, sometimes within the opcode
bytes themselves. Table E-2 is a complete list of all fields appearing in the Am486 microprocessor instruction set. Detailed tables for each field follow this table.
Table E-2
Instruction Fields
Field Name
No. of bits
Ref. Table
w
Specifies whether data is byte or full size word/dword
1
E-3
d
Specifies direction of data operation
1
E-4
s
Specifies whether the immediate field must be sign-extended
1
E-5
Specifies general register
3
E-6
2 (mod) or 3 (r/m)
E-7
Specifies scale factor for scaled indexed address mode
2
E-8
index
Specifies General Register to use as Index Register
3
E-9
base
Specifies General Register to use as Base Register
3
E-10
reg
mod r/m
ss
E-36
Description
Specifies address mode (effective address can be general register)
Instruction Format and Timing
AMD
Table E-3
Table E-4
Operand Length Field (w) Definitions
Value (w=)
16-Bit Operations
32-Bit Operations
0
8 bits
8 bits
1
16 bits
32 bits
Direction Field (d) Definitions
Value (d=)
0
1
Table E-5
Register/Memory ← Register
“reg” = Source operand
“mod r/m” or “mod s-i-b” = Destination operand
Register ← Register/Memory
“reg” = Destination operand
“mod r/m” or “mod s-i-b” = Source operand
Sign-Extend Field (s) Definitions
Value (s=)
Table E-6
Operation Direction
Effect on Immediate Byte
Effect on Immediate Word/Dword
0
None
None
1
Sign-Extend immediate byte to fill word or
dword destination.
None
General Register Field (reg) Definitions
General Register Selected
16-Bit Data Operations
Value (reg=)
32-Bit Data Operations
No w field
w=0
w =1
No w field
w =0
w =1
000
AX
AL
AX
EAX
AL
EAX
001
CX
CL
CX
ECX
CL
ECX
010
DX
DL
DX
EDX
DL
EDX
011
BX
BL
BX
EBX
BL
EBX
100
SP
AH
SP
ESP
AH
ESP
101
BP
CH
BP
EBP
CH
EBP
110
SI
DH
SI
ESI
DH
ESI
111
DI
BH
DI
EDI
BH
EDI
Instruction Format and Timing
E-37
AMD
Table E-7
Address Mode Field (mod/rm) Definitions (no s-i-b present)
Effective Address
E-38
Value
(mod r/m =)
16-Bit Address Mode
32-Bit Address Mode
00 000
DS:[BX + SI]
DS:[EAX]
00 001
DS:[BX + DI]
DS:[ECX]
00 010
SS:[BP + SI]
DS:[EDX]
00 011
SS:[BP + DI]
DS:[EBX]
00 100
DS:[SI]
s-i-b present (see Tables E-8 through E-10)
00 101
DS:[DI]
DS:immediate dword
00 110
DS:immediate word
DS:[ESI]
00 111
DS:[BX ]
DS:[EDI]
01 000
DS:[BX + SI + immediate byte]
DS:[EAX + immediate byte]
01 001
DS:[BX + DI + immediate byte]
DS:[ECX + immediate byte]
01 010
SS:[BP + SI + immediate byte]
DS:[EDX + immediate byte]
01 011
SS:[BP + DI + immediate byte]
DS:[EBX + immediate byte]
01 100
DS:[SI + immediate byte]
s-i-b present (see Tables E-8 through E-10)
01 101
DS:[DI + immediate byte]
SS:[EBP + immediate byte]
01 110
SS:[BP + immediate byte]
DS:[ESI + immediate byte]
01 111
DS:[BX + immediate byte]
DS:[EDI + immediate byte]
10 000
DS:[BX + SI + immediate word]
DS:[EAX + immediate dword]
10 001
DS:[BX + DI + immediate word]
DS:[ECX + immediate dword]
10 010
SS:[BP + SI + immediate word]
DS:[EDX + immediate dword]
10 011
SS:[BP + DI + immediate word]
DS:[EBX + immediate dword]
10 100
DS:[SI + immediate word]
s-i-b present (see Tables E-8 through E-10)
10 101
DS:[DI + immediate word]
SS:[EBP + immediate dword]
10 110
SS:[BP + immediate word]
DS:[ESI + immediate dword]
10 111
DS:[BX + immediate word]
DS:[EDI + immediate dword]
The following values
specify General
Registers
16-Bit Data Operations
32-Bit Data Operations
w=0
w =1
w =0
w =1
11 000
AL
AX
AL
EAX
11 001
CL
CX
CL
ECX
11 010
DL
DX
DL
EDX
11 011
BL
BX
BL
EBX
11 100
AH
SP
AH
ESP
11 101
CH
BP
CH
EBP
11 110
DH
SI
DH
ESI
11 111
BH
DI
BH
EDI
Instruction Format and Timing
AMD
Table E-8
Table E-9
Scale Field (ss) Definitions
Value (ss=)
Scale Factor
00
x1
01
x2
10
x4
11
x8
Index Field (index) Definitions
Value (index=)
Indexed Register
000
EAX
001
ECX
010
EDX
011
EBX
100
no index register
101
EBP
110
ESI
111
EDI
Note: When index = 100, the ss field must equal 00. If not, the effective address is undefined.
Table E-10
Base Field (base) Definitions
mod r/m =
Value (base=)
Effective Address
00 100
000
DS:[EAX + (scaled index)]
00 100
001
DS:[ECX + (scaled index)]
00 100
010
DS:[EDX + (scaled index)]
00 100
011
DS:[EBX + (scaled index)]
00 100
100
SS:[ESP + (scaled index)]
00 100
101
DS:[immediate dword + (scaled index)]
00 100
110
DS:[ESI + (scaled index)]
00 100
111
DS:[EDI + (scaled index)]
01 100
000
DS:[EAX + (scaled index) + immediate byte]
01 100
001
DS:[ECX + (scaled index) + immediate byte]
01 100
010
DS:[EDX + (scaled index) + immediate byte]
01 100
011
DS:[EBX + (scaled index) + immediate byte]
01 100
100
SS:[ESP + (scaled index) + immediate byte]
01 100
101
SS:[EBP + (scaled index) + immediate byte]
01 100
110
DS:[ESI + (scaled index) + immediate byte]
01 100
111
DS:[EDI + (scaled index) + immediate byte]
10 100
000
DS:[EAX + (scaled index) + immediate dword]
10 100
001
DS:[ECX + (scaled index) + immediate dword]
10 100
010
DS:[EDX + (scaled index) + immediate dword]
Instruction Format and Timing
E-39
AMD
Table E-10
E.4
Base Field (base) Definitions (continued)
mod r/m =
Value (base=)
Effective Address
10 100
011
DS:[EBX + (scaled index) + immediate dword]
10 100
100
SS:[ESP + (scaled index) + immediate dword]
10 100
101
SS:[EBP + (scaled index) + immediate dword]
10 100
110
DS:[ESI + (scaled index) + immediate dword]
10 100
111
DS:[EDI + (scaled index) + immediate dword]
ENCODING OF FLOATING-POINT INSTRUCTION FIELDS
Instructions for the FPU assume one of the five forms shown in Figure E-2. The s-i-b
(scale index base) byte and displacement are optionally present in instructions that have
mod and r/m fields. Their presence depends on the values of mod and r/m.
Figure E-2
E-40
Floating-Point Instruction Format
Instruction Format and Timing
APPENDIX
F
NUMERIC EXCEPTION SUMMARY
The following table lists the numeric (floating point; real) instruction mnemonics in alphabetical order. For each mnemonic, it summarizes the exceptions that the instruction can
generate. When writing numeric programs that may be used in an environment that employs
numeric exception handlers, assembly-language programmers should be aware of the
possible exceptions generated by each instruction in order to determine the need for exception synchronization.
Table F-1
Exception Summary for Floating-Point Instructions
Mnemonic
Instruction
IS
I
D
Y
Y
Y
Y
Z
F2XM1
2x–1
Y
FABS
Absolute value
Y
FADD(P)
Add real (and pop)
Y
FBLD
Load BCD
Y
FBSTP
Store BCD and pop
Y
FCHS
Change sign
Y
FCLEX
Clear exceptions
FCOM(P)(P)
Compare real (and pop) (and pop)
Y
Y
Y
FCOS
Cosine
Y
Y
Y
FDECSTP
Decrement stack top pointer
FDIV(R)(P)
Divide real (or reverse divide) (and pop)
FFREE
Free register
Y
Y
Y
Y
FIADD
Add integer
Y
Y
Y
FICOM(P)
Compare integer (and pop)
Y
Y
FIDIV
Divide integer
Y
FIDIVR
Reverse divide integer
FILD
O
Y
U
P
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Load integer
Y
Y
Y
Y
Y
Y
Y
FIMUL
Multiply integer
Y
Y
Y
Y
Y
Y
Y
FINCSTP
Increment stack pointer
Y
Y
Y
Y
Y
Y
Y
FINIT
Initialize FPU
Y
Y
Y
Y
Y
Y
Y
FIST(P)
Store integer (and pop)
Y
Y
Y
Y
Y
Y
Y
FISUB(R)
Subtract integer (or reverse subtract)
Y
Y
Y
Y
Y
Y
Y
FLD1
Load constant +1.0
Y
Y
Y
Y
Y
Y
Y
FLD
Load real
Y
Y
Y
Y
Y
Y
Y
FLDCW
Load control word
Y
Y
Y
Y
Y
Y
Y
FLDENV
Load FPU environment
Y
Y
Y
Y
Y
Y
Y
FLDL2E
Load constant log2e
Y
Y
Y
Y
Y
Y
Y
FLDL2T
Load constant log210
Y
Y
Y
Y
Y
Y
Y
Numeric Exception Summary
F-1
AMD
Table F-1
Exception Summary for Floating-Point Instructions (continued)
Mnemonic
Instruction
I
D
Z
O
U
P
FLDLG2
Load constant log102
Y
Y
Y
Y
Y
Y
Y
FLDLN2
Load constant loge2
Y
Y
Y
Y
Y
Y
Y
FLDPI
Load constant π
Y
Y
Y
Y
Y
Y
Y
FLDZ
Load constant + 0.0
Y
Y
Y
Y
Y
Y
Y
FMUL(P)
Multiply real (and pop)
Y
Y
Y
Y
Y
Y
Y
FNOP
No operation
Y
Y
Y
Y
Y
Y
Y
FPATAN
Partial arctangents
Y
Y
Y
Y
Y
Y
Y
FPREM1
Partial remainder (IEEE 754 compliant)
Y
Y
Y
Y
Y
Y
Y
FPREM
Partial remainder
Y
Y
Y
Y
Y
Y
Y
FPTAN
Partial tangent
Y
Y
Y
Y
Y
Y
Y
FRNDINT
Round to integer
Y
Y
Y
Y
Y
Y
Y
FRSTOR
Restore state
Y
Y
Y
Y
Y
Y
Y
FSAVE
Store state
Y
Y
Y
Y
Y
Y
Y
FSCALE
Scale
Y
Y
Y
Y
Y
Y
Y
FSIN
Sine
Y
Y
Y
Y
Y
Y
Y
FSINCOS
Sine and cosine
Y
Y
Y
Y
Y
Y
Y
FSQRT
Square root
Y
Y
Y
Y
Y
Y
Y
FST(P)
stack or extended
Store real (and pop)
Y
Y
Y
Y
Y
Y
Y
FST(P)
single or double
Store real (and pop)
Y
Y
Y
Y
Y
Y
Y
FSTCW
Store control word
Y
Y
Y
Y
Y
Y
Y
FSTENV
Store environment
Y
Y
Y
Y
Y
Y
Y
FSTSW
Store status word
Y
Y
Y
Y
Y
Y
Y
FSUB(R)(P)
Subtract real (or reverse subtract) (and pop)
Y
Y
Y
Y
Y
Y
Y
FTST
Test
Y
Y
Y
Y
Y
Y
Y
FUCOM
Unordered compare real
Y
Y
Y
Y
Y
Y
Y
FUCOMP
Unordered compare real and pop
Y
Y
Y
Y
Y
Y
Y
FUCOMP
Unordered compare real and pop twice
Y
Y
Y
Y
Y
Y
Y
FWAIT
Wait
Y
Y
Y
Y
Y
Y
Y
FXAM
Examine stack top
Y
Y
Y
Y
Y
Y
Y
FXCH
Exchange register
Y
Y
Y
Y
Y
Y
Y
FXTRACT
Extract exponent and significand
Y
Y
Y
Y
Y
Y
Y
FYL2X
Y ⋅ log2x
Y
Y
Y
Y
Y
Y
Y
FYL2XP1
Y ⋅ log2(x+1)
Y
Y
Y
Y
Y
Y
Y
Exception Description:
IS — Invalid operand due to stack overflow/underflow
I
— Invalid operand due to other cause
D — Denormal operand
Z — Zero-divide
O — Overflow
U — Underflow
P — Inexact result (precision)
F-2
IS
Numeric Exception Summary
APPENDIX
G
CODE OPTIMIZATION
The Am486 processor is binary-compatible with 386 processors. Only three new application-level instructions, useful in special situations, are added. Any existing 8086/8088,
80286, and 386 processor applications can execute on the 486 processor immediately
without any modification or recompilation. Any compiler that currently generates code for
the 386 processor family can generate code that runs on the 486 processor without modification. There are, however, certain code-optimization techniques that make applications
execute faster on the 486 processor. The techniques rely upon instruction sequence selection and instruction scheduling to take advantage of the 486 processor internal pipelined
execution units and the on-chip cache.
G.1
ADDRESSING MODES
Like 386 processors, the 486 processor needs an additional clock cycle to generate an
effective address when using an index register. Therefore, if you use only one indexing
component and no scaling is necessary, it is faster to use a register as the base. For
example:
MOV EAX, [ESI]; use ESI as base
MOV EAX, [ESI*]; use ESI as index, 1 clock penalty
If you use a base and an index, or if you require scale indexing, it is faster to use the
combined addressing mode, in spite of the one clock penalty.
When you use a register as the base component, you use an additional clock cycle if the
register is the destination of the immediately preceding instruction (assuming all instructions
are in the prefetch queue). For best performance, separate the two instructions by at least
one other instruction. For example:
ADD ESI, EAX; ESI is destination register
MOV EAX, [ESI]; ESI is base, 1 clock penalty
There are other hidden or implicit usages of destination and base registers, primarily the
stack pointer register ESP. The ESP register is the implicit base of all PUSH/POP/RET
instructions and it is the implicit destination for the CALL/ENTER/LEAVE/RET/PUSH/POP
instruction. Therefore, a LEAVE instruction followed immediately by a RET instruction will
use one additional clock. But if the LEAVE and RET are rearranged so that they are separated by another instruction, then no such penalty is entailed. (See other recommendations
regarding the LEAVE instruction.) It is not necessary to separate back-to-back PUSH/POP
instructions. The 486 processor allows this sequence without incurring an additional clock.
All such instruction rearrangements of the instructions will not affect the performance of
386 processors.
The 486 processor also takes an additional clock to execute an instruction that has both
an immediate data field and a memory offset field. For example:
MOV dword ptr FOO, 1234h; both immediate and memory offset
MOV dword ptr BAX, 1234h
MOV [EBP–200], 1234h
Code Optimization
G-1
AMD
When it is necessary to use constants, it would still be more efficient to use immediate data
instead of loading the constant into a register first. But if the same immediate data is used
more than once, then it would be faster to load the constant in a register and then use the
register multiple times. This optimization will not affect the performance of 386 processors.
The following sequence is faster than the one above, if all instructions are in the prefetch
queue, and because the instructions are shorter, it will actually make it easier to prefetch:
MOV
MOV
MOV
MOV
G.2
EAX, 1234h
dword ptr FOO, EAX; FOO IS VARIABLE
dword ptr BAZ, EAX; BAZ IS VARIABLE
[EBP–2300], EAX
PREFETCH UNIT
The 486 processor prefetch unit accesses the on-chip cache to fill the prefetch queue
whenever the cache is idle, and there is enough room in the queue for another cache line
(16 bytes). If the prefetch queue becomes empty, it can take up to three additional clocks
to start the next instruction. The prefetch queue is 32 bytes in size (2 cache lines).
Because data accesses always have priority over prefetch requests, keeping the cache
busy with data accesses can lock out the prefetch unit.
Therefore, arrange instructions so that the memory bus is not used continuously by a series
of memory reference instructions. Arrange the instructions so that there is a non-memoryreferencing instruction (such as a register/register instruction) at least two clocks before
the prefetch queue becomes full. This allows the prefetch unit to transfer a cache line into
the queue. For example:
Instruction
Length
MOV mem, 1234567h
MOV mem, 1234567h
MOV mem, 1234567h
MOV mem, 1234567h
MOV mem, 1234567h
ADD reg, reg
10 bytes
10 bytes
10 bytes
10 bytes
10 bytes
2 bytes
If the prefetch queue started out full, then by the third MOV instruction, there is enough
room for another cache line in the queue, but because the memory bus is continuously
used, there is no time for the transfer from the cache to the prefetch queue. If you do not
insert a non-memory instruction before or after the third MOV instruction, the queue is
exhausted by the fourth MOV instruction. In this case, rearrange the instructions so that
the ADD instruction is before or after the third MOV instruction. This allows the cache to
transfer another instruction line to the prefetch unit.
Note: Rearranging the instructions has no effect on 386 processor performance.
G.3
CACHE AND CODE ALIGNMENT
The prefetch unit in a 386 processor fetches four bytes at a time on aligned boundaries;
therefore, align the destination of any JUMP/CALL/RET instruction on a 0-mod-4 address
to help the prefetch unit fill the prefetch queue as quickly as possible. The 486 processor
fetches 16 bytes by using the on-chip cache; therefore, align JUMP/CALL/RET destinations
at 0-mod-16 addresses for better performance.
The drawback of the 0-mod-16 alignment is that it causes code to grow bigger, requiring
you to balance execution speed and code size. The recommended compromise is to align
function entry addresses (that is, CALL destinations) on a 0-mod-16 address, but to align
labels (that is, JUMP destinations) on a 0-mod-4 address.
G-2
Code Optimization
AMD
On the 486 processor, it takes up to five additional clocks to start executing an instruction
if it splits across two 16-byte cache lines. For example, if a CALL instruction ends at address
0x0000000E and the next instruction is a multiple-byte instruction, then when processing
returns from the CALL, the processor takes five additional clocks to fill the prefetch queue
if the target instruction is not already in the cache. Even if the instruction is in the cache,
the processor requires two clocks to transfer it into the prefetch unit. In this situation, it is
faster to insert a filler instruction (either by rearranging the instructions or adding a NOP
instruction) so that the multiple-byte instruction starts on an aligned address. This instruction
alignment also improves 386 processor performance.
G.4
NOP INSTRUCTIONS
Sometimes programs need fillers between instructions to align them. On 386 and 486
processors, the one-byte NOP instruction (an exchange EAX with EAX) performs this
function. You can also use other instructions to provide different length instructions with a
single clock, as shown below:
INC
MOV
LEA
MOV
ADD
LEA
reg
reg,reg
reg,0[reg]
EAX,0
EAX,0
reg,0[EAX]
;
;
;
;
;
;
1
2
3
5
5
6
byte — modifies register and flags
bytes — true NOP
bytes — true NOP, uses 8-bit displacement
bytes — modifies eax register
bytes — modifies flags
bytes — true NOP, uses 32-bit displacement
Many of the 386/486 instructions can perform this function, using several forms and lengths,
different-sized immediate data, or different-sized memory offsets. Some of the instructions
have shorter forms if the destination register is EAX/AX/AL. The different forms may use
different clocks. For example, PUSH/POP instructions use one clock in the 1-byte form,
but use four clocks when coded in the 2-byte form.
NOP replacement instructions execute faster than the XCHG instruction on 386 processors.
Using different forms of the same instruction does not affect 386 processor performance.
G.5
INTEGER INSTRUCTIONS
Most frequently used 486 processor instructions execute in one clock. However, unlike 386
processors, some memory operations take more clocks than corresponding register instructions. For example, for the PUSH MEM instruction:
Instruction
MOV reg,mem
PUSH reg
PUSH mem
386 Processor Clocks
486 Processor Clocks
4
2
5
1
1
4
For the 486 processor, loading a value from memory into a register and then pushing the
register results in a net saving of two clocks. The same sequence in a 386 processor
imposes a one-clock penalty. If however, available registers are limited, you may choose
to sacrifice efficiency to save reusable data stored in the registers.
Another example of 386 versus 486 differences is shown by the LEAVE instruction:
Instruction
MOV ESP,EBP
POP EBP
LEAVE
386 Processor Clocks
486 Processor Clocks
2
4
4
1
1 + 1 (esp. penalty)
5
Code Optimization
G-3
AMD
For the 486 processor, executing the MOV/POP sequence results in a net saving of two
clocks over the LEAVE instruction. On the 386 processor, LEAVE is both faster and shorter.
You can increase the efficiency on the 486 processor by separating the MOV and POP
instruction by one instruction. Because the MOV instruction uses ESP as the destination
register and the POP instruction implicitly uses ESP register as a base, there is an inherent
one-clock penalty. Separating the instructions with a useful instruction results in a net
savings of three clocks over the LEAVE instruction.
Because the 486 processors access operands from registers faster than from memory, it
is important for any compiler to have good register allocation and value tracking optimization
capability. However, unlike RISC architecture, there is no advantage to loading every possible value before using it. The processor performs reg,mem type ALU operations just as
fast as load/op/store sequences. For example, for the assignment:
mem1 = mem1 + mem2
you can use the following instruction sequences, which yield varying total clock counts (11
or 12) on 386 processors, but identical total clock counts (4) on a 486 processor:
Instruction
386 Processor Clocks
486 Processor Clocks
MOV EAX,mem1
MOV EBX,mem2
ADD EAX,EBX
MOV mem1, EAX
4
4
2
2
1
1
1
1
MOV EAX,mem1
ADD EAX,mem2
MOV mem1,EAX
4
6
2
1
2
1
MOV EAX,mem1
ADD mem2,EAX
4
7
1
3
The MOVZX instruction is another example in which the 486 processor executes faster
using simple instructions if the destination is a byte addressable register. For example:
Instruction
MOVZX EAX,mem1
XOR EAX,EAX
MOVB AL,mem1
386 Processor Clocks
486 Processor Clocks
6
2
4
3 + 1 (0Fh prefix)
1
1
For the 486 processor, clearing the register first and then loading the byte value may result
in a net savings of two clocks (depending on whether the prefix decode clock can overlap
the previous instruction), though there is no comparable difference on the 386 processor.
G.6
CONDITION CODES
In some high level languages, it is sometimes necessary to convert the result of a boolean
condition (e.g., equal, greater than, or less than) to a true-false (0/1) value. The flags
registers in 386 and 486 processors normally maintain comparison results. In order to
convert a comparison result to a true/false value, you must convert the flag settings to an
integer value.
G-4
Code Optimization
AMD
The conditional SET instructions can perform the conversions, but require, on a 486 processor, three to four clocks to execute depending on whether the condition tested is true
or false. When comparing unsigned values for greater-than or less-than, there is an optional
sequence to use. For example, if “x” and “y” are both unsigned values loaded into registers
EAX and ECX, respectively, then you can generate the code for “(x < y)” in several ways:
Instruction
386 Processor Clocks
486 Processor Clocks
2
2
7 + m/3
2
1
1
3/1
1
2
4/5
3
2
2
2
1
4/3
3
1
1
1
CMP EAX,ECX
MOV EAX,0
JNB L1
MOV EAX,1
L1:
CMP EAX,ECX
SETB AL
MOVSX EAX,AL
CMP EAX,ECX
SBB EAX,EAX
NEG EAX
Using the SBB instruction to capture the flag settings of an unsigned compare gives the
fastest performance. Because there are no jumps, it does not break the prefetch pipeline.
Although this is specific for the “(x < y)” condition, it is possible to transform other tests to
this form by either negating the condition or by exchanging the operands.
These condition code instruction replacements also improve 386 processor performance.
G.7
STRING INSTRUCTIONS
Like a 386 processor, a 486 processor executes string instructions slower than the load/
store instructions. For example, the LODS instructions:
Instruction
386 Processor Clocks
486 Processor Clocks
MOV EAX,[ESI]
ADD ESI,4
4
2
1
1
LODS
5
4
The LODS instruction loads the string and updates the ESI register. If the register update
is unnecessary, the MOV instructions saves three clocks on 386 and 486 processors. If
code length is more important, however, the LODS instruction is shorter than MOV.
In a non-REPeated instruction, individual MOV instructions are always faster than MOVS.
Even in a REPeated loop, if the loop is small enough, it is faster to use individual load/store
instructions than to set up REPeated MOVS instructions. The tradeoff is speed versus code
space. The REP MOVS loop is shorter, but slower.
Another consideration is that a long sequence of load/store instructions prevents the
prefetch unit from filling the prefetch queue, which slows the processor. To prevent this, do
not move more than 16 bytes using load/store instructions within any sequence. Insert a
non-memory instruction to allow the prefetch unit to access the cache.
Similar optimizations can be made for STOS and other string instructions. Such optimizations also improve 386 processor performance.
G.8
FLOATING-POINT INSTRUCTIONS
Like the 386 processor/387 coprocessor combination, the floating-point unit in the 486
processor is a separate independent execution unit that operates in parallel with the integer
unit. Any instruction sequence that allows the two independent units to execute in parallel
is faster than one that uses sequential processing.
Code Optimization
G-5
AMD
Do not place floating-point instructions in direct sequence. Rearrange instructions so that
non-floating-point instructions separate the floating-point instructions to allow both execution units to operate in parallel. Schedule the integer instructions (by clock counts) so that
they can execute without causing the floating-point unit to wait for its next instruction. These
rearrangements also improve 386/387 processor/coprocessor performance, but the clock
counts for 387 operations are much higher than the 486 floating-point unit.
Note: Use the integer unit and integer instructions for simple floating-point value
arrangement or movement. FWAITs are never required around simple floating-point
instructions.
G.9
PREFIX OPCODES
On 386 and 486 processors, all prefix opcodes require an additional clock to decode. You
can overlap this clock with the execution of the previous instruction if that instruction takes
more than one clock to execute. Because of the decode clock requirement, it is faster to
expand 16-bit operands to 32-bit operands instead of using the 66h prefix to operate on
16-bit operands, for example. Another reason for the conversion is that if an instruction
with a 16-bit destination is followed by an instruction with a 32-bit operand register, there
is another one-clock penalty. If you must use this combination, separate the instructions
with another instruction.
If you must use prefix opcodes, try to rearrange the instructions so that the prefixed instruction executes after a multiple-clock instruction.
G.10
OVERLAPPED CLOCKS
As mentioned before, an instruction may require an extra clock to execute. However, some
of the clock penalties can overlap. In particular, the following combinations overlap:
G.11
n
Having an index register and an immediate field with a memory offset field only incurs
a one-clock penalty.
n
Having a prefix opcode and using the result register of the previous instruction as a base
only incurs a one-clock penalty.
n
Having a prefix opcode after a multiclock instruction does not incur any clock penalty.
MISCELLANEOUS GUIDELINES
The 386 processor instruction design considered certain programming practices. Many of
these considerations apply to 486 processor programming and are applicable to compiler
design as well.
G-6
n
Use the EAX register when possible. Many instructions are 1 byte shorter when using
this register, such as loads and stores to memory with absolute addressing, transfers
between registers with XCHG, and operations using immediate operands.
n
Use the DS register when possible. Instructions that use the DS register are 1 byte
shorter than instructions using the other data segment registers; no data segment prefix
is required.
n
Use short 1-, 2-, and 3-byte instructions when possible. Because 486 processor instructions begin and end on byte boundaries, many instruction encodings are more compact
than those in word-aligned instruction sets. Byte alignment reduces code size and increases execution speed.
n
Use MOVSX and MOVZX to access 16-bit data. These instructions sign-extend and
zero-extend word operands to doubleword length, eliminating the need for an extra
instruction to initialize the high word.
Code Optimization
AMD
n
Use the NMI interrupt when possible for faster interrupt response.
n
Instead of ENTER at lexical level 0, use a code sequence like:
PUSH EBP
MOV EBP,ESP
SUB ESP,byte_count
This executes in seven clock cycles instead of the ten required to execute ENTER.
Optimize systems using the following techniques to enhance system speed after the basic
functions are implemented:
n
If supported by your assembler and acceptable for your application, use the short form
of the JUMP instruction. The short form uses an immediate byte for relative jumps in
the range from 128 bytes back to 127 bytes forward. The assembly generates an error
if it does not support the function. Some assemblers perform this optimization
automatically.
n
Use the ESP register to reference the stack in the deepest level of subroutines. Do not
set up the EBP register and stack frame.
n
For fastest task switching, switch tasks in software; this saves and restores a smaller
processor state.
n
Use the LEA instruction to add registers. If you use a base register and index register,
LEA loads the destination with the sum. You can scale the register contents by 2, 4, or 8.
n
Use the LEA instruction to add a constant to a register. If you use a base register and
a displacement, LEA loads the destination with their sum. You can use LEA with a base
register, index register, scale factor, and displacement.
n
Use integer move instructions to transfer floating-point data.
n
Use RET in the form that takes an immediate value for byte count, rather than an ADD
ESP instruction. It saves one clock cycle and 3 bytes on every subroutine call.
n
If you need to make several references to a variable addressed with a displacement,
load the displacement into a register.
n
For PUSH/POP instructions using an operand in memory, use an equivalent two-instruction sequence to move the operand through a general register before pushing it on the
stack. This saves two clock cycles.
n
For LOOP instructions, use an equivalent decrement and conditional jump instruction
combination. This saves two clock cycles.
n
For JECXZ instructions, use an equivalent compare and conditional jump instruction
combination. This saves one clock cycle.
Code Optimization
G-7
AMD
G-8
Code Optimization
APPENDIX
H
BIOS DATA AREA MAP
When an IBM-compatible personal computer system initializes, the microprocessor, under
the direction of the POST in the BIOS software, creates a BIOS data map at location
000400h. This map is 256 bytes in length (address range from 000400h to 0004FFh). The
BIOS software uses this memory space to store data and environmental control variables.
Programs can access and change the values stored in this area to change the conditions
under which the system operates. The following table identifies the standard contents of
the BIOS data area locations:
Table H-1
BIOS Map Contents
Address
BIOS Service
Description
000400h
INT 14h
Serial Port (COM) 1 — least-significant byte
000401h
INT 14h
Serial Port (COM) 1 — most-significant byte
000402h
INT 14h
Serial Port (COM) 2 — least-significant byte
000403h
INT 14h
Serial Port (COM) 2 — most-significant byte
000404h
INT 14h
Serial Port (COM) 3 — least-significant byte
000405h
INT 14h
Serial Port (COM) 3 — most-significant byte
000406h
INT 14h
Serial Port (COM) 4 — least-significant byte
000407h
INT 14h
Serial Port (COM) 4 — most-significant byte
000408h
INT 17h
Parallel Port (LPT) 1 — least-significant byte
000409h
INT 17h
Parallel Port (LPT) 1 — most-significant byte
00040Ah
INT 17h
Parallel Port (LPT) 2 — least-significant byte
00040Bh
INT 17h
Parallel Port (LPT) 2 — most-significant byte
00040Ch
INT 17h
Parallel Port (LPT) 3 — least-significant byte
00040Dh
INT 17h
Parallel Port (LPT) 3 — most-significant byte
00040Eh
POST
Extended BIOS Data Area Segment — least-significant byte
00040Fh
POST
Extended BIOS Data Area Segment — most-significant byte
BIOS Data AREA map
H-1
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
000410h –
000411h
INT 11h
000412h
POST
000413h
INT 12h
Memory size in KB — least-significant byte
000414h
INT 12h
Memory size in KB — most-significant byte
000415h –
000416h
000417h
H-2
Description
Equipment List:
Bits
Definition
15 – 14 Number of installed parallel adapters
00 = None
01 = One
10 = Two
11 = Three
13 – 12 Reserved
11 – 9 Number of installed serial adapters
000 = None
001 = One
010 = Two
011 = Three
100 = Four
101 to 111 = Reserved, not used
8
Reserved
7–6
Number of diskette drives
00 = One drive
01 = Two drives
10 to 11 = Reserved
5–4
Initial video mode
00 = EGA or PGA
01 = 40 x 25 color
10 = 80 x 25 color
11 = 80 x 25 monochrome
3
Reserved
2
PS/2-type point device
0 = Not present
1 = Present
1
Math coprocessor
0 = Not present
1 = Present
0
Diskette drive A
0 = Not present
1 = Present
Interrupt Flag used in POST
Reserved
INT 16h
Keyboard Status Byte: Bits
7
6
5
4
3
2
1
0
BIOS Data AREA map
Definition
Insert mode:
0 = Off
Caps Lock mode:
0 = Off
Num Lock mode:
0 = Off
Scroll Lock mode:
0 = Off
Alt key pressed:
0 = No
Ctrl key pressed:
0 = No
Left Shift key pressed: 0 = No
Rt Shift key pressed: 0 = No
1 = On
1 = On
1 = On
1 = On
1 = Yes
1 = Yes
1 = Yes
1 = Yes
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
000418h
INT 16h
000419h
Description
Extended Keyboard
Status Byte:
Bits
7
6
5
4
3
2
1
0
Definition
Ins key pressed:
0 = No
Caps Lock pressed: 0 = No
Num Lock pressed: 0 = No
Scroll Lock pressed: 0 = No
Ctrl / NumLock active: 0 = No
SysRq key pressed: 0 = No
Left Alt key pressed: 0 = No
Left Ctrl key pressed: 0 = No
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
Reserved
00041Ah –
00041Bh
INT 16h
Pointer to the address of the next character in the keyboard buffer
00041Ch –
00041Dh
INT 16h
Pointer to the address of the last character in the keyboard buffer
00041Eh –
00043Dh
INT 16h
Keyboard buffer (32 bytes) — if the address in 00041Ah is the same as
the address in 00041Ch, the buffer is empty. If the address in 00041Ch is
two bytes from the address in 00041Ah, the buffer is full.
00043Eh
INT 13h
Diskette Drive
Calibration Status:
Bits
7–4
3–2
1
0
Definition
Reserved, should be 0000
Reserved
Drive B recalib. reqd.? 0 = Yes 1 = No
Drive A recalib. reqd.? 0 = Yes 1 = No
00043Fh
INT 13h
Diskette Drive
Motor Status:
Bits
7
Definition
Current operation:
0 = Write or Format
1 = Read or Verify
Reserved
Drive select:
00 = Drive A selected
01 = Drive B selected
10 to 11 = Reserved
Reserved
Drive A:
0 = Motor is off
1 = Motor is on
Drive B:
0 = Motor is off
1 = Motor is on
6
5–4
3–2
1
0
000440h
INT 13h
Diskette Drive Motor Timeout: The system uses the INT 08h timer interrupt
(occurs at a rate of 18.2 times per second) to decrement this value. When
the value goes to zero, the system turns off the drive motor power. The
signal applies to the last drive accessed.
BIOS Data AREA map
H-3
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
000441h
INT 13h
Description
Status for
last accessed
Diskette Drive:
Bits
7
6
5
4–0
H-4
Definition
Ready Status:
0 = Ready
1 = Not ready
Seek Error:
0 = None detected
1 = Error detected
Drive Failure:
0 = None detected
1 = Failure detected
Error Codes:
00000 = No error
00001 = Illegal function
00010 = Address mark not found
00011 = Write protect error
00100 = Sector not found
00101 = Reserved
00110 = Drive door open
00111 = Reserved
01000 = DMA overrun error
01001 = DMA boundary error
01010 to 01011 = Reserved
01100 = Unknown media
01101 to 01111 = Reserved
10000 = CRC failed on read
10001 to 11111 = Reserved
000442h –
000448h
INT 13h
Diskette drive command and status bytes
000449h
INT 10h
Current video display mode
00044Ah –
00044Bh
INT 10h
Number of text columns per line of current video mode
00044Ch –
00044Dh
INT 10h
Current page size in bytes
00044Eh –
00044Fh
INT 10h
Offset address of current display page, relative to the start of video RAM
— video RAM starts at B800h in CGA and B000h in MDA.
BIOS Data AREA map
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
Description
000450h –
00045Fh
INT 10h
Current cursor position for each of the eight possible video display pages
— two bytes store the current cursor position for each page: the MSB
specifies the row (line) value; the LSB specifies the column value.
Note: DO NOT CHANGE THE VALUES AT THIS LOCATION!
Use INT 10h functions to change the video page values.
000460h
INT 10h
Starting line of the cursor
000461h
INT 10h
Ending line of the cursor
000462h
INT 10h
Current video display page number
000463h –
000464h
INT 10h
I/O port address of the video display adapter:
3B4h = monochrome adapter
3D4h = color adapter
000465h
INT 10h
Video display adapter mode register:
3B8h = monochrome adapter
3D8h = CGA adapter
3D9h = EGA or VGA adapter
000466h
INT 10h
Current palette color
000467h –
00046Bh
Adapter ROM address
00046Ch –
00046Fh
INT 1Ah
Counter for INT 1Ah — the system increments this counter using the INT
08h timer interrupt (occurs 18.2 times per second). After 24 hours, the
system resets the timer to 0.
000470h
INT 1Ah
Timer 24-hour flag
Bits
7–1
0
Definition
Reserved
Flag value:
0 = Timer value is 0 – 24 hours
1 = Timer value > 24 hours
(requires manual reset)
000471h
INT 16h
Break Status
Bits
7
Definition
0 = No break signaled
1 = Ctrl & Break or
Ctrl & C keys pressed
Reserved, not used
6–0
000472h –
000473h
POST
Soft reset flag — if value = 1234h, reboot skips the memory test.
BIOS Data AREA map
H-5
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
000474h
INT 13h
Status of last hard
drive operation:
Value
00h
01h
02h
03h
04h
05h
06h
07h
08h
09h
0Ah
0Bh
0Ch
0Dh
0Eh
0Fh
10h
11h
12h –
1Fh
20h
21h –
3Fh
40h
41h –
7Fh
80h
81h –
A9h
AAh
ABh –
BAh
BBh
BCh –
CBh
CCh
CDh –
DFh
E0h
E1h –
FEh
FFh
Definition
No error
Invalid function request
Address mark not found
Reserved
Sector not found
Reset failed
Reserved
Drive parameter activity failed
DMA overrun on operation
Data boundary error
Bad sector flag selected
Bad track detected
Reserved
Invalid number of sectors on format
Control data address mark detected
DMA arbitration level out of range
Uncorrectable ECC or CRC error
ECC corrected data error
Reserved
General controller failure
Reserved
Seek operation failure
Reserved
Timeout
Reserved
Drive not ready
Reserved
Undefined error occurred
Reserved
Write fault on selected drive
Reserved
Status error, or error register is 0
Reserved
Sense operation failed
000475h
INT 13h
Number of hard drives
000476h –
000477h
INT 13h
Hard drive work area
000478h
INT 17h
Parallel Port (LPT) 1 timeout counter
000479h
INT 17h
Parallel Port (LPT) 2 timeout counter
00047Ah
INT 17h
Parallel Port (LPT) 3 timeout counter
00047Bh
H-6
Description
Reserved
BIOS Data AREA map
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
Description
00047Ch
INT 14h
Serial Port (COM) 1 timeout counter
00047Dh
INT 14h
Serial Port (COM) 2 timeout counter
00047Eh
INT 14h
Serial Port (COM) 3 timeout counter
00047Fh
INT 14h
Serial Port (COM) 4 timeout counter
000480h –
000481h
INT 16h
Starting address of the keyboard buffer (usually 01Eh)
000482h –
000483h
INT 16h
Ending address of the keyboard buffer (usually 03Eh)
000484h
INT 10h
Number of displayed character rows minus one
000485h –
000486h
INT 10h
Height of character matrix
000487h
INT 10h
Video Status:
Bits
7
6–4
3
2
1
0
000488h
INT 13h
Definition
Equals bit 7 of the video mode number
passed through INT 10h by the
programmer
Video RAM size:
000 = 64K
001 = 128K
010 = 192K
011 = 256K
100 = 512K
101 = Reserved
110 = 1024K
111 = Reserved
Video subsystem status:
0 = Active
1 = Inactive
Reserved
Monitor type:
0 = Color
1 = Monochrome
Alphanumeric cursor emulation
0 = Disabled
1 = Enabled
Hard disk drive data transmission speed
BIOS Data AREA map
H-7
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
000489h
INT 10h
Description
VGA Video Flags:
Bits
7&4
6
5
4
3
2
1
0
00048Ah –
00048Bh
Reserved
00048Ch –
000495h
INT 13h
Hard drive and diskette drive variables
000496h
INT 16h
Extended Keyboard
Status:
Bits
7
6
5
4
3
2
1
0
Definition
Read ID in progress:
Last code was 1st ID:
Forced Num Lock:
101/102 keyboard:
Right Alt key active:
Right Ctrl key active:
Last code was E0h:
Last code was E1h:
Extended Keyboard
Status:
Bits
7
6
5
4
3
2
1
0
Definition
Keyboard error:
LED updating:
Resend code recd.:
Ack. code recd.:
Reserved
Caps Lock LED on:
Num Lock LED on:
Scroll Lock LED on:
000497h
H-8
Definition
Mode:
0XX0 = 350 lines
0XX1 = 400 lines
1XX0 = 200 lines
1XX1 = Reserved
Display switch:
0 = Disabled
1 = Enabled
Reserved
See 7 & 4 above
Default palette loading:
0 = Disabled
1 = Enabled
Monitor type:
0 = Color
1 = Monochrome
Grayscale summing
0 = Disabled
1 = Enabled
VGA
0 = Inactive
1 = Active
INT 16h
000498h –
000499h
Segment part of user wait flag address
00049Ah –
00049Bh
Offset part of user wait flag address
BIOS Data AREA map
0 = No
0 = No
0 = No
0 = No
0 = No
0 = No
0 = No
0 = No
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
1 = Yes
0 = No
0 = No
0 = No
0 = No
1 = Yes
1 = Yes
1 = Yes
1 = Yes
0 = No
0 = No
0 = No
1 = Yes
1 = Yes
1 = Yes
AMD
Table H-1
BIOS Map Contents (continued)
Address
BIOS Service
00049Ch –
00049Fh
0004A0h
Description
Wait count
INT 1Ah
Wait active flag:
Bits
7
6–1
0
0004A1h –
0004A7h
0004A8h –
0004ABh
Definition
Wait time elapsed:
0 = No
1 = Yes
Reserved
INT 15h AH = 86h occurred:
0 = No
1 = Yes
Reserved
INT 10h
Pointer to EGA and VGA parameter control block
0004ACh –
0004EFh
Reserved
0004F0h –
0004FFh
Intra-applications communication area — stores data available to application programs.
BIOS Data AREA map
H-9
AMD
H-10
BIOS Data AREA map
APPENDIX
I
TYPICAL CMOS RAM MAP
IBM-compatible personal computer systems that conform to the ISA standard have at least
64 bytes of CMOS RAM to store system initialization and configuration parameter values.
Typically, the values are set using a BIOS setup utility. The setup utility is usually ROM- or
Flash RAM-based. Some utilities can only be accessed at system startup; others can be
invoked at any time from the DOS prompt using a “hot key” combination, such as Alt + Ctrl
+ Esc or Alt + Ctrl + S.
The following table identifies the elements in a typical CMOS RAM map:
Table I-1
Example CMOS RAM Map
Offset
Description
00h
Real-Time Clock — Seconds. Contains the seconds value for the current time.
01h
Real-Time Clock — Seconds alarm. Contains the seconds value for the RTC alarm.
02h
Real-Time Clock — Minutes. Contains the minutes value for the current time.
03h
Real-Time Clock — Minutes alarm. Contains the minutes value for the RTC alarm.
04h
Real-Time Clock — Hours. Contains the hours value for the current time.
05h
Real-Time Clock — Hours alarm. Contains the hours value for the RTC alarm.
06h
Real-Time Clock — Day of the Week. Contains the current day of the week.
07h
Real-Time Clock — Date. Contains the day (1 – 31) of the current month.
08h
Real-Time Clock — Month. Contains the current month (1 – 12).
09h
Real-Time Clock — Year. Contains the current year (00 – 99).
0Ah
Status Register A
Bits:
7
6–4
3–0
Description:
Update in progress (cannot read date/time):
0 = No
1 = Yes
Selects the clock divider frequency
Default = 010 (32.768 KHz)
Selects the output frequency and periodic interrupt rate
Default = 0110 (1.024 KHz and 976.562 seconds)
Typical CMOS RAM Map
I-1
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
0Bh
Description
Status Register B
Bits:
7
6
5
4
3
2
1
0
0Ch
Status Register C
Bits:
7
6
5
4
3–0
Description:
IRQ Flag (read only)
Periodic Interrupt Flag (read only)
Alarm Interrupt Flag (read only)
Update Interrupt Flag (read only)
Reserved, should always be 0000
0Dh
Status Register D
Bits:
7
Description:
CMOS RAM valid:
0 = Battery low, CMOS RAM not valid
1 = Battery good, CMOS RAM valid
Reserved, should always be 000 0000
6–0
I-2
Description:
Halt cycle to set clock:
0 = Updates counter once per second
1 = Halts the counter to set the clock
Periodic interrupt:
0 = Disable
1 = Enable
Alarm Interrupt:
0 = Disable
1 = Enable
Update-Ended Interrupt:
0 = Disable
1 = Enable
Square Wave:
0 = Disable
1 = Use square wave rate set by Status Register A
Date and Time Mode:
0 = Use BCD format
1 = Use binary format
24/12-Hour Mode:
0 = Use 12-hour mode
1 = Use 24-hour mode
Daylight Savings Time
0 = Disable
1 = Enable
Typical CMOS RAM Map
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
0Eh
Description
Diagnostic Status
Bits:
7
6
5
4
3
2
1–0
0Fh
Description:
RTC Chip Power:
0 = Power valid
1 = Power invalid
CMOS RAM Checksum error:
0 = Checksum valid
1 = Checksum invalid
CMOS RAM Configuration Mismatch:
0 = Configuration match
1 = CMOS RAM configuration does not match system
configuration
CMOS RAM Memory Size Mismatch:
0 = Memory matches configuration
1 = CMOS RAM memory size does not match detected size
Hard drive C: initialization:
0 = Drive initialized, attempting to boot
1 = Drive failed to initialize; no boot attempted
Time Status indicator:
0 = Time is valid
1 = Time is invalid
Reserved, should always be 00
Shutdown Status. When the processor switches from protected mode to real mode, it saves
the contents of its registers to memory and performs a reset. If a program requests a shutdown
(by requesting a DWORD JMP instruction), the processor stores the segment address at 40:67h
and the offset address at 40:69h. Before performing the reset, the processor writes a shutdown
code to the CMOS RAM offset 0Fh. This allows the programmer to determine the cause of the
shutdown after the system resets.
Code Value
Description
00h
Normal POST execution
01h
Chipset initialization for Real Mode reentry
02h – 03h
Used internally by BIOS
04h
Jump to bootstrap code
05h
User-defined shutdown. The routine issues an EOI, flushes the
keyboard buffer, initializes the interrupt controller and math
coprocessor, and jumps to the doubleword pointer at 40:67h.
06h
Jump to the doubleword pointer at 40:67h without issuing an EOI
07h
Return to INT 15h Function 87h
08h
Return POST memory test
09h
INT 15h Function 87h Block Move shutdown request
0Ah
User-defined shutdown requested. The BIOS jumps to the doubleword
pointer at 40:67h without issuing an EOI or initializing the interrupt
controller or math coprocessor.
0Bh
Return through the doubleword pointer at 40:67h
The remainder of the possible codes are not defined.
Typical CMOS RAM Map
I-3
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
10h
Description
Diskette Drive Type:
Bits:
7–4
3–0
11h
Advance Setup
Options:
Bits:
7
6
5
4
3
2
1
0
I-4
Description:
Drive A type:
0000 = No drive
0001 = 360 Kbyte drive
0010 = 1.2 Mbyte drive
0011 = 720 Kbyte drive
0100 = 1.44 Mbyte drive
0101 – 1111 = Undefined
Drive B type:
0000 = No drive
0001 = 360 Kbyte drive
0010 = 1.2 Mbyte drive
0011 = 720 Kbyte drive
0100 = 1.44 Mbyte drive
0101–1111 = Undefined
Description:
PS/2 mouse:
0 = Disable
1 = Enable
Test memory above 1 Mbyte:
0 = Disable
1 = Enable
Memory test tick sound:
0 = Disable
1 = Enable
Memory parity error check:
0 = Disable
1 = Enable
Message display during boot:
0 = Disable
1 = Enable
User-defined hard disk type:
0 = Store at 0:300h
1 = Store in upper 1 Kbyte of DOS area
Wait for F1 key message if error occurs:
0 = Disable
1 = Enable
Num Lock at boot
0 = Off
1 = On
Typical CMOS RAM Map
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
Description
12h
Hard Drive Type:
Bits:
7–4
3–0
Description:
Drive C type:
Drive D type:
Values for both:
0000 = No drive
0001 = Type 1
0010 = Type 2
0011 = Type 3
0100 = Type 4
0101 = Type 5
0110 = Type 6
0111 = Type 7
1000 = Type 8
1001 = Type 9
1010 = Type 10
1011 = Type 11
1100 = Type 12
1101 = Type 13
1110 = Type 14
1111 = Types 16 – 46
(actual value stored in 19h for C or 1Ah for D)
13h
Keyboard
Typematic Data:
Bits:
7
Description:
Typematic Function:
0 = Disable
1 = Enable
Typematic rate delay:
00 = 250 ms
01 = 500 ms
10 = 750 ms
11 = 1000 ms
Typematic Rate:
000 = 6 cps
001 = 8 cps
010 = 10 cps
011 = 12 cps
100 = 15 cps
101 = 20 cps
110 = 24 cps
111 = 30 cps
6–5
4–2
Typical CMOS RAM Map
I-5
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
14h
Description
Equipment Byte
Bits:
7–6
5–4
3
2
1
0
I-6
Description:
Number of Diskette Drives:
00 = None
01 = One
10 = Two
11 = Reserved, not used
Monitor Type:
00 = Not CGA or MDA
01 = 40x25 CGA
10 = 80x25 CGA
11 = MDA (monochrome)
Display:
0 = Not installed
1 = Installed
Keyboard:
0 = Not installed
1 = Installed
Math coprocessor:
0 = Not installed
1 = Installed
Diskette drive installed, always 1
15h
Base memory in 1K increments, least-significant byte
16h
Base memory in 1K increments, most-significant byte
17h
Extended memory in 1K increments, least-significant byte
18h
Extended memory in 1K increments, most-significant byte
19h
Hard drive C: drive type if 12h, bits 7 – 4 = 1111
Values 00h to 0Fh are reserved; 10h to 2Eh equal drive types 16 – 46, respectively
1Ah
Hard drive D: drive type if 12h, bits 3 – 0= 1111
Values 00h to 0Fh are reserved; 10h to 2Eh equal drive types 16 – 46, respectively
1Bh
Hard drive C: Least-significant byte of the cylinder number for user-defined hard drive type
1Ch
Hard drive C: Most-significant byte of the cylinder number for user-defined hard drive type
1Dh
Hard drive C: Head number for user-defined hard drive type
1Eh
Hard drive C: Least-significant byte of the write-precompensation cylinder number for userdefined hard drive type
1Fh
Hard drive C: Most-significant byte of the write-precompensation cylinder number for userdefined hard drive type
20h
Hard drive C: Control byte (= 80h if head number ≥ 8) for user-defined hard drive type
21h
Hard drive C: Least-significant byte of the landing zone number for user-defined hard drive type
22h
Hard drive C: Most-significant byte of the landing zone number for user-defined hard drive type
23h
Hard drive C: Number of sectors for user-defined hard drive type
24h
Hard drive D: Least-significant byte of the cylinder number for user-defined hard drive type
25h
Hard drive D: Most-significant byte of the cylinder number for user-defined hard drive type
26h
Hard drive D: Head number for user-defined hard drive type
27h
Hard drive D: Least-significant byte of the write-precompensation cylinder number for userdefined hard drive type
Typical CMOS RAM Map
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
Description
28h
Hard drive D: Most-significant byte of the write-precompensation cylinder number for userdefined hard drive type
29h
Hard drive D: Control byte (= 80h if head number ≥ 8) for user-defined hard drive type
2Ah
Hard drive D: Least-significant byte of the landing zone number for user-defined hard drive type
2Bh
Hard drive D: Most-significant byte of the landing zone number for user-defined hard drive type
2Ch
Hard drive D: Number of sectors for user-defined hard drive type
2Dh
Miscellaneous
BIOS options:
Bits:
7
6
5
4
3
2
1
0
Description:
Weitek coprocessor:
0 = Not installed
1 = Present
Diskette Drive Seek
0 = Disabled for fast boot
1 = Enabled
System Boot Sequence:
0 = C:, then A:
1 = A:, then C:
System Speed at Bootup
0 = Fast
1 = Slow
External Cache Memory Test:
0 = Disable (use if no external cache installed)
1 = Enable
Internal Cache Memory Test:
0 = Disable
1 = Enable
Fast Gate A20:
0 = Disable (use if system does not use Fast Gate A20)
1 = Enable
Turbo Switch:
0 = Disable
1 = Enable
2Eh
Standard CMOS checksum, most-significant byte
2Fh
Standard CMOS checksum, least-significant byte
30h
Extended memory found by BIOS, least-significant byte
31h
Extended memory found by BIOS, most-significant byte
32h
Century byte — the BCD value for the current century
33h
Information Flag
Bits:
7
6–1
0
Description:
BIOS Length:
0 = 64K
1 = 128K
Reserved, should be 000 000. Used as scratchpad during
POST by chipsets.
POST Cache Test results:
0 = Cache bad
1 = Cache good
Typical CMOS RAM Map
I-7
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
34h
Description
Shadowing
and Password:
Bits:
7
6
5
4
3
2
1
0
35h
Shadowing:
Bits:
7
6
5
4
3
2
1
0
36h
I-8
Description:
Boot sector virus protection:
0 = Disabled
1 = Enabled
Password
0 = Disabled
1 = Enabled
C8000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
CC000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
D0000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
D4000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
D8000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
DC000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
Description:
E0000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
E4000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
E8000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
EC000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
F0000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
C0000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
C4000h Shadow 16K Adaptor ROM:
0 = Disabled
1 = Enabled
Math Coprocessor Test:
0 = Disabled
1 = Enabled
Chipset specific information
Typical CMOS RAM Map
AMD
Table I-1
Example CMOS RAM Map (continued)
Offset
37h
Description
Password Seed
and Color Option:
Bits:
7–4
3–0
38h –
3Dh
Description:
Password seed used in the password encryption algorithm
DO NOT CHANGE!
Setup Screen Color — if used, colors are BIOS dependent
Encrypted Password
3Eh
MSB of Extended CMOS Checksum
3Fh
LSB of Extended CMOS Checksum
Typical CMOS RAM Map
I-9
AMD
I-10
Typical CMOS RAM Map
APPENDIX
J
STANDARD I/O PORT ADDRESSING
IBM-compatible personal computer systems communicate with internal and external peripheral devices using a system of industry standard port addresses. System and peripheral
device designers use a variety of decoding techniques to convert these address signals
into a chip select signal that enables communications with a specific peripheral device.
Because the addresses are standardized within the personal computer industry, peripheral
devices from a variety of manufacturers can operate with a variety of personal computers.
New addresses continue to be defined as new peripheral devices become available. This
appendix provides a cross-reference for addressing only. Refer to the peripheral device
data sheet to determine how individual bits are used in a specific application or design.
Table J-1 is an I/O address map that includes the most typical peripheral devices.
Table J-1
Standard I/O Port Addresses
I/O Port
Read/Write
Description
000h
R/W
DMA channel 0 address bytes 0 and 1
001h
R/W
DMA channel 0 word count bytes 0 and 1
002h
R/W
DMA channel 1 address bytes 0 and 1
003h
R/W
DMA channel 1 word count bytes 0 and 1
004h
R/W
DMA channel 2 address bytes 0 and 1
005h
R/W
DMA channel 2 word count bytes 0 and 1
006h
R/W
DMA channel 3 address bytes 0 and 1
007h
R/W
DMA channel 3 word count bytes 0 and 1
R
DMA channel 0 – 3 Status Register
W
DMA channel 0 – 3 Command Register
009h
W
DMA channel 0 – 3 Request Register
00Ah
R/W
DMA channel 0 – 3 Mask Register
00Bh
W
DMA channel 0 – 3 Mode Register
00Ch
W
DMA channel 0 – 3 Clear Byte Pointer Flip/Flop
00Dh
R
DMA channel 0 – 3 Temporary Register
00Eh
W
DMA channel 0 – 3 Clear Mask Register
00Fh
W
DMA channel 0 – 3 Write Mask Register
008h
010h – 01Fh
Reserved or not assigned
Standard I/O Port Addressing
J-1
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
020h
Read/Write
R
Interrupt Controller 1
Interrupt Request Register (IRR)
or
In-Service Register (ISR)
(as selected by OCW3)
W
Interrupt Controller 1
Initialization Command Word 1 (ICW1) Register (if bit 4 = 1)
or
Operational Command Word 3 (OCW3) Register (if bit 4 = 0 and bit 2 = 1)
R/W
021h
W
022h – 03Fh
Interrupt Controller 1
Operation Control Word 1 (OCW1) Register (Mask Register)
Interrupt Controller 1
Initialization Control Word 2 (ICW2) Register
Initialization Control Word 3 (ICW3) Register
Initialization Control Word 4 (ICW4) Register (if enabled by ICW1)
Operation Control Word 2 (OCW2) Register (if bit 4 = 0 and bit 3 = 0)
Reserved or not assigned
040h
R/W
Programmable Counter/Timer 0
041h
R/W
Programmable Counter/Timer 1
042h
R/W
Programmable Counter/Timer 2
043h
W
044h – 05Fh
060h
061h
064h
R
Keyboard Controller Data Port or Keyboard Input Buffer
W
Keyboard Output Port
R/W
Port B Control Register
Reserved or not assigned
R
Keyboard Controller Status Register or Keyboard Input Buffer
W
Keyboard Output Port (alternate)
065h – 06Fh
Reserved or not assigned
070h
R
071h
R/W
072h – 07Fh
RTC Register (bits 6 – 0) and NMI Mask (bit 7)
CMOS RAM Data Register Port
Reserved or not assigned
R
080h
Programmable Counter/Timer Control Word Register
Reserved or not assigned
062h – 063h
J-2
Description
Manufacturing test port (for POST checkpoints)
R/W
DMA Page Register temporary storage
081h
R/W
DMA channel 2 address byte 2
082h
R/W
DMA channel 3 address byte 2
083h
R/W
DMA channel 1 address byte 2
084h
R/W
Additional DMA page register
Standard I/O Port Addressing
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
Read/Write
085h
R/W
Additional DMA page register
086h
R/W
Additional DMA page register
087h
R/W
DMA channel 0 address byte 2
088h
R/W
Additional DMA page register
089h
R/W
DMA channel 6 address byte 2
08Ah
R/W
DMA channel 7 address byte 2
08Bh
R/W
DMA channel 5 address byte 2
08Ch
R/W
Additional DMA page register
08Dh
R/W
Additional DMA page register
08Eh
R/W
Additional DMA page register
08Fh
R/W
DMA refresh page register
090h – 09Fh
0A0h
Reserved or not assigned
R
Interrupt Controller 2
Interrupt Request Register (IRR)
or
In-Service Register (ISR)
(as selected by OCW3)
W
Interrupt Controller 2
Initialization Command Word 1 (ICW1) Register (if bit 4 = 1)
or
Operational Command Word 3 (OCW3) Register (if bit 4 = 0 and bit 2 = 1)
R/W
0A1h
Description
W
0A2h – 0BFh
Interrupt Controller 2
Operation Control Word 1 (OCW1) Register (Mask Register)
Interrupt Controller 2
Initialization Control Word 2 (ICW2) Register
Initialization Control Word 3 (ICW3) Register
Initialization Control Word 4 (ICW4) Register (if enabled by ICW1)
Operation Control Word 2 (OCW2) Register (if bit 4 = 0 and bit 3 = 0)
Reserved or not assigned
0C0h – 0C1h
R/W
DMA channel 4 address bytes 0 and 1
0C2h – 0C3h
R/W
DMA channel 4 word count bytes 0 and 1
0C4h – 0C5h
R/W
DMA channel 5 address bytes 0 and 1
0C6h – 0C7h
R/W
DMA channel 5 word count bytes 0 and 1
0C8h – 0C9h
R/W
DMA channel 6 address bytes 0 and 1
0CAh – 0CBh
R/W
DMA channel 6 word count bytes 0 and 1
0CCh – 0CDh
R/W
DMA channel 7 address bytes 0 and 1
0CEh – 0CFh
R/W
DMA channel 7 word count bytes 0 and 1
Standard I/O Port Addressing
J-3
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
Read/Write
R
DMA channel 4 – 7 Status Register
W
DMA channel 4 – 7 Command Register
0D2h – 0D3h
W
DMA channel 4 – 7 Request Register
0D4h – 0D5h
R/W
DMA channel 4 – 7 Mask Register
0D6h – 0D7h
W
DMA channel 4 – 7 Mode Register
0D8h – 0D9h
W
DMA channel 4 – 7 Clear Byte Pointer Flip/Flop
0DAh – 0DBh
R
DMA channel 4 – 7 Temporary Register
0DCh – 0DDh
W
DMA channel 4 – 7 Clear Mask Register
0DEh – 0DFh
W
DMA channel 4 – 7 Write Mask Register
0D0h – 0D1h
0E0h – 0EFh
Reserved or not assigned
0F0h
Math coprocessor clear busy latch
0F1h
Math coprocessor reset
0F2h – 0FFh
R/W
100h – 16Fh
170h
171h
Math coprocessor
Reserved or not assigned
R/W
Hard drive 1 Data Register
R
Hard drive 1 Error Register
W
Hard drive 1 Write Precompensation Register
172h
R/W
Hard drive 1 Sector Count
173h
R/W
Hard drive 1 Sector Number
174h
R/W
Hard drive 1 Cylinder Number (low byte)
175h
R/W
Hard drive 1 Cylinder Number (high byte)
176h
R/W
Hard drive 1 Drive/Head Number
177h
R
Hard drive 1 Status Register
W
Hard drive 1 Command Register
178h – 1EFh
1F0h
1F1h
J-4
Description
Reserved or not assigned
R/W
Hard drive 0 Data Register
R
Hard drive 0 Error Register
W
Hard drive 0 Write Precompensation Register
1F2h
R/W
Hard drive 0 Sector Count
1F3h
R/W
Hard drive 0 Sector Number
1F4h
R/W
Hard drive 0 Cylinder Number (low byte)
1F5h
R/W
Hard drive 0 Cylinder Number (high byte)
1F6h
R/W
Hard drive 0 Drive/Head Number
Standard I/O Port Addressing
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
1F7h
Read/Write
R
Hard drive 0 Status Register
W
Hard drive 0 Command Register
1F8h – 1FFh
200h – 20Bh
Description
Reserved or not assigned
R/W
20Ch – 277h
Game controller ports
Reserved or not assigned
278h
R/W
Parallel Port 2 Data Port
279h
R/W
Parallel Port 2 Status Port
27Ah
R/W
Parallel Port 2 Control Port
27Bh
R/W
Reserved or not assigned
27Ch
R/W
Parallel Port 2 Data Port (DUP)
27Dh
R/W
Parallel Port 2 Status Port (DUP)
27Eh
R/W
Parallel Port 2 Control Port (DUP)
27Fh – 2E7h
Reserved or not assigned
R
2E8h
Serial Port 4 Receiver Buffer Register
R/W
Serial Port 4 Divisor Latch Low Byte
2E9h
R/W
Serial Port 4 Interrupt Enable Register
2EAh
R
Serial Port 4 Interrupt ID Register
2EBh
R/W
Serial Port 4 Line Control Register
2ECh
R/W
Serial Port 4 Modem Control Register
2EDh
R
Serial Port 4 Line Status Register
2EEh
R
Serial Port 4 Modem Status Register
2EFh
R/W
2F0h – 2F7h
Reserved or not assigned
R
2F8h
Serial Port 4 Scratch Register
Serial Port 2 Receiver Buffer Register
R/W
Serial Port 2 Divisor Latch Low Byte
2F9h
R/W
Serial Port 2 Interrupt Enable Register
2FAh
R
Serial Port 2 Interrupt ID Register
2FBh
R/W
Serial Port 2 Line Control Register
2FCh
R/W
Serial Port 2 Modem Control Register
2FDh
R
Serial Port 2 Line Status Register
2FEh
R
Serial Port 2 Modem Status Register
2FFh
R/W
Serial Port 2 Scratch Register
300h – 31Fh
Reserved for Prototype Card
320h – 371h
Reserved or not assigned
Standard I/O Port Addressing
J-5
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
Read/Write
372h
W
373h
Diskette Drive Controller 2 Digital Output Register
Reserved or not assigned
374h
R
375h
R/W
376h
R
Diskette Drive Controller 2 Control Port
R
Diskette Drive Controller 2 Digital Input Register
W
Diskette Drive Controller 2 Select Register for Data Transfer Rate
377h
Diskette Drive Controller 2 Status Register
Diskette Drive Controller 2 Data Register
378h
R/W
Parallel Port 1 Data Port
379h
R/W
Parallel Port 1 Status Port
37Ah
R/W
Parallel Port 1 Control Port
37Bh
R/W
Hercules-compatibility Configuration Switch Registers
37Ch
R/W
Parallel Port 1 Data Port (DUP)
37Dh
R/W
Parallel Port 1 Status Port (DUP)
37Eh
R/W
Parallel Port 1 Control Port (DUP)
37Fh
Reserved or not assigned
380h
R/W
Bisynchronous Device 2 Port A (8255A-5)
381h
R/W
Bisynchronous Device 2 Port B (8255A-5)
382h
R/W
Bisynchronous Device 2 Port C (8255A-5)
383h
W
384h
R/W
Bisynchronous Device 2 Counter 0 (8253)
385h
R/W
Bisynchronous Device 2 Counter 1 (8253)
386h
R/W
Bisynchronous Device 2 Counter 2 (8253)
387h
R/W
Bisynchronous Device 2 Control Word/Mode Register (8253/5)
Bisynchronous Device 2 Mode Set Register (8255)
R
Bisynchronous Device 2 Status Register (8253)
W
Bisynchronous Device 2 Command Register (8273)
389h
R
Bisynchronous Device 2 Parameter Result (8273)
38Ah
R/W
Bisynchronous Device 2 Transmit INT Status (8273)
38Bh
R/W
Bisynchronous Device 2 Receive INT Status (8273)
38Ch
R/W
Bisynchronous Device 2 Data (8273)
388h
38Dh – 39Fh
J-6
Description
Reserved or not assigned
3A0h
R/W
Bisynchronous Device 1 Port A (8255)
3A1h
R/W
Bisynchronous Device 1 Port B (8255)
3A2h
R/W
Bisynchronous Device 1 Port C (8255)
3A3h
W
Bisynchronous Device 1 Mode Set Register(8255)
Standard I/O Port Addressing
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
Read/Write
3A4h
R/W
Bisynchronous Device 1 Counter 0 (8253)
3A5h
R/W
Bisynchronous Device 1 Counter 1 (8253)
3A6h
R/W
Bisynchronous Device 1 Counter 2 (8253)
3A7h
R/W
Bisynchronous Device 1 Control Word/Mode Register (8253/5)
3A8h
W
3A9h
R/W
3AAh – 3B3h
Description
Bisynchronous Device 1 Data Select (8253/5)
Bisynchronous Device 1 Mode Instruction and Command Instruction
(8253/5)
Reserved or not assigned
3B4h
R/W
MDA CRTC Index Register
3B5h
R/W
MDA Video CRTC
data registers:
3B6h – 3B7h
3B8h
Function
Horizontal total
Horizontal displayed
Horizontal sync position
Horizontal sync pulse width
Vertical total
Vertical displayed
Vertical sync position
Vertical sync pulse width
Interleaved mode
Maximum scan lines
Cursor start
Cursor end
Start address (high byte)
Start address (low byte)
Cursor location (high byte)
Cursor location (low byte)
Light pen (high byte)
Light pen (low byte)
Undefined
Reserved or not assigned
W
3B9h
3BAh
Index
00h
01h
02h
03h
04h
05h
06h
07h
08h
09h
0Ah
0Bh
0Ch
0Dh
0Eh
0Fh
10h
11h
12h – FFh
MDA Mode Control Register
Reserved or not assigned
R
3BBh
CRT Status Port
Reserved or not assigned
3BCh
R/W
Parallel Port 3 Data Port
3BDh
R/W
Parallel Port 3 Status Port
3BEh
R/W
Parallel Port 3 Control Port
3BFh – 3C1h
Reserved or not assigned
3C2h
R
CGA Input Status Register
3C3h
R/W
Video Subsystem Enable
3C4h
R/W
CGA Sequencer Index Register
Standard I/O Port Addressing
J-7
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
Read/Write
3C5h
R/W
3C6h – 3C9h
3CAh
R
CGA Feature Control Register
Reserved or not assigned
R/W
3D2h – 3D3h
6845 Registers
Reserved or not assigned
3D4h
W
CGA Video CRTC index register
3D5h
W
CGA Video CRTC
data registers:
3D6h – 3D7h
Index
00h
01h
02h
03h
04h
05h
06h
07h
08h
09h
0Ah
0Bh
0Ch
0Dh
0Eh
0Fh
10h
11h
12h – FFh
Reserved or not assigned
3D8h
R/W
CGA Mode Control Register
3D9h
R/W
CGA Palette Register
3DAh
R/W
CRT Status Register
3DBh
W
Clear Light Pen Latch
3DCh
W
Preset Light Pen Latch
3DDh – 3E7h
Reserved or not assigned
R
3E8h
J-8
CGA Sequencer Data Registers
Reserved or not assigned
3CBh – 3CFh
3D0h – 3D1h
Description
Serial Port 3 Receiver Buffer Register
R/W
Serial Port 3 Divisor Latch Low Byte
3E9h
R/W
Serial Port 3 Interrupt Enable Register
3EAh
R
Serial Port 3 Interrupt ID Register
3EBh
R/W
Serial Port 3 Line Control Register
3ECh
R/W
Serial Port 3 Modem Control Register
3EDh
R
Serial Port 3 Line Status Register
Standard I/O Port Addressing
Function
Horizontal total
Horizontal displayed
Horizontal sync position
Horizontal sync pulse width
Vertical total
Vertical displayed
Vertical sync position
Vertical sync pulse width
Interleaved mode
Maximum scan lines
Cursor start
Cursor end
Start address (high byte)
Start address (low byte)
Cursor location (high byte)
Cursor location (low byte)
Light pen (high byte)
Light pen (low byte)
Undefined
AMD
Table J-1
Standard I/O Port Addresses (continued)
I/O Port
Read/Write
3EEh
R
3EFh
R/W
3F0h – 3F1h
3F2h
Description
Serial Port 3 Modem Status Register
Serial Port 3 Scratch Register
Reserved or not assigned
W
3F3h
Diskette Drive Controller 1 Digital Output Register
Reserved or not assigned
3F4h
R
3F5h
R/W
3F6h
R
Diskette Drive Controller 1 Control Port
R
Diskette Drive Controller 1 Digital Input Register
W
Diskette Drive Controller 1 Select Register for Data Transfer Rate
R
Serial Port 1 Receiver Buffer Register
3F7h
3F8h
Diskette Drive Controller 1 Status Register
Diskette Drive Controller 1 Data Register
R/W
Serial Port 1 Divisor Latch Low Byte
3F9h
R/W
Serial Port 1 Interrupt Enable Register
3FAh
R
Serial Port 1 Interrupt ID Register
3FBh
R/W
Serial Port 1 Line Control Register
3FCh
R/W
Serial Port 1 Modem Control Register
3FDh
R
Serial Port 1 Line Status Register
3FEh
R
Serial Port 1 Modem Status Register
3FFh
R/W
Serial Port 1 Scratch Register
Standard I/O Port Addressing
J-9
AMD
J-10
Standard I/O Port Addressing
CHAPTER
GLOSSARY
Abort
An unrecoverable exception.
Address
See I/O Address, Logical Address, Linear Address, and Physical Address.
Address Line
A signal line that is part of an address bus. For Am486 CPU-based systems, the
bus uses 32 address lines to connect to memory or devices on the I/O bus. The
processor uses the M/IO signal to specify whether the microprocessor is addressing memory or an I/O device.
Address Space
The range of addressable memory locations.
AddressSize Prefix
Optional programming code used before an instruction that defines the size of
address offsets, which can be 16 or 32 bits in length. The D bit in the instruction
code segment defines a default AddressSize, but the prefix overrides that default.
Address Translation
Remapping of memory locations that allows the same physical memory address
space to be used by multiple applications. Segmentation and paging use address
translation to protect memory locations from being overwritten. Paging uses the
Present bit to swap data between disk storage and memory, expanding the translation capability.
Alignment
The placement of code or data on a 2-, 4-, 8-, 16-, or 32-byte boundary depending
on the operand or cache-line size.
Application Program A higher level user program generally assigned the highest privilege number and
lowest privilege level. Application programs require an operating system and, for
some applications, an interface program, such as Microsoft Windows, to run
correctly.
ASCII
American Standard Code for Information Interchange. An international standard
for coding text characters that uses 7 or 8 bits per character. The standard set
of characters uses the first 128 value combinations (0 to 127 decimal); some
older serial communication protocols only used 7 bits of data (bits 6–0) per byte,
reserving the top bit (7) for control purposes. The extended ASCII character set
uses all eight bits per byte and assigns 128 additional characters for the values
128 to 255 decimal.
Base Address
A defined address that indicates the beginning of a data structure or table in
memory. Using a base address allows greater flexibility to locate and access
segments, descriptor tables, pages, page tables, and for input/output devices,
configuration tables.
Base Register
A register that stores a base address for a set of data. Data within the data set
is addressed via offsets from the address in the base register.
Baud
A variable unit of measure used for serial transmission of binary data across data
lines; usually equal to one bit per second.
Glossary-1
AMD
Biased Exponent
The form of the exponent used by the floating-point unit. The biased exponent is
interpreted as an unsigned, positive number. The value is computed by adding
a constant (the bias) to the true exponent of the real number. To get the true
exponent for a non-zero number, subtract the bias for the precision level (127 for
single, 1023 for double, and 32767 for extended) from the value in the exponent
field.
Binary
A number system based on the value of two. It is the system used by computers
at a circuit level because the basic computer circuit has two states, On and Off,
that are interpreted as numerical 1’s and 0’s.
Binary Coded
Decimal (BCD)
A method of representing base 10 (decimal) numbers using binary encoding.
Each decimal digit uses four bits; the values 1010 through 1111 are not used.
Standard BCD format encodes the four bits as part of a byte, ignoring the upper
four bits. The Am486 microprocessor floating-point unit supports a fixed-length
(18 digits) packed BCD format that stores two decimal numbers per byte, using
both the lower and upper four bits of the byte.
Binary Integer
A whole number represented in the binary (base 2) form using only the symbols
0 and 1.
Binary Point
The binary equivalent of the decimal point in real number format.
BIOS
Basic Input/Output System. The system drivers that define the default system
handlers for system interrupts and exceptions. The BIOS software is stored on
a static memory device, such as ROM or Flash memory, that retains data with
no power supplied. Whenever the microprocessor is reset to its initial state, it
begins operation by reading the BIOS into memory and executing power-on selftests, loading the vector addresses into low memory, and loading the handlers
into memory at the vector-referenced locations. When the tests are complete and
the vector addresses and handlers are loaded into memory, the BIOS relinquishes
control of the computer to the operating system software.
Bit
The smallest unit of information storage in a computer system. The basic computer circuit used to represent logical values has two states: On and Off. The
output from this circuit is typically interpreted as a logical 1 for On and a logical
0 for Off. Multiple bits are read in parallel in groups of 8 (called a byte), 16 (called
a word), 32 (called a doubleword or dword), and 64 (called a quadword or qword).
Although the actual number representations of the bits are in binary form (base
2 number system), typically users and programmers read the bits in sets of four
using the hexadecimal number system (base 16), which is closer to the more
common base ten and easier to evaluate mathematically than long binary strings.
Bit Field
A sequence of 1 to 32 bits starting at any position in a byte address.
Bit String
A sequence of 1 to 232–1 bits starting at any position in a byte address.
Boot
The common term for restarting a computer, shortened from the “bootstrap” routines required to start older mainframe computers. Personal computers typically
use a change-of-state of the POWER GOOD signal from the computer power
supply (COLD BOOT) or the keyboard controller (WARM BOOT) to initiate a
microprocessor reset. Applying a signal to the RESET input of the microprocessor
causes it to return to a known state and initialize by reloading the BIOS and
restarting system operations.
Glossary-2
AMD
Breakpoint
A defined address range used by debugging handlers to trap information for
evaluation by designers and programmers. Four breakpoints can be defined in
the Debug Registers. The user can specify the breakpoint for a particular form
of memory access.
Bus
A set of signal lines that transmit electronic signal sets between devices in a
computer. This can be a data bus that transmits data and code between components or an address bus that selects memory locations or system devices. Some
devices use a single address/data bus that alternate transmissions between addressing and data/code. Other control signals determine how the system interprets the bus information.
Bus Speed
The clock speed used to transmit data across a system bus.
Byte
8 bits. Memory and disk storage capacities are normally defined in bytes.
C3 – C0
The condition code bits in the floating-point unit (FPU) Status Word. These bits
define the status of the outcome of some of the FPU instructions.
Cache
Special fast memory that can be both internal and external to the microprocessor.
The cache memory retains copies of the most recently read memory contents
for quick reaccess by the microprocessor.
Cache Flush
Clearing the cache memory by forcing the microprocessor to read from system
memory and overwrite the contents before accessing the cache.
Cache Hit
A request for data that is available in the cache memory.
Cache Line
The smallest unit of cache storage. The internal cache of the Am486 microprocessor has a 128-bit cache line size.
Cache Miss
A request for data that is not available in the cache memory, requiring a read
from the system memory.
Call Gate
A gate descriptor used by a CALL, JMP, Jcc, or LOOPcc instruction.
Cascade
A method of linking controller circuits that can only input one value into the microprocessor. For example, an interrupt controller evaluates up to eight input
signals, prioritizes them, and presents one interrupt signal at a time to the microprocessor. By taking a second similar controller and tying its output to one of the
eight inputs to the first controller, you can process 15 interrupt signals. DMA
controllers use a similar scheme to expand the DMA channel processing
capability.
CD-ROM
A data storage method that uses laser technology and an encoded disk to store
digital data. The method uses the same technology as music recording.
CGA
Color Graphics Adapter. The earliest color display controller that supported fourcolor graphic displays.
Glossary-3
AMD
Clocking
The method by which data is transferred and sampled in a digital circuit. Data is
detected when a voltage level changes state (from 1 to 0, or 0 to 1); the clock
signal causes this change. The clock pulse rate determines how quickly a digital
circuit can move between sets of information. Typically, as newer microprocessor
clock rates increased, constraints (signal loss) on the expansion buses required
that a separate, slower clocking rate be used for I/O data transfers. Newer bus
designs (VESA LB and PCI) allow the buses to transfer data using the same input
clock and, therefore, clocking rate as the microprocessor.
CMOS Memory
Although the term refers to a specific memory manufacturing type, Complementary Metal-Oxide Semiconductor, this phrase commonly refers to the batterybacked up memory associated with the Real-Time Clock Circuit that stores the
basic computer configuration data, such as memory size and type, drive sizes
and types, video interface, etc. used by the computer when it starts to select the
correct BIOS handlers and parameter values.
Code
Also known as instructions or instruction code. A type of binary information transmitted to a processing device that activates specific functionality within the processing device.
Code Descriptor
The data table in memory that defines the type of information in the specified
code segment.
Code Segment
An address space containing instructions; also called an executable segment.
An instruction fetch cycle must address a code segment.
COM
The DOS-assigned name for a serial port. In later versions of DOS, a port can
be designated as COM1, COM2, COM3, or COM4. The name implies a specific
I/O address for the set of operational registers associated with the serial port.
Command
A user-entered instruction name, typically associated with an operating system
or command line driven application program. Typically used DOS commands
include COPY, MAKEDIR, DEL, and so forth.
Condition Code
See C3 – C0.
Configuration
The specific details used by a computer to interact correctly with its built-in and
installed devices. The configuration information may be read from hardware (such
as ROM-based information or specific jumper or switch settings) or may be stored
configuration files, such as those maintained in the CMOS battery-backed up
memory.
Conforming
Segment
A code segment that executes with the Requested Privilege Level (RPL) of the
segment selector or the Current Privilege Level (CPL) of the calling program,
whichever has a lower privilege level (higher value).
Control Word
A 16-bit register used by the floating-point unit (FPU). The user can define which
modes the FPU uses and the interrupts that are enabled.
Controller
An electronic device or circuit used to provide a hardware interface between the
microprocessor and other system devices. Examples include drive controllers,
keyboard controllers, and video controllers.
CPU
Central Processor Unit. See microprocessor.
Glossary-4
AMD
Current Privilege
Level (CPL)
The privilege level assigned to the currently executing program. Typically, the
level is the Descriptor Privilege Level of the code segment descriptor assigned
to the program. If, however, execution has been transferred to a conforming code
segment (in which case the CPL is carried from the previous execution), the CPL
may be different from the current DPL assigned to the executing code.
Data Line
One of the individual signal connections in the Data Bus (see Bus).
Data Segment
An address space containing data. The microprocessor provides four segment
registers (DS, ES, FS, and GS) to access data segments. The respective segment
descriptors describe the type of information stored in each segment.
Data Structure
A memory area defined for particular use by hardware or software, such as a
page table or task state segment (TSS).
Debug Registers
A set of registers used to define hardware breakpoints for debugging.
Decimal Integer
A whole number represented in BCD form.
Descriptor Privilege
Level (DPL)
The privilege level assigned to a segment through the DPL field in the segment
descriptor.
Descriptor Table
An array of segment descriptors. The Global Descriptor Table (GDT) defines the
overall memory layout. The Local Descriptor Tables (LDT) define individual memory segments.
Device Driver
A special program designed to manage the interface between the microprocessor
and a peripheral device (such as a video adapter).
Dirty Bit
A bit used when the microprocessor is set for Write-Back cache mode to indicate
that the microprocessor wrote to the cache, but that the new value has not yet
been written to memory. When the new data is transferred to memory, the microprocessor resets the dirty bit (to 0).
Disk
A data storage medium with embedded data recording tracks. Hard drives use
one or more of these disks.
Diskette
A portable magnetic data medium that fits into a diskette drive. Like the disk, the
diskette uses a single metal disk embedded with data recording tracks, but it is
stored in a plastic sleeve or housing that protects the diskette contents. After
recording data on the diskette, you can lock its contents mechanically to make
the diskette read-only.
Diskette Drive
A data storage device that has a drive motor, a set of read and write heads, and
a mechanism (stepping motor/actuator) to move the heads across a diskette
surfaces that uses removable diskettes to store and read data. Typically, diskette
drives support 3-1/2" (720 Kbytes, 1.44 Mbytes, or 2.88 Mbytes of storage) and/
or 5-1/4" (360 Kbytes or 1.2 Mbytes of storage) diskettes.
Displacement
A constant used to calculate an effective address. A displacement modifies the
address independently of any scaled indexing. Displacement is often used to
indicate the address of operands that have a fixed relation to some other address,
such as a base address or a record field in an array.
DMA
Direct Memory Access. A method of buffering information between the I/O bus,
which typically uses slower clocking, and the memory bus connected directly to
the microprocessor.
Glossary-5
AMD
DMA Channels
The circuits in a DMA controller that allow it to handle multiple devices. A typical
DMA controller provides four channels for data transfer. By using a cascade
approach, two controllers can support seven DMA channels. Typically, the first
controller supports 8-bit transfers and the second controller supports 16-bit transfers.
DMA Controller
A device that provides the interface between I/O devices that require DMA support
to transfer data between the I/O bus and the memory bus.
Doubleword (dword)
32 bits.
DRAM
Dynamic Random Access Memory. Memory that can be accessed and programmed by a computer system, but loses the last value written to the memory
if it loses power. This form of memory is faster and less expensive than SRAM,
but uses additional power through the refresh cycles required to maintain its
contents.
Driver
A software program that allows the operating system software to communicate
with a specific device, such as a video circuit or printer.
Effective Address
The results of a calculated address. The calculation method depends on the
addressing method used.
EFLAGS
The Extended FLAGS register added by 32-bit processors. The FLAGS register
is embedded in the lower 16 bits of EFLAGS. The flag bits in the upper word of
the register add functions required by the 32-bit (386- and 486-type) and higher
level processors.
EGA
Enhanced Graphics Adapter. A color video controller that supports 16-color
graphic displays.
EISA
Extended Instrumentation Society of America. Standard for personal computer
expansion slots. The EISA slots use the same basic footprint as ISA slots on a
motherboard, but provide two levels of contacts that allow expansion of the data
bus from 16-bits to 32-bits.
Enable
Make available for use by the computer. Typically, this term is used in relation to
turning on a specific functionality, such as a serial or parallel port or interrupt
capability.
Exception
A fault, trap, abort, or software-initiated interrupt that causes the microprocessor
to execute a recovery subroutine. Exception causes can include dividing by zero,
stack overflow, undefined opcode, and memory protection violation.
Far Pointer
A memory reference that includes both a segment selector and an offset value.
Fault
An exception reported at the instruction boundary before the instruction that
generates the exception. After the exception handler fixes the source of the
exception, such as a segment or page not present in memory, execution restarts.
Fax/Modem
A serial device that supports the transmission and receipt of document facsimiles
and other serial communications via telephone transmission lines.
FDC
Floppy Drive Controller. A device that controls the data interface between the
computer system and a diskette drive.
Glossary-6
AMD
Firmware
A program or set of programs recorded on a static memory device (such as ROM
or Flash RAM) that is required for system or device functionality. The system level
firmware is the BIOS. Essentially, because it is physically installed, firmware is a
piece of hardware that performs software functions.
FLAGS
A status register developed by the x86 microprocessor family. The FLAGS register stores a set of 1-bit system, status, and control flags.
Flag
A bit whose value reflects the status of the computer system or the result of a
particular operation, such as the Zero Flag (ZF) or Carry Flag (CF) in the EFLAGS
or FLAGS register.
Flash Memory
A reprogrammable type of long-term memory storage device used to store programs required by systems at startup or reset, such as BIOS, that does not require
continuous power or refresh to maintain its contents.
Flat Model
A memory management scheme in which all six segments are mapped to the
same linear address range. Essentially, this scheme eliminates segmentation.
Floating-Point
Unit (FPU)
The part of the Am486 processor that contains the FPU registers and performs
the operations requested by the floating-point instructions.
Gate Descriptor
The segment descriptor that can be the destination of a CALL, JMP, Jcc, or
LOOPcc instruction. You can also use a gate descriptor to invoke a procedure or
task at another privilege level. The four types of gate descriptors are: call gates,
trap gates, interrupt gates, and task gates.
General Register
The Am486 microprocessor supports eight 32-bit general registers: EAX, EBX,
ECX, EDX, EBP, EDI, ESI, and ESP. You can access the lower word in each of
these registers as eight 16-bit registers: AX, BX, CX, DX, BP, DI, SI, and SP. In
addition, you can access the high (H) and low (L) bytes of the first four 16-bit
registers as eight 8-bit registers: AH, AL, BH, BL, CH, CL, DH, and DL.
Global Descriptor
Table (GDT)
An array of segment descriptors for all programs in a system. There is only one
GDT in a system.
Graphical Interface
A user program, such as Microsoft Windows, that provides icons that when accessed by a mouse or other pointing device initiate execution of programs referenced by the icons.
Handler
A program called as a result of an exception or interrupt.
Hard Drive
A data storage device that uses one or more disks, a drive motor to spin the disks,
a set of read and write heads, and a mechanism (stepping motor/actuator) to
move the heads across the disk surfaces.
Hertz (Hz)
The unit of frequency used to describe clock speeds. It is equal to 1 cycle per
second.
Hexadecimal
A number system based on the number 16. This system uses the standard decimal digits 0 through 9 and adds the alphabetic characters A to F to provide 16
symbols. This document indicates numbers in the hexadecimal form by adding
“h” after the number (e.g., 007Fh).
Glossary-7
AMD
Icon
A simple picture representation of a function or program used by a Graphical
Interface package, such as Microsoft Windows, to initiate program execution.
IDE
Integrated Drive Electronics. A type of hard drive in which the controller electronics are built into the drive.
Immediate Operand
Data encoded into the instruction.
Index
A number used to access a table. An index is scaled (multiplied by shifting left)
to account for the size of the operand. The scaled index is added to the base
address of the table to get the address of the table entry.
Input Device
A device used by the operator to input data into a personal computer. The keyboard was the first basic input device used with the personal computer, but with
the development of graphical interface software, the user can now input data
using a mouse or other pointing device (trackball, for example). The newest
developments in input devices include handwriting recognition devices and voice
recognition devices.
Instruction
A set of encoded symbol sequences that cause a microprocessor to perform a
requested function. At the lowest level, an instruction can be a variable length
binary code fed into the microprocessor on the data bus. Typically programmers
enter the code as hexadecimal or alphanumeric values, which are recoded (compiled) to the binary level required by the microprocessor. For example, to add two
numbers, 4 and 3, a programmer might have to load the first number into a register
(MOV AL 4) and then add the second number (ADD AL 3). A compiler recodes
these instructions as:
1100 0110 1100 0000 0000 0100 (MOV AL 4), and
1000 0000 1100 0000 0000 0011 (ADD AL 3).
Integer
A number (positive, negative, or zero) that is finite and has no fractional part.
Interrupt
A forced transfer of program control to a handler. In a personal computer system,
the signal can come from the interrupt controller (hardware generated) or be
induced by the INT instruction.
Interrupt Controller
A device that provides an interface between peripheral devices requiring service
and the microprocessor. An interrupt is a hardware signal sent by the device to
communicate with the microprocessor. Because the microprocessor only has one
device interrupt line, the controller must handle and prioritize the multiple inputs
and generate only a single signal at a time. Typically, an interrupt controller can
handle up to eight interrupt lines, but by using a cascade scheme, you can combine two controllers to support up to 15 interrupt lines (although several are
reserved for system use). Typically, users refer to an interrupt as an IRQ, such
as IRQ4 which is typically used as the interrupt line for the COM1 serial port. The
interrupt controller tells the microprocessor the interrupt type by transmitting a
vector number associated with the correct handler.
Glossary-8
AMD
Interrupt Descriptor
Table (IDT)
An array of gate descriptors that invoke exception and interrupt handlers.
Interrupt Gate
A gate descriptor used to invoke a specific interrupt handler. An interrupt gate is
different from a trap gate only in its effect on the IF flag. An interrupt gate clears
IF for the duration of the handler.
Invalid Operations
The general exception condition for the FPU that includes stack overflow, stack
underflow, NaN inputs, illegal infinite inputs, out-of-range inputs, and inputs in
unsupported formats.
I/O Address
A combination of signal values fed across the Address Bus to initiate contact with
an input/output device on the bus. Typically, the device uses an input decode
circuit to evaluate the address lines and generate a single chip select signal to
activate the device connection.
I/O Device
A device that performs input and/or output functions in the personal computer
system. The device may be part of a supporting chipset on the motherboard, on
an expansion card installed in a slot on the motherboard, or, in some newer
designs, integrated into the microprocessor chip.
ISA Bus
Instrumentation Society of America. Standard for personal computer expansion
slot connections. The original specification was designed by IBM to support the
8-bit external bus on the early x86-based machines. When later processors expanded the bus to a 16-bit interface, the standard added contacts in a separate
interface connector to provide the additional data lines.
Jcc
A conditional JUMP instruction. A JUMP instruction that occurs only if the condition specified by the Jcc instruction is true.
JUMP
A programming instruction that causes the processor to stop executing instructions consecutively within a set of sequential address locations and transfer operation to an instruction at the address specified by the JUMP instruction.
Kbyte
1024 bytes.
Keyboard
An input device that is based on a typewriter keyboard. The signals from the
keyboard report the x-y location of the key within the keyboard matrix, along with
the current status of other control keys (such as Num Lock, Shift, Ctrl, Alt, etc.).
The keyboard controller uses translation tables to convert the input into a character set recognized by the computer system. Early keyboards used an 85-key
layout, but newer keyboards implement some variation of the 101-key (102-key
in Europe) layout. Some keyboards used with portable computers implement the
101-key layout using fewer physical keys by implementing an embedded keyboard concept that overlays the functionality of the numeric keypad on the standard QWERTY layout.
Keyboard Controller
The device that provides an interface between a keyboard and the computer
system. Typically, the keyboard controller is a general purpose 8-bit processor.
Newer controllers add additional functionality including power management support and PS/2 mouse and/or trackball support.
Limit Checking
One of the five protection checks provided by segmentation. All segment descriptors include a 16-bit limit value that sets the lower or upper segment limit (depending on whether the Direction Flag is set to forward or reverse accessing.
Glossary-9
AMD
Linear Address
A 32-bit address in a large unsegmented address space. If paging is enabled,
the linear address is translated into a physical address. If paging is disabled, the
linear address is the physical address.
Local Bus
A 32-bit expansion bus that uses the microprocessor input clock for data clocking.
This provides higher transfer rates than allowed by the standard ISA bus.
Local Descriptor
Table (LDT)
An array of segment descriptors used by a particular program. A program may
have a unique LDT, share an LDT with other programs, or no LDT (it uses GDT
only).
LOCK
An optional instruction prefix used with selected string operations that invokes
the LOCK signal. This prefix can reduce required clock counts in some situations.
Logical Address
Computed from a 16-bit segment selector and a 32-bit offset. The segment selector specifies an independent, protected address space. The offset defines an
address within that segment. The segmentation handling in the Am486 processor
converts the logical address to a linear address.
LOOPcc
Conditional Loop Instruction. A loop that repeats until the specified condition is
satisfied.
LPT
The DOS-assigned name for a parallel port. Depending on system design and
BIOS support, a port can be designated as LPT1, LPT2, or LPT3. The name
implies a specific I/O address for the set of operational registers associated with
the serial port.
Mask/Masking
By setting a bit in a control register, you can mask (disable) a particular function.
For example, you can mask the six FPU exceptions through the FPU control word.
Mbyte
1,048,576 bytes.
Memory
Electronic circuits used to store binary data for use by the microprocessor and
other devices in a computer system.
Memory
Management
A method of controlling access to memory. The Am486 microprocessor allows
you to address memory directly or indirectly and provides two basic protection
methods: segmentation and paging. Segmentation allows you to divide memory
into independent and protected address spaces. Paging allows you to increase
the virtual memory size by swapping data between memory and disk storage.
MGA
Monochrome Graphics Adapter. An early video display type that supported onecolor graphic displays.
Microprocessor
The main execution device in a personal computer that executes instructions.
Modem
Modulator-Demodulator. A circuit that encodes and decodes serially transmitted
digital data, typically across telephone communication lines.
modR/M byte
A byte following an instruction opcode that specifies instruction operands.
Monographics
One-color video displays.
Glossary-10
AMD
Motherboard
A printed circuit board that includes a microprocessor (or at least a socket for a
microprocessor), memory (or memory sockets), and circuits to link other devices
to the microprocessor and memory. Typically, the motherboard has expansion
slots or other interface devices or connectors to allow you to expand its basic
functionality. With the continuing growth of integrated circuits in microchips, the
motherboard contains a greater array of components including support circuitry
(chipsets) that handles the basic I/O interface requirements, including: interrupt
and DMA control as well as serial and parallel ports, keyboard control, drive
control (diskette, hard drive, CD-ROM drive, etc.), video control, and so forth.
Mouse
A hand-held pointing device that connects to the computer through a special bus
device, a serial port, or a special PS/2 type interface. The mouse movements
(translated by the movement of a ball on the underside of the mouse against
electronic switches into an electronic input into the computer) control the position
of the cursor on the video display. The buttons on the mouse allow the user to
execute programs through icons on the display.
Multimedia
Developments in computer technology that allow the computer system to incorporate graphics, audio, video, and animation. The expanded storage space provided by CD-ROM technology has made the large amounts of data required for
multimedia programming available.
Multisegmented
Model
A memory organization in which different segments are mapped to different ranges of linear addresses. This protects data structures from damage caused by
program execution errors.
NaN
Abbreviation for “Not a Number”. This floating-point quantity does not represent
any numeric or infinite quantity.
Near Pointer
A memory reference that includes an offset only without a segment selector.
Network
A linkage system that allows individual workstations to communicate and share
storage space.
Nibble
4 bits. A half byte.
NMI
Non-Maskable Interrupt. The single NMI input line on the microprocessor that
cannot be masked by the microprocessor.
Offset
A 16-bit or 32-bit number that specifies a memory location relative to the base
address in the segment. The code segment descriptor specifies a default value,
but the programmer can override the default by adding an AddressSize prefix
before the instruction opcode.
Opcode
The numeric representation of an instruction.
Operand
Data in a register or memory that the instructions reads/writes.
OperandSize
Optional programming code used before an instruction that defines the size of
integer operands, which can be 8 bits and 16 bits or 8 bits and 32 bits in length.
The D bit in the instruction code segment defines a default OperandSize, but the
prefix overrides that default.
Operating System
A computer program that provides the principal user interface to the microprocessor. Typically, the operating system converts its commands to the instruction
format used by the microprocessor. Application programs also can also use the
operating system command set to interact with the microprocessor.
Glossary-11
AMD
Overflow
Numeric: A floating-point exception that occurs when a result is finite, but is too
large to be represented in the destination format.
Stack: An exception caused by attempting to push down an non-empty stack
location.
Page
A 4-Kbyte block of consecutive memory locations used as the base size by the
system paging mechanism.
Paging
A form of memory management used to simulate a large, unsegmented address
space by swapping data between memory and disk storage.
Parallel Interface
An interface that uses a data bus (multiple data lines) to transfer information
instead of a single data line. Typically a parallel interface transfers information in
bytes (8 bits), words (16 bits), or doublewords (32 bits). Newer technology can
implement quadword (64 bits) or larger parallel transfers. Parallel transfers use
one clock to transfer a set of data instead of one bit at a time as in serial transfers,
and therefore provides faster data transfer.
Parallel Port
A hardware connector used to transfer data via a parallel interface. Typically,
computer systems use a 25-pin D-shell connector. Printers that use this interface
typically have a Centronics-type connector.
Parity
A method used to verify the accuracy of stored or transferred data. Data is typically
stored and transferred between devices as bytes. Typical parity schemes use a
ninth bit called the parity bit. In an even parity scheme, it you add the eight bits
and the parity bit, the result is always even if the data is correct. For odd parity,
the result is always odd.
PCD Bit
Page-Level Cache Disable, Bit 4 in CR3. This bit drives the value of the PCD
output pin on the microprocessor during unpaged bus cycles, such as interrupt
acknowledge, when paging is enabled, and all bus cycles when paging is disabled. The value and PCD pin output controls caching in an external cache on a
cycle-by-cycle basis.
PCI Bus
A newer expansion bus design that uses a local bus transfer rate, typically using
the microprocessor input clock. It is faster than the conventional ISA bus, and
also requires defined storage space for device configuration information.
PCI Device
A device that conforms to the PCI specification both in terms of the hardware
interface used on the PCI bus, but also by having a set of configuration information
stored in the device and accessible by the microprocessor through the PCI space
defined by the specification.
Peripheral Device
Any input, output, or input/output device connected to a personal computer. Typically, this includes a hard drive, diskette drive, display unit, mouse, trackball, or
similar devices.
Physical Address
The address on the local bus.
Physical Memory
The address space on the local bus; hardware implementation of memory.
Pointers
A value that references an address location. See also Far Pointer and Near
Pointer.
Glossary-12
AMD
Power Good Signal
An output signal provided by most computer power supplies to indicate the status
of the power provided by the supply. Computer designs typically use the change
of state of the Power Good signal that occurs when power is turned on as one
of the inputs to the Reset pin on the microprocessor that initiates processor
initialization.
Power Management
A method of controlling overall power usage in a computer. Interest in power
management techniques grew out of the increased market interest in battery
operated portable computers (laptops, notebooks, subnotebooks, palmtops,
etc.). The need for longer battery life and smaller systems pushed technology to
develop ways to shut down power automatically to devices not being used. As
the possibilities for power management were becoming realized, the environmental movement began to realize that the same principles could be applied to
desktop systems to conserve power on a larger scale. This led to the U. S.
Environmental Protection Agency’s initiative calling for an “Energy Star” program
and the vision of the “green” PC.
Precision
The number of bits used by the FPU as the significand of a real number in the
floating-point format. The FPU can represent a real number using one of three
precision levels: single precision (24 bits), double precision (53 bits), or extended
precision (64 bits).
Prefix
An optional instruction byte that a programmer can add to the instruction format.
The Am486 processor supports four prefix types: OperandSize, AddressSize,
Segment Override, Instruction (REP, REPcc, LOCK). The prefixes override the
default settings of a specific instruction.
Present Bit
A bit in the Task Segment Descriptor that indicates whether the segment is
present in memory. The Present Bit allows the system to generate an exception,
store the data currently in the segment location, and restore data from a hard
drive or other storage device, and then execute the requested task.
Privilege Level
A protection value assigned to segments and segment selectors. There are four
privilege levels, ranging from 0 (most privileged) to 3 (least privileged).
Programmable
Device
A device for which you can change the interface characteristics through programming. Typically, the driver software allows you to select a specific interrupt (IRQ)
line, DMA channel, and a memory location to shadow the device BIOS. Some
bus designs (such as EISA or PCI) include software configuration tables that
cross-check the device configurations in a system to reduce the possibility of
device conflict and system lockup.
Programmable
Register
A register whose bit values control the operation or configuration of a device or
function, which is accessible and programmable by the user, that is, the user can
select specific bit values in the register.
Programming
A designed sequence of instruction code that performs a desired function.
Protected Mode
An execution mode in which the full 32-bit architecture is available.
Protection
A mechanism used to protect the operating system and application programs
from execution errors. Protection includes defining the types of address available
to a program, the kind of memory references that can be made, and the privilege
level required for access. A violation of these limits causes a general protection
exception.
Glossary-13
AMD
PWT Bit
Page-Level Writes Transparent, bit 3 of CR3. This bit drives the value of the PWT
output pin on the microprocessor during unpaged bus cycles, such as interrupt
acknowledge, when paging is enabled, and all bus cycles when paging is disabled. The value and PWT pin output controls write-through in an external cache
on a cycle-by-cycle basis.
Quadword (qword)
64 bits.
RAM
Random Access Memory. A type of memory that can be organized within an
addressable array so that any memory location can be accessed electronically
by address, rather than by any mechanical access method (such as that provided
by drives). There are a variety of types of this memory including DRAM (Dynamic
RAM), SRAM (Static RAM), and Flash RAM.
Read Only
Typically used to describe a register or a protected memory field that can only
be read. Some read-only registers are not writable because they control critical
operations within a system. Others simply reflect status and do not affect operation. Some read-only registers share an I/O address with a write-only register.
Read/Write
A register or memory field that is both readable and writable.
Read/Write (R/W) Bit
A bit in the page directory entry or the page table entry that indicates the type of
access accorded to the program accessing the pages. This bit is used with the
User/Supervisor bit. If the operation is in User Mode (U/S = 1), only pages belonging to the current user have read/write access (R/W = 1). For all other pages,
the user has read-only access (R/W = 0).
Real Address Mode
A execution mode in which the microprocessor emulates the architecture of an
8086 processor; also called Real or Virtual Mode.
Real Number
Any finite value (negative, positive, or zero) that can be represented by a possibly
infinite numeric expansion.
Reboot
Initialize the system. You can initialize the microprocessor by applying a signal
(such as POWER GOOD) to the microprocessor RESET pin.
Register
A defined set of bits in a microprocessor or other device, or a defined space within
memory. Typically, a register is a set of 8 bits (byte register), 16 bits (word register),
or 32 bits (doubleword register), but it can be any length from 1 bit up. Microprocessor registers are addressed by name (or register code value). Other registers
require an I/O address for accessing. Typically, registers are assigned names for
reference convenience. Individual bit positions and bit sets or fields within the
register may also have names.
Requested Privilege
Level (RPL)
The privilege assigned to a segment selector. If the RPL is less than the CPL,
access to a segment occurs at the RPL level. This prevents access to more
privileged software by lower privilege applications, protecting operating systems
and BIOS software.
Reset
System Level: The signal input that causes the microprocessor to reinitialize and
go to a known state.
Bit: To force a bit to the 0 level.
Rounding
Glossary-14
A numerical operation that converts an extended fraction to a fixed length. Rounding control in the microprocessors numeric functions allow the user to select a
particular type of rounding control: Up toward +∞, Down toward -∞, or Truncate.
AMD
RTC
Real Time Clock. This circuit supports current day, month, year, day of the week,
hour, minute, and second for display and use by the computer system. Because
the circuit requires battery support to maintain the current time when the system
is turned off, the clock circuits have traditionally been used to store system configuration information required at startup.
SCSI Bus
Small Computer Systems Interface Bus. An external communications system
that allows the interconnection of serval types of external devices to a computer
through sets of daisy-chain cables. As many as 256 devices can be interconnected to a single system through daisy-chained SCSI controllers.
SCSI Device
A device that has the built-in circuitry and connectors to attach to a SCSI bus.
Like IDE drives, SCSI hard drives have built-in controller circuitry as well as the
SCSI interface.
Segment
An independent, protected address space. A program can access as many as
16,383 segments, each of which can be as large as 4 Gbytes.
Segment Descriptor
A 64-bit data structure used for segmentation. It includes the segment base
address, its size (limit), its type, and protection information. It is setup by operating
system software and accessed by segmentation hardware.
Segment Override
Prefix
Optional programming code used before an instruction to override the default
segment selection. There are six segment override prefixes, one for each segment register.
Segment Register
One of six special purpose registers: CS, SS, DS, ES, FS, and GS. The registers
store the segment selectors that identify the independent address spaces addressable by a particular program. The CS register defines the address space in
memory for code storage. The SS register defines the stack space, that is the
area used as temporary storage registers and accessed by the PUSH and POP
instructions. The remaining four registers (DS, ES, FS, and GS) define four independent address spaces to store data for program use.
Segment Selector
A 16-bit number used to specify an address space (segment). Bits 15–3 are the
index into the descriptor table. Bit 2 specifies whether to use the GDT or an LDT.
Bits 1–0 define the RPL as an additional protection check.
Segmentation
A form of memory management that provides multiple, independent, protected
address spaces. Segmentation allows you to define as many as 16383 segments,
each of which can be as large as 4 Gbytes.
Serial Interface
An interface that uses a single data line. A serial interface transfers one bit at a
time.
Serial Port
A hardware connector used to transfer data via a serial interface. Typically, computer systems use a 25-pin or 9-pin D-shell connector, but can also use an RJ11 telephone connector.
SETcc
A conditional set byte command. If the condition is met, the specified byte is set
to a value of 1. If the condition is not met, the specified byte is 0.
Setup
A program that is typically part of BIOS that allows the user to program the system
configuration information stored in the CMOS in the RTC.
Glossary-15
AMD
Shadow Register
A device register that contains the same information as another, typically writeonly, register. For some power management solutions, designs may include shadow registers to save and restore the contents of write-only registers when normal
operation resumes after a system has been in one of the reduced-power modes.
Shadowing
Copying information from one source to another. In older personal computer
systems, accessing BIOS or other firmware programs from the static memory
devices was slower than accessing data from RAM. One solution was to “shadow”
the ROM contents into RAM for more efficient computer operation. Shadowing
is now also used to preserve the contents of write-only registers; see Shadow
Register.
s-i-b Byte
A byte following an instruction opcode and modR/M bytes that specifies a scale
factor, index, and base register.
SIMM
Single In-Line Memory Module. Sets of memory chips mounted on an easily
installed circuit board for quick access and servicing.
SMM
System Management Mode. A special operational mode accessible only by BIOS
and other firmware that is used to develop power management or security support
systems for the computer.
SRAM
Static Random Access Memory. Memory that can be accessed and programmed
in a computer system, but does not require power to maintain the last value written
to the memory.
Stack
A set of consecutive memory locations used as scratch space by application and
other programs. The FPU has an internal stack consisting of eight 80-bit registers.
Stack Fault
A special case of the invalid-operand exception indicated by the SF bit in the FPU
Status Word. It is usually caused by a stack underflow or overflow.
Stack Segment
A memory segment used to hold a stack. Only one stack segment is available to
the microprocessor at a time, the segment whose descriptor is currently in the
SS register. The segment descriptor defines the segment.
Status Word
A 16-bit FPU register indicating the current FPU status. It contains the condition
codes, the FPU stack pointer, busy and interrupt bits, and exception flags.
String
A sequence of bytes, words, or doublewords that may start at any byte address
in memory.
Tag Word
A 16-bit FPU register that for each stack space in the FPU, tells if that space
stores a number and what type number it is.
Task Register (TR)
A register that holds the segment selector for the current task. The selector
references a task state segment (TSS). Like the segment registers, the TR has
a visible part and an invisible part. The visible part holds the segment selector;
the invisible part holds information cached from the segment descriptor for the
TSS.
Task State
Segment (TSS)
A segment that stores the processor state during a task switch. If a separate I/O
address space is used, the TSS holds permission bits that control access to the
I/O space.
Task Switch
A transfer of execution between tasks. The TSS saves most of the processor
state.
Glossary-16
AMD
Task
A program running (executing).
Test Register
One of five registers that provide test and status information of the internal cache
functionality of the Am486 microprocessor.
TOP
The 3-bit field in the FPU Status Word that indicates which FPU register is at the
top of the stack.
Translation
Lookaside
Buffer (TLB)
The on-chip cache for page table entries.
Trap
An exception that is reported at the instruction boundary immediately following
the instruction that generated the exception.
Trap Gate
A gate descriptor used to invoke an exception handler. A trap gate is different
from an interrupt gate only in its effect on the IF flag. Unlike an interrupt gate,
which clears the flag for the duration of the handler, a trap gate leaves the flag
unchanged.
Type Checking
One of the five types of protection provided by segmentation. The TYPE field in
the system segment descriptor defines the capabilities allowed to a particular
program and generates exceptions if the program attempts to perform executions
not allowed by the specified TYPE information.
Underflow
Numeric: An exception condition in which the correct answer is not zero, but is
so small that it cannot be represented as a real number by the floating-point unit.
Stack: An exception caused by trying to read an operand from an empty stack
location.
User/Supervisor
(U/S) bit
A bit in the page directory entry or the page table entry that indicates the level of
privilege accorded to the program accessing the pages. An application with a
CPL = 3 is assigned a user status (U/S = 1). Any other CPL value is considered
supervisory (U/S = 0).
Vcc
The typical notation used to indicate the input dc power for computer logical
operations. Typically, the voltage is in the range from +3 to +5 volts.
Vector
A number used by the microprocessor to access interrupt and exception handlers.
VESA
Video Electronics Standard Association. Video and local bus standards codified
by the association to support development of compatible video components industry-wide.
VGA
Video Graphics Array. A video controller that supports high resolution color graphic displays. Actual support varies depending on the type and size of the RAMDAC
used in the controller and the amount of video memory available.
Video Controller
The electronic circuits that provide the interface between a personal computer
system and a display unit (CRT, LCD or other display type).
Virtual Mode
Similar to Real Mode, the processor emulates the 8086 processor architecture.
Glossary-17
AMD
VL Bus
One of several types of Local Bus design protocols that allow a device to transfer
data and code to and from the microprocessor at the same speed as the processor
input clock. The clocking rate (typically 25 MHz and higher) is faster than transfer
clocking used by the older ISA or EISA bus standard (4.77 MHz to 12 MHz;
typically 8 MHz). See also VESA.
Vss
The typical notation used to indicate a ground bus pinout in a chip.
Word
16 bits.
Wait States
Programmed or hardware-determined delays that allow slow transfer processes
to finish their operation. This is typically used with slow hard drives and DMA
controllers. Most new microprocessors support Zero Wait State operation, that
is without induced processor delay states.
Write Back
A form of caching in which memory writes load only the cache memory. Data is
transferred to the system memory before a memory read accesses the location.
Write Only
A register in a microprocessor or controller that can be loaded with a write command, but which cannot be read by a read command. Typically, this register
shares an I/O address location with a read-only register.
Write Through
A form of caching in which memory writes load both the cache and system
memory.
Glossary-18
ASCII Codes (based on ANSI x3.4 1968)
Hex
Character
(Control
Kybd Equiv.)
Hex
Character
Hex
Character
Hex
Character
00
NUL (@)
20
SP
40
@
60
‘
01
SOH (A)
21
!
41
A
61
a
02
STX (B)
22
“
42
B
62
b
03
ETX (C)
23
#
43
C
63
c
04
EOT (D)
24
$
44
D
64
d
05
ENQ (E)
25
%
45
E
65
e
06
ACK (F)
26
&
46
F
66
f
07
BEL (G)
27
‘
47
G
67
g
08
BS (H)
28
(
48
H
68
h
09
HT (I)
29
)
49
I
69
i
0A
LF (J)
2A
*
4A
J
6A
j
0B
VT (K)
2B
+
4B
K
6B
k
0C
FF (L)
2C
,
4C
L
6C
l
0D
CR (M)
2D
-
4D
M
6D
m
0E
SO (N)
2E
.
4E
N
6E
n
0F
SI (O)
2F
/
4F
O
6F
o
10
DLE (P)
30
0
50
P
70
p
11
DC1 (Q)
31
1
51
Q
71
q
12
DC2 (R)
32
2
52
R
72
r
13
DC3 (S)
33
3
53
S
73
s
14
DC4 (T)
34
4
54
T
74
t
15
NAK (U)
35
5
55
U
75
u
16
SYN (V)
36
6
56
V
76
v
17
ETB (W)
37
7
57
W
77
w
18
CAN (X)
38
8
58
X
78
x
19
EM (Y)
39
9
59
Y
79
y
1A
SUB (Z)
3A
:
5A
Z
7A
z
1B
ESC ([)
3B
;
5B
[
7B
{
1C
FS (\)
3C
<
5C
\
7C
|
1D
GS (])
3D
=
5D
]
7D
}
1E
RS (^)
3E
>
5E
^
7E
~
1F
US (_)
3F
?
5F
_
7F
DEL

Open as PDF

Similar pages: AMD64 Architecture Programmer s Manual Volume 3: General-Purpose and System Instructions; INTEL KU386; NSC NS32FX164V-25; ETC NS32532-30; ETC 3DNOW!TECHNOLOGY; HYNIX GMS30C2216; HYNIX GMS30C2132; INTEL M80C286; Intel Xeon Phi Coprocessor Instruction Set Architecture Reference; Cortex-M3/M4F Instruction Set Technical User`s Manual (Rev. A); ETC 19197; INTEL INTEL387SX